Running system recovery using the service assistant

You can use the service assistant to start recovery when all node canisters that were members of the system are online and have candidate status. For any nodes that display error code 550 or 578, ensure that all nodes in the system are visible and all the recommended actions are completed before placing them into candidate status. To place a node into candidate status, remove system information for that node canister. Do not run the recovery procedure on different node canisters in the same system.

Note: Ensure that the web browser is not blocking pop-up windows. If it does, progress windows cannot open.

Before you begin this procedure, read the recover system procedure introductory information; see Recover system procedure.

The service assistant can also be accessed by using the technician port. See Procedure: Accessing the Lenovo Storage V series service assistant from the technician port.

Attention: This service action has serious implications if not completed properly. If at any time an error is encountered not covered by this procedure, stop and call the support center.

Run the recovery from any node canisters in the system; the node canisters must not have participated in any other system.

If the system has USB encryption, run the recovery from any node canister in the system that has a USB flash drive inserted which contains the encryption key.

If the system has key server encryption, note the following items before you proceed with the T3 recovery.
  • Run the recovery on a node that is attached to the key server. The keys are fetched remotely from the key server.
  • Run the recovery procedure on a node that is not hardware replaced or node rescued. All of the information that is required for a node to successfully fetch the key from the key server resides on the node's file system. If the contents of the node's original file system are damaged or no longer exist (rescue node, hardware replacement, file system that is corrupted, and so on), then the recovery fails from this node.
Note: Each individual stage of the recovery procedure can take significant time to complete, depending on the specific configuration.
  1. Point your browser to the service IP address of one of the node canisters.

  2. Log on to the service assistant.
  3. Check that all node canisters that were members of the system are online and have candidate status.

    If any nodes display error code 550 or 578, remove their system data to place them into candidate status; see Procedure: Removing system data from a node canister.

  4. Select Recover System from the navigation.
  5. Follow the online instructions to complete the recovery procedure.
    1. Verify the date and time of the last quorum time. The time stamp must be less than 30 minutes before the failure. The time stamp format is YYYYMMDD hh:mm, where YYYY is the year, MM is the month, DD is the day, hh is the hour, and mm is the minute.
      Attention: If the time stamp is not less than 30 minutes before the failure, call the support center.
    2. Verify the date and time of the last backup date. The time stamp must be less than 24 hours before the failure. The time stamp format is YYYYMMDD hh:mm, where YYYY is the year, MM is the month, DD is the day, hh is the hour, and mm is the minute.
      Attention: If the time stamp is not less than 24 hours before the failure, call the support center.

      Changes that are made after the time of this backup date might not be restored.

Any one of the following categories of messages might be displayed:
  • T3 successful
    The volumes are back online. Use the final checks to get your environment operational again.
  • T3 recovery completed with errors
    T3 recovery completed with errors: One or more of the volumes are offline because there was fast write data in the cache. To bring the volumes online, see Recovering from offline volumes using the CLI for details.
  • T3 failed
    Call the support center. Do not attempt any further action.
Verify that the environment is operational by completing the checks that are provided in What to check after running the system recovery.

If any errors are logged in the error log after the system recovery procedure completes, use the fix procedures to resolve these errors, especially the errors that are related to offline arrays.

If the recovery completes with offline volumes, run the command-line interface (CLI) svctask recovervdisk command to access the volumes.