Several tasks must be completed before you use the system.
The recovery procedure re-creates the old system from the quorum data. However, some things
cannot be restored, such as cached data or system data managing in-flight I/O. This latter loss
of state affects RAID arrays that manage internal storage. The detailed map about where data is
out of synchronization has been lost, meaning that all parity information must be restored, and
mirrored pairs must be brought back into synchronization. Normally this action results in either
old or stale data being used, so only writes in flight are affected. However, if the array lost
redundancy (such as syncing, degraded, or critical RAID status) before the error that requires
system recovery, then the situation is more severe. Under this situation you need to check the
internal storage:
- Parity arrays are likely syncing to restore parity; they do not have redundancy when this
operation proceeds.
- Because there is no redundancy in this process, bad blocks might are created where data is
not accessible.
- Parity arrays might be marked as corrupted. This indicates that the extent of lost data is
wider than in-flight I/O; to bring the array online, the data loss must be acknowledged.
- RAID6 arrays that were degraded before the system recovery might
require a full restore from backup. For this reason, it is important to have at least a
capacity match spare available.
Be aware of these differences about the recovered configuration:
- FlashCopy mappings are restored as
"idle_or_copied" with 0% progress. Both volumes must are restored to their original I/O
groups.
- The management ID is different. Any scripts or associated programs that refer to the
system-management ID of the system must be changed.
- Any FlashCopy mappings that were not in the
"idle_or_copied" state with 100% progress at the point of disaster have inconsistent
data on their target disks. These mappings must be restarted.
- Intersystem partnerships and relationships are not restored and must be re-created
manually.
- Consistency groups are not restored and must be re-created manually.
- Intrasystem Metro Mirror relationships are restored if all dependencies were successfully restored to their original
I/O groups.
- If hardware was replaced before the recovery, the SSL certificate might not be restored. If
it is not restored, then a new self-signed certificate is generated with a validity of 30
days. Follow the associated Directed Maintenance Procedures (DMP) for a permanent
resolution.
- The system time zone might not are restored.
- Any Global Mirror secondary volumes on the recovered system might have inconsistent data if
there was replication I/O from the primary volume that is cached on the secondary system at
the point of the disaster. A full synchronization is required when re-creating and restarting
these relationships.
- Immediately after the T3 recovery process runs, which are compressed disks do not know the
correct value of their used capacity. The disks initially set the capacity as the entire real
capacity. When I/O resumes, the capacity is shrunk down to the correct
value.
Similar behavior occurs when
you use the -autoexpand option on volumes. The real capacity of a disk might
increase slightly, caused by the same kind of behavior that affects compressed volumes.
Again, the capacity shrinks down as I/O to the disk is resumed.
- Manual actions might be necessary on the hosts to trigger them to rescan
for devices. You can complete this task by disconnecting and reconnecting the Fibre Channel
cables to each host bus adapter (HBA) port.
- Verify that all mapped volumes can be accessed by the hosts.
- Run the application consistency checks.
For Virtual Volumes (VVols), complete the following tasks.
- After you confirm that the T3 completed successfully, restart Spectrum Control Base (SCB)
services. Use the Spectrum Control Base command service ibm_spectrum_control
start.
- Refresh the storage system information on the SCB GUI to ensure that the systems are in sync
after the recovery.
- To complete this task, login to the SCB GUI.
- Hover over the affected storage system, select the menu launcher, and then select
Refresh. This step repopulates the system.
- Repeat this step for all Spectrum Control Base instances.
- Rescan the storage providers from within the vSphere Web Client.
For Virtual Volumes (VVols), also be aware of the following information.
FlashCopy
mappings are not restored for VVols. The implications are as follows.
- The mappings that describe the VM's snapshot relationships are lost. However, the Virtual
Volumes that are associated with these snapshots still exist, and the snapshots might still
appear on the vSphere Web Client. This outcome might have implications on your VMware back up
solution.
- Do not attempt to revert to snapshots.
- Use the vSphere Web Client to delete any snapshots for VMs on a VVol data store to free
up disk space that is being used unnecessarily.
- The targets of any outstanding 'clone' FlashCopy relationships might not function as
expected (even if the vSphere Web Client recently reported clone operations as complete). For
any VMs, which are targets of recent clone operations, complete the following tasks.
- Perform data integrity checks as is recommended for conventional volumes.
- If clones do not function as expected or show signs of corrupted data, take a fresh
clone of the source VM to ensure that data integrity is maintained.