Several tasks must be completed before you use the system.
The recovery procedure re-creates the old system from the quorum data. However, some things
cannot be restored, such as cached data or system data managing in-flight I/O. This latter loss of
state affects RAID arrays that manage internal storage. The detailed map about where data is out of
synchronization has been lost, meaning that all parity information must be restored, and mirrored
pairs must be brought back into synchronization. Normally this action results in either old or stale
data being used, so only writes in flight are affected. However, if the array lost redundancy (such
as syncing, degraded, or critical RAID status) before the error that requires system recovery, then
the situation is more severe. Under this situation you need to check the internal storage:
- Parity arrays are likely syncing to restore parity; they do not have redundancy when this
operation proceeds.
- Because there is no redundancy in this process, bad blocks might are created where data is not
accessible.
- Parity arrays might be marked as corrupted. This indicates that the extent of lost data is wider
than in-flight I/O; to bring the array online, the data loss must be acknowledged.
- RAID6 arrays that were degraded before the system recovery might require
a full restore from backup. For this reason, it is important to have at least a capacity match spare
available.
Be aware of these differences about the recovered configuration:
- FlashCopy mappings are restored as
"idle_or_copied" with 0% progress. Both volumes must are restored to their original I/O
groups.
- The management ID is different. Any scripts or associated programs that refer to the
system-management ID of the clustered system (system) must be changed.
- Any FlashCopy mappings that were not in the
"idle_or_copied" state with 100% progress at the point of disaster have inconsistent data on
their target disks. These mappings must be restarted.
- Intersystem
partnerships and relationships are not restored and must be re-created manually.
- Consistency groups are not restored and must be re-created manually.
- Intrasystem Metro Mirror relationships are restored if all dependencies were successfully restored to their original I/O
groups.
- If hardware was replaced before the recovery, the SSL certificate might not be restored. If it
is not restored, then a new self-signed certificate is generated with a validity of 30 days. Follow
the associated Directed Maintenance Procedures (DMP) for a permanent resolution.
- The system time zone might not are restored.
- Any Global Mirror secondary volumes on the recovered system might have inconsistent data if
there was replication I/O from the primary volume that is cached on the secondary system at the
point of the disaster. A full synchronization is required when re-creating and restarting these
relationships.
- Immediately after the T3 recovery process runs, which are compressed disks do not know the
correct value of their used capacity. The disks initially set the capacity as the entire real
capacity. When I/O resumes, the capacity is shrunk down to the correct
value.
Similar behavior occurs when you
use the -autoexpand option on volumes. The real capacity of a disk
might increase slightly, caused by the same kind of behavior that affects compressed
volumes. Again, the capacity shrinks down as I/O to the disk is resumed.
- Manual actions might be necessary on the hosts to trigger them to rescan for
devices. You can complete this task by disconnecting and reconnecting the Fibre Channel cables to
each host bus adapter (HBA) port.
- Verify that all mapped volumes can be accessed by the hosts.
- Run the application consistency checks.
For Virtual Volumes (VVols), complete the following tasks.
- After you confirm that the T3 completed successfully, restart Spectrum Control Base (SCB)
services. Use the Spectrum Control Base command service ibm_spectrum_control
start.
- Refresh the storage system information on the SCB GUI to ensure that the systems are in sync
after the recovery.
- To complete this task, login to the SCB GUI.
- Hover over the affected storage system, select the menu launcher, and then select
Refresh. This step repopulates the system.
- Repeat this step for all Spectrum Control Base instances.
- Rescan the storage providers from within the vSphere Web Client.
For Virtual Volumes (VVols), also be aware of the following information.
FlashCopy
mappings are not restored for VVols. The implications are as follows.
- The mappings that describe the VM's snapshot relationships are lost. However, the Virtual
Volumes that are associated with these snapshots still exist, and the snapshots might still appear
on the vSphere Web Client. This outcome might have implications on your VMware back up solution.
- Do not attempt to revert to snapshots.
- Use the vSphere Web Client to delete any snapshots for VMs on a VVol data store to free up disk
space that is being used unnecessarily.
- The targets of any outstanding 'clone' FlashCopy relationships might not function as expected
(even if the vSphere Web Client recently reported clone operations as complete). For any VMs, which
are targets of recent clone operations, complete the following tasks.
- Perform data integrity checks as is recommended for conventional volumes.
- If clones do not function as expected or show signs of corrupted data, take a fresh clone of
the source VM to ensure that data integrity is maintained.