Manual quorum disk override in a HyperSwap® system

A manual quorum disk override is required when you experience a rolling disaster. In rare situations, the system is subject to what is referred to as a rolling disaster. A rolling disaster occurs when an incident has wide scope, and its effects are felt in multiple steps over an extended time period. The following example scenario describes a rolling disaster and shows how to recover from that rolling disaster.

An example of a rolling disaster occurs when the following situation is true:
  1. The link between the two sites fails, at which point one site uses the automatic quorum feature to continue operation.
  2. The system site that has control of the quorum disk fails (due to a power outage, for example).

This example leaves the second site as the only site that is potentially capable of continuing data I/O. However, it is unable to do so until it gains control of the quorum disk. The MDisks in the second site stop. Nodes at the site display the node error 551, indicating that an insufficient number of nodes are available to form a quorum in a HyperSwap® system configuration.

In this scenario, you can run the satask overridequorum command to override the automatic quorum disk selection and create a new system that contains the nodes in the second site.
Note: If a fabric disruption occurs while the satask overridequorum command is running, it is possible that a subset of the nodes will update their cluster (system) ID. The updated nodes display the node error 550; the nodes that were not updated display 551 and the nodes are assigned to two different systems. In this situation, you can run the satask overridequorum command again on one of the nodes that reported the error 551. This command updates all the nodes in the two systems with a new cluster (system) ID. You can then recover data.

Enforcing conditions for a quorum

You must run the chsystem -topology hyperswap command as part of the installation process for the system to make the satask overridequorum command available if a rolling disaster occurs. The satask overridequorum command is not available in systems that do not have the topology set to hyperswap. Before you can use the command, the following prerequisites must be met:

When these prerequisites are met and automatic quorum selection is enabled, the system attempts to assign one quorum disk within all three sites. If a site does not have an MDisk suitable to be a quorum disk, a quorum disk is not assigned to it.

Note: After the chsystem -topology hyperswap command is run, you cannot alter the site assignment of any controller except where that controller is a new controller that has only unmanaged MDisks.

It also does not allow site settings for nodes. This enforcement is required to ensure that the system operates correctly to allow the satask overridequorum command to operate correctly.

When you run the chsystem -topology standard command, it again is possible to alter the site settings for nodes and controllers. However, this command disables the override quorum feature. Therefore, aim to run chsystem -topology hyperswap when you complete your changes to reenable this support.