A quorum disk is an MDisk or a managed drive that contains a reserved area that is used exclusively for system management. A system automatically assigns quorum disk candidates. When you add new storage to a system or remove existing storage, however, it is a good practice to review the quorum disk assignments.
It is possible for a system to split into two groups where each group contains half the original number of nodes in the system. A quorum disk determines which group of nodes stops operating and processing I/O requests. In this tie-break situation, the first group of nodes that accesses the quorum disk is marked as the owner of the quorum disk and as a result continues to operate as the system, handling all I/O requests. If the other group of nodes cannot access the quorum disk or finds the quorum disk is owned by another group of nodes, it stops operating as the system and does not handle I/O requests.
A system can have only one active quorum disk that is used for a tie-break situation. However, the system uses three quorum disks to record a backup of system configuration data to be used in the event of a disaster. The system automatically selects one active quorum disk from these three disks. The active quorum disk can be specified by using the chquorum command-line interface (CLI) command with the active parameter. To view the current quorum disk status, use the lsquorum command.
In a system with a single control enclosure or without any external managed disks, quorum is automatically assigned to drives. In this scenario, manual configuration of the quorum disks is not required.
In a system with two or more I/O groups, the drives are physically connected to only some of the node canisters. In such a configuration, drives cannot act as tie-break quorum disks; however, they can still be used to back up metadata.
If suitable external MDisks are available, these MDisks are automatically used as quorum disks that do support tie-break situations.
If no suitable external MDisks or IP quorum devices exist, the entire system might become unavailable if exactly half the node canisters in the system become inaccessible (such as due to hardware failure or becoming disconnected from the fabric).
In systems with exactly two control enclosures, an uncontrolled shutdown of a control enclosure might lead to the entire system becoming unavailable because two node canisters become inaccessible simultaneously. It is therefore vital that node canisters are shut down in a controlled way when maintenance is required.
These criteria are not requirements. On configurations where meeting all the criteria is not possible, quorum still is automatically configured.
It is possible to assign quorum disks to alternative drives by using the chquorum command. However, you cannot move quorum to a drive that creates a less optimum configuration. You can override the dynamic quorum selection by using the override yes option of the chquorum command. This option is not advised, however, unless you are working with your support center.
To provide protection against failures that affect an entire location (for example, a power failure), you can use active-active relationships with a configuration that splits a single system between two physical locations. For more information, see HyperSwap configuration details. For detailed guidance about HyperSwap system configuration for high-availability purposes, contact your IBM regional advanced technical specialist.
If you configure a HyperSwap system, the system automatically selects quorum disks that are placed in each of the three sites.