551: A cluster cannot be formed because of a lack of cluster resources.

Explanation

The node does not have sufficient connectivity to other nodes or the quorum device to form a cluster.

Attempt to repair the fabric or quorum device to establish connectivity. If a disaster occurred and the nodes at the other site cannot be recovered, then it is possible to allow the nodes at the surviving site to form a system by using local storage.

User Response

Follow troubleshooting procedures to correct connectivity issues between the cluster nodes and the quorum devices.

  1. Check for any node errors that indicate issues with Fibre Channel connectivity. Resolve any issues.
  2. Ensure that the other nodes in the cluster are powered on and operational.
  3. Using the SAT GUI or CLI (sainfo lsservicestatus), display the Fibre Channel port status. If any port is not active, perform the Fibre Channel port problem determination procedures.
  4. Ensure that Fibre Channel network zoning changes have not restricted communication between nodes or between the nodes and the quorum disk.
  5. Perform the problem determination procedures for the network.
  6. The quorum disk failed or cannot be accessed. Perform the problem determination procedures for the disk controller.
  7. As a last resort when the nodes at the other site cannot be recovered, then it is possible to allow the nodes at the surviving site to form a system by using local site storage:

    To avoid data corruption ensure that all host servers that were previously accessing the system have had all volumes unmounted or have been rebooted. Ensure that the nodes at the other site are not operational and are unable to form a system in the future.

    After starting this command, a full resynchronization of all mirrored volumes is completed when the other site is recovered. This is likely to take many hours or days to complete.

    Contact IBM support personnel if you are unsure.

    Warning: Before continuing, confirm that you have taken the following actions - failure to perform these actions can lead to data corruption that is undetected by the system but affects host applications.
    1. All host servers that were previously accessing the system have had all volumes unmounted or have been rebooted.
    2. Ensure that the nodes at the other site are not operating as a system and actions have been taken to prevent them from forming a system in the future.

    After these actions have been taken, the satask overridequorum can be used to allow the nodes at the surviving site to form a system that uses local storage.