Nodes can notify their hosts of events for SCSI commands that are issued.
Some events are part of the SCSI architecture and are handled by the host application or device drivers without reporting an event. Some events, such as read and write I/O events and events that are associated with the loss of nodes or loss of access to backend devices, cause application I/O to fail. To help troubleshoot these events, SCSI commands are returned with the Check Condition status and a 32-bit event identifier is included with the sense information. The identifier relates to a specific event in the event log.
If the host application or device driver captures and stores this information, you can relate the application failure to the event log.
Table 1 describes the SCSI status and codes that are returned by the nodes.
Status | Code | Description |
---|---|---|
Good | 00h | The command was successful. |
Check condition | 02h | The command failed and sense data is available. |
Condition met | 04h | N/A |
Busy | 08h | An Auto-Contingent Allegiance condition exists and the command specified NACA=0. |
Intermediate | 10h | N/A |
Intermediate - condition met | 14h | N/A |
Reservation conflict | 18h | Returned as specified in SPC2 and SAM-2 where a reserve or persistent reserve condition exists. |
Task set full | 28h | The initiator has at least one task queued for that LUN on this port. |
ACA active | 30h | This code is reported as specified in SAM-2. |
Task aborted | 40h | This code is returned if TAS is set in the control mode page 0Ch. The node has a default setting of TAS=0, which cannot be changed; therefore, the node does not report this status. |
Nodes notify the hosts of events on SCSI commands. Table 2 defines the SCSI sense keys, codes, and qualifiers that are returned by the nodes.
Key | Code | Qualifier | Definition | Description |
---|---|---|---|---|
2h | 04h | 01h | Not Ready. The logical unit is in the process of becoming ready. | The node lost sight of the system and cannot perform I/O operations. The additional sense does not have additional information. |
2h | 04h | 0Ch | Not Ready. The target port is in the state of unavailable. | The following conditions are possible:
|
3h | 00h | 00h | Medium event | This is only returned for read or write I/Os. The I/O suffered an event at a specific LBA within its scope. The location of the event is reported within the sense data. The additional sense also includes a reason code that relates the event to the corresponding event log entry. For example, a RAID controller event or a migrated medium event. |
4h | 08h | 00h | Hardware event. A command to logical unit communication failure has occurred. | The I/O suffered an event that is associated with an I/O event that is returned by a RAID controller. The additional sense includes a reason code that points to the sense data that is returned by the controller. This is only returned for I/O type commands. This event is also returned from FlashCopy target volumes in the prepared and preparing state. |
5h | 25h | 00h | Illegal request. The logical unit is not supported. | The logical unit does not exist or is not mapped to the sender of the command. |
The reason code appears in bytes 20-23 of the sense data. The reason code provides the node with a specific log entry. The field is a 32-bit unsigned number that is presented with the most significant byte first. Table 3 lists the reason codes and their definitions.
If the reason code is not listed in Table 3, the code refers to a specific event in the event log that corresponds to the sequence number of the relevant event log entry.
Reason code (decimal) | Description |
---|---|
40 | The resource is part of a stopped FlashCopy mapping. |
50 | The resource is part of a Metro Mirror or Global Mirror relationship and the secondary LUN in the offline. |
51 | The resource is part of a Metro Mirror or Global Mirror and the secondary LUN is read only. |
60 | The node is offline. |
71 | The resource is not bound to any domain. |
72 | The resource is bound to a domain that was recreated. |
73 | Running on a node that is contracted out for some reason that is not attributable to any path that is going offline. |
80 | Wait for the repair to complete, or delete the volume. |
81 | Wait for the validation to complete, or delete the volume. |
82 | An offline thin-provisioned volume that caused data to be pinned in the directory cache. Adequate performance cannot be achieved for other thin-provisioned volumes, so they are taken offline. |
85 | The volume that is taken offline because checkpointing to the quorum disk failed. |
86 | The repairvdiskcopy -medium command that created a virtual medium error where the copies differed. |
93 | An offline RAID-5 or RAID-6 array that caused in-flight-write data to be pinned. Good performance cannot be achieved for other arrays and so they are taken offline. |
94 | An array MDisk that is part of the volume that is taken offline because checkpointing to the quorum disk failed. |
95 | This reason code is used in MDisk bad block dump files to indicate that the data loss was caused by having to resync parity with rebuilding strips or some other RAID algorithm reason due to multiple failures. |
96 | A RAID-6 array MDisk that is part of the volume that is taken offline because an internal metadata table is full. |