Troubleshooting procedures help you diagnose problems.
Best practices for troubleshooting
Taking advantage of certain configuration options, and ensuring vital system access information has been recorded, makes the process of troubleshooting easier.
Understanding the medium errors and bad blocks
A storage system returns a medium error response to a host when it is unable to successfully read a block. The system response to a host read follows this behavior.
RAID write response time
This function means that the RAID software layer, where redundancy exists to do so, can prevent drive bad behavior from having an unlimited impact on I/O performance. In addition, the system tries to avoid immediately committing to an array rebuild due to a brief offline event from a single drive, while there is full redundancy.
User interfaces for servicing your system
The system provides several user interfaces to troubleshoot, recover, or maintain your system. The interfaces provide various sets of facilities to help resolve situations that you might encounter.
Event reporting
Events that are detected are saved in an event log. As soon as an entry is made in this event log, the condition is analyzed. If any service activity is required, a notification is sent, if you set up notifications.
Resolving a problem
Described here are some procedures to help resolve fault conditions that might exist on your system. A basic understanding of the system concepts is required.
Recover system procedure The recover system procedure recovers the entire storage system if the system state is lost from all control enclosure node canisters. The procedure re-creates the storage system by using saved configuration data. The recovery might not be able to restore all volume data. This procedure is also known as Tier 3 (T3) recovery.
Servicing storage systems
Storage systems that are supported for attachment to the system are designed with redundant components and access paths to enable concurrent maintenance. Hosts have continuous access to their data during component failure and replacement.
Removing and replacing parts You can remove and replace customer-replaceable units (CRUs) in control enclosures or expansion enclosures.