Performance statistics

Real-time performance statistics provide short-term status information for the system. To access these performance statistics, click Monitoring > Performance in the management GUI. In addition, the management GUI displays an overview of system performance, in the Performance section on the Dashboard.

You can use system statistics to monitor the bandwidth of all the volumes, interfaces, and MDisks that are being used on your system. You can also monitor the overall CPU utilization for the system. These statistics summarize the overall performance health of the system and can be used to monitor trends in bandwidth and CPU utilization. You can monitor changes to stable values or differences between related statistics, such as the latency between volumes and MDisks. These differences then can be further evaluated by performance diagnostic tools.

Additionally, with system-level statistics, you can quickly view bandwidth of volumes, interfaces, and MDisks. Each of these graphs displays the current bandwidth in megabytes per second and a view of bandwidth over time. Each data point can be accessed to determine its individual bandwidth use and to evaluate whether a specific data point might represent performance impacts. For example, you can monitor the interfaces to determine whether the host data-transfer rate is different from the expected rate.

You can also select node-level statistics, which can help you determine the performance impact of a specific node. As with system statistics, node statistics help you to evaluate whether the node is operating within normal performance metrics.

CPU Utilization

The CPU utilization graph shows the current percentage of CPU usage and peaks in utilization. If compression is being used, you can monitor the amount of CPU resources that being used for compression and the amount that is available to the rest of the system.

The gradient at the top of the graph indicates when any data point is above 95% system CPU utilization. A single spike often does not indicate a performance impact on the system; however, if a data point is consistently above 95% of utilization and I/O input is high, the system might be overloaded, which might indicate a need for more back-end storage. However, for compression utilization alone, it can be normal to see rates at 100%, especially if compression is used frequently on the system. If both compression and I/O input is high, the system might need additional storage to accommodate the utilization.

Interface

The Interfaces graph shows all possible interface types that can be configured on different models of the system. Depending on the model of your system and interface adapters that are installed, data points might not be available for all the displayed interfaces. To view data points for an interface, select the type of interface to display performance data in the Interface graph. You can use this information to help determine connectivity issues that might impact performance. The Fibre Channel interface is also used to communicate within the system. The iSCSI interface is used for read and write workloads from iSCSI-attached hosts. The SAS interface is used for read and write operations to drives. The SAS interface can show activity even when there is no incoming workload on the Fibre Channel or iSCSI interfaces due to FlashCopy operations or background RAID activity, such as data scrubbing and array rebuilding. The workload on the SAS interface can also be higher than the workload from hosts because of the additional write operations that are necessary for the different RAID types. For example, a write operation to a volume that is using a RAID-10 array requires twice the amount of the SAS interface bandwidth to accommodate the RAID mirroring. The IP Remote Copy interface displays read and write workloads for Remote Copy traffic over IP connections. The IP Remote Copy (Compressed) interface displays read and write workloads for Remote Copy traffic over compressed IP connections. Data is compressed as it is sent between systems in the remote copy partnership. Compression for remote copy can reduce the amount of bandwidth that is required for the IP connection. Compression must be enabled on both systems in the remote-copy partnership to compress data over the IP connection.

MDisks and volumes

The MDisk and Volumes graphs on the Performance panel show four metrics: Read, Write, Read latency, and Write latency. You can use these metrics to help determine the overall performance health of the volumes and MDisks on your system. Consistent unexpected results can indicate errors in configuration, system faults, or connectivity issues. Both the volumes and MDisk graphs contain the same metrics to compare and use to evaluate performance; however, the data points for these metrics can be quite different due to the impact of system cache, RAID overhead, and Copy Services functions. You can either choose to display data points in megabytes per second (MBps) or I/O per second (IOPS).

If you select the read metric, data points on the graph indicate the average amount in MBps or IOPS for read operations that have been processed over the sample period. Read metrics represent how much data the system is processing (the bandwidth). The read latency metric for volumes measures the average amount of time in milliseconds that the system takes to respond to read requests over the sample period. The MDisks latency metrics measures the response time for back-end storage. Increased workload or error recovery can cause spikes in read latency. Values that are consistently higher than expected can indicate a fault situation or an overloaded system. Volume write latency tends to be lower than the MDisk write latency because of volume caching. If the volume write latency is equal to or greater than MDisk write latency, it suggests that the drives might be overloaded and more drives might be necessary to accommodate the increased workload. Volume latency may be lower than MDisk latency because of the volume cache. Alternatively volume latency may be higher than MDisk latency due to hosts using large I/O sizes. Neither of these conditions indicates a problem with the system.

The difference between read and write IOPS shows the mixture of workload that the system is executing. You can determine the average transfer sizes of data that the system is experiencing by dividing the read and write operations in MBps by the read and write operations in IOPS. This information can be used for validating and predicting disk configuration for the system or input to a disk provisioning application. Write latency is the average time (in milliseconds) that the system writes data to volumes or MDisks but does not include the time for write operations that are used to keep volumes in Global Mirror relationships synchronized. As with read latency, MDisk write latency tends to be higher than volumes because of write caching and RAID overheads. For example, a write operation to a volume can result in additional read and write operations on the MDisk depending on the RAID type for the array.