Starting statistics collection

The system collects statistics over an interval and creates files that can be viewed.

Introduction

For each collection interval, the management GUI creates four statistics files: one for managed disks (MDisks), named Nm_stat; one for volumes and volume copies, named Nv_stat; one for nodes, named Nn_stat; and one for SAS drives, named Nd_stat. The files are written to the /dumps/iostats directory on the node. To retrieve the statistics files from the non-configuration nodes onto the configuration node, svctask cpdumps command must be used.

A maximum of 16 files of each type can be created for the node. When the 17th file is created, the oldest file for the node is overwritten.

Fields

The following fields are available for user definition:
Interval
Specify the interval in minutes between the collection of statistics. You can specify 1 - 60 minutes in increments of 1 minute.

Tables

The following tables describe the information that is reported for individual nodes and volumes.

Table 1 describes the statistics collection for MDisks, for individual nodes.

Table 1. Statistics collection for individual nodes
Statistic name Description
id Indicates the name of the MDisk for which the statistics apply.
idx Indicates the identifier of the MDisk for which the statistics apply.
rb Indicates the cumulative number of blocks of data that is read (since the node has been running).
re Indicates the cumulative read external response time in milliseconds for each MDisk. The cumulative response time for disk reads is calculated by starting a timer when a SCSI read command is issued and stopped when the command completes successfully. The elapsed time is added to the cumulative counter.
ro Indicates the cumulative number of MDisk read operations that are processed (since the node is running).
rq Indicates the cumulative read queued response time in milliseconds for each MDisk. This response is measured from above the queue of commands to be sent to an MDisk because the queue depth is already full. This calculation includes the elapsed time that is taken for read commands to complete from the time they join the queue.
wb Indicates the cumulative number of blocks of data written (since the node is running).
we Indicates the cumulative write external response time in milliseconds for each MDisk. The cumulative response time for disk writes is calculated by starting a timer when a SCSI write command is issued and stopped when the command completes successfully. The elapsed time is added to the cumulative counter.
wo Indicates the cumulative number of MDisk write operations processed (since the node is running).
wq Indicates the cumulative write queued response time in milliseconds for each MDisk. This time is measured from above the queue of commands to be sent to an MDisk because the queue depth is already full. This calculation includes the elapsed time taken for write commands to complete from the time they join the queue.
Table 2 describes the VDisk (volume) information that is reported for individual nodes.
Note: MDisk statistics files for nodes are written to the /dumps/iostats directory on the individual node.
Table 2. Statistic collection for volumes for individual nodes
Statistic name Description
id Indicates the volume name for which the statistics apply.
idx Indicates the volume for which the statistics apply.
rb Indicates the cumulative number of blocks of data read (since the node is running).
rl Indicates the cumulative read response time in milliseconds for each volume. The cumulative response time for volume reads is calculated by starting a timer when a SCSI read command is received and stopped when the command completes successfully. The elapsed time is added to the cumulative counter.
rlw Indicates the worst read response time in microseconds for each volume since the last time statistics were collected. This value is reset to zero after each statistics collection sample.
ro Indicates the cumulative number of volume read operations processed (since the node has been running).
wb Indicates the cumulative number of blocks of data written (since the node is running).
wl Indicates the cumulative write response time in milliseconds for each volume. The cumulative response time for volume writes is calculated by starting a timer when a SCSI write command is received and stopped when the command completes successfully. The elapsed time is added to the cumulative counter.
wlw Indicates the worst write response time in microseconds for each volume since the last time statistics were collected. This value is reset to zero after each statistics collection sample.
wo Indicates the cumulative number of volume write operations processed (since the node is running).
wou Indicates the cumulative number of volume write operations that are not aligned on a 4K boundary.
xl Indicates the cumulative read and write data transfer response time in milliseconds for each volume since the last time the node was reset. When this statistic is viewed for multiple volumes and with other statistics, it can indicate whether the latency is caused by the host, fabric, or the Lenovo Storage V series.

Table 3 describes the VDisk information that is related to Metro Mirror or Global Mirror relationships that is reported for individual nodes.

Table 3. Statistic collection for volumes that are used in Metro Mirror and Global Mirror relationships for individual nodes
Statistic name Description
gwl Indicates cumulative secondary write latency in milliseconds. This statistic accumulates the cumulative secondary write latency for each volume. You can calculate the amount of time to recovery from a failure based on this statistic and the gws statistics.
gwo Indicates the total number of overlapping volume writes. An overlapping write is when the logical block address (LBA) range of write request collides with another outstanding request to the same LBA range and the write request is still outstanding to the secondary site.
gwot Indicates the total number of fixed or unfixed overlapping writes. When all nodes in all clusters are running Lenovo Storage V series version 4.3.1, this records the total number of write I/O requests received by the Global Mirror feature on the primary that overlapped. When any nodes in either cluster are running Lenovo Storage V series versions earlier than 4.3.1, this value does not increment.
gws Indicates the total number of write requests issued to the secondary site.

Table 4 describes the port information that is reported for individual nodes

Table 4. Statistic collection for node ports
Statistic name Description
bbcz Indicates the total time in microseconds for which the buffer credit counter was at zero. Note that this statistic is only reported by 8 Gbps Fibre Channel ports. For other port types, this is 0.
cbr Indicates the bytes received from controllers.
cbt Indicates the bytes transmitted to disk controllers.
cer Indicates the commands received from disk controllers.
cet Indicates the commands initiated to disk controllers.
dtdc Indicates the number of transfers that experienced excessive data transmission delay.
dtdm Indicates the number of transfers that had their data transmission delay measured.
dtdt Indicates the total time in microseconds for which data transmission was excessively delayed.
hbr Indicates the bytes received from hosts.
hbt Indicates the bytes transmitted to hosts.
her Indicates the commands received from hosts.
het Indicates the commands initiated to hosts.
icrc Indicates the number of CRC that are not valid.
id Indicates the port identifier for the node.
itw Indicates the number of transmission word counts that are not valid.
lf Indicates a link failure count.
lnbr Indicates the bytes received to other nodes in the same cluster.
lnbt Indicates the bytes transmitted to other nodes in the same cluster.
lner Indicates the commands received from other nodes in the same cluster.
lnet Indicates the commands initiated to other nodes in the same cluster.
lsi Indicates the lost-of-signal count.
lsy Indicates the loss-of-synchronization count.
pspe Indicates the primitive sequence-protocol error count.
rmbr Indicates the bytes received to other nodes in the other clusters.
rmbt Indicates the bytes transmitted to other nodes in the other clusters.
rmer Indicates the commands received from other nodes in the other clusters.
rmet Indicates the commands initiated to other nodes in the other clusters.
wwpn Indicates the worldwide port name for the node.

Table 5 describes the node information that is reported for each node.

Table 5. Statistic collection for nodes
Statistic name Description
cluster_id Indicates the name of the cluster.
cluster Indicates the name of the cluster.
cpu busy - Indicates the total CPU average core busy milliseconds since the node was reset. This statistic reports the amount of the time the processor has spent polling while waiting for work versus actually doing work. This statistic accumulates from zero.
comp - Indicates the total CPU average core busy milliseconds for compression process cores since the node was reset.
system - Indicates the total CPU average core busy milliseconds since the node was reset. This statistic reports the amount of the time the processor has spent polling while waiting for work versus actually doing work. This statistic accumulates from zero. This is the same information as the information provided with the cpu busy statistic and will eventually replace the cpu busy statistic.
cpu_core id - Indicates the CPU core id.
comp - Indicates the per-core CPU average core busy milliseconds for compression process cores since node was reset.
system - Indicates the per-core CPU average core busy milliseconds for system process cores since node was reset.
id Indicates the name of the node.
node_id Indicates the unique identifier for the node.
rb Indicates the number of bytes received.
re Indicates the accumulated receive latency, excluding inbound queue time. This statistic is the latency that is experienced by the node communication layer from the time that an I/O is queued to cache until the time that the cache gives completion for it.
ro Indicates the number of messages or bulk data received.
rq Indicates the accumulated receive latency, including inbound queue time. This statistic is the latency from the time that a command arrives at the node communication layer to the time that the cache completes the command.
wb Indicates the bytes sent.
we Indicates the accumulated send latency, excluding outbound queue time. This statistic is the time from when the node communication layer issues a message out onto the Fibre Channel until the node communication layer receives notification that the message arrived.
wo Indicates the number of messages or bulk data sent.
wq Indicates the accumulated send latency, including outbound queue time. This statistic includes the entire time that data is sent. This time includes the time from when the node communication layer receives a message and waits for resources, the time to send the message to the remote node, and the time taken for the remote node to respond.

Table 6 describes the statistics collection for volumes.

Table 6. Cache statistics collection for volumes and volume copies
Statistic Acronym Statistics for volume cache Statistics for volume copy cache Statistics for volume cache partition Statistics for volume copy cache partition Statistics for the Node Overall Cache Cache statistics for mdisks Units and state

read ios

ri

Yes

Yes

 

ios, cumulative

write ios

wi

Yes

Yes

 

ios, cumulative

read misses

r

Yes

Yes

 

sectors, cumulative

read hits

rh

Yes

Yes

 

sectors, cumulative

flush_through writes

ft

Yes

Yes

 

sectors, cumulative

fast_write writes

fw

Yes

Yes

 

sectors, cumulative

write_through writes

wt

Yes

Yes

 

sectors, cumulative

write hits

wh

Yes

Yes

 

sectors, cumulative

prefetches

p

Yes

 

sectors, cumulative

prefetch hits (prefetch data that is read)

ph

Yes

 

sectors, cumulative

prefetch misses (prefetch pages that are discarded without any sectors read)

pm

Yes

 

pages, cumulative

modified data

m

Yes

Yes

        sectors, snapshot, non-cumulative

read and write cache data

v

Yes

Yes

        sectors snapshot, non-cumulative

destages

d

Yes

Yes

 

sectors, cumulative

fullness Average

fav

Yes

Yes

%, non-cumulative

fullness Max

fmx

Yes

Yes

%, non-cumulative

fullness Min

fmn

Yes

Yes

%, non-cumulative

Destage Target Average

dtav

 

Yes

Yes

IOs capped 9999, non-cumulative

Destage Target Max

dtmx

 

Yes

IOs, non-cumulative

Destage Target Min

dtmn

 

Yes

IOs, non-cumulative

Destage In Flight Average

dfav

 

Yes

Yes

IOs capped 9999, non-cumulative

Destage In Flight Max

dfmx

 

Yes

IOs, non-cumulative

Destage In Flight Min

dfmn

 

Yes

IOs, non-cumulative

destage latency average

dav

Yes

Yes

Yes

Yes

Yes

Yes

µs capped 9999999, non-cumulative

destage latency max

dmx

   

Yes

Yes

Yes

  µs capped 9999999, non-cumulative

destage latency min

dmn

   

Yes

Yes

Yes

µs capped 9999999, non-cumulative

destage count

dcn

Yes

Yes

Yes

Yes

Yes

ios, non-cumulative

stage latency average

sav

Yes

Yes

 

Yes

µs capped 9999999, non-cumulative

stage latency max

smx

 

Yes

µs capped 9999999, non-cumulative

stage latency min

smn

 

Yes

µs capped 9999999, non-cumulative

stage count

scn

Yes

Yes

 

Yes

ios, non-cumulative

prestage latency average

pav

Yes

 

Yes

µs capped 9999999, non-cumulative

prestage latency max

pmx

 

Yes

µs capped 9999999, non-cumulative

prestage latency min

pmn

 

Yes

µs capped 9999999, non-cumulative

prestage count

pcn

Yes

 

Yes

ios, non-cumulative

Write Cache Fullness Average

wfav

 

Yes

%, non-cumulative

Write Cache Fullness Max

wfmx

 

Yes

%, non-cumulative

Write Cache Fullness Min

wfmn

 

Yes

%, non-cumulative

Read Cache Fullness Average

rfav

 

Yes

%, non-cumulative

Read Cache Fullness Max

rfmx

 

Yes

%, non-cumulative

Read Cache Fullness Min

rfmn

 

Yes

%, non-cumulative

Pinned Percent

pp

Yes

Yes

Yes

Yes

Yes

% of total cache snapshot, non-cumulative

data transfer latency average

tav

Yes

Yes

 

µs capped 9999999, non-cumulative

Track Lock Latency (Exclusive) Average

teav

Yes

Yes

 

µs capped 9999999, non-cumulative

Track Lock Latency (Shared) Average

tsav

Yes

Yes

 

µs capped 9999999, non-cumulative

Cache I/O Control Block Queue Time

hpt

 

Yes

Average µs, non-cumulative

Cache Track Control Block Queue Time

ppt

 

Yes

Average µs, non-cumulative

Owner Remote Credit Queue Time

opt

 

Yes

Average µs, non-cumulative

Non-Owner Remote Credit Queue Time

npt

 

Yes

Average µs, non-cumulative

Admin Remote Credit Queue Time

apt

 

Yes

Average µs, non-cumulative

Cdcb Queue Time

cpt

 

Yes

Average µs, non-cumulative

Buffer Queue Time

bpt

 

Yes

Average µs, non-cumulative

Hardening Rights Queue Time

hrpt

 

Yes

Average µs, non-cumulative
Note: Any statistic with a name av, mx, mn, and cn is not cumulative. These statistics reset every statistics interval. For example, if the statistic does not have a name with name av, mx, mn, and cn, and it is an Ios or count, it will be a field containing a total number.
  • The term pages means in units of 4096 bytes per page.
  • The term sectors means in units of 512 bytes per sector.
  • The term µs means microseconds.
  • Non-cumulative means totals since the previous statistics collection interval.
  • Snapshot means the value at the end of the statistics interval (rather than an average across the interval or a peak within the interval).

Table 7 describes the statistic collection for volume cache per individual nodes.

Table 7. Statistic collection for volume cache per individual nodes. This table describes the volume cache information that is reported for individual nodes.
Statistic name Description
cm Indicates the number of sectors of modified or dirty data that are held in the cache.
ctd Indicates the total number of cache destages that were initiated writes, submitted to other components as a result of a volume cache flush or destage operation.
ctds Indicates the total number of sectors that are written for cache-initiated track writes.
ctp Indicates the number of track stages that are initiated by the cache that are prestage reads.
ctps Indicates the total number of staged sectors that are initiated by the cache.
ctrh Indicates the number of total track read-cache hits on prestage or non-prestage data. For example, a single read that spans two tracks where only one of the tracks obtained a total cache hit, is counted as one track read-cache hit.
ctrhp Indicates the number of track reads received from other components, treated as cache hits on any prestaged data. For example, if a single read spans two tracks where only one of the tracks obtained a total cache hit on prestaged data, it is counted as one track read for the prestaged data. A cache hit that obtains a partial hit on prestage and non-prestage data still contributes to this value.
ctrhps Indicates the total number of sectors that are read for reads received from other components that obtained cache hits on any prestaged data.
ctrhs Indicates the total number of sectors that are read for reads received from other components that obtained total cache hits on prestage or non-prestage data.
ctr Indicates the total number of track reads received. For example, if a single read spans two tracks, it is counted as two total track reads.
ctrs Indicates the total number of sectors that are read for reads received.
ctwft Indicates the number of track writes received from other components and processed in flush through write mode.
ctwfts Indicates the total number of sectors that are written for writes that are received from other components and processed in flush through write mode.
ctwfw Indicates the number of track writes received from other components and processed in fast-write mode.
ctwfwsh Indicates the track writes in fast-write mode that were written in write-through mode because of the lack of memory.
ctwfwshs Indicates the track writes in fast-write mode that were written in write through due to the lack of memory.
ctwfws Indicates the total number of sectors that are written for writes that are received from other components and processed in fast-write mode.
ctwh Indicates the number of track writes received from other components where every sector in the track obtained a write hit on already dirty data in the cache. For a write to count as a total cache hit, the entire track write data must already be marked in the write cache as dirty.
ctwhs Indicates the total number of sectors that are received from other components where every sector in the track obtained a write hit on already dirty data in the cache.
ctw Indicates the total number of track writes received. For example, if a single write spans two tracks, it is counted as two total track writes.
ctws Indicates the total number of sectors that are written for writes that are received from components.
ctwwt Indicates the number of track writes received from other components and processed in write through write mode.
ctwwts Indicates the total number of sectors that are written for writes that are received from other components and processed in write through write mode.
cv Indicates the number of sectors of read and write cache data that is held in the cache.

Table 8 describes the XML statistics specific to an IP Partnership port.

Table 8. XML statistics for an IP Partnership port
Statistic name Description
ipbz Indicates the average size (in bytes) of data that is being submitted to the IP partnership driver since the last statistics collection period.
iprc Indicates the total bytes that are received before any decompression takes place.
ipre Indicates the bytes retransmitted to other nodes in other clusters by the IP partnership driver.
iprt Indicates the average round-trip time in microseconds for the IP partnership link since the last statistics collection period.
iprx Indicates the bytes received from other nodes in other clusters by the IP partnership driver.
ipsz Indicates the average size (in bytes) of data that is being transmitted by the IP partnership driver since the last statistics collection period.
iptc Indicates the total bytes that are transmitted after any compression (if active) has taken place.
iptx Indicates the bytes transmitted to other nodes in other clusters by the IP partnership driver.
Table 9 describes the offload data transfer (ODX) Vdisk and node level I/O statistics.
Table 9. ODX VDisk and node level statistics
Statistic name Acronym Description
Read cumulative ODX I/O latency orl Cumulative total read latency of ODX I/O per VDisk. The unit type is micro-seconds (US).
Write cumulative ODX I/O latency owl Cumulative total write latency of ODX I/O per VDisk. The unit type is micro-seconds (US).
Total transferred ODX I/O read blocks oro Cumulative total number of blocks read and successfully reported to the host, by ODX WUT command per VDisk. It is represented in blocks unit type.
Total transferred ODX I/O write blocks owo Cumulative total number of blocks written and successfully reported to the host, by ODX WUT command per VDisk. It is represented in blocks unit type.
Wasted ODX I/Os oiowp Cumulative total number of wasted blocks written by ODX WUT command per node. It is represented in blocks unit type.
WUT failure count otrec Cumulative total number of failed ODX WUT commands per node. It includes WUT failures due to a token getting revoked and expired.

Actions

The following actions are available to the user:

OK
Click this button to change statistic collection.
Cancel
Click this button to exit the panel without changing statistic collection.
XML formatting information
The XML is more complicated now, as seen in this raw XML from the volume (Nv_statistics) statistics. Notice how the names are similar but because they are in a different section of the XML, they refer to a different part of the VDisk.
<vdsk idx="0"
ctrs="213694394" ctps="0" ctrhs="2416029" ctrhps="0"
ctds="152474234" ctwfts="9635" ctwwts="0" ctwfws="152468611"
ctwhs="9117" ctws="152478246" ctr="1628296" ctw="3241448"
ctp="0" ctrh="123056" ctrhp="0" ctd="1172772"
ctwft="200" ctwwt="0" ctwfw="3241248" ctwfwsh="0"
ctwfwshs="0" ctwh="538" cm="13768758912876544" cv="13874234719731712"
gwot="0" gwo="0" gws="0" gwl="0"
id="Master_iogrp0_1"
ro="0" wo="0" rb="0" wb="0"
rl="0" wl="0" rlw="0" wlw="0" xl="0">
Vdisk/Volume statistics
<ca r="0" rh="0" d="0" ft="0"
wt="0" fw="0" wh="0" ri="0"
wi="0" dav="0" dcn="0" pav="0" pcn="0" teav="0"  tsav="0"  tav="0"
pp="0"/>
<cpy idx="0">
volume copy statistics
<ca r="0" p="0" rh="0" ph="0"
d="0" ft="0" wt="0" fw="0"
wh="0" pm="0" ri="0" wi="0"
dav="0" dcn="0" sav="0" scn="0"
pav="0" pcn="0" teav="0"  tsav="0"
tav="0"  pp="0"/>
</cpy>
<vdsk>
The <cpy idx="0"> means its in the volume copy section of the VDisk, whereas the statistics shown under Vdisk/Volume statistics are outside of the cpy idx section and therefore refer to a VDisk/volume.
Similarly for the volume cache statistics for node and partitions:
<uca><ca dav="18726" dcn="1502531" dmx="749846" dmn="89"
sav="20868" scn="2833391" smx="980941" smn="3"
pav="0" pcn="0" pmx="0" pmn="0"
wfav="0" wfmx="2" wfmn="0"
rfav="0" rfmx="1" rfmn="0"
pp="0"
hpt="0" ppt="0" opt="0" npt="0"
apt="0" cpt="0" bpt="0" hrpt="0"
/><partition id="0"><ca dav="18726" dcn="1502531" dmx="749846" dmn="89"
fav="0" fmx="2" fmn="0"
dfav="0" dfmx="0" dfmn="0"
dtav="0" dtmx="0" dtmn="0"
pp="0"/></partition>
This output describes the volume cache node statistics where <partition id="0"> the statistics are described for partition 0.

Replacing <uca> with <lca> means that the statistics are for volume copy cache partition 0.