Part 4 of the "Manage VSAN with RVC" series covers commands that are useful to troubleshoot VSAN configurations. The commands can measures performance metrics and fix configuration issues:
- vsan.obj_status_report
- vsan.check_state
- vsan.fix_renamed_vms
- vsan.reapply_vsan_vmknic_config
- vsan.vm_perf_stats
To make commands look better I created marks for a Cluster, a Virtual Machine and an ESXi Hosts. This allows me to use ~cluster, ~vm and ~esx in my examples:
/localhost/DC> mark cluster ~/computers/VSAN-Cluster/ /localhost/DC> mark vm ~/vms/vma.virten.lab /localhost/DC> mark esx ~/computers/VSAN-Cluster/hosts/esx1.virten.lab/
Troubleshooting VSAN
vsan.obj_status_report
Provides information about objects and their health status. With this command, you can identify that all object components are healthy, which means that witness and all mirrors are available and synced. It also identifies possibly orphaned objects.
Usage from help page:
/localhost/DC> help vsan.obj_status_report Print component status for objects in the cluster. cluster_or_host: Path to a ClusterComputeResource or HostSystem --print-table, -t: Print a table of object and their status, default all objects --filter-table, -f: Filter the obj table based on status displayed in histogram, e.g. 2/3 --print-uuids, -u: In the table, print object UUIDs instead of vmdk and vm paths --ignore-node-uuid, -i: Estimate the status of objects if all comps on a given host were healthy. --help, -h: Show this message
Example 1 - Simple component status histogram.
We can see 45 objects with 3/3 healthy components and 23 objects with 7/7 healthy components. With default policies, 3/3 are disks (2 mirror+witness) and 7/7 are namespace directories. We can also see an orphand object in this example.
/localhost/DC> vsan.obj_status_report ~cluster 2014-01-03 19:10:13 +0000: Querying all VMs on VSAN ... 2014-01-03 19:10:13 +0000: Querying all objects in the system from esx1.virten.lab ... 2014-01-03 19:10:14 +0000: Querying all disks in the system ... 2014-01-03 19:10:15 +0000: Querying all components in the system ... 2014-01-03 19:10:15 +0000: Got all the info, computing table ... Histogram of component health for non-orphaned objects +-------------------------------------+------------------------------+ | Num Healthy Comps / Total Num Comps | Num objects with such status | +-------------------------------------+------------------------------+ | 3/3 | 45 | | 7/7 | 23 | +-------------------------------------+------------------------------+ Total non-orphans: 68 Histogram of component health for possibly orphaned objects +-------------------------------------+------------------------------+ | Num Healthy Comps / Total Num Comps | Num objects with such status | +-------------------------------------+------------------------------+ | 1/3 | 1 | +-------------------------------------+------------------------------+ Total orphans: 1
Example 2 - Add a table with all object and their status to the command output. That output reveals which object actually is orphaned.
/localhost/DC> vsan.obj_status_report ~cluster -t 2014-01-03 19:42:13 +0000: Querying all VMs on VSAN ... 2014-01-03 19:42:13 +0000: Querying all objects in the system from esx1.virten.lab ... 2014-01-03 19:42:14 +0000: Querying all disks in the system ... 2014-01-03 19:42:15 +0000: Querying all components in the system ... 2014-01-03 19:42:16 +0000: Got all the info, computing table ... Histogram of component health for non-orphaned objects +-------------------------------------+------------------------------+ | Num Healthy Comps / Total Num Comps | Num objects with such status | +-------------------------------------+------------------------------+ | 3/3 | 45 | | 7/7 | 23 | +-------------------------------------+------------------------------+ Total non-orphans: 68 Histogram of component health for possibly orphaned objects +-------------------------------------+------------------------------+ | Num Healthy Comps / Total Num Comps | Num objects with such status | +-------------------------------------+------------------------------+ | 1/3 | 1 | +-------------------------------------+------------------------------+ Total orphans: 1 +-----------------------------------------------------------------------------+---------+---------------------------+ | VM/Object | objects | num healthy / total comps | +-----------------------------------------------------------------------------+---------+---------------------------+ | perf9 | 1 | | | [vsanDatastore] 735ec152-da7c-64b1-ebfe-eca86bf99b3f/perf9.vmx | | 7/7 | | perf8 | 1 | | | [vsanDatastore] 195ec152-92a3-491b-801d-eca86bf99b3f/perf8.vmx | | 7/7 | [...] +-----------------------------------------------------------------------------+---------+---------------------------+ | Unassociated objects | | | [...] | 795dc152-6faa-9ae7-efe5-001b2193b9a4 | | 3/3 | | d068ab52-b882-c6e8-32ca-eca86bf99b3f | | 1/3* | | ce5dc152-c5f6-3efb-9a44-001b2193b3b0 | | 3/3 | +-----------------------------------------------------------------------------+---------+---------------------------+ +------------------------------------------------------------------+ | Legend: * = all unhealthy comps were deleted (disks present) | | - = some unhealthy comps deleted, some not or can't tell | | no symbol = We cannot conclude any comps were deleted | +------------------------------------------------------------------+
Example 3 - Add a filtered table with unhealthy components only. We filter the table to show 1/3 health components only
/localhost/DC> vsan.obj_status_report ~cluster -t -f 1/3 [...] +-----------------------------------------+---------+---------------------------+ | VM/Object | objects | num healthy / total comps | +-----------------------------------------+---------+---------------------------+ | Unassociated objects | | | | d068ab52-b882-c6e8-32ca-eca86bf99b3f | | 1/3* | +-----------------------------------------+---------+---------------------------+ [...] /localhost/DC> vsan.obj_status_report ~cluster -t -u [...] +-----------------------------------------+---------+---------------------------+ | VM/Object | objects | num healthy / total comps | +-----------------------------------------+---------+---------------------------+ | perf9 | 1 | | | 735ec152-da7c-64b1-ebfe-eca86bf99b3f | | 7/7 | | perf8 | 1 | | | 195ec152-92a3-491b-801d-eca86bf99b3f | | 7/7 | | perf11 | 1 | | | d65ec152-7d27-f441-39bb-eca86bf99b3f | | 7/7 | [...]vsan.check_state Checks state of VMs and VSAN objects. This command can also re-register vms where objects are out of sync. I can't reproduce to get VMs out of sync, so I couldn't test that for now. I am going to update that post when i have any further information. Example 1 - Check state /localhost/DC> vsan.check_state ~cluster 2014-01-03 19:53:36 +0000: Step 1: Check for inaccessible VSAN objects Detected 1 objects to not be inaccessible Detected d068ab52-b882-c6e8-32ca-eca86bf99b3f on esx2.virten.lab to be inaccessible 2014-01-03 19:53:37 +0000: Step 2: Check for invalid/inaccessible VMs 2014-01-03 19:53:37 +0000: Step 3: Check for VMs for which VC/hostd/vmx are out of sync Did not find VMs for which VC/hostd/vmx are out of sync
vsan.fix_renamed_vms
This command fixes VMs that are renamed by the vCenter in case of storage inaccessibility when they are get renamed to their vmx file path.
In this a best effort command, as the real VM name is unknown.
Example 1 - Fix a renamed VM
/localhost/DC> vsan.fix_renamed_vms ~/vms/%2fvmfs%2fvolumes%2fvsanDatastore%2fvma.virten.lab%2fvma.virten.lab.vmx/ Continuing this command will rename the following VMs: %2fvmfs%2fvolumes%2fvsanDatastore%2fvma.virten.lab%2fvma.virten.lab.vmx -> vma.virten.lab Do you want to continue [y/N]? y Renaming... Rename %2fvmfs%2fvolumes%2fvsanDatastore%2fvma.virten.lab%2fvma.virten.lab.vmx: success
vsan.reapply_vsan_vmknic_config
Re-enables VSAN on vmk ports. Could be useful when you have network configuration problems in your VSAN Cluster.
Example 1 - Unbinds and rebinds VSAN on an host
/localhost/DC> vsan.reapply_vsan_vmknic_config ~esx Host: esx1.virten.lab Reapplying config of vmk1: AgentGroupMulticastAddress: 224.2.3.4 AgentGroupMulticastPort: 23451 IPProtocol: IPv4 InterfaceUUID: 776ca852-6660-c6d8-c9f4-001b2193b9a4 MasterGroupMulticastAddress: 224.1.2.3 MasterGroupMulticastPort: 12345 MulticastTTL: 5 Unbinding VSAN from vmknic vmk1 ... Rebinding VSAN to vmknic vmk1 ...
vsan.vm_perf_stats
Displays performance statistics from a virtual machine. The following metrics are supported:
- IOPS (read/write)
- Throughput in KB/s (read/write
- Latency in ms (read/write)
Example 1 - Display performance stats (Default: 20sec interval)
/localhost/DC> san.vm_perf_stats ~vm 2014-01-03 20:23:44 +0000: Querying info about VMs ... 2014-01-03 20:23:44 +0000: Querying VSAN objects used by the VMs ... 2014-01-03 20:23:45 +0000: Fetching stats counters once ... 2014-01-03 20:23:46 +0000: Sleeping for 20 seconds ... 2014-01-03 20:24:06 +0000: Fetching stats counters again to compute averages ... 2014-01-03 20:24:07 +0000: Got all data, computing table +-----------+--------------+------------------+--------------+ | VM/Object | IOPS | Tput (KB/s) | Latency (ms) | +-----------+--------------+------------------+--------------+ | win7 | 325.1r/80.6w | 20744.8r/5016.8w | 2.4r/12.2w | +-----------+--------------+------------------+--------------+
Example 2 - Display all VM objects performance stats with an interval of 5.
/localhost/DC> vsan.vm_perf_stats ~vm --show-objects --interval=5 2014-01-03 20:29:08 +0000: Querying info about VMs ... 2014-01-03 20:29:08 +0000: Querying VSAN objects used by the VMs ... 2014-01-03 20:29:09 +0000: Fetching stats counters once ... 2014-01-03 20:29:09 +0000: Sleeping for 5 seconds ... 2014-01-03 20:29:14 +0000: Fetching stats counters again to compute averages ... 2014-01-03 20:29:15 +0000: Got all data, computing table +-----------------------------------------------------+-------------+----------------+--------------+ | VM/Object | IOPS | Tput (KB/s) | Latency (ms) | +-----------------------------------------------------+-------------+----------------+--------------+ | win7 | | | | | ddc4a952-c91c-a777-771f-001b2193b9a4/win7.vmx | 0.0r/0.3w | 0.0r/0.2w | 0.0r/26.1w | | ddc4a952-c91c-a777-771f-001b2193b9a4/win7.vmdk | 166.6r/3.9w | 10597.2r/13.2w | 5.2r/23.5w | | ddc4a952-c91c-a777-771f-001b2193b9a4/win7_1.vmdk | 0.0r/72.6w | 0.0r/4583.7w | 0.0r/13.1w | +-----------------------------------------------------+-------------+----------------+--------------+
how do i remove an Unassociated object ?
Is there a way to remove these unassociated objects with VSAN2?