Monitor ESXi, vCenter, and Virtual Machines
Alarms can monitor Virtual Machines, Hosts, Clusters, Datacenters, Datastores, Networks, vSphere Distributed Switches, Distributed Port Groups, and vCenter Server.
Two types of alarms:
- Condition Based
- Event Based
Valid Alarm Responses for Alarms created in vCenter
- Run a Command
- Send an Email
- Send an SNMP Trap
SMTP and SNMP settings are configured in the vCenter Server Settings.
Sender account and SMTP server name (or IP address) must be configure to allow vCenter to send email responses to alarms.
SNMP Community Name, SNMP server name (or IP address), and SNMP port must be configured to send an SNMP response to alarms.
There are 4 vCenter statistics collection levels (1-4). The default statistics level for vCenter 5 is 1.
- Level 1 includes basic metrics - Average usage for CPU, memory, disk, and network. Statistics for devices are not included at this level.
- Level 2 includes all metrics (average, summation, and latest rollups types). Statistics for devices are not included at this level.
- Level 3 - includes all metrics, including devices, for all counter groups.
- Level 4 - All metrics supported by vCenter Server.
vCenter can keep statistics in intervals of 5 minutes, 30 minutes, 2 hours, and 1 day.
The Database Size tool can be used to estimated the size of the vCenter database. Enter the expected number of Physical Hosts and number of Virtual Machines in inventory and an estimated space requirement will be generated based on the statistics intervals and levels selected.
None, Error, Warnings, Information, Verbose, and Trivia are all vCenter logging options. The Trivia logging options captures the most data.
Logging and Statistics are configured in vCenter Server Settings.
System Logs can be exported from Home -> System Logs -> Export System Logs or from the Administration menu -> Export System Logs or File -> Export -> Export System Logs
ESXi Dump Collector can be configured to dump a host’s kernel core to a network server when a system failure occurs.
resxtop can be run from the vMA to remotely monitor ESXi performance Metrics
A few R/ESXTOP Metric Definitions
- %SYS - Time spent running system services
- %RUN - Percentage of total scheduled time
- %RDY - Time a world is ready to run
- %MLMTD - Time a world is ready to run but cannot due to CPU limit
- %WAIT - Time a world spends waiting for resources.
- %IDLE - Time a vCPU is in idle loop.
- %SWPWT - Time a world is waiting for VMkernel to swap memory.
- %USED - Physical CPU time accounted to a world.
The CPU Ready performance counter identifies the time a virtual machine is ready to use CPU but cannot because CPU resources could not be allocated.
Kernel Command Latency (usually 2-3 ms) data counter monitors the average time spent in the VMkernel per SCSI command.
Physical Device Command Latency (15-20ms) counter displays the average time the physical device takes to complete a SCSI command.
Device Latency (DAVG/cmds) + Kernel Latency (KAVG/cmds) = Guest Latency (GAVG/cmds)
High latency results in lower throughput.
Performance reports can be exported from File -> Report -> Performance
Performance report Chart Options: Line graph, Stacked Graph, or Stacked Graph (Per VM) - Size: Small, Medium, and Large