Skip to main content

Compute dashboard

Overview

The compute dashboard aggregates the status of every VM in your organization, along with GPU, CPU, memory, network, and storage usage. It refreshes every 60 seconds and makes it easy to spot unhealthy VMs quickly.

Prerequisites
  • Resource.OrganizationResourceSummary.READ permission
  • Metric.Metric.READ permission (for the metric charts)

How to open it

From the left-hand menu, choose Compute > Dashboard.


Tabs

The dashboard consists of these tabs:

TabContents
Overview + ListAlerts for unhealthy VMs plus the full VM list with status
GPUTime-series chart of GPU utilization for running VMs
CPUTime-series chart of CPU utilization
MemoryTime-series chart of memory utilization
NetworkTime-series chart of network I/O
StorageBlock storage usage chart

Each tab aggregates metrics across all VMs into a single comparable time-series chart.


Time range

The time filter at the top changes the query window.

FilterMeaning
TodayMidnight today to now
This week / Last weekThis week / last week
This month / Last monthThis month / last month
Specific monthA particular month

For shorter or arbitrary ranges, use the Metrics Explorer.


Spotting unhealthy VMs

On the Overview + List tab, when unhealthy VMs are detected, an Unhealthy VMs dialog appears. The Diagnostic state column distinguishes two states:

Diagnostic stateMeaning
UnhealthyThe VM's allocated GPU is not detected or is not functioning correctly
UnknownThe monitoring system can't receive state information from the VM

Clicking View remediation guide in the dialog brings up the resolution steps. The full step-by-step procedure also lives in VM diagnostic state.


Auto-refresh

The dashboard auto-refreshes every 60 seconds. To refresh immediately, click the refresh button at the top. For organizations with many VMs, a refresh can take up to 30 seconds depending on data volume.


Tips

Detect idle GPUs to cut costs

  • Use the GPU tab to identify VMs whose utilization stays low for long stretches
  • Stop or delete VMs as soon as their training finishes
  • For automation, set up an alert

Catch storage-fill issues early

Drill into a single VM

  • Click a VM on the Overview + List tab to jump to that VM's Metrics tab
  • For deeper analysis, compare multiple charts in the Metrics Explorer

Next steps