Dashboard Cards

Cards are the building blocks of your dashboards. Each card shows specific information that you can resize, move, and configure.

Card Features

Every card has:

  • Drag handle - Move it around
  • Menu button - Configure, replace, or remove
  • AI button - Ask AI about this card
  • Expand button - Make it full screen
  • Refresh indicator - See when data was last updated

All 110+ Card Types

The console ships with 115+ built-in cards, and you can create more using the Card Factory. Below are the main categories.

Cluster Health Cards (7)

#CardWhat it shows
1Cluster HealthHealth status of all clusters with green/red/gray indicators
2Cluster MetricsTime-series graphs of CPU, memory, pods, nodes
3Cluster FocusDetailed view of specific cluster
4Cluster ComparisonSide-by-side comparison of multiple clusters
5Cluster CostsCost breakdown per cluster
6Upgrade StatusVersion info and available upgrades
7Cluster Resource TreeHierarchical view of cluster resources

Workload Cards (6)

#CardWhat it shows
8Deployment StatusDonut chart of deployment health
9Deployment IssuesTable of deployments with problems
10Deployment ProgressRollout progress gauge
11Pod IssuesTable of pods with problems (crashes, OOM, etc.)
12Top PodsBar chart of top resource-consuming pods
13App StatusOverall application health status

Compute Cards (8)

#CardWhat it shows
14Compute OverviewSummary of CPU, memory, nodes, pods, GPUs
15Resource UsageGauge showing CPU/memory/GPU utilization
16Resource CapacityBar chart of used vs available resources
17GPU OverviewSummary of GPU resources and utilization
18GPU StatusDonut chart of GPU allocation
19GPU InventoryTable of GPU nodes with types and counts
20GPU WorkloadsTable of workloads using GPUs
21GPU Usage TrendTime-series graph of GPU utilization

Storage Cards (2)

#CardWhat it shows
22Storage OverviewSummary of storage resources
23PVC StatusTable of Persistent Volume Claims

Network Cards (3)

#CardWhat it shows
24Network OverviewSummary of network resources
25Service StatusTable of services
26Cluster NetworkNetwork status per cluster

GitOps Cards (7)

#CardWhat it shows
27Helm Release StatusStatus of Helm releases
28Helm HistoryEvent timeline of Helm deployments
29Helm Values DiffCompare Helm values between releases
30Chart VersionsAvailable chart version updates
31Kustomization StatusStatus of Kustomize overlays
32Overlay ComparisonCompare Kustomize overlays
33GitOps DriftDetect when clusters donโ€™t match git

ArgoCD Cards (3)

#CardWhat it shows
34ArgoCD ApplicationsStatus of ArgoCD apps
35ArgoCD Sync StatusDonut chart of sync status
36ArgoCD HealthHealth status of ArgoCD

Operator Cards (3)

#CardWhat it shows
37Operator StatusStatus of OLM operators
38Operator SubscriptionsTable of operator subscriptions
39CRD HealthHealth of Custom Resource Definitions

Namespace Cards (4)

#CardWhat it shows
40Namespace OverviewSummary of namespace resources
41Namespace QuotasGauge of quota usage
42Namespace RBACTable of RBAC rules
43Namespace EventsEvent stream for namespace

Security & Events Cards (3)

#CardWhat it shows
44Security IssuesTable of security problems
45Event StreamLive event feed
46User ManagementTable of console users

Live Trend Cards (4)

#CardWhat it shows
47Events TimelineTime-series of events
48Pod Health TrendTime-series of pod health
49Resource TrendTime-series of resource usage
50GPU UtilizationTime-series of GPU usage

AI Cards (3)

#CardWhat it shows
51AI IssuesIssues detected by AI
52Kubeconfig AuditAudit of your kubeconfig
53AI Health CheckAI health check gauge

Alerting Cards (2)

#CardWhat it shows
54Active AlertsCurrently firing alerts
55Alert RulesTable of alert rules

Cost Cards (3)

#CardWhat it shows
56Cluster CostsCost per cluster
57OpenCost OverviewOpenCost integration data
58Kubecost OverviewKubecost integration data

Policy Cards (2)

#CardWhat it shows
59OPA PoliciesOPA Gatekeeper policies
60Kyverno PoliciesKyverno policy status

Compliance Cards (3)

#CardWhat it shows
61Compliance ScoreOverall compliance percentage (CIS, NSA, PCI)
62Compliance FindingsTable of compliance findings by severity
63Security PostureCombined security posture overview

Provider Health Cards (1)

#CardWhat it shows
64Provider HealthStatus of AI providers (Claude, OpenAI, Gemini) and cloud providers

Workload Monitor Cards (2)

#CardWhat it shows
65Workload StatusCascading cluster/namespace/workload selector with resource details
66Resource AllocationResource allocation across clusters

llm-d Inference Cards (10)

llm-d Cards
llm-d Cards
#CardWhat it shows
67llm-d Request FlowAnimated request flow through the inference stack with throughput/latency metrics
68KV Cache MonitorKV cache utilization, per-pod cache stats, aggregated/per-pod toggle
69EPP RoutingEndpoint Picker routing decisions with RPS and routing distribution
70P/D DisaggregationPrefill and Decode server load, queue depth, throughput, TPOT, GPU memory
71llm-d BenchmarksStacks vs Comparison vs Latency views with TTFT, throughput, bar charts
72llm-d AI InsightsAI-generated insights about balanced P/D configuration and optimization
73llm-d ConfiguratorConfigure inference strategies: Intelligent Scheduling, P/D Disaggregation, Wide Expert Parallelism, Variant Autoscaling
llm-d Stack
llm-d Stack
#CardWhat it shows
74llm-d StackStack health, component status, model serving details with cluster discovery
75llm-d ModelsLoaded models with namespace, cluster, and GPU allocation
76llm-d Inference ServersRunning inference servers with status and throughput

PROW CI Cards (3)

#CardWhat it shows
77PROW CI MonitorOverall PROW health: success rate, job counts (running, pending, failed)
78PROW JobsFilterable job list with type, state, PR number, duration, and age
79PROW HistoryRevision history with pass/fail trends

Hardware Health Card (1)

Hardware Health
Hardware Health
#CardWhat it shows
80Hardware HealthGPU/accelerator node health with alerts, inventory, IPMI-style monitoring. Shows critical/warning counts, device search, and per-device status with disappearance tracking

Predictive Health Card (1)

Predictive Health Monitor
Predictive Health Monitor
#CardWhat it shows
81Predictive Health MonitorAI-powered failure prediction with offline node count, GPU issues, and predicted failures. Shows confidence levels, severity, and correlates with traffic patterns

ML Job & Notebook Cards (2)

#CardWhat it shows
82ML JobsRunning ML training jobs (Kubeflow, Ray, custom) with GPU count, ETA, and status
83ML NotebooksActive Jupyter/notebook servers with user, resources, and status

Kagenti AI Agent Cards (7)

#CardWhat it shows
84Kagenti OverviewAgent count, MCP tools, builds, framework breakdown (LangGraph, CrewAI, AG2)
85Agent FleetSearchable agent list with cluster, framework, replicas, and status
86Agent TopologyVisual topology of agent relationships and dependencies
87SPIFFE IdentitySPIFFE identity coverage and certificate status
88Agent BuildsBuild history with status (succeeded, failed, building)
89Agent MCP ToolsMCP tool inventory per agent
90Agent LogsAggregated agent logs with filtering

Deploy Cards (5)

#CardWhat it shows
91WorkloadsAll workloads with status, drag-to-deploy to cluster groups
92Cluster GroupsTarget groups (production, staging, edge) with health
93Deployment MissionsAI-assisted deployment missions with status tracking
94Resource MarshallCascading cluster/namespace/workload selector for resource placement
95Deployment HistoryTimeline of recent deployments with rollback options

GPU Node Health Monitor (1)

#CardWhat it shows
96GPU Node Health MonitorProactive GPU health checks across 4 tiers (Critical, Standard, Full, Deep). CronJob management, per-node results, alert integration, AI Diagnose button

Flatcar Container Linux Card (1)

#CardWhat it shows
97Flatcar Container Linux StatusFlatcar node count, OS version distribution, update status and health

Nightly E2E Test Cards (1)

#CardWhat it shows
98Nightly E2E StatusRun history dots (green=pass, red=fail, amber=GPU unavailable, blue=running), per-run metadata, log/artifact links, AI Diagnose on failures

Monitoring Cards (2)

#CardWhat it shows
99Thanos Monitoring StatusThanos sidecar, store gateway, compactor, and query health across clusters. Shows component status, replication lag, and query performance
100wasmCloud MonitoringwasmCloud host status, running actors, capability providers, and lattice health

Community-Contributed Cards (2)

CardWhat it shows
Crossplane Managed ResourcesManaged resource count, provider health, composite resource status, resource table with sync/ready status
Cloud Native BuildpacksBuild counts, success rates, active builders, recent builds with duration and builder info

Additional Cards (44+)

The console includes 44+ additional specialized cards across categories like:

  • Events - Event timeline and filtering
  • Data Compliance - Data classification and compliance checks
  • Arcade - 21 Kubernetes-themed games (AI Checkers, Kube Chess, Container Tetris, etc.)
  • Card History - Track card changes over time
  • User Management - Console user management
  • Weather, Stocks, RSS - Widget-style cards for external data

Plus any custom cards you create using the Card Factory.


Visualization Types

Cards use different ways to show data:

TypeIconWhat it looks like
Gaugeโฑ๏ธCircular progress indicator
Table๐Ÿ“‹Rows and columns of data
Timeseries๐Ÿ“ˆLine chart over time
Events๐Ÿ“œScrolling event feed
Donut๐ŸฉPie/donut chart
Bar๐Ÿ“ŠBar chart
Status๐ŸšฆStatus indicators (green/yellow/red)

Adding Cards

  1. Click the Add Card button
  2. Browse by category or search
  3. Click a card to add it
  4. Drag it where you want
  5. Click the menu to configure it

Creating Custom Cards (Card Factory)

Donโ€™t see the card you need? Create your own:

  1. Open the Card Factory
  2. Choose your method:
    • AI-Assisted - Describe what you want in plain English
    • JSON - Write a declarative card definition
    • TSX Code - Write a React component (compiled at runtime)
  3. Preview your card
  4. Add it to any dashboard

Configuring Cards

Click the menu (three dots) on any card:

  • Configure - Change settings like filters, refresh interval
  • Replace - Swap for a different card type
  • Remove - Take it off your dashboard

Common Configuration Options

  • Clusters - Show data from specific clusters
  • Namespaces - Filter to specific namespaces
  • Refresh interval - How often to update
  • Show count - How many items to display

AI Card Suggestions

In High AI mode, the console watches what you look at and suggests new cards:

  1. AI notices youโ€™re focusing on pods
  2. It suggests adding the Pod Issues card
  3. You can Accept, Snooze (1 hour), or Dismiss

This helps your dashboard evolve with your needs!