Objective
Set up Prometheus and Grafana for TON node metrics.kube-prometheus-stack is recommended because the chart includes a ServiceMonitor template for automatic scrape discovery.
Prerequisites
-
Enable the metrics HTTP server in node config (
config.json):The server exposes/metrics(Prometheus format),/healthz(liveness), and/readyz(readiness). Ifmetricsis absent, the server is not started. -
Set
ports.metricsin Helm values:The port must match themetrics.addressport in node config.
Network security
The metrics port is never exposed on public per-replicaLoadBalancer services. The chart creates a dedicated internal <release>-metrics ClusterIP service instead, accessible only inside the cluster.
External metrics access can be added with a custom LoadBalancer service that targets the metrics port. The recommended approach is an ingress with authentication (basic auth, OAuth2 proxy, and similar) that proxies to <release>-metrics.
Quick start
Minimal values to enable metrics, probes, andServiceMonitor:
Not runnable
ServiceMonitor configuration
EnableServiceMonitor so kube-prometheus-stack discovers and scrapes node metrics automatically:
Not runnable
Label matching
Some Prometheus Operator installations filterServiceMonitor resources by labels (serviceMonitorSelector in the Prometheus custom resource). If a Prometheus instance requires labels:
Not runnable
Scrape interval
By default,ServiceMonitor inherits the global Prometheus scrape interval (typically 30s). To override:
Not runnable
Cross-namespace monitoring
If Prometheus runs in a different namespace, set theServiceMonitor namespace to the namespace where Prometheus looks:
Not runnable
namespaceSelector is added automatically so Prometheus can discover services in the release namespace.
Alternative: Prometheus annotations
If Prometheus Operator is not used and services are scraped throughprometheus.io/* annotations:
Not runnable
prometheus.io/scrape, prometheus.io/port, and prometheus.io/path to the <release>-metrics ClusterIP service.
Alternative: static scrape config
For other Prometheus setups, the metrics endpoint is available through the internalClusterIP service:
<release>-metrics.<namespace>.svc.cluster.local
Grafana dashboard
The Grafana dashboard is authored as TypeScript with Grafana Foundation SDK and generated to JSON. Dashboard source is available in TON Rust Node Grafana source. Generated output file name iston-node-overview.json.
The dashboard uses two multi-select template variables:
networknode_id
global_labels in node metrics config.
Dashboard sections:
- Node Status
- Build Info
- Transactions per second
- Sync and Block Progress
- Validation and Collation
- Outbound Message Queue
- Network
- Database and Storage
Generate dashboard JSON
Run from the TON Rust Node repository root.bun run generate writes ton-node-overview.json.
Import into Grafana
- Open Dashboards > New > Import.
- Upload
ton-node-overview.json. - Select a Prometheus data source.
- Click Import.
Edit workflow
- Edit dashboard TypeScript source files.
- Run
bun run generate. - Import the generated JSON and verify panels.
- Commit TypeScript source files. The generated JSON file is ignored by Git.
Alert rules
PrometheusRule resources can be created to trigger alerts based on TON node metrics.