Hardware Metrics Sourcing: Multi-Exporter Configuration#

Overview#

Grafazos provides configurable hardware metrics sourcing for Octez Grafana dashboards, allowing users to combine any set of metric exporters:

netdata (default): Application-aware metrics from netdata daemon
process-exporter: Process-level metrics (CPU, memory, disk I/O, file descriptors) via namedprocess_namegroup_* metrics
node-exporter: System-level metrics (storage, network) via node_* metrics
local-storage-exporter: Size on disk via local_storage_pv_used_bytes

HARDWARE_SRC accepts a comma-separated list of exporters. When multiple exporters provide the same metric, all variants appear in the same Grafana panel with an [exporter-name] legend suffix.

Key Features#

Multi-Exporter Panels: Multiple exporters can contribute queries to the same panel
Legend Suffixes: When multiple exporters match, legends are suffixed with [exporter-name]
Granular Selection: Choose exactly which exporters to use (e.g., netdata,node-exporter)
Side-by-Side Comparison: Comparison dashboard showing netdata and prom-exporters metrics side-by-side
Build Integration: Seamless integration with existing Makefile build system

Motivation#

Hardware metrics can be sourced from different monitoring backends. This system supports netdata, process-exporter, node-exporter, and local-storage-exporter, allowing users to combine them as needed.

Solution Approach#

This implementation provides flexible metric sourcing through:

selectMetrics() pattern: Each hardware metric function declares which exporters can provide it via a mapping object. All matching exporters contribute queries to the panel.
Environment Variable Control: Users specify a comma-separated list of exporters via HARDWARE_SRC
Exporter Mapping: Each metric explicitly declares its exporter (netdata, process-exporter, node-exporter, or local-storage-exporter)
Comparison Dashboard: Visual reference showing both approaches side-by-side for evaluation

Configuration Parameters#

HARDWARE_SRC Parameter#

Type: Environment variable (comma-separated list)

Default Value: netdata

Valid Exporter Names:

netdata - Application-aware metrics from netdata daemon
process-exporter - Process-level metrics (namedprocess_namegroup_*)
node-exporter - System-level metrics (node_filesystem_*, node_network_*)
local-storage-exporter - Size on disk metrics (local_storage_pv_used_bytes)

How It Works:

User sets environment variable: export HARDWARE_SRC=process-exporter,node-exporter
Makefile receives value: HARDWARE_SRC ?= netdata
Passed to jsonnet compiler: --ext-str hardware_src="$(HARDWARE_SRC)"
Parsed in base.jsonnet into a list of exporters (hardware_exporters)
selectMetrics() finds all matching exporters and returns queries for each
When multiple match, legend suffixes are added automatically

Example Usage:

# Build with default netdata metrics
make

# Build with both prometheus exporters
HARDWARE_SRC=process-exporter,node-exporter make

# Combine netdata with node-exporter (netdata for process metrics, node-exporter for system metrics)
HARDWARE_SRC=netdata,node-exporter make

# Show both netdata and process-exporter in same panels (with legend suffixes)
HARDWARE_SRC=netdata,process-exporter make

# Build comparison dashboard (shows both side-by-side, independent of HARDWARE_SRC)
make compare-hardware.jsonnet

LOGS_LABEL Parameter#

Type: Environment variable

Default Value: job

Purpose: Configure which label field to filter logs by

How It Works:

User sets environment variable: export LOGS_LABEL=service
Makefile passes to jsonnet: --ext-str logs_label="$(LOGS_LABEL)"
Used in log filter queries: {$logs_label="octez-node"}

Example Usage:

# Filter logs by 'job' label (default)
make octez-with-logs.jsonnet

# Filter logs by 'service' label
LOGS_LABEL=service make octez-with-logs.jsonnet

# Filter logs by 'app' label
LOGS_LABEL=app make octez-with-logs.jsonnet

Metrics Comparison#

Comparison#

Metric Category	netdata	process-exporter	node-exporter	local-storage-exporter
Disk I/O (Process)	✅	✅
CPU Utilization	✅	✅
Memory Usage	✅	✅
Open FDs	✅	✅
Storage/Filesystem	✅		✅	✅
Network I/O	✅		✅

Disk I/O Metrics#

netdata Approach:

// Logical disk I/O at app_group level
'netdata_app_disk_logical_io_KiB_persec_average{dimension="%s", app_group="%s"}'

Metric Name: netdata_app_disk_logical_io_KiB_persec_average
Granularity: Application group aggregation
Dimensions: “reads” and “writes” as dimension labels
Units: KiB/sec (kilobytes per second)
Coverage: Logical disk operations per app_group

Strengths:

Integrated with application group awareness
Simple aggregation level

Weaknesses:

Doesn’t show individual process breakdown
Limited to application group filtering
Logical I/O only (no physical device information)

process-exporter Approach:

// Physical disk I/O with process-level granularity
'rate(namedprocess_namegroup_read_bytes_total[1m])/1024',
'rate(namedprocess_namegroup_write_bytes_total[1m])/1024'

Metric Name: namedprocess_namegroup_*_bytes_total (counters)
Granularity: Per-process-group with groupname label
Dimensions: Read and write as separate metric series
Units: Converted to KiB/sec via rate() calculation
Coverage: Actual bytes read/written per process group

Strengths:

Per-process-group tracking with explicit groupname label
Shows actual bytes transferred (physical I/O)
Standard Prometheus counter pattern (monotonic)
Compatible with external monitoring infrastructure

Weaknesses:

Requires rate() calculation for per-second metrics
Process-exporter must be deployed separately

CPU Utilization Metrics#

netdata Approach:

'netdata_app_cpu_utilization_percentage_average{app_group="%s"}'

Metric Name: netdata_app_cpu_utilization_percentage_average
Type: Gauge (direct percentage)
Units: Percent (0-100 or 0-400 for 4 cores)
Coverage: CPU usage aggregated per app_group
Time Aggregation: Already averaged by netdata

process-exporter Approach:

'namedprocess_namegroup_cpu_seconds_total'

Metric Name: namedprocess_namegroup_cpu_seconds_total
Type: Counter (cumulative CPU time in seconds)
Units: Seconds (must be converted via irate/rate)
Coverage: Per-CPU-mode tracking (user, system)
Time Aggregation: Requires rate() calculation

Memory Usage Metrics#

netdata Approach:

'netdata_app_mem_usage_MiB_average{app_group="%s"}',     // RAM
'netdata_app_swap_usage_MiB_average{app_group="%s"}'     // Swap

Metric Names:
- netdata_app_mem_usage_MiB_average (RAM)
- netdata_app_swap_usage_MiB_average (Swap)
Type: Gauge
Units: MiB (mebibytes)
Coverage: Memory and swap usage per app_group
Breakdown: Separate metrics for RAM vs Swap

process-exporter Approach:

'namedprocess_namegroup_memory_bytes{memtype="rss"}',    // Physical RAM
'namedprocess_namegroup_memory_bytes{memtype="vms"}',    // Virtual memory
'namedprocess_namegroup_memory_bytes{memtype="swap"}'    // Swap

Metric Names: namedprocess_namegroup_memory_bytes (with memtype label)
Type: Gauge
Units: Bytes (requires /1024/1024 conversion to MiB)
Coverage: RSS (resident), VMS (virtual), and Swap per groupname
Breakdown: Three memory types via label matching

Storage/Filesystem Metrics#

netdata Approach:

'netdata_disk_space_GiB_average'  // All disks averaged together

Metric Name: netdata_disk_space_GiB_average
Coverage: Aggregate disk space
Granularity: Single value for all filesystems
Units: GiB (gibibytes)

node-exporter Approach:

'node_filesystem_size_bytes{mountpoint=~"^/$"}'  // Root filesystem only

Metric Name: node_filesystem_size_bytes (from node-exporter)
Coverage: Per-filesystem granularity via mountpoint label
Granularity: Individual filesystem data
Units: Bytes (requires /1024/1024/1024 conversion to GiB)
Label: device shows actual device (e.g., /dev/sda1)

When to Use Each#

Choose netdata when:

Running existing netdata infrastructure
Need out-of-the-box application-aware metrics
Prefer pre-aggregated, immediately-readable values
Simple deployment without additional exporters

Choose process-exporter when:

Need explicit process group identification (groupname labels)
Need fine-grained process-level insights (individual FD counts, per-group CPU time)
Running multiple process groups and need clear separation

Choose node-exporter when:

Require device-level filesystem metrics
Need system-level network traffic monitoring
Want standardized Prometheus counter/gauge patterns

Usage Instructions#

Default Build (netdata)#

cd grafazos
make

This compiles all dashboards using netdata metrics. Generated files appear in output/ directory.

Build with Prometheus Exporters#

cd grafazos
HARDWARE_SRC=process-exporter,node-exporter make

This recompiles all dashboards using process-exporter and node-exporter metrics.

Build with Multiple Exporters in Same Panel#

cd grafazos
HARDWARE_SRC=netdata,process-exporter make

When both exporters provide the same metric (e.g., CPU), both queries appear in the same panel with legend suffixes like Cpu load [netdata] and Cpu load [process-exporter].

Build Specific Dashboard#

# octez-full with both prometheus exporters
HARDWARE_SRC=process-exporter,node-exporter make octez-full.jsonnet

# octez-basic with netdata (default)
make octez-basic.jsonnet

# Comparison dashboard
make compare-hardware.jsonnet

Hardware Metrics Sourcing: Multi-Exporter Configuration

Contents

Hardware Metrics Sourcing: Multi-Exporter Configuration#

Overview#

Key Features#

Motivation#

Solution Approach#

Configuration Parameters#

HARDWARE_SRC Parameter#

LOGS_LABEL Parameter#

Metrics Comparison#

Comparison#

Disk I/O Metrics#

CPU Utilization Metrics#

Memory Usage Metrics#

Storage/Filesystem Metrics#

When to Use Each#

Usage Instructions#

Default Build (netdata)#

Build with Prometheus Exporters#

Build with Multiple Exporters in Same Panel#

Build Specific Dashboard#