Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

How Joular Core Works

This page describes the internal architecture of Joular Core: how it reads hardware power data, how it attributes that power to processes and applications, how it produces output, and how the different subsystems fit together.

High-Level Architecture

At startup, Joular Core:

  1. Parses command-line arguments
  2. Detects the platform and initialises the appropriate power readers and CPU utilisation trackers
  3. Optionally calibrates an idle CPU baseline
  4. Enters a monitoring loop that fires once per second
  5. On each iteration: reads power, reads CPU utilisation, attributes process/app power, and sends the result to all configured output channels
  6. On Ctrl+C: flushes outputs and exits cleanly
┌──────────────────────────────────────────────────────────┐
│                   Argument Parsing                       │
└──────────────────────────┬───────────────────────────────┘
                           │
┌──────────────────────────▼───────────────────────────────┐
│                  Platform Setup                          │
│  Detect OS → create CPU/GPU energy readers               │
│             → create CPU utilisation trackers            │
│             → set up ring buffer (if -r)                 │
│             → set up API server (if --api-port)          │
└──────┬──────────────────────┬──────────────────┬─────────┘
       │                      │                  │
       ▼                      ▼                  ▼
  Monitor Loop          Ring Buffer          API Server
  (1 Hz)                Writer               (async)
       │
       ├── CPU energy reader
       ├── GPU energy reader
       ├── CPU utilisation reader
       ├── Process tracker (optional)
       └── App tracker (optional)
            │
            ▼
       MonitorSample
            │
            ├── Terminal output
            ├── CSV file output
            ├── Ring buffer write
            └── API broadcast

Platform Abstraction

The codebase is built around three traits defined in src/energy.rs:

  • CPUEnergy — returns the current CPU power in watts via get_power()
  • GPUEnergy — returns the current GPU power in watts via get_power()
  • PlatformEnergy — a factory that creates the above readers and the CPU utilisation trackers, and provides the process/app power attribution formula

Each supported OS has a concrete implementation of these traits. The monitoring loop talks only to these trait objects, so it is identical on every platform.

Linux

CPU power is read from the Intel RAPL (Running Average Power Limit) sysfs interface at /sys/class/powercap/intel-rapl/. RAPL is supported on Intel processors since Sandy Bridge (2011) and on AMD processors since Ryzen. The interface exposes cumulative energy counters in microjoules; Joular Core reads two consecutive values a fixed time apart and converts the delta to watts.

CPU utilisation is computed from /proc/stat. Joular Core reads the user, nice, system, idle, iowait, irq, softirq, and steal tick counters on each sample, computes the delta from the previous sample, and calculates utilisation as:

utilisation = 1 − (Δidle / Δtotal)

Per-process utilisation is computed from /proc/<pid>/stat, which exposes cumulative user and kernel CPU time (in clock ticks) for each process. Joular Core computes the delta from the previous sample normalised by the total CPU time delta from /proc/stat.

GPU power on Linux is read by:

  • Nvidia: calling nvidia-smi --query-gpu=power.draw --format=csv,noheader,nounits and parsing the output. Power values from all GPUs are summed.
  • AMD: calling amd-smi or rocm-smi with JSON output and extracting the power fields.

Windows

CPU power is read from Hubblo’s RAPL Windows kernel driver via DeviceIoControl. The driver exposes RAPL data through a Windows device interface, returning power in watts directly.

CPU utilisation is read from the Win32 API via GetSystemTimes(), which returns idle, kernel, and user times as FILETIME structures. Utilisation is computed from the deltas between two successive calls.

Per-process CPU time is read via GetProcessTimes(). Application monitoring enumerates processes using the Windows Toolhelp32 API (CreateToolhelp32Snapshot, Process32First, Process32Next).

GPU power on Windows follows the same approach as Linux: nvidia-smi for Nvidia, amd-smi for AMD.

macOS

CPU, GPU, and overall system power are all read from Apple’s powermetrics tool, which ships with macOS. Joular Core spawns powermetrics as a subprocess with JSON output format and parses the result. On Apple Silicon, powermetrics reports CPU and GPU power separately; on Intel Macs it reports CPU power only.

Because powermetrics requires elevated privileges to access hardware counters, Joular Core must be run with elevated access on macOS.

CPU utilisation on macOS is read via the Mach kernel API (host_statistics64 with HOST_CPU_LOAD_INFO), which returns per-CPU user, system, and idle tick counters.

Process and application monitoring on macOS uses Mach task info APIs to read per-process CPU times.

Single-Board Computers (SBC)

SBC platforms (Raspberry Pi, Asus Tinker Board) do not have a hardware power interface accessible to software. Instead, Joular Core calculates CPU power from CPU utilisation using polynomial regression models:

power = c₀ + c₁·u + c₂·u² + … + cₙ·uⁿ

where u is the current CPU utilisation (0–100) and c₀…cₙ are model coefficients measured empirically for each specific board model and revision.

The built-in models cover all supported Raspberry Pi models. A custom model can be supplied via the SBC_POWER_MODEL_JSON environment variable; the file format must match the Joular Power Models Database schema.

GPU power is always 0 on SBC platforms.

Virtual Machines

In a VM, there is no direct hardware interface. Joular Core reads power from a shared file written by the host. The file is read on every sampling cycle. If the file is absent or empty, the power value for that sample is 0. The supported file formats are described in Virtual Machines.

The Monitoring Loop

The monitoring loop is in src/monitor.rs (JoularCoreMonitor::poll). It runs once per second and produces a MonitorSample:

#![allow(unused)]
fn main() {
pub struct MonitorSample {
    pub timestamp: u64,       // Unix epoch seconds
    pub cpu_power: f64,       // Watts
    pub gpu_power: f64,       // Watts
    pub total_power: f64,     // cpu_power + gpu_power
    pub cpu_usage: f64,       // Percentage (0–100)
    pub process_power: Option<f64>,       // Watts, if -p is set
    pub app_power: Option<(f64, usize)>,  // (Watts, PID count), if -a is set
}
}

Each field:

  • timestamp: wall-clock time at the moment of reading (SystemTime::now())
  • cpu_power / gpu_power: raw readings from the hardware interface
  • total_power: sum of CPU and GPU power
  • cpu_usage: system-wide CPU utilisation percentage
  • process_power / app_power: attributed power, computed as described below

Before the main loop starts, loop_init() takes one throwaway reading to warm up the energy counters. RAPL counts cumulative energy since boot; the first real delta needs a prior reference value.

Process and Application Power Attribution

Joular Core attributes CPU power to a process (or application) using a proportional model:

process_power = 100 × (process_cpu_utilisation × attributed_cpu_power) / system_cpu_utilisation

where:

  • process_cpu_utilisation is the fraction of total CPU time used by the process (or the sum of all PIDs for an application) in the last second
  • attributed_cpu_power is max(0, cpu_power − idle_baseline) — the raw CPU power minus the idle baseline (zero if no baseline is configured)
  • system_cpu_utilisation is the overall CPU utilisation percentage

If system_cpu_utilisation is zero (the CPU is completely idle), the attributed power is zero to avoid a division by zero.

This model assumes that a process’s share of CPU power is proportional to its share of CPU time. This is an approximation — it does not account for frequency scaling within a core, NUMA topology, or work done in kernel threads on behalf of the process — but it is a practical and widely used approach for software-level power attribution.

CPU Idle Baseline

When the --cpu-idle-baseline or --calibrate-cpu-idle-baseline option is set, the baseline is subtracted from the raw CPU power before attribution:

attributed_cpu_power = max(0, cpu_power − baseline)

This removes the base power the CPU consumes just running the operating system at rest, so that the attributed power more accurately reflects the energy consumed by the workload itself.

Auto-calibration (--calibrate-cpu-idle-baseline) collects 5 power samples at 1-second intervals before starting the main loop and uses their average as the baseline.

Application Monitoring and PID Refresh

When monitoring a named application (-a), Joular Core maintains a list of all PIDs whose process name matches the supplied string. This list is refreshed periodically (every --app-refresh-interval seconds, default 3) to pick up new processes spawned after monitoring started and drop ones that have exited. Setting the interval to 0 rescans on every second.

Output Pipeline

After each poll() call, the MonitorSample is sent to an OutputBundle, which dispatches it to all active output channels:

  1. Terminal (OutputWriter in Terminal mode): formats the reading with ANSI colour codes and writes it to stdout, overwriting the previous line using carriage return and ANSI erase-line sequences. In numeric mode (-i), a bare float is printed instead.

  2. CSV file (OutputWriter in CsvFile mode): appends one row to the CSV file. In overwrite mode (-o), the file is truncated to zero before each write so only the latest row is kept.

  3. Ring buffer (RingBufferWriter): writes 5 f64 values to a shared-memory region. On Linux and macOS this is a memory-mapped file (memmap2); on Windows it uses Win32 file mapping (CreateFileMapping / MapViewOfFile).

  4. API (when the api feature is enabled): broadcasts an ApiData struct to all connected HTTP and WebSocket clients via a Tokio broadcast channel. The HTTP handler (GET /data) reads the latest broadcast value; the WebSocket handler (/ws) streams each new broadcast as a JSON message.

API Server

The API server (src/api.rs) is built with the Axum web framework running on a Tokio async runtime. It runs in a separate async task alongside the synchronous monitoring loop. Communication between the two is done via a tokio::sync::broadcast::Sender<ApiData>.

  • GET /data: the handler subscribes to the broadcast channel and immediately returns the last received value as JSON.
  • /ws: the WebSocket handler subscribes to the broadcast channel and pushes each new value as a JSON message to the connected client.
  • CORS is enabled via tower-http’s CorsLayer, so the API can be consumed directly from browser-based dashboards.

Graceful Shutdown

Joular Core installs a Ctrl+C handler via the ctrlc crate. When the signal is received:

  1. A shared Arc<AtomicBool> flag is set to false, causing the monitoring loop to exit after the current iteration.
  2. The ANSI cursor (hidden at startup) is restored.
  3. All buffered output is flushed and file handles are dropped cleanly.

Code Layout

PathPurpose
src/main.rsCLI entry point: argument parsing, monitoring loop, output setup
src/maingui.rsGUI binary entry point
src/lib.rsLibrary root: re-exports all public modules
src/args.rsClap-derived argument struct
src/common.rsJoularContext: platform detection and component initialisation
src/monitor.rsJoularCoreMonitor: the sampling loop and MonitorSample
src/energy.rsCPUEnergy, GPUEnergy, PlatformEnergy traits
src/cpu.rsCPUUtilization, ProcessCPUUtilization, AppCPUUtilization traits and Linux implementations
src/output.rsOutputWriter, OutputBundle, OutputSink traits
src/ringbuffer.rsRingBufferWriter: shared-memory ring buffer for Linux/macOS/Windows
src/api.rsAxum-based HTTP and WebSocket API server
src/logging.rstracing-based structured logging helpers
src/platform/linux.rsLinux-specific power readers and process trackers
src/platform/windows.rsWindows-specific power readers and process trackers
src/platform/macos.rsmacOS-specific powermetrics integration and process trackers
src/platform/sbc.rsSBC regression model power calculation
src/platform/nvidia.rsNvidia GPU power via nvidia-smi
src/platform/amdgpu.rsAMD GPU power via amd-smi / rocm-smi
src/vm.rsVirtual machine shared-file power reader
src/gui/egui-based GUI: model, views, history, theme

Key Dependencies

CrateVersionPurpose
clap4.xCommand-line argument parsing
egui / eframe0.34Cross-platform GUI framework
axum0.8HTTP and WebSocket API server
tokio1.xAsync runtime for the API server
tower-http0.6CORS middleware
serde / serde_json1.xJSON serialisation for the API
sysinfo0.38System and process information (used for process enumeration)
memmap20.9Memory-mapped I/O for the Unix ring buffer
windows0.62Win32 API bindings
mach20.6Mach kernel API bindings (macOS)
libc0.2C library bindings (macOS)
ctrlc3.xCross-platform Ctrl+C handler
tracing0.1Structured logging
rfd0.17Native file dialog (GUI file picker)