Update documentation to reflect network throughput implementation: - Network sensors now show real-time RX/TX throughput in MB/s - Changed from cumulative byte counters to bandwidth monitoring - Updated sensor descriptions and features list 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
12 KiB
12 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Common Commands
Development
# Install dependencies
mix deps.get
# Compile the project
mix compile
# Run in development (non-halt mode)
mix run --no-halt
# Run tests
mix test
# Run specific test
mix test test/systant_test.exs
# Enter development shell (via Nix)
nix develop
# Run both server and dashboard together (recommended)
just dev
# or directly: hivemind
# Run components individually
just server # or: cd server && mix run --no-halt
just dashboard # or: cd dashboard && mix phx.server
# Other just commands
just deps # Install dependencies for both projects
just compile # Compile both projects
just test # Run tests for both projects
just clean # Clean both projects
Production
# Build production release
MIX_ENV=prod mix release
# Run production release
_build/prod/rel/systant/bin/systant start
Architecture Overview
This is an Elixir OTP application that serves as a systemd daemon for MQTT-based system monitoring, designed for deployment across multiple NixOS hosts to integrate with Home Assistant.
Core Components
- Systant.Application (
lib/systant/application.ex): OTP application supervisor that starts the MQTT client - Systant.MqttClient (
lib/systant/mqtt_client.ex): GenServer handling MQTT connection, metrics publishing, and command subscriptions - Systant.MqttHandler (
lib/systant/mqtt_handler.ex): Custom Tortoise handler for processing command messages with security validation - Systant.CommandExecutor (
lib/systant/command_executor.ex): Secure command execution engine with whitelist validation and audit logging - Systant.SystemMetrics (
lib/systant/system_metrics.ex): Comprehensive Linux system metrics collection with configuration support - Systant.Config (
lib/systant/config.ex): TOML-based configuration loader with environment variable overrides - Dashboard.Application (
dashboard/lib/dashboard/application.ex): Phoenix LiveView dashboard application - Dashboard.MqttSubscriber (
dashboard/lib/dashboard/mqtt_subscriber.ex): Real-time MQTT subscriber that feeds data to the LiveView dashboard
Key Libraries
- Tortoise: MQTT client library for pub/sub functionality
- Jason: JSON encoding/decoding for message payloads
- Toml: TOML configuration file parsing
- Phoenix LiveView: Real-time dashboard framework
MQTT Behavior
- Publishes comprehensive system metrics (CPU, memory, disk, GPU, network, temperature, processes) to stats topic
- Subscribes to commands topic for incoming events that can trigger user-customizable actions
- Uses hostname-based randomized client ID to avoid conflicts across multiple hosts
- Configurable startup delay (default 5 seconds) before first metrics publish
- Real-time metrics collection with configurable intervals
Configuration System
Systant uses a TOML-based configuration system with environment variable overrides:
- Config File:
systant.toml(current dir,~/.config/systant/, or/etc/systant/) - Module Control: Enable/disable metric collection modules (cpu, memory, disk, gpu, network, temperature, processes, system)
- Filtering Options: Configurable filtering for disks, network interfaces, processes
- Environment Overrides:
MQTT_HOST,MQTT_PORT,SYSTANT_INTERVAL,SYSTANT_LOG_LEVEL
Key Configuration Sections
[general]: Collection intervals, enabled modules[mqtt]: Broker settings, client ID prefix, credentials[commands]: Command execution settings, security options[[commands.available]]: User-defined command definitions with security parameters[disk]: Mount filtering, filesystem exclusions[gpu]: NVIDIA/AMD GPU limits and settings[network]: Interface filtering, traffic thresholds[processes]: Top process limits, sorting options[temperature]: CPU/sensor temperature monitoring
Default Configuration
- MQTT Host:
mqtt.home(configurable viaMQTT_HOST) - Stats Topic:
systant/${hostname}/stats(per-host topics) - Command Topic:
systant/${hostname}/commands(per-host topics) - Response Topic:
systant/${hostname}/responses(command responses) - Publish Interval: 30 seconds (configurable via
SYSTANT_INTERVAL) - Command System: Enabled by default with example commands (restart, info, df, ps, ping)
NixOS Deployment
This project includes a complete Nix packaging and NixOS module:
- Package:
nix/package.nix- Builds the Elixir release using beamPackages.mixRelease - Module:
nix/nixos-module.nix- Providesservices.systantconfiguration options - Development: Use
nix developfor development shell with Elixir/Erlang
The NixOS module supports:
- Configurable MQTT connection settings
- Per-host topic naming using
${config.networking.hostName} - Environment variable configuration for runtime settings
- Systemd service with security hardening
- Auto-restart and logging to systemd journal
Dashboard
The project includes a Phoenix LiveView dashboard (dashboard/) that provides real-time monitoring of all systant instances.
Dashboard Features
- Real-time host status updates via MQTT subscription
- LiveView interface showing all connected hosts
- Automatic reconnection and error handling
Dashboard MQTT Configuration
- Subscribes to
systant/+/statsto receive updates from all hosts - Uses hostname-based client ID:
systant-dashboard-${hostname}to avoid conflicts - Connects to
mqtt.home:1883(same broker as systant instances)
Important Implementation Notes
- Tortoise Handler: The
handle_message/3callback must return{:ok, state}, not[] - Topic Parsing: Topics may arrive as lists or strings, handle both formats
- Client ID Conflicts: Use unique client IDs to prevent connection instability
Development Roadmap
Phase 1: System Metrics Collection (Completed)
- ✅ SystemMetrics Module:
server/lib/systant/system_metrics.ex- Comprehensive metrics collection - ✅ CPU Metrics: Load averages (1/5/15min) via
/proc/loadavg - ✅ Memory Metrics: System memory data via
/proc/meminfowith usage percentages - ✅ Disk Metrics: Disk usage and capacity via
dfcommand with configurable filtering - ✅ GPU Metrics: NVIDIA (nvidia-smi) and AMD (rocm-smi) GPU monitoring with temperature, utilization, memory
- ✅ Network Metrics: Interface statistics via
/proc/net/devwith traffic filtering - ✅ Temperature Metrics: CPU temperature and lm-sensors data via system files and
sensorscommand - ✅ Process Metrics: Top processes by CPU/memory via
pscommand with configurable limits - ✅ System Info: Uptime via
/proc/uptime, kernel version, OS info, Erlang runtime data - ✅ MQTT Integration: Real metrics published with configurable intervals replacing simple messages
- ✅ Configuration System: Complete TOML-based configuration with environment overrides
- ✅ Dashboard Integration: Phoenix LiveView dashboard with real-time graphical metrics display
Implementation Details
- Uses Linux native system commands and
/procfilesystem for accuracy over Erlang os_mon - Configuration-driven metric collection with per-module enable/disable capabilities
- Advanced filtering: disk mounts/types, network interfaces, process thresholds
- Graceful error handling with fallbacks when commands/files unavailable
- JSON payload structure:
{timestamp, hostname, cpu, memory, disk, gpu, network, temperature, processes, system} - Dashboard displays metrics as progress bars and cards with color-coded status indicators
- TOML configuration with environment variable overrides for deployment flexibility
Phase 2: Command System (Completed)
- ✅ Command Execution:
server/lib/systant/command_executor.ex- Secure command processing with whitelist validation - ✅ MQTT Handler:
server/lib/systant/mqtt_handler.ex- Custom Tortoise handler for command message processing - ✅ User Configuration: Commands fully configurable via
systant.tomlwith security parameters - ✅ MQTT Integration: Commands via
systant/{hostname}/commands, responses viasystant/{hostname}/responses - ✅ Security Features: Whitelist-only execution, parameter validation, timeouts, comprehensive logging
- ✅ Built-in Commands:
listcommand shows all available user-defined commands
Command System Features
- User-Configurable Commands: Define custom commands in
systant.tomlwith triggers, allowed parameters, timeouts - Enterprise Security: No arbitrary shell execution, strict parameter validation, execution timeouts
- Simple Interface: Send
{"command":"trigger","params":[...]}, receive structured JSON responses - Request Tracking: Auto-generated request IDs for command/response correlation
- Comprehensive Logging: Full audit trail of all command executions with timing and results
Example Command Usage
# Send commands via MQTT
mosquitto_pub -t "systant/hostname/commands" -m '{"command":"list"}'
mosquitto_pub -t "systant/hostname/commands" -m '{"command":"info"}'
mosquitto_pub -t "systant/hostname/commands" -m '{"command":"df","params":["/home"]}'
mosquitto_pub -t "systant/hostname/commands" -m '{"command":"restart","params":["nginx"]}'
# Listen for responses
mosquitto_sub -t "systant/+/responses"
Phase 3: Home Assistant Integration (Completed)
- ✅ MQTT Auto-Discovery:
server/lib/systant/ha_discovery.ex- Publishes HA discovery configurations for automatic device registration - ✅ Device Registration: Creates unified "Systant {hostname}" device in Home Assistant with comprehensive sensor suite
- ✅ Sensor Auto-Discovery: CPU load averages, memory usage, system uptime, temperatures, GPU metrics, disk usage, network throughput
- ✅ Configuration Integration: TOML-based enable/disable with
homeassistant.discovery_enabledsetting - ✅ Value Templates: Proper JSON path extraction for nested metrics data with error handling
- ✅ Real-time Updates: Seamless integration with existing MQTT stats publishing - no additional topics needed
Home Assistant Integration Features
- Automatic Discovery: No custom integration required - uses standard MQTT discovery protocol
- Device Grouping: All sensors grouped under single "Systant {hostname}" device for clean organization
- Comprehensive Metrics: CPU, memory, disk, GPU (NVIDIA/AMD), network throughput, temperature, and system sensors
- Configuration Control: Enable/disable discovery via
systant.tomlconfiguration - Template Flexibility: Advanced Jinja2 templates handle optional/missing data gracefully
- Topic Structure: Discovery on
homeassistant/#, stats remain onsystant/{hostname}/stats
Setup Instructions
- Configure MQTT Discovery: Set
homeassistant.discovery_enabled = trueinsystant.toml - Start Systant: Discovery messages published automatically on startup (1s after MQTT connection)
- Check Home Assistant: Device and sensors appear automatically in MQTT integration
- Verify Metrics: All sensors should show current values within 30 seconds
Available Sensors
- CPU: Load averages (1m, 5m, 15m), temperature
- Memory: Usage percentage, used/total in GB
- Disk: Root and home filesystem usage percentages
- GPU: NVIDIA/AMD utilization, temperature, memory usage
- Network: RX/TX throughput in MB/s for primary interface (real-time bandwidth monitoring)
- System: Uptime in hours, kernel version, online status
Future Plans
- Multi-host deployment for comprehensive system monitoring
- Advanced alerting and threshold monitoring
- Historical data retention and trending