# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Common Commands ### Development ```bash # Install dependencies mix deps.get # Compile the project mix compile # Run in development (non-halt mode) mix run --no-halt # Run tests mix test # Run specific test mix test test/systant_test.exs # Enter development shell (via Nix) nix develop # Run both server and dashboard together (recommended) just dev # or directly: hivemind # Run components individually just server # or: cd server && mix run --no-halt just dashboard # or: cd dashboard && mix phx.server # Other just commands just deps # Install dependencies for both projects just compile # Compile both projects just test # Run tests for both projects just clean # Clean both projects ``` ### Production ```bash # Build production release MIX_ENV=prod mix release # Run production release _build/prod/rel/systant/bin/systant start ``` ## Architecture Overview This is an Elixir OTP application that serves as a systemd daemon for MQTT-based system monitoring, designed for deployment across multiple NixOS hosts to integrate with Home Assistant. ### Core Components - **Systant.Application** (`lib/systant/application.ex`): OTP application supervisor that starts the MQTT client - **Systant.MqttClient** (`lib/systant/mqtt_client.ex`): GenServer handling MQTT connection, metrics publishing, and command subscriptions - **Systant.MqttHandler** (`lib/systant/mqtt_handler.ex`): Custom Tortoise handler for processing command messages with security validation - **Systant.CommandExecutor** (`lib/systant/command_executor.ex`): Secure command execution engine with whitelist validation and audit logging - **Systant.SystemMetrics** (`lib/systant/system_metrics.ex`): Comprehensive Linux system metrics collection with configuration support - **Systant.Config** (`lib/systant/config.ex`): TOML-based configuration loader with environment variable overrides - **Dashboard.Application** (`dashboard/lib/dashboard/application.ex`): Phoenix LiveView dashboard application - **Dashboard.MqttSubscriber** (`dashboard/lib/dashboard/mqtt_subscriber.ex`): Real-time MQTT subscriber that feeds data to the LiveView dashboard ### Key Libraries - **Tortoise**: MQTT client library for pub/sub functionality - **Jason**: JSON encoding/decoding for message payloads - **Toml**: TOML configuration file parsing - **Phoenix LiveView**: Real-time dashboard framework ### MQTT Behavior - Publishes comprehensive system metrics (CPU, memory, disk, GPU, network, temperature, processes) to stats topic - Subscribes to commands topic for incoming events that can trigger user-customizable actions - Uses hostname-based randomized client ID to avoid conflicts across multiple hosts - Configurable startup delay (default 5 seconds) before first metrics publish - Real-time metrics collection with configurable intervals - **Connection verification**: Tests MQTT connectivity on startup with timeout-based validation - **Graceful shutdown**: Exits cleanly via `System.stop(1)` when MQTT broker unavailable (prevents crash dumps) ### Configuration System Systant uses a TOML-based configuration system with environment variable overrides: - **Config File**: `systant.toml` (current dir, `~/.config/systant/`, or `/etc/systant/`) - **Module Control**: Enable/disable metric collection modules (cpu, memory, disk, gpu, network, temperature, processes, system) - **Filtering Options**: Configurable filtering for disks, network interfaces, processes - **Environment Overrides**: `MQTT_HOST`, `MQTT_PORT`, `SYSTANT_INTERVAL`, `SYSTANT_LOG_LEVEL` #### Key Configuration Sections - `[general]`: Collection intervals, enabled modules - `[mqtt]`: Broker settings, client ID prefix, credentials - `[commands]`: Command execution settings, security options - `[[commands.available]]`: User-defined command definitions with security parameters - `[disk]`: Mount filtering, filesystem exclusions - `[gpu]`: NVIDIA/AMD GPU limits and settings - `[network]`: Interface filtering, traffic thresholds - `[processes]`: Top process limits, sorting options - `[temperature]`: CPU/sensor temperature monitoring ### Default Configuration - **MQTT Host**: `mqtt.home` (configurable via `MQTT_HOST`) - **Stats Topic**: `systant/${hostname}/stats` (per-host topics) - **Command Topic**: `systant/${hostname}/commands` (per-host topics) - **Response Topic**: `systant/${hostname}/responses` (command responses) - **Publish Interval**: 30 seconds (configurable via `SYSTANT_INTERVAL`) - **Command System**: Enabled by default with example commands (restart, info, df, ps, ping) ### NixOS Deployment This project includes a complete Nix packaging and NixOS module: - **Package**: `nix/package.nix` - Builds the Elixir release using beamPackages.mixRelease - **Module**: `nix/nixos-module.nix` - Provides `services.systant` configuration options - **Development**: Use `nix develop` for development shell with Elixir/Erlang The NixOS module supports: - Configurable MQTT connection settings - Per-host topic naming using `${config.networking.hostName}` - Environment variable configuration for runtime settings - Systemd service with security hardening - Auto-restart and logging to systemd journal ## Dashboard The project includes a Phoenix LiveView dashboard (`dashboard/`) that provides real-time monitoring of all systant instances. ### Dashboard Features - Real-time host status updates via MQTT subscription - LiveView interface showing all connected hosts - Automatic reconnection and error handling ### Dashboard MQTT Configuration - Subscribes to `systant/+/stats` to receive updates from all hosts - Uses hostname-based client ID: `systant-dashboard-${hostname}` to avoid conflicts - Connects to `mqtt.home:1883` (same broker as systant instances) ### Important Implementation Notes - **Tortoise Handler**: The `handle_message/3` callback must return `{:ok, state}`, not `[]` - **Topic Parsing**: Topics may arrive as lists or strings, handle both formats - **Client ID Conflicts**: Use unique client IDs to prevent connection instability ## Development Roadmap ### Phase 1: System Metrics Collection (Completed) - ✅ **SystemMetrics Module**: `server/lib/systant/system_metrics.ex` - Comprehensive metrics collection - ✅ **CPU Metrics**: Load averages (1/5/15min) via `/proc/loadavg` - ✅ **Memory Metrics**: System memory data via `/proc/meminfo` with usage percentages - ✅ **Disk Metrics**: Disk usage and capacity via `df` command with configurable filtering - ✅ **GPU Metrics**: NVIDIA (nvidia-smi) and AMD (rocm-smi) GPU monitoring with temperature, utilization, memory - ✅ **Network Metrics**: Interface statistics via `/proc/net/dev` with traffic filtering - ✅ **Temperature Metrics**: CPU temperature and lm-sensors data via system files and `sensors` command - ✅ **Process Metrics**: Top processes by CPU/memory via `ps` command with configurable limits - ✅ **System Info**: Uptime via `/proc/uptime`, kernel version, OS info, Erlang runtime data - ✅ **MQTT Integration**: Real metrics published with configurable intervals replacing simple messages - ✅ **Configuration System**: Complete TOML-based configuration with environment overrides - ✅ **Dashboard Integration**: Phoenix LiveView dashboard with real-time graphical metrics display #### Implementation Details - Uses Linux native system commands and `/proc` filesystem for accuracy over Erlang os_mon - Configuration-driven metric collection with per-module enable/disable capabilities - Advanced filtering: disk mounts/types, network interfaces, process thresholds - Graceful error handling with fallbacks when commands/files unavailable - JSON payload structure: `{timestamp, hostname, cpu, memory, disk, gpu, network, temperature, processes, system}` - Dashboard displays metrics as progress bars and cards with color-coded status indicators - TOML configuration with environment variable overrides for deployment flexibility ### Phase 2: Command System (Completed) - ✅ **Command Execution**: `server/lib/systant/command_executor.ex` - Secure command processing with whitelist validation - ✅ **MQTT Handler**: `server/lib/systant/mqtt_handler.ex` - Custom Tortoise handler for command message processing - ✅ **User Configuration**: Commands fully configurable via `systant.toml` with security parameters - ✅ **MQTT Integration**: Commands via `systant/{hostname}/commands`, responses via `systant/{hostname}/responses` - ✅ **Security Features**: Whitelist-only execution, parameter validation, timeouts, comprehensive logging - ✅ **Built-in Commands**: `list` command shows all available user-defined commands #### Command System Features - **User-Configurable Commands**: Define custom commands in `systant.toml` with triggers, allowed parameters, timeouts - **Enterprise Security**: No arbitrary shell execution, strict parameter validation, execution timeouts - **Simple Interface**: Send `{"command":"trigger","params":[...]}`, receive structured JSON responses - **Request Tracking**: Auto-generated request IDs for command/response correlation - **Comprehensive Logging**: Full audit trail of all command executions with timing and results #### Example Command Usage ```bash # Send commands via MQTT mosquitto_pub -t "systant/hostname/commands" -m '{"command":"list"}' mosquitto_pub -t "systant/hostname/commands" -m '{"command":"info"}' mosquitto_pub -t "systant/hostname/commands" -m '{"command":"df","params":["/home"]}' mosquitto_pub -t "systant/hostname/commands" -m '{"command":"restart","params":["nginx"]}' # Listen for responses mosquitto_sub -t "systant/+/responses" ``` ### Phase 3: Home Assistant Integration (Completed) - ✅ **MQTT Auto-Discovery**: `server/lib/systant/ha_discovery.ex` - Publishes HA discovery configurations for automatic device registration - ✅ **Device Registration**: Creates unified "Systant {hostname}" device in Home Assistant with comprehensive sensor suite - ✅ **Sensor Auto-Discovery**: CPU load averages, memory usage, system uptime, temperatures, GPU metrics, disk usage, network throughput - ✅ **Configuration Integration**: TOML-based enable/disable with `homeassistant.discovery_enabled` setting - ✅ **Value Templates**: Proper JSON path extraction for nested metrics data with error handling - ✅ **Real-time Updates**: Seamless integration with existing MQTT stats publishing - no additional topics needed #### Home Assistant Integration Features - **Automatic Discovery**: No custom integration required - uses standard MQTT discovery protocol - **Device Grouping**: All sensors grouped under single "Systant {hostname}" device for clean organization - **Comprehensive Metrics**: CPU, memory, disk, GPU (NVIDIA/AMD), network throughput, temperature, and system sensors - **Configuration Control**: Enable/disable discovery via `systant.toml` configuration - **Template Flexibility**: Advanced Jinja2 templates handle optional/missing data gracefully - **Topic Structure**: Discovery on `homeassistant/#`, stats remain on `systant/{hostname}/stats` #### Setup Instructions 1. **Configure MQTT Discovery**: Set `homeassistant.discovery_enabled = true` in `systant.toml` 2. **Start Systant**: Discovery messages published automatically on startup (1s after MQTT connection) 3. **Check Home Assistant**: Device and sensors appear automatically in MQTT integration 4. **Verify Metrics**: All sensors should show current values within 30 seconds #### Available Sensors - **CPU**: Load averages (1m, 5m, 15m), temperature - **Memory**: Usage percentage, used/total in GB - **Disk**: Root and home filesystem usage percentages - **GPU**: NVIDIA/AMD utilization, temperature, memory usage - **Network**: RX/TX throughput in MB/s for primary interface (real-time bandwidth monitoring) - **System**: Uptime in hours, kernel version, online status ### Future Plans - Multi-host deployment for comprehensive system monitoring - Advanced alerting and threshold monitoring - Historical data retention and trending