systant/CLAUDE.md
ryan eff32b3233 Implement comprehensive system metrics collection with real-time monitoring
## System Metrics Collection
- Add SystemMetrics module with CPU, memory, disk, and system info collection
- Integrate Erlang :os_mon application (cpu_sup, memsup, disksup)
- Collect and format active system alarms with structured JSON output
- Replace simple "Hello" messages with rich system data in MQTT payloads

## MQTT Integration
- Update MqttClient to publish comprehensive metrics every 30 seconds
- Add :os_mon to application dependencies for system monitoring
- Maintain backward compatibility with existing dashboard consumption

## Documentation Updates
- Update CLAUDE.md with Phase 1 completion status and implementation details
- Completely rewrite README.md to reflect current project capabilities
- Document alarm format, architecture, and development workflow

## Technical Improvements
- Graceful error handling for metrics collection failures
- Clean alarm formatting: {severity, path/details, id}
- Dashboard automatically receives and displays real-time system data and alerts

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-05 12:48:44 -07:00

5.6 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Common Commands

Development

# Install dependencies
mix deps.get

# Compile the project
mix compile

# Run in development (non-halt mode)
mix run --no-halt

# Run tests
mix test

# Run specific test
mix test test/systant_test.exs

# Enter development shell (via Nix)
nix develop

# Run dashboard (Phoenix LiveView)
cd dashboard && mix phx.server
# or use justfile: just dashboard

Production

# Build production release
MIX_ENV=prod mix release

# Run production release
_build/prod/rel/systant/bin/systant start

Architecture Overview

This is an Elixir OTP application that serves as a systemd daemon for MQTT-based system monitoring, designed for deployment across multiple NixOS hosts to integrate with Home Assistant.

Core Components

  • Systant.Application (lib/systant/application.ex): OTP application supervisor that starts the MQTT client
  • Systant.MqttClient (lib/systant/mqtt_client.ex): GenServer that handles MQTT connection, publishes stats every 30 seconds, and listens for commands
  • Dashboard.Application (dashboard/lib/dashboard/application.ex): Phoenix LiveView dashboard application
  • Dashboard.MqttSubscriber (dashboard/lib/dashboard/mqtt_subscriber.ex): Real-time MQTT subscriber that feeds data to the LiveView dashboard
  • Configuration: MQTT settings configurable via environment variables or config files

Key Libraries

  • Tortoise: MQTT client library for pub/sub functionality
  • Jason: JSON encoding/decoding for message payloads

MQTT Behavior

  • Publishes "Hello from systant" messages with timestamp and hostname to stats topic every 30 seconds
  • Subscribes to commands topic for incoming events that can trigger user-customizable actions
  • Uses randomized client ID to avoid conflicts across multiple hosts
  • Sends immediate hello message on startup

Default Configuration

  • MQTT Host: mqtt.home (not localhost)
  • Stats Topic: systant/${hostname}/stats (per-host topics)
  • Command Topic: systant/${hostname}/commands (per-host topics)
  • Publish Interval: 30 seconds

NixOS Deployment

This project includes a complete Nix packaging and NixOS module:

  • Package: nix/package.nix - Builds the Elixir release using beamPackages.mixRelease
  • Module: nix/nixos-module.nix - Provides services.systant configuration options
  • Development: Use nix develop for development shell with Elixir/Erlang

The NixOS module supports:

  • Configurable MQTT connection settings
  • Per-host topic naming using ${config.networking.hostName}
  • Environment variable configuration for runtime settings
  • Systemd service with security hardening
  • Auto-restart and logging to systemd journal

Dashboard

The project includes a Phoenix LiveView dashboard (dashboard/) that provides real-time monitoring of all systant instances.

Dashboard Features

  • Real-time host status updates via MQTT subscription
  • LiveView interface showing all connected hosts
  • Automatic reconnection and error handling

Dashboard MQTT Configuration

  • Subscribes to systant/+/stats to receive updates from all hosts
  • Uses hostname-based client ID: systant-dashboard-${hostname} to avoid conflicts
  • Connects to mqtt.home:1883 (same broker as systant instances)

Important Implementation Notes

  • Tortoise Handler: The handle_message/3 callback must return {:ok, state}, not []
  • Topic Parsing: Topics may arrive as lists or strings, handle both formats
  • Client ID Conflicts: Use unique client IDs to prevent connection instability

Development Roadmap

Phase 1: System Metrics Collection (Completed)

  • SystemMetrics Module: server/lib/systant/system_metrics.ex - Comprehensive metrics collection
  • CPU Metrics: Load averages (1/5/15min) and utilization via :cpu_sup
  • Memory Metrics: System memory data and monitoring via :memsup
  • Disk Metrics: Disk usage and capacity for all mounted drives via :disksup
  • System Info: Uptime, Erlang/OTP versions, scheduler info
  • System Alarms: Active os_mon alarms (disk_almost_full, memory_high_watermark, etc.)
  • MQTT Integration: Real metrics published every 30 seconds replacing simple messages
  • 🔄 Network Metrics: TODO - Interface statistics, bandwidth utilization
  • 🔄 GPU Metrics: TODO - NVIDIA/AMD GPU utilization, temperatures, memory usage

Implementation Details

  • Uses Erlang's built-in :os_mon application (cpu_sup, memsup, disksup)
  • Collects active system alarms from :alarm_handler with structured format
  • Graceful error handling with fallbacks when metrics unavailable
  • JSON payload structure: {timestamp, hostname, cpu, memory, disk, system, alarms}
  • Dashboard automatically receives and displays real-time system data and alerts
  • Alarm format: {severity, path/details, id} for clean consumption

Phase 2: Command System

  • Subscribe to systant/+/commands in MqttClient
  • Implement secure command execution framework with validation/whitelisting
  • Support commands like: restart services, update packages, system queries
  • Response mechanism to send command results back via MQTT

Phase 3: Home Assistant Integration

  • Custom MQTT integration following Home Assistant patterns
  • Auto-discovery of systant hosts via MQTT discovery protocol
  • Create entities for metrics (sensors) and commands (buttons/services)
  • Dashboard cards and automation support

Future Plans

  • Multi-host deployment for comprehensive system monitoring
  • Advanced alerting and threshold monitoring
  • Historical data retention and trending