Building a self-hosted agentic OS I call AEGIS — Adaptive Execution & Generative Intelligence System. Running on a single workstation with a consumer GPU.
The core idea is a three-tier model cascade: a cloud model handles architecture and review, a local 32B model handles execution and code generation, smaller local models handle evaluation. The cloud model never executes directly — it reviews diffs and approves before anything gets committed.
The interesting problems so far: GPU arbitration across competing inference services using a distributed lock, giving local models read-only access to institutional memory before task execution so they're not flying blind, and autonomous fleet provisioning — I spun up a new server node last night without touching it after the USB went in.
Next phase is adding department queues so the system understands context — infrastructure work vs. client consulting work vs. internal tooling — and idle-time priority advisory so it starts anticipating what I need rather than waiting to be asked.
Goal is something closer to Jarvis than a chatbot. Early days but the bones are solid.
reply