The sidebar method is better than agents for many reasons.
The main one is I'd rather review the limited changes of a one shot prompt than ton of changes made by an agent.
Another being the massive use of tokens agents force you into. As the context window fills up, the quality goes down, and the token usage spikes as the model traps itself into low confidence reasoning loops.
I tried to outline the project features in brief at the start of the page. I'm sorry if I didn't communicate it clearly. Feel free to pinpoint where you were thrown off.
You need immutability so you don't have to worry/waste time on maintaining the system, and instead can focus on your containers.
Lightwhale isn't a special Debian variant. It's built ground up with Buildroot. It is literally purpose-built for this task.
A package manager doesn't remove the burden of maintenance, it just makes it easier. But it's still maintenance. I'm basically arguing it's unnecessary as long as you have a Docker Engine.
Snap and flatpak, while totally different concepts, I agree.
Linux (or GNU) doesn't solve any of this by itself.
True, the root account is a layer of security. But there are still may other problem even attack surfaces on a system.
Indeed, Debian stable with podman/Docker is "immutable enough" for me.
It is also the insurance that I will get help whenever I'm stuck.
Sure it could be smaller ... but when it already runs fine on any hardware, even weird stuff like a BananaPi with a low-end RISC-V processor, then I have a difficult time wanting anything else.
I won't try to change your mind, but I think you should know that...
Debian isn't immutable at all. Everything is writable from the partition table, the kernel, to the C library and every executable.
Lightwhale has very few moving parts, making it less likely to get stuck. The single-page website is easily searchable and has information to get most people started. The Lightwhale Discord server answers the rest.
Here's a link to the advantages that immutability of Lightwhale brings to the table, just to give you an idea what you're dealing with: https://lightwhale.asklandd.dk/#immutability
"Local AI should be a default, not a privilege: private data, no per-token bill, no vendor lock-in. The hardware to run capable models already sits on desks. The software to run those chips well doesn't."
So figure out how to run it on Vulkan instead of requiring the user to be locked into expensive CUDA cards.
So everyone is aware, you can already run Qwen3.5-27B on Vulkan or Apple's hardware. Every major inference engine supports it right now.
This repo is a vibecoded demo implementation of some recent research papers combined with some optimizations that sacrifice quality for speed to get a big number that looks impressive. The 207 tok/s number they're claiming only appears in the headline. The results they show are half that or less, so I already don't trust anything they're saying they accomplished.
If you want to run Qwen3.5-27B you can do it with a project llama.cpp on CUDA, Vulkan, Apple, or even CPU.
You can run pretty much every model on Vulkan, including the Qwen MoE models. You can also run pretty much every model on ROCm, Apple Silicon via MLX, and Intel hardware via OpenVINO. Nvidia got there first, but they're no longer clearly dominant in the self-hosting space, simply because of the high cost. I think Apple probably has the lead there, due to unified memory allowing big models to run without multiple big dedicated GPUs, but stuff like Strix Halo with 128GB of unified memory is also pretty much sold out everywhere. There's a lower bound on how small a model can be and still be useful.
Anyway, I don't have any Nvidia hardware, and I've got several local models running and/or training at all times.
Like with all new tech trends, it takes them a hot minute to catch up, but it's highly likely they will (eventually) release some killer platforms for local AI. The shared memory, high bandwidth and power-efficiency of their M chips is a near-ideal architecture. If/when they finally push out the M5-ultra, that could be round one (albeit still not at the best price/performace vs comparable cloud api tokens). A real mass-market killer device for local LLMs is still going to require some remediation of the global DRAM shortages, and maybe the M6/M7 generation.
Apple has Metal, which is already pretty well-integrated in llama.cpp, various Python libs, and mistral-rs & candle. Unpopular opinion, but Vulkan is hot garbage and the definition of "design by committee." There's a reason people still prefer CUDA, whereas most code could likely be programmatically ported anyway.
After the steep increase in sales of Mac Studios specifically for LLMs, I'm waiting for Apple to release a frontier level model, optimized for highest end of apple hardware (probably would be hardware locked by a certain neural processor needed (which would then lock the memory config).
The built in Apple Intelligence right now is very small, but even just having a small LLM you know is always there, online, fast and ready makes you think about building app differently. I would love the context to expand from the meager ~4K tokens.
I am in .net as well. The clean code virus runs rampant.
Swimming in DTOs and ViewModels that are exact copies of Models; services that have two methods in them: a command method and then the actual command the command method calls, when the calling class already has access to the data the command method is executing; 3 layers of generic abstractions that ultimately boil down to a 3 method class.
Debugging anything is a nightmare with all the jumps through all the different classes. Hell, just learning the code base was a nightmare.
Now I'm balls deep in a warehouse migration, which means rewriting the ETL to accommodate both systems until we flip the switch. And the people who originally wrote the ETL apparently didn't read the documentation for any of it.
for the california legislation there were no "nay" votes. it's disapointing this performatively protective stance permeates both dominant right-of-global-center parties in America, but it is "all of them"
Pretty much the same laws in red and blue states yeah. It always gets confusing when Americans use the word liberal, everyone is a liberal, it never meant *your* liberty.
Responsible parents don't have separate OS accounts for their children.
reply