More

steffs · 2026-04-26T00:11:19 1777162279

The onion model is the right mental frame. The nastiest failures are often not obviously bad SQL, they are valid queries that become dangerous only after the planner sees real cardinalities. Row limits and statement timeouts help, but a query can still thrash caches or hold locks before timeout hits. Is your pre execution cost check based on an EXPLAIN style plan with relation level budgets, or is it mostly AST heuristics plus database backstops? That boundary usually decides whether something feels safe enough for production data.

steffs · 2026-04-20T14:51:02 1776696662

I think the deeper point is not CLI specifically, it is structured local tooling with predictable side effects. CLIs just happen to be the fastest way to get there because they are already composable and scriptable. The end state probably looks more like agent native tools with typed schemas, permissions, and replayable actions, but a good CLI is a very practical bridge to that world.

steffs · 2026-04-20T14:50:21 1776696621

The part that stands out is that you are optimizing for warm state instead of cold boot. That feels right for dev shells. If the workload is repeated short lived environments, template fork time matters more than booting a minimal kernel fast. How do you handle template drift over time? Do you periodically rebuild and re-warm from scratch, or can you patch a warm template in place without losing the memory-sharing gains?

rasengan · 2026-04-20T21:22:59 1776720179

Great question! We rebuild if there's a security update or otherwise every few weeks. We're working on a better method, but right now a few templates can be kept warm so users aren't forced to reboot.

steffs · 2026-04-02T21:47:10 1775166430

The multi-modal bundling is the part that stands out more than the raw inference speed. If you are building an app that needs text generation, image generation, and speech recognition, right now the local setup is three separate services with three different APIs and three different model management stories. Having one server handle all of that behind OpenAI-compatible endpoints is a real quality of life improvement for anyone prototyping locally. The NPU angle is interesting but probably overstated for most use cases. The discussion in the thread confirms what I would expect: NPUs shine for small always-on models and prefill offloading, not for the chatbot workloads most people care about. Where this gets genuinely compelling is if AMD can make the combined GPU plus NPU scheduling transparent enough that developers do not need to think about which hardware is running which part of the pipeline. That is not a solved problem on any platform yet, and if Lemonade gets it right for even a subset of workloads, it becomes the default choice on AMD hardware regardless of how it benchmarks against Ollama on pure text generation.

steffs · 2026-03-30T02:30:43 1774837843

The no-JSON-boundary piece is the part that stands out to me. Most polyglot runtimes spend a lot of cycles serializing and deserializing at the language boundary, and that cost compounds fast when you are doing SSR or tight per-connection loops. Having Erlang read the native DOM directly without a string rendering step is a real architectural win, not just a convenience. Curious how you handle the supervision semantics when a JS runtime crashes.

fouc · 2026-03-30T05:29:37 1774848577

"is a real architectural win, not just a convenience." AI use spotted

steffs · 2026-03-19T14:19:35 1773929975

The StumbleUpon comparison is apt but I think what made StumbleUpon work was the social layer: you could see what your friends upvoted, and that created an implicit filter against the pure randomness. Pure random discovery is fun for a session but gets old. Would love to see something like a lightweight trust graph here where a site vouching for other sites carries weight, similar to how Webring worked but with signal about quality rather than just affiliation.

steffs · 2026-01-20T03:17:48 1768879068

Doesn't make sense... most of the software used on 3D printers is open source.

steffs · 2026-01-20T02:29:23 1768876163

I love this low-level stuff. Do you think this is the reason that in the past year or two DNS providers have stopped allowing CNAME records for the root domain? Meaning, it's to ensure backwards compatibility for certain DNS resolution tools that are order sensitive to not get confused?

steffs · on May 12, 2022

The realness here is palpable

steffs · on May 10, 2022

So guilty of this.... 1st thought "I should just learn this new skill so I can do it myself" 2nd thought "That will take days/weeks... learn enough to hire it out" 3rd thought "I'm glad I went with 2nd thought"