More

earcar · 2025-07-10T15:04:44 1752159884

Fair point on the syntax, I should have been clearer. What I meant is that your existing Ruby code doesn't need modifications. In Python you'd need to use a different HTTP library, add `async def` and `await` everywhere, etc. In Ruby the same `Net::HTTP` call works in both sync and async context.

The `Async do` wrapper just at the orchestration level, not throughout your codebase. That's a huge difference in practice.

Regarding pgbouncer - yes, it helps with connection pooling, but you still have the fundamental issue of 25 workers = 25 max concurrent LLM streams. Your 26th user waits. With fibers, you can handle thousands on the same hardware because they yield during the 30-60s of waiting for tokens.

Sure, for pure performance you'd pick another language. But that's not the point - the point is that you can get much better performance for IO-bound workloads in Ruby today, without switching languages or rewriting everything.

It's about making Ruby better at what it's already being used for, not competing with system languages.

horsawlarway · 2025-07-11T14:05:59 1752242759

> Regarding pgbouncer - yes, it helps with connection pooling, but you still have the fundamental issue of 25 workers = 25 max concurrent LLM streams.

I guess my point is why are you picking an arbitrarily low number like 25? If you know that workers are going to be "waiting for tokens" most of the time, why not bump that number way, WAY up?

And I guess I should clarify - I'm coming into this outside of the Python space (I touch python because it's hard to avoid when doing AI work right now, but it's hardly my favorite language). Basically - having done a lot of GoLang, which uses goroutines in basically the same way Ruby uses Fibers (lightweight runtime managed thread replacements) I'll tell you up front - The orchestration level still matters a LOT, and you're going to be dealing with a lot of complexity there to make things work, even if it does mean that some lower level code can remain unaware (colorless).

Even good ol' fashioned c++ has had this concept bouncing around for a long time ( https://github.com/boostorg/fiber ). It's good at some things, but it's absolutely not the silver bullet I feel like you're trying to pitch it as here.

earcar · 2025-07-12T14:07:40 1752329260

Why not bump it to 10,000 threads? The post shows: the OS scheduler struggles badly, 18x slower allocation, 17x slower context switching. That’s measured overhead, not theory.

Complexity? We migrated in 30 minutes. It’s just Async blocks, not goroutine scheduling gymnastics.

Not claiming it’s a silver bullet - the post explicitly says “use threads for CPU work”. But for I/O-bound LLM streaming, the massive improvement is real and in production.

earcar · 2025-07-10T14:02:34 1752156154

Author here. Thank you, that means a lot!

Happy to answer any questions.

earcar · 2025-07-09T12:18:48 1752063528

Author here. The problem: every thread-based job queue (Sidekiq, GoodJob, etc.) has a max_threads limit. Set it to 25? That's your hard ceiling for concurrent LLM conversations. The 26th user waits, even though your server is 99% idle.

Switched to async-job in 30 minutes. No code changes needed. No max_threads = our AI app went from barely handling 25 users to thousands on the same hardware.

The beautiful part: existing libraries like RubyLLM automatically get async performance because Net::HTTP yields to fibers. No special async versions needed. No async/await keywords polluting your codebase.

After a decade in Python's asyncio world, this feels like how async should have been done. You get massive concurrency without the complexity tax.

Happy to discuss the technical details or migration experience. The future of Ruby AI apps is here, and it's surprisingly simple.

earcar · 2025-04-01T09:42:31 1743500551

Daily!

danyathecooder · 2025-04-01T09:46:11 1743500771

I checked the documentation and I don't see, that I can do this.

Also, could you return only difference in the data, that appeared on the website?

earcar · 2025-04-01T09:59:39 1743501579

There will be no diff support, but it will be updated daily.

earcar · 2025-04-01T09:33:39 1743500019

Fascinating timing for this feature. I suspect the AI chatbot explosion has been a major driver behind the push for richer select elements.

Have you noticed how every AI interface needs to let you choose between models? The current select element is embarrassingly inadequate when you need to show more than just text labels. You want to display model capabilities, performance indicators, context sizes - not just "GPT-4" vs "Claude 3.7" as plain text.

earcar · 2025-04-01T09:28:39 1743499719

Who's actually making the claim we should replace everything with natural language? Almost nobody serious. This article sets up a bit of a strawman while making excellent points.

What we're really seeing is specific applications where conversation makes sense, not a wholesale revolution. Natural language shines for complex, ambiguous tasks but is hilariously inefficient for things like opening doors or adjusting volume.

The real insight here is about choosing the right interface for the job. We don't need philosophical debates about "the future of computing" - we need pragmatic combinations of interfaces that work together seamlessly.

The butter-passing example is spot on, though. The telepathic anticipation between long-married couples is exactly what good software should aspire to. Not more conversation, but less need for it.

Where Julian absolutely nails it is the vision of AI as an augmentation layer rather than replacement. That's the realistic future - not some chat-only dystopia where we're verbally commanding our way through tasks that a simple button press would handle more efficiently.

The tech industry does have these pendulum swings where we overthink basic interaction models. Maybe we could spend less time theorizing about natural language as "the future" and more time just building tools that solve real problems with whatever interface makes the most sense.

mattmanser · 2025-04-01T09:38:27 1743500307

I don't think it's a straw man, there's lots of people who think it might, or under vague impressions that it might. Plenty of less technical people. Because they haven't thought it through.

The article is useful as it's enunciated arguments which many of us have intuited, but are not necessarily able to explain ourselves.

earcar · 2025-03-15T21:42:15 1742074935

Thank you! This is what the Ruby community has always prioritized - developer experience. Making complex things simple and joyful to use isn't just aesthetic preference, it's practical engineering. When your interface matches how developers think about the problem domain, you get fewer bugs and more productivity.

earcar · 2025-03-15T21:40:30 1742074830

Thank you for your kind words!

Valid point. I'm actually already working on testing better streaming using async-http-faraday, which configures the default adapter to use async_http with falcon and async-job instead of thread-based approaches like puma and SolidQueue. This should significantly improve resource efficiency for AI workloads in Ruby - something I'm not aware is implemented by other major Ruby LLM libraries. The current approach with blocks is idiomatic Ruby, but the upcoming async support will make the library even better for production use cases. Stay tuned!

earcar · 2025-03-15T21:32:00 1742074320

Thanks for flagging this. The eval was only in the docs and meant only as an example, but we definitely don't want to promote dangerous patterns in the docs. I updated them.

earcar · 2025-03-13T19:25:37 1741893937

Thank you!