Hacker News new | comments | ask | show | jobs | submit login

Erlang/Elixir has some really great advantages in concurrency and parallelism but what you're describing are just badly designed systems.

Shopify, for example, use Resque (Ruby + Redis) to process thousands of background jobs per second.

> * Require very specific ergonomics(for example, don't hand the model over, hand over the ID so you can pull over the freshest version and not overwrite)

This is good practice but certainly not a requirement. You can pass objects in a serialized format like JSON or use Protobuf etc.

> * They require a separate storage system, like your DB, Redis, etc. This doesn't sound big, but when doing complex things it can turn into hell.

ETS and Mnesia aren't production ready job queues, unfortunately: https://news.ycombinator.com/item?id=9828608

> * They have to be run in a separate process, which makes deployment more difficult.

Background tasks have different requirements so this is a good idea regardless.

> * They're slow. Almost all of them work on polling the receiving tables for work, which means you've got a lag time of 1-5 seconds per job. Furthermore, the worse your system load, the slower they go.

Redis queues have millisecond latency and there's no polling. Resque and Sidekiq use the BRPOP to wait for jobs. BRPOP is O(1), so it doesn't slow down as the queue backs up.

PG has LISTEN/NOTIFY to announce new jobs or the state change of an existing job so there's no need to poll. SKIP LOCKED also prevents performance degrading under load.

> * You can't reliably "resume" from going multi-process. Lets say you're fine with the user waiting 2-3 seconds to have a request finish. With workers/queues, you either have to poll to figure out when something finished(which is not only very slow, but error prone), or you have to just go slow and not multi-process, making it into a 8-10 second request even though you've got the processing power to go faster.

There are multiple other options here which are better:

Threads - GIL allows parallel IO anyway and JRuby has no GIL

Pub/Sub - Both Redis and PG have a great basic implementation usable from the Ruby clients

Websockets - Respond early and notify directly from the background jobs




> This is good practice but certainly not a requirement. You can pass objects in a serialized format like JSON or use Protobuf etc.

ie, requiring very specific ergonomics. If you have to change what you're doing, it's a new domain to learn.

> ETS and Mnesia aren't production ready job queues, unfortunately: https://news.ycombinator.com/item?id=9828608

I didn't mention ETS or Mnesia? The OP was talking specifically about using job queues to get concurrency/parallelism, in which case you absolutely don't need job queues. If you need a job queue, you need a job queue.

> Background tasks have different requirements so this is a good idea regardless.

Why? You're just stating this like it's obviously true, and honestly I can't think of a time I significantly wanted a different system doing my jobs than the one handling requests.

> Redis queues have millisecond latency and there's no polling. Resque and Sidekiq use the BRPOP to wait for jobs. BRPOP is O(1), so it doesn't slow down as the queue backs up.

Redis queues have millisecond latency, Ruby using Redis queues does not. That's the part that polls when there's nothing else going on. If you're never running out of jobs to do then your latency is fast, but you're also not accomplishing things as fast as possible(since it's waiting on whatever is in front of it).

If this isn't true anymore, then alright, but last I used Sidekiq(early 2018), the latency to start processing a job was often greater than a second.

> Threads - GIL allows parallel IO anyway and JRuby has no GIL

And are incredibly difficult to use and pass information back and forth(hence why Elixir exists at all- Jose Valim was the person implementing this on the Rails core team).

> Pub/Sub - Both Redis and PG have a great basic implementation usable from the Ruby clients

Can certainly work, to be honest I never tried this because of the complexity of initial setup and how green I was when I needed it.

> Websockets - Respond early and notify directly from the background jobs

Which Ruby has a lot of trouble maintaining performantly. When my original team went to use Rails5 sockets, we found we could barely support 50 sockets per machine.

---

It's worth saying, I'm not saying one shouldn't use Ruby- the place I work right now is a primarily Ruby shop, and my Elixir work is for event processing and systems needing microsecond response times. But, we've also built things in Elixir that normally I would use Ruby or JS for, and not only does it do well, but often it's write it and forget it, with deployment being literally "run a container and set the address in connected apps".


> ie, requiring very specific ergonomics. If you have to change what you're doing, it's a new domain to learn.

Resque & Sidekiq build this in by converting job arguments to JSON. There's nothing extra to learn.

> I didn't mention ETS or Mnesia? The OP was talking specifically about using job queues to get concurrency/parallelism, in which case you absolutely don't need job queues. If you need a job queue, you need a job queue.

Sorry, I thought you were talking about building a background job system in Erlang using out of the box OTP but it sounds like you're actually talking about trying to get parallelism in Ruby by doing RPC over Sidekiq? That's always a bad idea!

> Redis queues have millisecond latency, Ruby using Redis queues does not. That's the part that polls when there's nothing else going on. If you're never running out of jobs to do then your latency is fast, but you're also not accomplishing things as fast as possible(since it's waiting on whatever is in front of it).

Ahhh! When Mike Perham says "Sidekiq Pro cannot reliably handle multiple queues without polling" what this really means is a Redis client can only block on and immediately process from the highest priority queue. The lower priority queues are only checked when blocking timeout expires. There's no "check all queues and sleep" polling loop which adds artificial latency.

> And are incredibly difficult to use and pass information back and forth(hence why Elixir exists at all- Jose Valim was the person implementing this on the Rails core team).

Jose Valim didn't join Rails core until a couple years after Josh Peek (now working for GitHub) made Rails thread-safe.

> And are incredibly difficult to use and pass information back and forth(hence why Elixir exists at all- Jose Valim was the person implementing this on the Rails core team).

It's really not that hard anymore!

results = ['url1','url2'].parallel_map{|url| HTTParty.get(url) }

2012-2013 onwards Ruby got great libraries like concurrent-ruby and parallel that make things a lot easier.

> Which Ruby has a lot of trouble maintaining performantly. When my original team went to use Rails5 sockets, we found we could barely support 50 sockets per machine.

ActionCable is designed for convenience not performance. https://github.com/websocket-rails/websocket-rails will handle thousands of connections per process.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: