Hacker News new | comments | show | ask | jobs | submit login
Event-Driven Architecture (herbertograca.com)
154 points by kiyanwang on Oct 11, 2017 | hide | past | web | favorite | 28 comments

In my experience the most important feature of any event driven system should be to have a simple way to lookup all the listeners that exist and what events they're hooking into and in which exact order. As soon as you start having them all around the code it becomes a nightmare to follow the logic, especially if code has side-effects.

In my experience the most important feature of any event driven system should be to have a simple way to lookup all the listeners that exist and what events they're hooking into and in which exact order.

I’m with you on the first part: finding all possible listeners to a certain event is sometimes useful in the same way that finding all possible callers of a certain function is useful.

I’m not so sure about the “which exact order” part, though. IME, this style of architecture works best when you can entirely decouple event listeners and run them asynchronously, so each event listener is unknown to both the original event source and any other event listeners, except for whatever mechanism is used to register as a listener.

If you do need an event to trigger multiple actions in different parts of your system in a specific order, then apparently those parts of the system aren’t entirely independent. In that case, I usually suggest creating a single listener somewhere near them to co-ordinate. It might be very simple, but it makes the dependency between the multiple actions explicit, and now you have a single place in your code with responsibility for triggering each action in the required sequence, without imposing any ordering requirements on your entire event-based architecture.

I meant the situation when you have multiple listeners listening on the same event. Even if you're running all handlers asynchronously, knowing the order of their invocation can be helpful when you're trying to figure out what happened by looking in the logs.

This makes a ton of sense. Do you have any examples / explanations / tutorials you could point to that demonstrate how this is achieved in code?

In a microservice architecture this is known as service discovery and is handled by applications like etcd, consul, and zookeeper.

Inside a single process you can accomplish the same thing by having a central interface that all others use to register themselves and query others.

How do you do this with disparate systems? For example a ruby application that handles and emits events concurrently with a python one.

Centralized logging, with a track and trace ID that is passed around and logged. I have seen this work to great effect with a dynamic (force directed) based graph been built in real-time showing the flow of calls/messages between systems. Also provided timings and drilldowns to find processing issues and performance problems.

People I've worked with kinda-sorta understand the advantages of having a trace ID (aka context ID, correlation ID, etc), but don't code it into their systems. Or they do something like replace an incoming ID with their own, which negates it's value.

For folks who aren't familiar with the concept - at the initiation of any action (user button click, timer firing, automated job starting, etc.) a new ID is created then passed to every single method and external call. This allows you to trace that action as it wends it's way through your distributed systems.

this helps a lot, particularly if your event system spans multiple nodes. its invaluable for correctness, but you can learn a lot about performance too

The router should provide the feature. E. G crossbar.io router has meta rpc enpoints and pub/sub events letting you inspect and react to any activity taking place. It's expensive on resources so you should use it sparingly, but it's very handy to get a grasp on what the hell is going on.

You do it on the shared event bus.

systems report which things they are listening to

listeners all should be in one place, the routes should be in one place ideally that pattern should apply everywhere.

My team just got finished re-writing a frontend JS app that went all-in on events for handling almost everything to a more declarative React app with more explicit dependencies. It still has some event bindings with a central store of state to re-render components, but they are isolated to a single, manageable mechanism.

Event-Driven architecture sounds like somewhat of a panacea with no worries about hard-coded or circular dependencies, more de-coupling with no hard contracts, etc. but in practice it ends up being your worst debugging nightmare. I would much rather have a hard dependency fail fast and loudly than trigger an event that goes off into the ether silently.

With events, most stack traces, call stacks, and flame graphs become useless dead ends. Tracing the flow of the program becomes a game of grepping the entire codebase for copy-and-pasted event names, trying to figure out everywhere the event is used, called, forwarded on to another event, etc. You can attach loggers, but if you go all-in on events, you will eventually end up with so many event calls and handlers that it becomes hard to tell signal from noise.

IMHO, there are way too many downsides to use events at the application architecture level. They are certainly useful for many other things in real applications (as the author states and warns), but not the basic nuts and bolts of your architecture.

TL;DR: The author is quite correct about his 'spaghetti code' warning that can happen with events everywhere.

The problem with this is that you're attempting to do an event-driven app that has a network layer inbetween and a language that has very little safety sorrounding it.

Event driven architectures are appropriate for services. Just not a JS based website.

The biggest problems I've run into with event-driven (microservices) architectures is that there is no built-in opportunity for any kind of application-level feedback, or even minimal synchronous data validation. Your queue may be up and accepting messages, but if you've sent bad data, or if nobody's listening, there's no ability to find that out without essentially reinventing HTTP or creating their own lower-overhead error channels -- an approach which leaves a lot open for mistakes over something well-understood like HTTP.

Why not do synchronous validation at the boundary with event-driven async architecture within the boundary?

That works OK up to a point, but if validity depends on any sort of state that might be changing concurrently then potentially it opens you up to problems with race conditions.

Just like locking shared state for thread synchronisation or operating on a shared filesystem, there are usually only two completely safe strategies. Assuming your underlying operations are guaranteed to be atomic but might fail, either you can do everything synchronously and report any failure immediately, or you can go all-in on asynchronicity, so even reporting success or failure is done by firing a message back asynchronously later.

If you’re going with anything more exotic, you’ll need a strategy for dealing with the intermediate states, for example something that looked like it was going to work but then didn’t. That might mean something like doing just enough synchronously to guarantee you’ve got a record of what needs to happen as a result of the event, then returning a success indication, and then asynchronously following up on the other required actions later, with some sort of auditing and reconciliation processes to make sure that everything that should happen eventually does. In this sort of situation, maybe you would also be able to do enough synchronous validation up-front to know that your request will be able to succeed, even if the goalposts move during the process.

Some protocoles like WAMP do give you feedback, propagating errors in an automatic manner.

Replay with side/external-effects is problematic, but it seems like you just need futures to handle this.

Each event that triggers external effects is bound to a future identifier. The replay actor then accumulates a map of these events. When encountering an event that represents resolving a future, that future is removed from the map. At the end of the replay, you're left with the list of futures that haven't been resolved.

At this point, you can trigger the side-effecting computation to try and resolve the futures.

> You can buy a 1000MHz machine with 2 gigabytes of RAM and an 1000Mbit/sec Ethernet card for $1200 or so.

Even better, on applying this horsepower to serve web pages to 10,000 clients:

> It shouldn't take any more horsepower than that to take four kilobytes from the disk and send them to the network once a second for each of twenty thousand clients.

Four kilobytes. Apparently the average page size is now 2.3MB [1]. It's not hard to find popular sites that load much more than that.

[1] https://www.wired.com/2016/04/average-webpage-now-size-origi...

Which is pathetic and absolutely inexcusable.

The internet and computers keep getting faster, and instead of the web getting MORE responsive, thanks to jackoff web developers, it's getting SLOWER and SLOWER.

I'll never forget the first day as a teenager that we had cable internet finally offered in our neighborhood. Going from 56K it was pure heaven. We're running a (P2-era) Celeron 333 MHz and pages loaded INSTANTLY.

Now, I sit there with a connection 6-10 times faster than my teenage connection, a computer with 8 cores running at almost 4 GHZ and I have to wait for a page to load.

I just checked. ~5 seconds to load the front page of CNN. 5 seconds for Facebook. 7 seconds for GMAIL. (Not like I need to check my e-mail often, eh!) All those numbers are 3x or higher on my laptop.

I actually still use the HTML version of gmail even on my desktop. Why? Because it loads in less than a second. (I just got a time-to-load-page extension, yep. 0.52 seconds verses 6.5 seconds.)

I've taken to browsing with JS disabled by default (whitelisted for specific sites) and the speed/enjoyment trade-off is totally worth the occasional broken it oddly-laid-out page.

This is especially true in a HN/Reddit context where most links will be to articles.

ok hands up who actually enjoyed MVVM architecture?

I do, but is that related to this article somehow?

React is MVVM, so...

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact