It's an Elixir server (Phoenix) that allows you to listen to changes in your database via websockets.
Basically the Phoenix server listens to PostgreSQL's replication functionality, converts the byte stream into JSON, and then broadcasts over websockets. The beauty of listening to the replication functionality is that you can make changes to your database from anywhere - your api, directly in the DB, via a console etc - and you will still receive the changes via websockets.
I'd have to dig into their implementation to verify, but the main benefit over OP is scalability - i believe their implementation is tightly coupled to postgres, but with this you could scale the Elixir servers without any additional load on your DB
It's still in very early stages, although I am using it in production at my company and will work on it full time starting Jan
The recommended resolution online was just to send a NOTIFY with a Primary key or identifier, then to query the database for the resource that changed. I didn't like this approach because it required more load on the database.
One other advantage of using the WAL is that you don't have to set up then triggers everytime you set up a table. You just set and wal_level to "logical" on the database and then all your tables will be included in the stream replication
Very similar to what you are doing but instead of Websockets you can decide where to send your data. The core-debezium sends it to Kafka.
And if I'm 100% honest I just wanted to make something i thought was cool
Disclaimer: working on Debezium
Mnesia (one of the databases built into Erlang) has a similar ability to subscribe to its tables that I've used for realtime updates in some demo projects.
In Elixir-land, there is https://github.com/commanded/eventstore (which AFAICT is closely tied to the Commanded CQRS/RS framework https://github.com/commanded/commanded)
It minimized complexity (moving parts, boundaries, etc...), everything was transactional, and a single dump of the database gave you a copy of the entire persistent state of your system at that moment (across all services).
We had several services running, but any notion of persistent state was stored in a single database. We didn't have particularly high demands, but processed ~140M jobs per month in addition to our normal query load.
Postgres handled this like a champ, had far lower p50 and p95 latency than SNS/SQS, and rarely went above single-digit percentage CPU usage on a fairly cheap DigitalOcean box.
I've worked at a lot of big tech companies (Google, Microsoft, Twitter, Salesforce) in many varieties of giant distributed systems and the most valuable lesson I've learned over and over again is: Distributed systems are hard, avoid them until you can't.
That's the best advice I can give to any startup. Until you're regularly sustaining 1,000's of TPS or have grown to dozens of TB of data, it is generally a distraction to even think about anything other than a single relational database.
The vast majority of companies and their scale will never go above a single midsize database server. All of the transactional, backup, and querying functionalities you list make it much more productive than using fancy AWS/cloud services.
We wrote a very simple task queue (<500 lines of python) using a postgres table and inbuilt pub/sub. Ran millions of jobs, scheduled and unscheduled, without a hitch.
I've been building systems on some pretty similar ideas for the last 5+ years, and have a similarly optimistic outlook on it. Just to add another datapoint here, we've been able to scale them up to 20-40x higher than your figure (a couple thousand jobs/sec) on a single (albeit beefy) MySQL instance. It definitely takes a fair bit of care & we certainly have some time invested in it, but it can be done.
I guess my services haven't had (gotten to? :-) ) to share databases with other services for quite some time -- they did when I started my job! -- but even with that concession, the benefits of DB guarantees have been amazing. There's nothing like a transaction for preserving your sanity.
The best tactic I've learned from these systems is to think really hard about the batching & partitioning behavior to keep msgs/sec high enough & transactions/sec low enough to stay afloat.
thank you. May I borrow this?
Scalability for a tech product should be a concern at some point, but if you don’t have product market fit it doesn’t matter. The complexity of building a distributed system right out of the gate will cripple rapid iteration or lead to a broken product.
I've heard of him before but never really read anything much by him. Is this guy for real? He worked at one startup, got angry about a database, became a consultant, and screams about everyone being wrong about everything?
* I don't see any problem with stored procedures. They can make good sense for, say, auditing, as well as for performance.
* People describe their software systems as "Using Oracle" because it matters, not because their design is stupid. It tells me that Oracle skills are relevant, for instance.
* NoSQL is generally, in my experience, awful and chaotic, rather than liberating. Lots of good work has gone into making serious grown-up relational databases. Schemas, normal forms, constraints, rigorous work on the ACID properties. Something like MongoDB is just sloppy amateur-hour by comparison.
* In the YouTube video, he says that solid-state drives render relational databases obsolete. This strikes me as absurd. Not even ultra-fast SSD storage technologies will do that. The relational model is effective, and the associated DBMSs still well justified. Joins belong in a query language, not in imperative code. Why would you want to try managing a huge complex dataset manually?
If you have a proprietary database, stored procedures are awful. AHEM ORACLE.
Something like VoltDB but there are other solutions. There used to be some blogs about their architecture and how they could avoid a couple of steps a normal RMDB needs that takes a lot of time.
I sincerely doubt that.
For it to be useful in real world situations, you'd have to build your own highly flexible, scalable, rock-solid, high-performance data-management solution, ideally one which enables the user to use a declarative means of expressing queries.
This is, of course, a DBMS.
If you build one which imposes structure on the data, you've got either a relational DBMS, or an object DBMS, or some other kind of well-studied database solution... except you've rolled your own from scratch, without input from database experts, so it's going to be a disaster.
Unless you're someone like Microsoft, Google, or Amazon, you pretty much can't build one of those. It costs tens of millions. It takes a huge amount of testing, for obvious reasons.
I really don't see any argument for not using a mature database system for managing a typical company's data.
Of course, if we aren't talking about a full-scale database that has to cope with a huge amount of messy mission-critical data in a changing business environment, then the game changes, and sure, you might have a chance just writing something yourself.
Netflix's video-streaming CDN ('OpenConnect'), for instance, obviously isn't powered by an RDMS. They put in a huge amount of highly technical work building their own finely tuned technologies to pull data off the disk and get it to the NIC with minimal glue in between. But that's Netflix, and virtually no real companies face that kind of challenge.
Also, VoltDB looks like a streaming DB technology. Isn't that both an DBMS (rather than a means of rolling-your-own), and very niche? I don't see how it's relevant here. If it's really able to serve business needs better, then great, but it's still a complex-DBMS-as-a-product.
To me, he sounds more like a snake oil salesman.
He rails against a strawman where people put SQL everywhere, into views, into application logic ("mail merge" is one he mentions), and DBAs gatekeep everything.
The common sense part: no, your views shouldn't be composing SQL with string interpolation.
The moronic parts:
- He venerates application logic and regards the data model as a detail, but data models aren't details — they tend to outlive application logic.
- He holds up in-memory data structures as a platonic ideal, but doesn't address the things databases provide for you: schemas, constraints, transactional semantics and error recovery, a clear concurrency story.
A bit outdated now, but some details here: https://paul.copplest.one/blog/nimbus-tech-2019-04.html#tech...
It only starts to leak if everyone accesses the the DB through raw sql, mishmashing modules and creating insane dependencies.
Give each module ownership to a table and demand to only access it through there and the backend storage system becomes irrelevant
I believe you're being deeply naive about modules owning single tables. You forego relational integrity and almost the whole power of relational algebra if you take that approach. That's heaps of functionality to leave behind when you and one or two other guys need to crank out a feature a day.
I can see how you misunderstood me there however, it was admittedly poorly worded.
What I was talking about was basic data hygine as it's often called.
If you need specific data, you define a clear way to get this data and only access it through that API. This can be a class, method or anything your language of choice prefers.
If you skip that step and directly access the DB everywhere in your code, you'll create another unmaintainable dumbsterfire as soon as your team goes beyond the initial programmers.
A startup needs all the options on the table because the alternative is they go out of business and there is no startup. This idea that we can enforce “good architecture” (subject to interpretation) with technology by isolating junior engineers with networking rules needs to at least be more transparent about its motivations.
For pieces to be able to be carved off, they already have to be autonomous.
What usually happens is that devs without experience in service architectures presume that service architectures are probably just the things they are used to seeing, ie: monolithic entities.
It's usually then that we hear things like "extract the product service". The problem is that product is an entity, and entities are the last thing to build services around. That's how we end up with distributed monoliths, and subsequently failed service architecture projects.
In order for an app to be able to transition to services, the app has to be designed this way from the beginning.
And yes, it's definitely possible to know what the model partitions should be up front. They're very natural divisions. But they can't be arrived at by looking through an entity-centric lens. And unfortunately, forms-over-data apps very rarely provide us with an opportunity to learn about the "other" way to do it.
Edit: Fixed a typo
If you're a startup just use SNS / Kinesis / Google pub sub ect ...
SNS, Kenesis (Kafka), Google Pub/Sub are awesome. Not the same problem/solution fit as an event store, but awesome for the scenarios and architectures they're targeted at.
I understand that Kafka can scale horizontally and can handle crazy throughput, but I mean from a programming point of view, the idea of a unified log as a data model applies to both, correct?
Event-based just means that communication happens over events. Event-sourced means that the authoritative state of the system is sourced from events. If the events are literally the state, then how those are retrieved begins to matter.
Kafka breaks down as a message store in 2 key ways that I mentioned elsewhere in all these threads.
> The first is that one generally has a separate stream for each entity in an event-sourced system. Streams are sort of like topics in Kafka, but it would be quite challenging to, say, make a topic per user in Kafka. The second is Kafka's lack of optimistic concurrency support (see https://issues.apache.org/jira/browse/KAFKA-2260). The decision to not support expected offsets makes perfect sense for what Kafka is, but it does make it unsuitable for event sourcing.
If my only tool were Kafka, then I wouldn't be able to use the messages in the same way that I can with something like Message DB. And that's okay, different tools for different jobs.
- Dealing with sec upgrade for PG and Linux
- Dealing with backups
- Dealing with HA, so settings up slaves / replica, then what do you do when something goes wrong? Do you manually SSH and do some magic?
- Network security / usernames / passwords
- ect ...
Clearly what startup don't want to do and usually lacks expertise into.
The first is that one generally has a separate stream for each entity in an event-sourced system. Streams are sort of like topics in Kafka, but it would be quite challenging to, say, make a topic per user in Kafka. The second is Kafka's lack of optimistic concurrency support (see https://issues.apache.org/jira/browse/KAFKA-2260). The decision to not support expected offsets makes perfect sense for what Kafka is, but it does make it unsuitable for event sourcing.
Being built on top of Postgres, Message DB gives you access to event sourcing semantics using a familiar database technology.
We use Message DB in our production systems, and I'd be happy to talk more about it if you have other questions. We've found it very reliable.
As a disclaimer, I'm listed as a contributor at the Eventide Project, the project Message DB was extracted from, though I did not write any of the code behind Message DB.
Some clarity by example: In financial systems, it’s extremely important to keep track of all the transactions between your microservices (if your architecture is based on that). You could potentially lose a message delivered to you via a broker if your service fails to write it to a persistent storage. If the producer of that message never stored that message or it was produced on wire and transmitted, there is no way to recover it anymore. A system designed around a Message Store can mitigate such problems. You can build that similar architecture with brokers as well but for each of your application, you will have to implement something analogous to a Message Store to handle idempotency and things like that.
Polling sounds very crude, but for the systems Message DB is designed for, it's a virtue. No back pressure problems, for example.
Atomic changes (rollback can undo any unprocessed messages)
Fewer moving parts, if you already need a DB
Easily query state of queues and messages with famailar SQL
"Databases suck for Messaging"
I wonder if the designers of this system disagree, or if they did anything in particular to mitigate these issues?
They didn't use it for everything; there was also a blindingly efficient homegrown messaging system for the stuff that needed to be really fast. But that was for fairly well-defined use cases. (Latency and concurrency were the key limiting factors.) By default, everything went into the RDBMS-based system. It was inherently more robust, it replaced a whole quagmire of design questions you have to make when using other messaging systems with ACID guarantees, and it made it a lot quicker & easier to do post-mortem analysis of any surprising things that happened in production.
The response given by @ethangarofolo does a good job of addressing the main points.
I would say that the slide deck linked is a bit sneaky. Sooner or later, in anything built with a computer, there's going to be polling. Higher-level libraries abstract the polling so that it appears as push (rather than pull) to the application developer. But under the hood, there's polling.
It should also be pointed out that any message queue with persistent messaging is built on a database. Even RabbitMQ's persistent messaging is built on a database (Mnesia).
In the end, a message store or event store can be used to model message queues, but the opposite is rarely true. One of the creators of the Event Store database once said to me something like, "What's a queue but a degenerate form of a stream".
A message store provides for patterns that are above and beyond message queues, like event sourcing, for example. Like Ethan mentioned, it's an application of "dumb pipes". It's in the same vein as Event Store or Kafka, rather than RabbitMQ, ActiveMQ, etc.
The critical difference is durability of messages. If you have an application that doesn't require durability of messages, then a plain old message queue or message broker technology may be a better choice for the situation.
In the end, polling is totally fine and totally manageable. What matters is that it's not done naively; that batching is intelligent and polling is only optionally triggered in the right circumstances and tuned based on batch processing cycle time and message arrival time.
Polling doesn't mean that the database will be "hammered" unless it's implemented that way.
Is that true? I am reminded of this essay that goes into a deep dive of what happens under the hood with http and the underlying network I/O
And at the lowest level, the network card interrupts the CPU because it has finished reading or writing data.
> Some time after the write request started, the device finishes writing. It notifies the CPU via an interrupt.
Is that polling? It seems more like a push.
If there's polling, it happens in a matter of a few CPU cycles.
The magnitude of the time slice or polling interval is immaterial as to whether it is to be considered "polling."
Good point that any system that is "persistent" has a data store by definition. The real question is that if it's a good fit to a SQL relational database.
Postgres checks those boxes, as do others.
Technically, though, Postgres has been an "object-relational" database for some time.
In the end, Message DB uses a table as an append-only log, and leverages Postgres indexes, advisory locks, and JSON documents (and indexes) to implement some of the critical messaging features and patterns. It's not using the "relational" aspects of Postgres. There are no relational tables.
Here's the table schema: https://github.com/message-db/message-db/blob/master/databas...
Given the paucity of Postgres features that the message store leverages, it could have been implemented against the raw Postgres storage engine. Had we done that, though, few people could have understood it and been able to specialize it to their purposes using plain old SQL. And any performance improvements induced by skipping Postgres's tabular data abstractions would have been so negligible for the schema in question that it wouldn't have offered much return on the effort.
Also your links are (mostly) from message queue companies who have a vested interest in your not using Postgres, or are from a time before LISTEN/NOTIFY and JSON(B) columns.
Message DB is a message store, a database optimized for storing message data. The database doesn't track the status of anything---that responsibility falls to consumers of the data.
Message queues are fine pieces of technology when what you need is a message queue. Message DB, on the other hand, is used for event-sourced systems. "Event-sourced" in turn differs from merely "event-based." Since I've started building event-sourced systems, I haven't run across the need for message queues, but ymmv, and of course, like all humans, I too have my own hammer/nail biases.
I don't know about that, I have seen many microservices that process data that comes in from messages on message queues; frequently that is a better design than receiving the data over synchronous http PUT or POST.
By "message queues" I mean AWS SNS/SQS and Azure event hubs. Would a SQL Db be "a good fit in a microservice-based architecture" replacement there?
or are you suggesting that such a microservice's first action on receiving a message from a queue would be to store it a local, private Message DB? That might make sense, I've seen enough tables that store a JSON blob already.
A message store like Message DB can at the same time serve as record of system state and communication channel. Writing a message to the store is the same as publishing it for consumers to pick up. Messages are written to streams, and interested parties subscribe to those streams by polling them for new messages.
The store itself isn't aware of which components are subscribing or managing their read positions.
So to your question, the first action of a component receiving a message is do whatever it does to handle the message and then write new messages (specifically events) to record and signal what happened in response to the original message.
The almost-but-not-quite unstated major premise of the whole argument is that you're putting a key piece of smarts into the queue: Tracking whether a message has been processed.
One could argue that that is the real antipattern. It'll hurt you big time if you're storing messages in a database table. But it'll hurt you even if you don't. For example, by tracking that kind of information inside of the queue, you're losing the ability to add a second listener without either affecting what messages are being seen by (and therefore the behavior of) the existing listener(s), or modifying the queue itself. Which you don't want to have to do any more than you have to on a critical piece of shared infrastructure like that.
It might be fine if it's an oldschool monolith that's just processing some sort of work queue on multiple threads. But, if you're doing microservices, you're probably trying to keep things more flexible than that, and want to allow them to evolve more independently.
Lots of good writing on the tradeoffs, for example: https://sookocheff.com/post/messaging/dissecting-sqs-fifo-qu...
At the risk of beating the point to death, the transition from SOA to Microservices is marked by the transition of smart pipes to dumb pipes, and dumb pipes have no knowledge of whether a message has been processed. In the microservices era, the basic presumption is that the transport has no knowledge of the state of the message processors' progress through a queue.
That doesn't to seem to be characteristic of Microservices at all. It's more about small bounded context and independent deployment of the service. All of which is completely possible over the "smart pipes" that message queues that I mentioned will give you.
Insisting that this wheel must be re-invented or worked around to make it a microservice seems very odd.
Many have their own "unique" definition, which is why "microservices" is bordering on no longer meaning anything. Some people say it and mean SOA—the characteristics you mention are pretty much characteristics of SOA when done right, though independent deployment isn't a necessity.
Martin Fowler attempted to codify a definition, and this is inline with the definition that sbellware is referring to. He (and others) refer specifically to smart endpoints and dumb pipes:
It's a very similar transition to the shift from EJB to Hibernate (or EJB to Rails) or from SOAP to REST.
In the mean time, a lot of folks who don't have that background and perspective picked up on Microservices and made a lot of presumptions based on a lot of experience with web apps and web APIs. It's this that made "Microservices" a largely meaningless, muddied term. This is ironic because "Microservices" was introduced as a means to disambiguate the many competing meanings of "Service-Oriented Architecture" that had come into existence as many and vendors wanted to be seen as being involved with it without actually having been involved with it.
So, there's two meanings of "Microservices": One that comes from a background of service architectures and one that comes from a background of web development. My background is in both of them, but I hew to the meaning of "Microservices" which is closer to "SOA without big vendor smart pipes of the mid-to-late 2000s" than "The same old HTTP APIs we see the world through as web developers".
That's largely a lost battle now, just as SOA was lost to the competing interests of message tech vendors.
The term "Autonomous Services" is a much more precise and unambiguous in its intent, and conveys more specifically what's intended by "SOA done right", a.k.a.: "Microservices". And even more specifically, "Evented Autonomous Services" does an even better job of conveying the intent and the implications or the architecture than what "Microservices" can do now.
Or as Adrian Cockroft originally put it: "Loosely-coupled service oriented architecture with bounded contexts".
Knowing what the implications of all those words are, it's quite impossible to look at what is commonly asserted about the meaning of "Microservices" in 2019 and see much left of the great value of what was originally intended. Much of it still remains unlearned.
yes, AWS SQS does that. e.g. https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQS...
Are you saying that this is bad? It doesn't seem so.
> you're losing the ability to add a second listener without either affecting what messages are being seen by (and therefore the behavior of) the existing listener(s).But, if you're doing microservices, you're probably trying to keep things more flexible than that, and want to allow them to evolve more independently.
I'm not following. Both AWS SNS/SQS and Azure Event hubs were able to handle that scenario just fine, by design.
In AWS you attach multiple queues to the same SNS endpoint and have a pool of subscribers on each queue, in Azure you declare a "consumer group" and a pool of subscribers in it. Or declare a new consumer group if need be.
We regularly scaled up AWS from 3 listeners to 30 depending on load, and back down again, or replaced instances one by one, and still it guaranteed at-least once delivery (in practice, exactly once in the vast majority of cases) across the subscribers. We _never_ "lost the ability to add another listener" either scaling up in the same pool, or creating a new pool.
It's not "bad" per se, but it's not what it seems on the surface.
ACKing a message doesn't mean that the message will not be received more than one time. As long as the smarts for recognizing recycled messages are embedded in the application logic, all will be fine.
Here's a good explanation written about SQS, but applies to all message technologies that work based on ACKs: https://sookocheff.com/post/messaging/dissecting-sqs-fifo-qu...
So, while it's possible to add more consumers, depending on the implementation of the technology's internals for tracking consumer state, you may loose messages or, more typically, receive messages more than once or receive them out of order.
This often happens without the developers' and operators' knowledge. Without foreknowledge of the causes and effects of these things, developers and operators may know there there's some kind of strange intermittent glitch, but don't presume that it's because message processors are processing messages that had already previously been processed (or are not processing messages that had been skipped).
What you can't do is add a new consumer that is interested in processing historical messages beyond the retention window of typical cloud-hosted message transports.
But that's ok because message queues are typically just transports, not message stores, and they serve the needs of moving messages from one place to another. A message store can do that as well, but has different semantics and serves other needs.
So, as long as ACK-based message processors work within the retention window, everything's good. It's when that's not the case that the problems arise. Lost messages and recycled messages are pretty much the only guarantee over the lifetime of a messaging app. As long as that's within tolerances, then it's fine.
Specifically, like all "dumb pipe" technology that came about in the post-SOA age, a message store doesn't use a protocol based on ACKs. Instead, a consumer tracks its own state and does not defer this responsibility to the transport.
And inevitably, if any consumer of any technology wants to guarantee that it never processes a recycled message, this tracking logic has to also be implemented in the application logic of even SQS, RabbitMQ, etc consumers.
There's no such thing as a messaging technology that can guarantee only-once delivery, as you alluded to. There's a good examination of this aspect of messaging and distributed systems here: https://bravenewgeek.com/you-cannot-have-exactly-once-delive...
The best guarantee from message transports that we can hope for amounts to a "maybe". And that's ok. The conditions and countermeasures are well-known. As long as we're not blind-sided and haven't taken "only once" literally, things will be fine.
If we don't realize that application logic is always responsible for making the ultimate decision as to whether to reject a recycled message, we're going to have problems that can be difficult-to-impossible to detect and correct.
Message stores and event stores aren't alternatives to message queues. They're technologies that support architectures that are themselves alternatives to each other.
Deals with FIFO and ordering. You also mention "historical messages".
Sure, if those are things that you need, then you almost certainly want a different tech to SQS, likely Kafka. I won't disagree with that, except as in making it the focus of the criticism of SQS and similar message queues. It isn't that thing.
Very important bit to understand, that many smart people (including perhaps some in this HN discussion) don't know yet: the semantics of an event store are meaningfully different from the semantics of a pub/sub system. If you pick up a pub/sub system and start trying to build an event store based system architecture with a set of CQRS-ES-style followers, you will find you are reinventing a bunch of things quite painfully.
Here's what we did, a few years ago. In some ways it is similar to EventTide.
We picked up JGroups, a truly spectacular piece of open source technology, and used it to automatically wake up event stream listeners in a distributed way. (At the time there was some limitation of the NOTIFY mechanism in PG that sent us this direction.). With very little code around automated wake-up, we achieved low latency (and therefore fast automated testing) of an arbitrary group of event stream followers.
Is it masterless, or does it handwave a lot of distribution problems like most java distributed projects by just using Zookeeper?
There's a script in the Message DB codebase that will do some fairly rudimentary read and write performance measures for your Postgres deployment:
It's run right on the database, so no network I/O or client implementation in the loop - which is inevitably non-representative of "real world" scenarios.
Building on top of generic Postgres using a simple service is pretty neat but without the extra client libraries it’s hard to see a bright future.
The npm module doesn’t look like it actually does anything besides unzip and run the dB setup code at the moment so it’s definitely early days for them. But with a few basic client libraries that can actually do the work of publishing and subscribing to messages, this will be a pretty neat tool.
I’m going to have to watch how this develops.