Hacker News new | past | comments | ask | show | jobs | submit login

TCP does deliver data more than once though. Sure within a TCP session you are guaranteed not to get the same byte twice out of your socket, bytes are delivered exactly once, in order. Now if your application that uses TCP pulls data out of the socket and then dies the data will need to be delivered again and the TCP protocol is unable to help us there, it's application level logic at that point.

So everyone is talking about the same thing here, data can get delivered twice, and the application must handle it in an idempotent way. For TCP this happens to be looking at some counter and throwing away anything that's already been seen. With a single client/server connection maintaining sequence numbers like TCP does solve the problem but it's harder to use this same technique in multi-client, multi-server, "stateless" requests that are common in the web world.

EDIT: To clarify, what I mean by TCP "does deliver data more than once", is that one side of the connection can send the same data twice. It's true that it's then discarded by the other end but this is what people are talking about when they talk about the theoretical impossibility of never sending anything twice. The rest is basically the idem-potency thing of ensuring that data received twice somehow doesn't cause anything abnormal.




Not when your 'socket' is a persisted, durable, transacted medium (ie. a database). Sure, applications can 'pull data out of the socket and then die', but this is a common scenarios on databases which is handled with transaction and post-crash recovery. The application comes back after the crash and find the same state as before the crash (the 'socket' still has the data ready to pull off), it pull again, process, and then commit. This is not duplicate delivery, since we're talking about an aborted and rolled back attempt, followed later by a successful processing. Again, databases and database apps have been dealing with this kind of problems for decades and know how handle them.

I've been living in this problem space for many years now and seen the wheel reinvented many times. Whenever the plumbing does not guarantees EOIO but the business demands it, it gets pushed into the app layer where TCP (retries and acks) is reimplemented, to various success levels.


> The application comes back after the crash and find the same state as before the crash (the 'socket' still has the data ready to pull off), it pull again, process, and then commit.

Isn't this the same app layer stuff that has to get reimplemented? I can see how this is often pushed back to a human (oops, my money transfer didn't go through, I'll have to redo it) but it's still something that has to be dealt with somewhere.


> it's still something that has to be dealt with somewhere

Database programmers have the means to deal with it off-the-shelf: BEGIN TRANSACTION ... COMMIT. When your queues are in the database, this becomes trivial. Even without the system I'm talking about (Service Broker) that has the queues stored in the database, most regular messaging systems do support enrolling into a distributed transaction and achieve an atomic dequeue/process sequence, is just that many apps/deployments don't bother to do it because the ops overhead (XA coordinator), reduced throughput and/or simply not understanding the consequences.

Point is that durable, persisted, transacted 'sockets' are behaving very differently from a TCP socket. Is a whole lot harder to simply lose a message in the app layer when interacting with a database.


But the application still has to deal with processing the same message twice if it dies before ACK.


He did say XA.

The transaction boundary defined in your consumer covers the interactions of that consumer with other XA aware nodes who all participate in a "distributed transaction". So you can process this message N times without committing, and thus possibly N times telling other systems to do M' side-effect of that message, but until you commit the world has not changed.

https://en.wikipedia.org/wiki/X/Open_XA


The ACK is also a transaction. Something like this:

1. Receive from server, commit state = NEW, UNACK 2. Send ACK to server, get confirmation from server, commit state = NEW, ACK 3. Start processing, commit state = PROC 4. Got results, commit state = FIN, UNACK 5. Send FIN to server, commit state = FIN, ACK

Each commit is a database transaction where you write the results of that step along with the new state. If anything fails along the way the work-in-progress is discarded along with the state change. The server has an equivalent so if it gets a duplicate ACK for the same (or earlier) state it can ignore it.

In this example, if the client crashes between 1-2, in #2 never gets confirmation, or crashes trying to commit the "NEW, ACK" state then it will retry. The server has already committed the fact that it sent the value to the client and is awaiting an ACK. If it saw the ACK and gets a duplicate it ignores it. If it never saw the first ACK then it will see the second(+) attempt and commit that it saw the ACK before sending confirmation to the client.


Right. The idea is that by having your database, message queue, and application all share the same transactional context, "reprocessing" the message twice doesn't matter, because the effects are only ever committed exactly once.

It's true that this doesn't work if your processing touches external systems or otherwise escapes the transaction context, but in those cases you do still get at-least-once delivery (or at-most-once, if you choose to commit the receipt before processing the message).

It really is a powerful technology and when leveraged can absolutely reduce the level of effort and cognitive burden to building correct asynchronous systems.


> Whenever the plumbing does not guarantees EOIO but the business demands it, it gets pushed into the app layer where TCP (retries and acks) is reimplemented, to various success levels.

Interesting! You're saying the exact opposite of what Tyler Treat is saying:

> Even with smart middleware, problems still leak out and you have to handle them at the edge—you’re now being taxed twice. This is essentially the end-to-end argument. Push responsibility to the edges, smart endpoints, dumb pipes, etc. It’s the idea that if you need business-level guarantees, build them into the business layer because the infrastructure doesn’t care about them.

(source: http://bravenewgeek.com/smart-endpoints-dumb-pipes/)

What do you think of his argument?


Right. If your definition of "delivered" is "showed a message to the user", then exactly once is a fairly trivial goal to attain. If it's defined "data over the wire", then it's nearly impossible to avoid some form of ACK/RETRY mechanism on real networks, which means that messages will need to sometimes have to be 'delivered' many times to ensure a single display to the user.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: