Hacker News new | past | comments | ask | show | jobs | submit login

The thing that is at the end of the lossy medium. It must tolerate (0 or 1) or (1 or more) things being delivered to it.





Yes, that is true. But why can't I choose to view "the system itself" as the thing that is on the other side of a de-duplicator?

It feels to me like an argument over whether or not humans can fly. An unassisted human cannot fly, but with some technological augmentation, they can. It seems a bit pedantic to deny that someone can fly from LA to New York simply because they have to get into an airplane to do it.


> why can't I choose to view "the system itself" as the thing that is on the other side of a de-duplicator?

because the "de-duplicator" would either:

* be somewhere else on an unreliable network (in which case we have the same problem)

* be on the same machine (or in the same process) as "the system itself" (in which case from a distributed systems perspective makes it the same thing)

> It seems a bit pedantic...

It is pedantic. The only reason that these "delivery" rules are popular is because of how many times programmers have gotten it wrong. Mostly by making assumptions that either:

* the network is reliable

* the message queue (or whatever) will de-duplicate messages for me


Having a clear system boundary is required for analysis.

Knowing that messages will be delivered 1+ times gives us a variety of ways we could choose to deal with this on the endpoint, with different vulnerabilities. (Getting "exactly once" processing usually requires making various kinds of resilience tradeoffs based on timing windows, storage requirements, etc).

> It seems a bit pedantic to deny that someone can fly from LA to New York

At this point I question your good faith. You're calling people out by name, and you're going full on "well, aktuallyyyy" and seeming to deliberately misunderstand other peoples' assertions. "People can't breathe underwater" v. "Well, once I was in a tunnel that was under a body of water, and I still breathed!(@!("

If you choose to define words differently than everyone else, you're just sabotaging your own communication to try and feel smart.


> You're calling people out by name

I am? Where?

> Getting "exactly once" processing usually requires making various kinds of resilience tradeoffs based on timing windows, storage requirements, etc

Yes, of course. But that's not the same as "impossible".


> I was reading Hacker News a few days ago and stumbled on a comment posted by ...

Really? That is what causes you to question whether or not I'm acting in good faith?

If that's what you call "calling people out by name" I guess we'll just have to agree to disagree.


The whole refusal to accept that a field could legitimately define something differently than how you prefer, and then running off to blog about it and name names... and then coming around for round II of flamewar... with ever more splitting of hairs in definitions... is not awesome.

You are especially well-answered here, I think: https://news.ycombinator.com/item?id=41599131

One reason the delivery / processing distinction exists because very often the application needs to atomically persist "I have received this message" with any other state changes made as a result of processing that message for correctness. You can't generally solve this with a layer put on top, even on the same machine. If it's not atomic, then you can still deliver duplicates to the application or end up never delivering to the application. (Power goes out when one side has written but not the other).

So, the state change to "already received" and the changes you want to make in response to the message being received have to happen together. TCP or even a message queueing implementation with a persistence layer cannot solve this problem for you. Thus, the application needs to deal with multiple delivery.

Imagine a "subtract $5 from my bank account" message with no ID on the message itself, and a layer "on top" that gives IDs and tries to ensure exactly once delivery. If the layer "on top" does not change state at the exact time $5 is deducted from the account, bad things can happen-- and in practice this is impossible. Hence, the application needs to be able to cope with the "subtract $5" being delivered to it multiple times, and this deduping has to be intimately tied to it subtracting the $5 (processing).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: