
Event Buffering - craigkerstiens
https://gist.github.com/1823481
======
tim_h
Basically, in a client-server relationship, use persistent buffering on the
client-side so that the client can tolerate server downtime.

I like the simplicity of this approach. It's best to keep the low-level stuff
as simple as possible when building distributed systems. It will get
complicated soon enough at the higher levels.

------
fleitz
I wish people would stop using SQL as a message queue, message queuing is not
what SQL was designed for.

Even if you used the shittiest message queue imaginable (email) you get this
'buffering' functionality for free.

~~~
aphyr
The point, I think, is that the event buffer obeys the same transactional
semantics as the data the event refers to. If you are not able to thread all
state atomically through the message system, this can offer improved
consistency guarantees.

~~~
andma
You hit the nail on the head. If you don't need strong consistency guarantees
any sort of message queue will do. That is a solved problem.

There are two key quotes from the article:

1\. "... if our queue begins dropping messages, we run the risk of silently
corrupting data"

The two statements that they are executing "app.update" (modifying the
database) and "enqueue" (sending to message queue") are not atomic. They can
make it atomic using something like 2 phase commit but implementing that will
be more trouble than it's worth. Additionally, there will be performance
(latency) implications since it is a synchronous operation.

2\. "Notice how both of our writes are inside of a local database transaction"

Now they have atomicity very easily!

I agree with the grandparent that using SQL as a message queue is generally a
bad idea. However the pitfalls are well known and can be engineered around.
See this article: [http://www.engineyard.com/blog/2011/5-subtle-ways-youre-
usin...](http://www.engineyard.com/blog/2011/5-subtle-ways-youre-using-mysql-
as-a-queue-and-why-itll-bite-you/)

I am actually working on something very similar to this at my day job. The
"buffer processor" that the article is describing, we implemented using a
scala process. Perhaps what's most interesting about the entire system as
mentioned at the end of the article is maintaining the large amount of state
for durability (we use hbase) and deliverability (we wrote a custom scala
server).

------
skMed
This sort of event buffer may also be achieved by leveraging Change Data
Capture (CDC) features if they are supported by your data store. For example,
when CDC is enabled in SQL Server, an agent service will periodically examine
the transaction log and move data changes to transient "buffer tables"
automatically. Since this is happening as a background service, there is no
need for a developer to explicitly perform a write operation to these buffer
tables.

~~~
ryandotsmith
Interesting. Given that I primarily use PostgreSQL, I have not come across
this feature in my day-to-day. However, I can imagine building such a thing in
PostgreSQL. Perhaps with a system of triggers and dblink [1] one can create a
mechanism to achieve a similar result.

1\. <http://www.postgresql.org/docs/9.1/static/dblink.html>

~~~
skMed
CDC is definitely a more enterprise-y feature (Oracle, SQL Server, etc.). I
have had to build the same sort of tracking via triggers in the past, so I
would definitely agree with you. I am not super familiar with the PostgreSQL
road map, but maybe it is a long-term goal.

