Open db connection get data for receipt generate receipt send email write succes...

mdpopescu · on July 26, 2016

I was about to argue but then I realized: I've been working in the MS world for so long, I forgot that not everybody has connection pools.

The MS recommendation is the opposite: get the connection early, finish with it late, let us worry about the "real" connection. I've been working on multi-TB databases like that, with hundreds of concurrent requests, and never had problems.

gnaritas · on July 27, 2016

This has nothing to do with pooling, you can't use a connection pool in this scenario because you're in a transaction. Even in the MS world, that requires a single connection to be used throughout without releasing it back to the pool. If one isn't in a transaction then each access to the db uses a fresh connection from the pool and this problem never comes up, but the OP here is assuming the time between open and close is a transaction.

tormeh · on July 26, 2016

So the correct version is, I guess:

Open db connection 1

get data for receipt

Close db connection 1

generate receipt

send email

open db connection 2

write success to database

close db connection 2

I guess you could speed this up a lot by doing it in bulk instead of opening and closing db connections twice for every email. Anyway, the version you wrote sounds reasonable for everything but really big operations to me, but then again I'm fairly junior.

jedberg · on July 26, 2016

You can still do it in one transaction, sort of. Replace "send email" with "queue email" and then either eliminate the step that writes the success to the database (pushing that off to the queue processor, who will open or already have open a database connection) or instead write to the DB that you queued the item.

The advantage you gain is that the queue is a nice buffer if something gets held up with email sending, and also, the queue processor can work in bulk, say sending 100 emails, and then opening a connection to the DB and writing them all in one statement.

1024core · on July 27, 2016

So if the "send email" step fails (temporarily), the next worker to come along will grab the same receipt and send it again?

I think a better solution would be to use a centralized queue that actually does the mail sending, and retries in case of failure.

srd · on July 27, 2016

That depends on the implementation of the queue. Most queues have a "failed/retry" concept, where they mark failed jobs to be retried in some set future point in time. So one failed job does not hold up your entire queue processing.

ChoHag · on July 27, 2016

> a centralized queue that actually does the mail sending, and retries in case of failure.

A bit like SMTP?

lazyant · on July 27, 2016

not necessarily, in the first write we can update receipt status as "queued" so queue system can update to "sent" or retry

viraptor · on July 26, 2016

Depending on the number of email sending workers, it could be closer to:

    transaction 1:
        mark an unsent receipt as being processed
        get the marked receipt
    send email
    transaction 2:
        mark the receipt as sent

nawitus · on July 26, 2016

You don't need to close the database connection to not have an open transaction, right?

jedberg · on July 26, 2016

No you don't. But depending on the ORM, some of them batch writes unless you explicitly tell them not to, but will flush when you close the connection. So some programmers will just close the connection because they don't understand the difference or because they've learned that closing the connection guarantees that the data is written.

mschuster91 · on July 26, 2016

> So some programmers will just close the connection because they don't understand the difference or because they've learned that closing the connection guarantees that the data is written.

To be fair, this problem isn't limited to databases. Filesystems, even HDDs/SSDs, have been known to readily ignore flush() calls in order to achieve better benchmark results.

nine_k · on July 26, 2016

This still can be a pretty reasonable and correct course of action. Everything depends on the amount of data you are locking, and amount of transactions open simultaneously.

When there's a noticeable contention due to the number of parallel transactions, one should consider ways to loosen their data guarantees (the status can be unset for some time) and go for an explicit asynchronous approaches.

yeukhon · on July 26, 2016

Thanks for the illustration. But if Uber team was using ORM, then expect ORM to take care of the pool of connections.

get data for receipt would be a SELECT and only write success to database would do INSERT or UPDATE. I expect junior programmer to complete the above in at least two SQL calls. I have a feeling they were trying something smart.

EDIT: hmm reading the other commenter above, probably they are trying to lock on the data for full data integrity. Okay. That makes sense then. I was looking at the problem from the wrong angle.

riking · on Aug 1, 2016

Grapevine says that they were holding a transaction open for the entire duration of the Uber ride.