We didn't observe any automatic batching when testing Tigerbeetle with their Go client. I think we initiated a new Go client for every new transaction when benchmarking, which is typically how one uses such a client in app code. This follows with our other complaint: it handles so little you will have to roll a lot of custom logic around it to batch realtime transactions quickly.
I'm a bit worried you think instantiating a new client for every request is common practice. If you did that to Postgres or MySQL clients, you would also have degradation in performance.
PHP has created mysqli or PDO to deal with this specifically because of the known issues of it being expensive to recreate client connects per request
We shared the code with the Tigerbeetle team (who were very nice and responsive btw), and they didn't raise any issues with the script we wrote of their Tigerbeetle client. They did have many comments about the real-world performance of PostgreSQL in comparison, which is fair.
Thanks for the code and clarification. I'm surprised the TB team didn't pick it up, but your individual transfer test is a pretty poor representation. All you are testing there is how many batches you can complete per second, giving no time for the actual client to batch the transfers. This is because when you call createTransfer in GO, that will synchronously block.
For example, it is as if you created an HTTP server that only allows one concurrent request. Or having a queue where only 1 worker will ever do work. Is that your workload? Because I'm not sure I know of many workloads that are completely sync with only 1 worker.
To get a better representation for individual_transfers, I would use a waitgroup
var wg sync.WaitGroup
var mu sync.Mutex
completedCount := 0
for i := 0; i < len(transfers); i++ {
wg.Add(1)
go func(index int, transfer Transfer) {
defer wg.Done()
res, _ := client.CreateTransfers([]Transfer{transfer})
for _, err := range res {
if err.Result != 0 {
log.Printf("Error creating transfer %d: %s", err.Index, err.Result)
}
}
mu.Lock()
completedCount++
if completedCount%100 == 0 {
fmt.Printf("%d\n", completedCount)
}
mu.Unlock()
}(i, transfers[i])
}
wg.Wait()
fmt.Printf("All %d transfers completed\n", len(transfers))
This will actually allow the client to batch the request internally and be more representative of the workloads you would get. Note, the above is not the same as doing the batching manually yourself. You could call createTransfer concurrently the client in multiple call sites. That would still auto batch them
Yeah it was back in February in your community Slack, I did receive a fairly thorough response from you and others about it. However then there were no technical critiques of the Go benchmarking code, just how our PostgreSQL comparison would fall short in real OLTP workloads (which is fair).
I don’t think we reviewed your Go benchmarking code at the time—and that there were no technical critiques probably should not have been taken as explicit sign off.
IIRC we were more concerned at the deeper conceptual misunderstanding, that one could “roll your own” TB over PG with safety/performance parity, and that this would somehow be better than just using open source TB, hence the discussion focused on that.
Interesting, I thought I had heard that this is automatically done, but I guess it's only through concurrent tasks/threads. It is still necessary to batch in application code.
But nonetheless, it seems weird to test it with singular queries, because Tigerbeetle's whole point is shoving 8,189 items into the DB as fast as possible. So if you populate that buffer with only one item your're throwing away all that space and efficiency.
We certainly are losing that efficiency, but this is typically how real-time transactions work. You write real-time endpoints to send off transactions as they come in. Needing to roll more than that is a major introduction of complexity.
We concluded where Tigerbeetle really shines is if you're a large entity like a central bank or corporation sending massive transaction files between entities. Tigerbeetle is amazing for moving large numbers of batch transactions at once.
We found other quirks with Tigerbeetle that made it difficult as a drop-in replacement for handling transactions in PostgreSQL. E.g. Tigerbeetle's primary ID key isn't UUIDv7 or ULID, it's a custom id they engineered for performance. The max metadata you can save on a transaction is a 128-bit unsigned integer on the user_data_128 field. While this lets them achieve lightning fast batch transaction processing benchmarks, the database allows for the saving of so little metadata you risk getting bottlenecked by all the attributes you'll need to wrap around the transaction in PostgreSQL to make it work in a real application.
> you risk getting bottlenecked by all the attributes you'll need to wrap around the transaction in PostgreSQL to make it work in a real application.
The performance killer is contention, not writing any associated KV data—KV stores scale well!
But you do need to preserve a clean separation of concerns in your architecture. Strings in your general-purpose DBMS as "system of reference" (control plane). Integers in your transaction processing DBMS as "system of record" (data plane).
Dominik Tornow wrote a great blog post on how to get this right (and let us know if our team can accelerate you on this!):
> We didn't observe any automatic batching when testing Tigerbeetle with their Go client.
This is not accurate. All TigerBeetle's clients also auto batch under the hood, which you can verify from the docs [0] and the source [1], provided your application has at least some concurrency.
> I think we initiated a new Go client for every new transaction when benchmarking
The docs are careful to warn that you shouldn't be throwing away your client like this after each request:
The TigerBeetle client should be shared across threads (or tasks, depending on your paradigm), since it automatically groups together batches of small sizes into one request. Since TigerBeetle clients can have at most one in-flight request, the client accumulates smaller batches together while waiting for a reply to the last request.
Again, I would double check that your architecture is not accidentally serializing everything. You should be running multiple gateways and they should each be able to handle concurrent user requests. The gold standard to aim for here is a stateless layer of API servers around TigerBeetle, and then you should be able to push pretty good load.