

Why are Oracle and DB2 still on top of TPC-C? A roadmap to end their dominance. - pron
http://dbmsmusings.blogspot.com/2012/05/if-all-these-new-dbms-technologies-are.html

======
peapicker
"How is technology designed decades ago still dominating TPC-C?" -- this
assumption does not take into account that the engine underneath the
technology interface design can change and evolve for new technologies to some
extent.

Additionally, the reason these two are at the top of the heap is that Oracle
and IBM both put a great deal of money not only into the software, but into
designing and producing the machines that run these tests, the filesystems and
operating systems, etc -- it is a total package with a driving focus related
to database performance; these tests are run on multi-million dollar machines
made by those companies.

Until you have a complete end-to-end ecosystem, with a focus on everything
from the hardware, through the OS, all the way to the DB, you probably aren't
going to beat them with any 'new' technology...

If a 'new technology' does beat out the traditional, I will expect the
products that will be at the top of the performance curve, for the near future
(10yrs) to also come from IBM and Oracle... Especially since Oracle has
already release an ACID noSQL db engine that is also tightly wedded to their
full ecosystem. (IBM may have as well, but I didn't notice if they did)

~~~
jhugg
It is, surely, difficult to create a product that is better than Oracle/DB2 in
every dimension. These systems have thousands of engineer-years of work baked
into them.

On the other hand, it's pretty easy to beat these systems by specializing.
Take some part of what they do and do it better. [Insert sports car / minivan
/ mack truck analogy here]

The interesting question is whether the pain of using the new system is less
than the pain of using the old system. New system pain often comes from having
fewer features or a less robust implementation. Legacy pain often comes from
managing software that has to maintain compatibility with 20 year old apps and
scaling workloads on software designed when 8mb of ram was a lot and clusters
weren't practical.

I will also say from experience, if you take a copy of Oracle and the same
hardware from the TPC-C leader board, you will have a hell of a time
replicating their results. They use every trick in the book and spend huge
sums of money tuning for these benchmarks (n.b. I don't fault them). In
practice getting throughput close to what they claim on actual real-world apps
is not realistic.

Now, you should take all benchmarks with a grain of salt (even mine). Ask a
vendor how fast the system will run on a real workload that you understand
with the configuration you plan to deploy with. If their number is attractive,
build a POC and check for yourself.

------
stephen
Sounds like event sourcing [1] applied to databases.

The server-side transactions reminded me a lot of VoltDB [2], which also has
server-side transactions and turns out was a previous system Abadi was
involved in.

My naive impression is that VoltDB is more about being in-memory, single-
threaded, and lock-free, where as this Calvin approach is more about
deterministic ordering of events.

Sounds really cool, but there is a reason people don't like writing stored
procedures. Besides (historically) meaning you're forced into a non-general
programming language, it means you can't do anything else while executing the
logic (call out to a 3rd party, read a file from somewhere else, etc.).

While SQL has its flaws, the semantics of writing your application logic
freely over multiple queries ("make a query, do some work, make a query, do
some work, ... commit!") and having the database just make that magically work
for you is pretty darn nice and not something I'd easily give up.

Maybe a heavy dose of optimistic locking and a pleasant/mostly automated
process of coding/integrating/deploying the stored procedures would make it
possible to get all the performance benefits of their approach. It does sound
pretty smart.

[1]: <http://martinfowler.com/eaaDev/EventSourcing.html>

[2]: <http://voltdb.com/>

~~~
zzzeek
It does say it supports ad-hoc Python blocks. I can see the potential for
Python frameworks/ORMs being able to "export" a series of instructions over to
such a system, or alternatively allow the object-relational/database
abstraction layer to just run on the server side to start with. If you can
produce a DBAPI-like API on the server, then ORMs like SQLAlchemy and maybe
others should be fully usable within that environment. Client-server
communication might use some kind of RPC-like system. Depending on how
flexible the Python environment is, the client system could send over blocks
of Python code on the fly which gets cached, making the system almost
transparent.

------
SpikeGronim
Vendors rig these benchmarks to death. An ex-coworker of mine worked on a
major proprietary SQL db. He told me that they spent 6 months with both
software and hardware vendor engineers optimizing for the benchmark. They put
TPC specific code in the query planner!

------
tlogan
This might useful for database as a service offerings. If that trend pick up
the steam.

BTW, I think Oracle and DB2 are still on top of TPC-C is because you really
don't need any faster database for OLTP than they are now. The maximum number
of customers any online shop will not exceed 4 billion... The need for faster
OLTP is just not there.

Warehousing and big data is a different beast... Oracle did not invest to make
faster OLTP but faster OLAP...

~~~
jhugg
If you count OLTP as "a system that places orders created by people", then you
might be right.

As a VoltDB eng, we see lots of OLTP workloads that require much more
throughput than legacy RDBMs can offer on a single system.

How many transactions do you think it requires to pick which ads to show you
on a webpage?

How many updates to your Farmville-style game state are required per minute?
How many concurrent users? What about MMOs?

Are financial exchange orders OLTP? If so, you have a six-figure TPS problem.

What about sensor data? Network packet monitoring? Call data record
dashboards?

~~~
tlogan
Yes there is a market but it is not very big. There is only one Zynga.

Also note that some of examples you mention are very well solved by streaming
databases (Streambase) and event processing systems (which are sold as add-on
to databases).

~~~
jhugg
A big part of my job is talking to people who have scale pain with legacy
systems. I don't know what percentage of the DB market this is, but it's
nontrivial and growing fast.

Most of the markets I mentioned in the previous comment are nearly impossible
to be successful in with a single node of legacy RDBMS sitting behind your
app. Zynga is far from alone in social gaming scale pain.

Consider digital ad-tech. How many ads do you have to show before someone
clicks on one? How many clicks do you need to earn $1? That can translate
into: The cost of all those DB operations needs to cost way less than $1 or
I'm hosed. Enter systems that can scale with less pain.

Streambase is a good example of a specialized system that can outperform
legacy RDBMSs. Still, it's not like you can say, "All financial problems in
scale pain can fix everything with Streambase." It's too specialized. What if
you need 100gb of state? There are lots of problems in finance and some of
them can be solved with Oracle/DB2 while others can't.

------
gouranga
Two reasons:

1\. TPC-C is a vendor sponsored cock measuring competition that noone other
than marketing people take seriously.

2\. Most people scale out rather than scale up so it's pretty much irrelevant
as it considers monolithic computing only.

TPC-C is from the dark ages...

~~~
huggyface
_that noone other than marketing people take seriously_

Plenty of people take it seriously. I take it seriously.

 _Most people scale out rather than scale up so it's pretty much irrelevant as
it considers monolithic computing only._

Yet the top results are clusters.

You're 0 out of 2. Any more wisdom about TPC?

It's also worth considering audiences: If you're a web scale company holding
recipes for millions of free accounts, you're a bit different from a hedge
fund churning performance results for your investor results statements.
Judging the latter from the perspective of the former is asinine.

~~~
jhugg
People take TPC-C seriously because not much has come along that's more useful
as a transactional benchmark. There are lots of transactional benchmarks, but
they're often flawed in some annoying way and don't serve well as a baseline
as TPC-C.

That said, TPC-C is horribly out-of-date. For example, you have to simulate
human data entry time within transactions. How many systems have that problem
today? It also only ever adds data, so if you run it fast enough, you have a
petabyte problem, and most OLTP isn't a petabyte problem.

As for the second point, those clusters are clusters in name only. They use
fancy and expensive interconnects and caches that effectively give them shared
memory. Also, they won't tolerate failure of any individual component well.
Finally, individual nodes still act as gatekeepers and transaction monitors.
Most of the cluster is simply there to apply predicates on data coming off of
disk really fast.

~~~
huggyface
_For example, you have to simulate human data entry time within transactions.
How many systems have that problem today?_

How many people have problems with _concurrency_? A shitload of people, that's
who. This isn't a problem with no locking models because...no locking. It is
when you care about consistency.

 _They use fancy and expensive interconnects and caches that effectively give
them shared memory._

They often use high speed interconnects because the sort of customers who care
about such build-outs would naturally use high speed interconnects. They are,
however, clusters in every meaning of the word, regardless of no true scotsman
fallacy's.

~~~
jhugg
I'm very familiar with the TPC-C spec and the problem can't be brushed off
with "concurrency!". There are multi-second waits in a large percentage of
transactions. Nobody does this in any performant OLTP system today, but sure,
concurrency! Except the benchmark limits how many of these transactions can be
concurrently operating on a warehouse, one of its core models. So the only way
to scale throughput is to add warehouses. You end up simulating a company with
a million warehouses, each with a fairly small load. Furthermore, to run a
million transactions a second, you'll need several million open transactions.
That's why you see armies of client nodes in the spec of the systems on the
leaderboards. The append-only data model is also difficult. If you removed
waits and allowed old data to be pruned, the benchmark would be much more
useful.

So yes, technically Oracle Exadata OLTP Clustering has multiple CPUs connected
by an high speed interconnect. Cluster.

My dual socket commodity Dell server also has multiple CPUs connected by a
high speed interconnect. Cluster?

My point was not to argue about a word, just that Exadata is not what some
typically think of as a cluster in modern distributed systems. They've moved
some smart filtering into a SAN and plugged that into RAC and shipped it all
in a big hot tower-thingy. It's not bad in any way, it's just closer to SMP
than other kinds of network-based parallelism.

~~~
huggyface
_My dual socket commodity Dell server also has multiple CPUs connected by a
high speed interconnect. Cluster?_

The top TPC-C cluster has 27 database servers, each with 4 processors and
512GB of RAM. Can your Dell server host 108 processors and 13.8TB of RAM?

The Sun cluster is exactly what most people think of as a "cluster" -- a
scalable set of servers that vastly exceeds single server performance.

~~~
jhugg
Yes, it's a cluster. I totally concede.

I just meant it's not the same kind of clustering as Hadoop/HBase, Vertica,
VoltDB, Cassandra, Riak, Greenplum, Netezza, Teradata or even DB2-Cluster.

If clustering is a spectrum where Dynamo-style systems like Riak are on one
side and my Dell SMP system is the other extreme, the Sun cluster is probably
closer to the SMP system than to the Riak cluster.

------
gaius
Have you not just reinvented CICS?

~~~
JoachimSchipper
If you mean <http://en.wikipedia.org/wiki/CICS>, that does look similar. From
a quick look at the Wikipedia page, it does not appear to be a very
distributed system, though.

~~~
noselasd
You can scale it pretty amazingly, though only on homogeneous IBM mainframes.

------
spitfire
I haven't read the full paper yet. But this sounds like a winner to me. It's
actually not a database, but a transaction scheduling system sitting in front
of the databases.

Hopefully these sort of ideas make it into Postgres for scalability.

------
sixbrx
Looks very interesting.

The "limitations" are especially interesting:

"Calvin’s primary limitation compared to other systems is that transactions
must be executed entirely server-side. Calvin has to know in advance what code
will be executed for a given transaction."

Also the author mentions in the comments that certain non-deterministic
functions such as fetching random numbers of current date/time will not be
allowed within the server-side transactions, the client will have to pass such
values to the server.

------
jacquesm
Could someone please explain the sentence at the bottom of page3 of the pdf?
It reads:

"This decoupling makes it impossible to implement certain popular recovery and
concurrency control techniques such as the physiological logging in ARIES and
next-key locking to handle phantoms (i.e. using physical surrogates for
logical properties in concurrency control)."

Also: the pre-fetch trick where the read request is sent to the storage layer
pre-emptively with an artificial delay for all dependent operations is clever
but could be derailed quite a bit when a drive re-calibrates. That can take a
large multiple of the time a seek typically takes (which is the case you'd be
optimizing for here).

------
kyberias
Can someone explain why the IBM machines on the list are running AIX operating
system and Microsoft's COM+? I thought COM+ was Windows thing.

~~~
DrJokepu
This answer on Stackoverflow explains it:
<http://stackoverflow.com/a/401999/8954>

~~~
kyberias
Sorry, no, it doesn't explain it at all! I already know COM+ is delivered on
Windows 2000 and above. But the TPC benchmark table talks about AIX, which is
a Unix, not Windows! Why do they say AIX, not Windows if they're running COM+?

~~~
DrJokepu
"In fact, it's usually used as the TP monitor in TPC-C benchmark systems
because it's more efficient than .Net or Java and much cheaper than Tuxedo or
Encina (which reduces the $/TPM)."

~~~
kyberias
So are they running Windows boxes but it's nowhere to be seen? Only AIX is
mentioned.

