
Call me maybe: Aerospike - joshrotenberg
https://aphyr.com/posts/324-call-me-maybe-aerospike
======
Ankhers
I currently work for a company that uses Aerospike quite heavily. In the past
couple weeks, we have begun to notice data inconsistencies in our counters. We
are seeing fluctuations in the data, despite having no decrement operations.

We have the enterprise edition of Aerospike, allowing us to be in constant
contact with their support team and developers. A couple weeks later, and we
still have no idea why this is happening. When dealing with monetary values,
these fluctuations are very bad for us. Needless to say, we have begun
migrating away from Aerospike.

~~~
jboggan
What is the rationale for storing monetary values in this sort of system? Not
being snarky, just legitimately curious what scale of service could possibly
necessitate that and what solutions didn't work beforehand.

~~~
CyrusL
Transactions in AdTech are different than normal payments.

For example, imagine an ad campaign spending $30k/month at a rate of $5 per
1,000 impressions. The customer may want their budget spread evenly throughout
the month, so the software sets a daily budget of $1000. But this really
represents 200,000 daily impressions, each of which is a transaction that
subtracts from the available balance in real-time. The buyers software is
talking to an ad exchange and keeping track of the budget every time an
individual impression is won.

To add some more complexity, the impressions are probably billed as second-
price auctions, so they aren't all exactly $0.005 each. Some are $0.00493,
some are $0.00471, ect. Each one of these numbers is reported back from the
exchange to the buyer's software in real time and the buyer is responsible for
managing their budget.

This is just an example, but hopefully it illustrates how it can become
impractical to account for this kind of thing using something more traditional
like PostgreSQL. It would be reasonable to log all the impressions to
something like Hadoop for the analytical piece of the software, but there
needs to be something more real-time for budgeting to prevent overspending.
The big ad exchanges can host hundreds of thousands or even millions of
auctions per second, so not turning off bidding can be very costly.

This process of auctioning ad impressions across many buyers through an API is
called real-time bidding.

~~~
hurin
Why does this need to be in real-time, if their daily budget is $1000, you can
still wait quite a bit and then apply increments in aggregate (e.g. hourly)?
More so it sounds the customers aren't inter-connected - it hardly seems like
the complex distributed problem.

~~~
cldellow
The impressions may be sparse, e.g. say you're retargeting CEOs (demographic
information you're getting from a DSP) who have visited your website in the
last month (via a pixel you drop) who are in New York City (via a geoIP DB).

So, fine, a probabilistic model might work well. And you might decide to bid
on 100% of impressions. And you might decide that you have to bid $200 CPM to
win -- which you're OK doing, because they're sparse.

And then say that FooConf happens in NYC and your aggressive $200 CPM bid 100%
of the time blows out your budget.

Often you can charge the customer you're acting on behalf of your actual spend
+ X% up to their campaign threshold. So you really want to ensure that you
spend as much as possible, without spending too much. Pacing is hard. Google
AdWords, for example, only promise to hit your budget +/\- 20% over a 1 month
period.

~~~
hurin
I'm really not seeing what you gain out of running fine-grained control all-
the-time here. Even if it were vital for a customer that you hit a budget
target exactly you could dynamically change the granularity of control as you
got closer. If anything predictive modeling would give you better budget use
when you do have the flexibility than granular adjustment would (I don't know
much about the area though and just going with your description of the problem
here.)

~~~
phamilton
A better example is frequency capping. Ever watch something on Hulu and see
the same ad 4 times in a twenty-minute commercial? Or even, worse, back to
back?

With a real-time data stack you can avoid the duplicated ad a good percent of
the time. Better experience for buyers, for publishers, and for users.

~~~
hurin
> A better example is frequency capping. Ever watch something on Hulu and see
> the same ad 4 times in a twenty-minute commercial? Or even, worse, back to
> back?

Yeah, but when that happens I usually don't think, oh hey they are lacking an
optimal in memory distributed database solution.

I think, well... their engineers suck. Or they don't care. Pick one.

edit: His point is vague, so there is nothing technical to respond to. I am
very much interested in a good technical example - but the things mentioned so
far are by all appearances relatively straight-forward and linear, hence lack
of effort or bad engineering are the only reasonable assumptions left.

~~~
phamilton
Volume and latency requirements make it more difficult to track individuals on
the web. It's an easier problem to solve in 50ms. It's also much easier to
solve when it's only a million individuals rather than a couple hundred
million individuals.

Like most problems, scale makes it hard.

~~~
hurin
It's simply not a difficult problem: there are no consensus requirements
between individuals, so scaling can't be made any harder by increasing N.

------
yellowapple
When a product claims to have 100% uptime, I immediately cringe, knowing full
well that they're probably full of bovine manure.

Good read.

~~~
saryant
Same with "exactly once delivery".

------
bketelsen
Many kudos to Stripe for funding this. Truly a great gift to the community.

~~~
mdellabitta
I'm imagining the job interview.

"We want to sponsor your distributed database research, but not your Barbie
animated GIF production."

------
stephen_mcd
The rotated A in the Aerospike logo reminds me of a system that's fallen over,
and now you can't unsee it:

[http://www.aerospike.com/](http://www.aerospike.com/)

~~~
oska
"A system"? Not sure what you mean by that.

To me, it was immediately apparent that they are making the whole word
Aerospike look like a rocket, a motif they repeat through their home-page.

~~~
simoncion
Worker 1: "The server has fallen over."

Worker 2: "I'll go restart it."

See also: The frequent interchangeability of the words "system" and
"computer".

------
SchizoDuckie
Oh wow, so this is not ACID, not "Eventual Consistency", but "Eventual
Inconsistency"

------
cschneid
What is the 3 color chart that I've seen posted in several of these articles?
I get that it breaks down the different CAP combinations with the things that
it allows/implies. But is there a good breakdown of all the terms and what
they mean?

~~~
dwetterau
Here is a paper [0] out of Berkeley that explains the chart you're asking
about (the chart appears on page 8). For more information on each isolation
level you might have to refer to the cited works or other sources.

[0]
[http://db.cs.berkeley.edu/papers/vldb14-hats.pdf](http://db.cs.berkeley.edu/papers/vldb14-hats.pdf)

~~~
cschneid
Thank you! That's perfect.

------
gmagnusson
I've used Aerospike at scale (approx 1MM tx per second) in private network,
and smaller loads in cloud. I have always found it to be fast, reliable and
extremely easy to operate (upgrade, modify cluster members, etc) w/o any
downtime or interruption. It is a critical tool in my toolbox. I also have
found their support and engineering team to be excellent.

I admire the work that Aphyr does - though at the end of the day, I need to
build systems that work for the problem I'm trying to solve (and I have to
choose from real things that are available).

Aerospike isn't the solution to every storage problem, and if you are choosing
technology based on marketing material, you're probably going to be
disappointed.

These technologies in general are trying to address _really hard problems_ and
design and architecture is the art of balancing tradeoffs. Nothing is going to
be perfect. Yet.

~~~
ploxiln
"though at the end of the day" "nothing is perfect" ... Aerospike makes
blatantly ridiculous promises in their high-level descriptions of their
database. That makes our jobs harder (before Aphyr makes them easier) because
we don't know what Aerospike is actually good at or exactly what kinds of data
loss potential we need to architect our systems around.

Isn't it kind of annoying that some technical projects bolster their
popularity/ecosystem with very fancy websites and impressive/competitive
claims, but to really do your job right you have to throw all that away? The
best you can do is try to get a sense from the reports of others who have
tried something (and may or may not have been rigorous in their evaluation) so
you can pick good candidates to even put through trials. (so again, thanks
Aphyr)

~~~
gmagnusson
I can't and won't defend Aerospike's descriptions on website or white paper.
And yes, "Thanks Aphyr".

I came across Aerospike technology via a pre-existing system at a previous
employer, and watched that system scale up and perform in a serious way. It
wasn't all unicorns and roses all the time as real life never is, but in the
context of the real world, it was great. The software is rock solid in a way
I've rarely come across, and support was spectacular. (I forget my current
production clusters are even running sometimes they are so stable, reliable
and self-operating)

And at the end of the day, there was no other solution out there remotely
competitive that we could find. And I looked - not because we were
dissatisfied, but because that was our fiduciary responsibility to the
company, to ensure that we were deploying the most cost-effective systems that
met our feature and performance requirements.

Ultimately yes - I think that as an engineer, you need to understand what your
tools are really capable of and avoid doing what I call "BDD" (Blog-Driven
Design). That isn't the ideal answer - it would be nice to have a reliable
understanding of the capabilities of the materials we use to build systems
(like civil engineers can reason about materials like steel and concrete in
repeatable ways) but what we call "software engineering and architecture" is
still a very young discipline, very often with unrealistic expectations about
our ability to deliver in given budgetary and temporal constraints, so we do
what we can.

------
pkaye
Fast, functional or reliable... pick two.

~~~
me_again
ITYM consistent, available or partition-tolerant...

~~~
Jweb_Guru
...but you can't pick CA.

Something Aerospike didn't realize.

(And nope, sorry, I'm completely uninterested in your anecdotes about how you
haven't personally lost data when [1] there's a clear data loss scenario
highlighted in the post, [2] Aerospike actively recommend services like EC2
and GCE that routinely partition, and [3] there are people _in this thread_
who have experienced the same problems).

------
acqq
I was wondering what is going on, with the titles of this form appearing on HN
and now browsing the blog, it appears that the author is the fan of the "Call
me maybe" titled posts (a lot of them there, and then also here!). From what I
understand, this phrase, as used in the titles by him, seem to mean to him
something like the "review of" something or the "comment on" something. For
what it's worth.

~~~
beberlei
It is a pun on this song [https://www.youtube.com/watch?v=fWNaR-
rxAic](https://www.youtube.com/watch?v=fWNaR-rxAic)

~~~
acqq
Apart from author being obviously inspired by the song I admit I don't see any
connection or anything worth naming that "the pun." Maybe it's just me.

~~~
chipsy
Making a phone call is an asynchronous event, and as the song suggests,
sometimes you give someone your number but they never call back. With any
distributed system in real world conditions, a similar situation arises where
a request doesn't get handled or is lost along the way.

</explainer>

~~~
acqq
So he uses the phrase instead of "distributed systems?" (shrug)

------
jack9
The minimum useful licensing is in the tens of thousands of dollars. For the
SLA they offer, it makes sense. Many well known Ad Serving companies utilize
Aerospike (at a fraction of the cost of their previous solutions). It's very
impressive result, per machine, from an operational standpoint.

------
theVirginian
"schemaless" nope

------
tootie
Is there any reason he's never tried to analyze a "classic" RDBMS like Oracle
or SQL Server? I have to imagine they'd clobber a lot of this hipster
technology.

~~~
electrum
He tested PostgreSQL: [https://aphyr.com/posts/282-call-me-maybe-
postgres](https://aphyr.com/posts/282-call-me-maybe-postgres)

~~~
threeseed
I don't understand how this is comparable though.

All of other databases were tested in clustered mode. Why not PostgreSQL as
well ?

~~~
Jweb_Guru
Postgres doesn't have a builtin clustered mode, or claim to be totally
available (and also, incidentally, doesn't guarantee serializability for hot
standby replication targets, which is prominently stated in a bright red box
that says "warning" in the documentation on serializable isolation, i.e. the
first place you would look). It claims to be CP in a single-node configuration
(which it is) and aphyr tested it on those claims. Remember, Jepsen isn't
about proving that a database can't violate CAP, nor is it trying to say "all
databases are crap." It's about _verifying the marketing claims_ and then
determining whether there are any mitigating strategies. The fact that many
databases make unreasonable marketing claims is unfortunate, but certainly not
a requirement.

It's also worth posing this question in reverse: what would happen if these
distributed databases were tested in a single-cluster configuration? As noted
in the most recent article on Elasticsearch, many of them (e.g. Elasticsearch,
Cassandra, and Riak) acknowledge writes before fsync and can therefore lose
data due to issues like `kill -9`, power loss, and other exceptional
conditions, while Postgres doesn't. For a single-node database this robustness
is very important, while he argues that it isn't as important for a
distributed one. Because these databases aren't _designed_ to be used as
single nodes, aphyr didn't substantially ding them for that. Again, what's
important is whether the database does what its documentation says it does
when used as its documentation says it should be used.

------
elchief
If you're big enough to require something like Aerospike, you're rich enough
to build something like F1.

~~~
vosper
I dunno, there are ad-tech companies with less than 20 engineers who're
processing millions of events per minute on their endpoints, and trying to do
various things with that data. This from personal experience.

Of course, I would love if someone gave me the mandate to go out and build
something like F1...

------
DAddYE
Awesome read and still quite impressed by Aerospike.

~~~
misframer
What is impressive about being somewhat misleading and cutting corners to gain
performance?

~~~
DAddYE
As said by Aphyr this product is ideal in the ad-tech. Of course is misleading
but good engs will test products to see if it fit their scenario without
basing decision on marketing slides. Yes, ideally they should be more clear
and I bet after this article they will.

~~~
rgbrenner
_As said by Aphyr this product is ideal in the ad-tech_

No, Aphyr said the data loss is ok for ad tracking and analytics because it
doesn't matter. That's very different.

And if that's the case, then why make those claims.. they could just as easily
give accurate info to their customers, and the customer could decide if that
fits their case. Instead they claim something very difficult (if not
impossible), and let their customers find out it's not true (possibly after
it's too late, and they've already lost valuable data).

------
manigandham
We use Aerospike heavily. It works just fine.

I'm constantly surprised by the general tone of comments on posts like these
as if it's some crazy revelation that this software still obeys the
fundamental laws of distributed systems.

There is no perfect database out there, all of them will fail with network
partitions. Aerospike was designed to work in clusters that are very close
together, often the same rack. It has much tighter timings and tolerances in
exchange for providing much higher performance in certain situations and
definitely has one of the best SSD focused storage systems I've come across.

If you don't have a high performance network interconnect between nodes, then
there will be more issues with Aerospike since it relies on that more than
some other system that use Paxos for all writes (like aphyr mentions). We run
several TB's of data accessed at 100k+ TPS including very fine grained
counters and everything works. And yes, we run on the cloud in AWS and
SoftLayer and have yet to have major problems with the proper network setup.

Btw, there is a comment below from the current CTO of AppNexus, one of the
companies that pioneered real-time bidding for digital ads and runs several
million auctions per second on one of the biggest ad exchanges available. They
were the first customer for Aerospike and from everything I've learned from
their team, it works really well for them, and they definitely are not happy
to just "lose" data however insignificant it might seem. Volume changes
everything and even a fraction of a percent will add up. We trust Aerospike
because it's been hardened by lots of much much larger companies with very
high production usage, the key is being aware of all the technical
requirements and the environment you're deploying in.

I think the real major issue here that people seem upset with are the general
claims and marketing information. I can't speak to all that and there are
definitely some things like 100% uptime which do seem overly confident, but
this is true of every single technology vendor out there unfortunately. I'm
not saying Aerospike is any better or worse as a company but marketing
material only goes so far and it would surprise me if further research wasn't
done for any mission critical system.

~~~
teraflop
> There is no perfect database out there, all of them will fail with network
> partitions.

Some of them will fail in a way that keeps your data safe, others will fail in
a way that preserves uptime but gives you _temporarily_ inconsistent data.
Aerospike apparently does neither. Why is it unreasonable to expect them not
to falsely claim otherwise?

The "crazy revelation" for me was not that Aerospike's software is, like
everything else, subject to the CAP theorem. It's that they apparently think
it's awesome to claim that it isn't, and charge tens of thousands of dollars
for their product on that justification.

~~~
manigandham
1) Aerospike is open-source and has a free community edition if you need it.

2) Yes, marketing claims are BS. If this was a reason to not use something,
we'd have to stop using pretty much every other commercial piece of software
we have. That's why we test and run software in our environment, and there...
aerospike works. Really well. Even with network partitions. So I can
understand kyle's tests in this post and the reasoning and results but there's
still a big gap between this testing and the reality our company has
experienced.

~~~
yellowapple
> Yes, marketing claims are BS. If this was a reason to not use something,
> we'd have to stop using pretty much every other commercial piece of software
> we have.

Which is what I at least have indeed opted to do; I avoid commercial software
like the plague for this very reason, using it only when there isn't an
alternative (like when it's a legacy system that has to be interfaced with).
There are plenty of free software projects that don't make outrageous
marketing claims and - therefore - aren't nearly as susceptible to
disappointment and wasted money.

Aerospike's claims border on the realm of false advertising (if they don't
actually classify as false advertising, which is a big "if"; the claim of 100%
uptime is dubious at best and more likely to be an outright-malicious lie).
Why should they get my money?

