
Fast key-value stores: An idea whose time has come and gone - godelmachine
https://ai.google/research/pubs/pub48030
======
siculars
Previous discussion
[https://news.ycombinator.com/item?id=19823022](https://news.ycombinator.com/item?id=19823022)
.

------
shereadsthenews
I have some real beefs with this paper. Their number about how long it takes
to encode or decode a protobuf is wrong and misleading. It seems to be based
on this benchmark[1] which encodes a huge repeated int32 stuffed with random
numbers. This does not resemble a key-value workload at all. In a KV system
you would have something like key and value as bytes fields. It would be
extremely simple. By contrast this "benchmark" is the worst possible case for
protobuf because encoding random data as varint guarantees that the average
field takes 9 bytes instead of 8 and hits the slowest possible path in the
codec. The whole paper rests on this number, so the conclusions are crap. They
are not even consistent with the practical performance of memcacheg which the
authors should have been very familiar with. 1:
[https://github.com/hq6/ProtobufBenchmark/blob/master/Benchma...](https://github.com/hq6/ProtobufBenchmark/blob/master/Benchmark.cc)

~~~
alfalfasprout
Not to mention protobufs have awful performance compared to more modern
alternatives in use today like Flatbuffers, Thrift, Cap'n Proto, SBE.

In the case of Google's own Flatbuffers, the layout is going to be far more
performant.

~~~
apta
> protobufs have awful performance compared to more modern alternatives in use
> today like Flatbuffers, Thrift, Cap'n Proto, SBE

Do you have a source on that? Genuinely curious.

~~~
kentonv
Hi, I wrote Protobuf v2 (the version everyone uses) and Cap'n Proto.

I don't know if I'd say Protobuf has "awful" performance. It's certainly much
better that text-based formats like JSON. But the format is rather branch-y.
You have to process it byte-by-byte, because e.g. integers are encoded in a
variable-width encoding where each byte contains 7 bits of data plus 1 bit to
indicate if this is the last byte. This results in a compact encoding, but
takes a lot of cycles to encode and decode. Moreover, since everything is
variable-width, in order to find any one field of the message, you must scan
through all previous fields, parsing them one by one.

Cap'n Proto, FlatBuffers, and SBE all use "zero-copy" encodings, meaning the
data is laid out on the wire in a format that is easy for a CPU to use
directly. This means, for example, that integers are fixed-width, and fields
are located at fixed offsets. This is must faster to parse (or even use in-
place without parsing at all), but does result in somewhat larger encodings.
(But then, you can always layer on independent compression when bandwidth
matters more than CPU.)

My understanding is that Thrift is closer to Protobuf and contemporaneous with
it, so I don't know why GP included it the list.

~~~
shereadsthenews
For simple protocols protobuf decoding has no taken branches. I.e. if you only
use the first 15 field numbers (all your tags are 1 byte) and if all the types
are the expected types, and if all the variable-length items are < 128 bytes
long then you can decode the message without taking any branches. In C++. Most
of the other languages have simpler and slower codecs.

This is the hot path in C++[1]. A really large amount of work has gone into
protobuf C++ performance in the last 3 years or so.

1:
[https://github.com/protocolbuffers/protobuf/blob/master/src/...](https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/io/coded_stream.h#L1034-L1044)

~~~
kentonv
And all your integer fields must be < 128, right?

Yes, I suppose the branches in Protobuf can be pretty predictable. Still, you
do generally have to examine each byte individually.

~~~
shereadsthenews
Sure. In this specific case of a kv store it's hard to imagine how to simplify
it dramatically from protobuf. As a proto you might have: tag-length-key-tag-
length-value. Instead you could store the key and value lengths in host format
using 8-16 bytes: length-length-key-value. It's not _dramatically_ faster to
decode this, and you traded away extensibility to get a marginal speedup.

~~~
kentonv
Sure, I was speaking in general, not specifically about the key-value case.

I think most serialization frameworks are likely to be overkill for such a use
case, spending more time on setup than actual parsing.

Also note that storing the value (and maybe the key) with proper alignment
might make it easier to use the data in-place, saving a copy.

------
judofyr
This paper is very light on details. It defines "RInK architecture" as
something which uses stateless application servers w/ key-value store as
backend (Redis/Memcached). Section 3 then shows that it's faster to use a
stateful application server, but skips any details about scaling/durability
which makes the comparison kinda strange. Are they really just comparing a
single-threaded in-memory application server with a scalable system of
stateless application servers?

Their solution (LInK) is a key-value system "as a library": You link it into
your application servers and you can then store data (they become stateful).
The system will automatically handle sharding, replication, consistency and so
on. From your perspective you're working directly on memory. You still need to
implement marshalling, but this is only used for sharding/replication across
nodes (which the system handles completely for you).

They say nothing about _how_ this system handles data
consistency/replication/durability, so it's a bit tricky to assess it. They
also have this nice note: "(since, for performance, persistence of state is
not guaranteed)".

Summary: This is a system which couples your application logic with your
persistence layer (in the same OS process). This can give you great
performance as they co-side with each other.

Looks like a very cool project, and I'm actually surprised there not more
"database as a library"-projects out there.

~~~
continuations
> Summary: This is a system which couples your application logic with your
> persistence layer (in the same OS process).

I don't think the paper is talking about the persistence layer. Rather it's
focusing on the cache layer.

From the paper:

"Modern internet-scale services often rely on remote, in-memory, key-value
(RInK) stores such as Redis and Memcached. These stores serve at least two
purposes. First, they may provide a cache over a storage system to enable
faster retrieval of persistent state. Second, they may store short-lived data,
such as per-session state, that does not warrant persistence."

If you look at Fig.2 in the paper, the database (persistence) layer is always
there. What the paper is advocating is to move from an out-of-process cache
server like Redis to an in-process cache library like Java Caffeine

~~~
zaroth
So for the canonical use case of session state in a load balanced server farm,
is the session state being replicated across every single one of these in-
process caches? Or, if a request is routed to a server without a copy of the
session state, does the in-process cache have to go find a peer server with a
copy of the session state?

This seems like an easy “solution” to show how it speeds up the happy path
while not fully addressing the real-world tradeoffs.

Stateless application servers come and go without a care in the world. You can
reason very simply about the cost and effects of bringing them up and down.

Add in a KV store with sharding, replication, a discovery protocol, a
heartbeat protocol, a sync and recovery protocol, etc... and put it all in-
process on every application server?

Have fun monitoring and debugging this.

I much prefer the 3-tier system of a local RAM cache, a Redis cache, and a
persistence layer. As much state as possible is idempotent so you can cache it
locally as well as in Redis. The load balancer can best-effort route back to
the same app server, and your KV lookups automatically check local RAM first.
But the central KV store handles replication and is easy to monitor and scale
out if needed.

~~~
cjhopman
The paper assumes that you have a good amount of related context. I'd
recommend at least reading the Slicer paper mentioned several times
([https://www.usenix.org/system/files/conference/osdi16/osdi16...](https://www.usenix.org/system/files/conference/osdi16/osdi16-adya.pdf)).

------
wolf550e
Reminds me of the talk "Building Scalable Stateful Services" by Caitie
McCaffrey about stateful app servers for Microsoft Halo (the multiplayer
shooter game). She explained that stateless app servers are wasteful and the
user's data should be cached in an instance and routing should be sticky to
always reach that instance, and once it is warm performance is good. If the
instance goes down, the user data will get loaded into another instance and
will get stickied there.

[http://highscalability.com/blog/2015/10/12/making-the-
case-f...](http://highscalability.com/blog/2015/10/12/making-the-case-for-
building-scalable-stateful-services-in-t.html)

[https://www.youtube.com/watch?v=H0i_bXKwujQ](https://www.youtube.com/watch?v=H0i_bXKwujQ)

[https://speakerdeck.com/caitiem20/building-scalable-
stateful...](https://speakerdeck.com/caitiem20/building-scalable-stateful-
services)

~~~
hinkley
We are an all or nothing sort. If we use statefulness and sticky sessions we
almost always tend to use it in a way where migration to another server is
next to impossible.

If we had better tools for the moderate position then life would probably be
easier than at the extremes.

While the idea of migration is useful for resilience in the face of hardware
failure, it’s more attractive from a standpoint of elastic scalability.

But.

We are still living in a bubble where we believe that cloud providers aren’t
going to oversubscribe their hardware the way ISPs have been doing for
decades. As the ratio of private server rooms declines, that bubble starts to
wear thin.

~~~
Johnny555
_We are still living in a bubble where we believe that cloud providers aren’t
going to oversubscribe their hardware_

AWS doesn't oversubscribe their main class of cloud servers, but they will
sell you a cheaper t-series if you're willing to accept throttling under heavy
CPU use.

They used to have "noisy neighbor" issues with some of their older instance
types where another server on the same physical hardware could monopolize the
disk/network bandwidth (but not the CPU), but they've resolved that now, every
server can use it's published resources (except for network which can burst up
on 10 Gig on some instances types, but sustained except on large instance
types that can sustain 10Gig)

They've built their business model on selling dedicated resources, and I don't
see that changing.

------
ThePadawan
Meanwhile, the rest of the CRUD world is still using local RDBMS in place of
local key-value stores.

I understand the point the abstract is trying to make, but the title is
sensationalized at best.

~~~
anonu
I remember a headline on HN from a month or two ago. Something like "You're
not Google".

Turns out that a run of the mill RDBMS will fit 99% of the problems you're
trying to solve.

Edit:
[https://news.ycombinator.com/item?id=19576092](https://news.ycombinator.com/item?id=19576092)

~~~
lmm
What makes an RDBMS "run of the mill" and a key-value store not? If anything
key-value stores tend to be much simpler, both conceptually and in terms of
implementation.

~~~
ben509
A SQL DBMS (run of the mill) gives you a well thought out data model (the SQL
model, derived from the relational model) that implements your business logic
in terms of a well designed, well understood data model.

Conceptually, the relational model has relations and atoms. The algebra is
built on the primitives conjoin, disjoin, project and rename. You can't get
much simpler while being as expressive.

The SQL model adds a good deal of complexity, but at least it's standardized
and you can generally ignore the complexity you don't need.

A KV store is conceptually simple if you stick to using it as a KV store. Once
you try to implement any kind of schema, you have to build that from scratch.
As you add business logic to it, you wind up inventing an ad hoc data model,
which is likely to be conceptually more complex than either the SQL or
relational model.

You may not be aware of the complexity of your ad hoc model, but math will
inevitably remind you.

~~~
stcredzero
_You may not be aware of the complexity of your ad hoc model, but math will
inevitably remind you._

I think this idea needs to be expressed more clearly in CompSci and
programming. What is the exact nature of the complexity explosion you are
talking about here? Is there something analogous to the increase in complexity
going from regular expressions to stack machines? What math are we talking
about here? It seems like everyone just explains the relational algebra, then
leaves it right there.

~~~
ben509
> It seems like everyone just explains the relational algebra, then leaves it
> right there.

Fair point.

> What is the exact nature of the complexity explosion you are talking about
> here?

I'm thinking in terms of the entities of Occam's razor: "entities should not
be multiplied unnecessarily." "Entities" is pretty abstract, we can't identify
what complexity is directly, but what we can do is imagine, "what if we tried
to build a mathematical model that captures a real life application?"

If we did that, and had a mathematical description of a thing we wrote, we can
formalize it by trying to reduce it to some minimal set of axioms.

And then, your more complex rules are derived from those axioms, and if you
get your math right those complex rules will be consistent. If you're very
clever, you can make it reasonably intuitive.

If you have something that's very complex, what you'd observe after modelling
it is you have mostly axioms and very few rules are able to be derived from
those axioms. That is, the rules are just the rules and there's no broader
reason for them to be so, or deeper _consistent_ patterns. And, maybe some of
those rules wind up being contradictory, and they may lack orthogonality.

The relational algebra, being an algebra, is a set of operatations that are
closed over the universe of relations, so it's very nicely orthogonal and
reduces to a small set of primitives. As relations can be visualized as
"tables" they're relatively intuitive, and using techniques such as
normalization you can also structure around potential anomalies that add
unwanted complexity.

You can also control where your complexity goes. An integrity constraint can
enforce some rules about what may be in a relational variable, and if that's
enforced, all code that wants to use data from that relation can simply assume
that the data maintains that structure. Thus the complexity can be centralized
in the system.

Now, that's in the ideal world with a True Relational DBMS, we have to deal
with SQL and vendor-specific SQL at that, so we have a rather complex
underlying model. But it's typically Pretty Good.

When we build such a model ad hoc, we go by our intuition and are constrained
by the market. Our intutition leads us to use more complex structures, that
is, structures that if they were expressed mathematically would use a larger
set of axioms to describe them. Then, as we want those structures to
interoperate, we are unwittingly merging two sets of axioms.

Worse, because we're typicaly expressing it in application libraries, we wind
up repeating code, so the complexity is spread all over the application and it
especially multiplies if "conventions" are copy-pasta'd.

Hopefully that explains the nature of the explosion of entities / complexity;
let me know if I can expand on anything.

~~~
stcredzero
_Hopefully that explains the nature of the explosion of entities / complexity;
let me know if I can expand on anything._

It does not satisfy me. One can show that a regular expression or finite state
machine is limited in specific ways, as compared to a stack machine or a
Turing machine. One can write proofs concerning the number of states a
specific machine can be in, given an input of a certain length. The explosion
in complexity can be quantified, as can the impact on the effectiveness of
testing. By comparison, "using techniques such as normalization you can also
structure around potential anomalies that add unwanted complexity," is just an
aphorism.

------
kristoff_it
Stateful applications with the linklet abstraction seem a nightmare to operate
just like plain old stateful webapps that keep sessions in memory (requiring
sticky sessions and annoying users when shut down).

I understand the dislike for using in-memory databases as a json store (by
reading/writing entire objects at once), but if you're using Redis this way,
you're doing it wrong.

Another thing that the authors authors seem to forget is that "rinks" also
provide coordination across multiple instances. Locks, atomic operations,
transactions, even pub/sub is useful to get the right amount of coordination
between instances (see my take on preventing cache stampedes
[https://github.com/kristoff-it/redis-memolock](https://github.com/kristoff-
it/redis-memolock)).

Once you move to their new version of stateful instances you get all the
operational downsides and lose all the coordination features. "auto sharding"
is not going to be enough to solve all coordination problems.

Calendars are a good example of a domain that is tricky to model as a bunch of
kv operations, but nothing that can't be reasonably decomposed by applying DDD
+ the repository pattern + transactions + lua scripts (in the case of Redis).
And if you end up with something that is computationally expensive, then it
means that you need a custom data structure and maybe an algorithm optimized
for it, and this is where Redis Modules come in.

The real domain of "rinks" is not your business domain, but the implementation
domain of data structures, algorithms, asymptotic complexity etc. Thinking
that the focus at that level should not be on compsci, but on the business
domain, will inevitably produce bad performance.

~~~
turdnagel
> I understand the dislike for using in-memory databases as a json store (by
> reading/writing entire objects at once), but if you're using Redis this way,
> you're doing it wrong.

I guess... is it so terrible to use a dictionary/hash-type data structure with
values more than one level deep? Redis doesn't provide a data structure for
that.

~~~
sitkack
> Redis doesn't provide a data structure for that.

Both Redis and Aerospike provide more than enough mechanisms for one to roll
their own composite hierarchy. Both provide server side scripting via Lua as
well so the database can maintain consistency and prevent chatty access to the
store.

KV database modeling is just as rich as its relational counterpart, but if one
isn't using a KV store for a compelling reason, use Postgres.

[https://www.aerospike.com/docs/guide/data-
types.html](https://www.aerospike.com/docs/guide/data-types.html)

[https://www.aerospike.com/docs/guide/udf.html](https://www.aerospike.com/docs/guide/udf.html)

------
ledneb
Microsoft have been pushing this kind of thinking for a while with Service
Fabric. If you buy in completely and use both the framework and the
infrastructure you get structures which are in-memory and replicated for you.

A couple of the .Net guys we hired preached that stateless architecture is a
little old-fashioned - over time I've come to agree. A lot of things can be
shoe-horned in to a stateless world but become much easier in a stateful one.

------
thdxr
We've been building this way ever since we moved to Elixir which makes
distributing state across your application easy. Memory on application servers
is severely underutilized

~~~
lifeisstillgood
Can you explain more?

~~~
wmeddie
Systems like Erlang/OTP, Akka, Orbit and Orleans/Service Fabric are built on
an actor model where domain objects of the system (e.g. Users, Accounts,
Invoices, etc.) exist within the cluster and have a an address so they are
like having a bunch of mini servers. These servers typically keep their state
in memory so they can respond to query messages quickly. Plus the application
can unload idle (or no longer necessary) actors and restore their state when
they are needed again. It's very similar to the Linked in-memory key-value
idea mentioned in the paper.

~~~
cuddlecake
I think it is an anti-pattern to design a system where each domain object is a
process. Sometimes data is just data and should be managed as such.

I think what makes actor models so nice is the explicit ownership of state. It
is not possible to declare "var x = 1" in one file, and access x in another
file. You always have to retrieve state explicitly, otherwise it won't be
accessible within the scope of your function.

------
noahl
This is interesting, but I think the authors don't talk enough about CPU and
memory utilization. To me, the "classic" Google distributed systems
architecture puts different tasks in different logical servers (doesn't matter
if they're separate physical servers or not), which gives those servers more
predictable memory and CPU usage, which in turn enables tighter bin-packing of
jobs in the datacenter. The price they pay is needing a really, really fast
in-datacenter network, but in the past they've been okay with this.

The paper proposes putting application-specific processing and memory caching
on the same host, which might give the combined server less consistent CPU
usage and therefore lower utilization, but will also eliminate the network hop
from application server to in-memory cache. It seems intuitively reasonable to
me to give up some CPU utilization in exchange for eliminating an all-to-all
network connections stage, but I would like to see a real cost and speed
comparison.

~~~
shereadsthenews
An interesting take. When borg was written the main machine class was dual-
dual-core opteron. Now I imagine the typical borglet has dual Haswell or
Skylake CPUs with I guess between 40 and 88 cores. Do you think (or do you
have data that indicates) the typical Google container/vm has grown to keep
pace with the machine size, or do you think the mainstream container is still
1CPU/4GB?

~~~
noahl
No idea, I don't work there any more. It would be really interesting to know
though.

------
jayd16
Provocative title but fundamentally the paper is "what if we just run the
application on redis and skip the app server." Its not like no one has tried
sticky sessions before.

It seems like the paper is simply arguing we should bang our heads against
this yet again without a solid reason why we would succeed this time.
Sharding/routing wasn't the only problem with sticky sessions.

~~~
jdefr89
That is definitely what I took from the article. Nothing but a shock-title.
Data stores were created because previous solutions were not working as well..

------
bitL
Generational oscillation: stateful -> stateless -> stateful -> stateless ->
...

------
tschellenbach
This paper isn't great.

Yes, I can use a custom in-memory data structure, write it in Go and cluster
it using some nice Raft replication. It's not all that hard and it's much,
much faster than Redis. (we do this at Stream for activity feeds and chat)

For most apps doing this is totally impractical. If you're using PHP, Python,
Node or Ruby for a small to large (anything that doesn't have 100 million plus
users) the overhead of implementing the pattern the authors suggest is just
too large.

I think Redis will only keep on becoming more popular...

------
vast
I am confused. The major problem is claimed to be un/marshalling. But they
take no closer looks. Advise an architecture without it just to introduce
another with it.

Seems like protocol overhead will stay.

~~~
shereadsthenews
Yeah, the only thing we can really learn from this paper is that the authors
aren't very familiar with protobuf. Anybody who claims it takes 10
microseconds to decode a 1KiB protocol message is either intentionally
constructing the worst-case message or doesn't know what they are talking
about.

~~~
justin66
> Yeah, the only thing we can really learn from this paper is that the authors
> aren't very familiar with protobuf.

This is stupid. Three of the four authors are engineers at Google.

~~~
hinkley
That should not surprise you. When Microsoft was at its dumbest, its corporate
identity was that they were hiring the best of the best.

Now Google has that baton. Wonder who they’ll pass it to...

~~~
justin66
I did not intend to express agreement with the OP's stupid comment.

------
pwodhouse
The authors are the owners of Google Slicer, a sharding system that enables
stateless frontends to connect to sharded data stores (user stickiness).

~~~
Dirlewanger
And so basically, the paper's tl;dr can be: It's time for everyone to stop
using well-established best practices on X, and start using our product Y.

~~~
marcosdumay
It's more like: He got much better results with Y than well-established
practices (that many people disagreed were best) X, so we packaged it into a
product.

It may be still be a marketing lie, but it's not something to dismiss
automatically, mostly because a lot of people disagreed the practice was
"best" all the time.

------
ChrisCinelli
This does not seem a very smart thing to do for any standard application
especially without a well tested LInK library.

When it was necessary and was easy, I had local in memory caches for some
specific hot data acting as another level of caching. In the simplest case it
us just one variable + timestamp. Otherwise usually a Map or a simple library
will do it.

But substituting your caching layer with in process caching, it is usually
very risky. In cases where the system cannot work without a hot cache, in
memory caching can easily lead to extened downtimes of the whole system.

First of all, the deployment process needs to be reengineered so the new
version of the app can load the cache of the current running process.

Then if there is a bug that crashes the process, you end up losing all the
cache (unless like Redis you also save it on disk).

Until there is a well tested library that can do that or the system _really_
need a system like this, I am going to keep using Redis and use a second in
memory caching where it is easy and low risk to do.

------
sly010
I have noticed that in most use cases of redis/memcached/etc, one will
abstract it into some domain specific data-structure. At scale that
abstraction becomes a service of it's own (job servers, caches, pub sub, etc.)
at which point it doesn't make sense to keep the data separated from the
service, because all it does is it adds an extra layer of indirection. I
presume at google it's really not that big of a deal to write a fast priority
queue service that keeps things in it's own memory and talks protobuf
directly.

Relatedly I also noticed that teams that overuse memstores predominantly use
application languages that have a terrible concurrency story (e.g. python),
fall back to multiprocessing where creating anything stateful is not as
obvious as keeping it in memory, so they offload that state to redis.

------
jdsully
As a maintainer of KeyDB I think there is some truth to this in the form of
naive get/set queries. More of the processing should be done server side to
avoid wasteful network traffic and the latencies that entails.

------
rythie
Memcache was released in 2003, we didn't start to see server SSDs until 2008.
For many places, if you put the database on SSDs, it's fast enough.

------
ixtli
It strikes me that you could easily satisfy this by running a Redis process
local to your application node as a cache and, indeed, you probably should
instead of trying to re-implement functionality yourself.

Also redis is cool for pub/sub.

~~~
hinkley
If your rate of churn is low. It’s fairly easy to get a server with excess
memory, especially if you’ve pushed state out of your services, such that
having a kv sidecar isn’t that onerous. But then you get cache invalidation
issues you have to find a solution for.

------
LaserToy
Makes total sense. Here is a presentation from AWS reInvent from PlayStation
folks: [https://www.slideshare.net/AmazonWebServices/aws-
reinvent-20...](https://www.slideshare.net/AmazonWebServices/aws-
reinvent-2016-gam302-sony-playstation-breaking-the-bandwidth-barrier-using-
soft-state-and-elb-properties)

You just can't beat memory

------
snikeris
> Instead, data center services should be built using stateful application
> servers or custom in-memory stores with domain-specific APIs, which offer
> higher performance than RInKs at lower cost.

We've been using Sirius [1] (on the JVM) for this.

[1] [https://github.com/comcast/sirius](https://github.com/comcast/sirius)

------
mmatczuk
I'm working at ScyllaDB we have observed that long time ago and even blogged
about it back in 2017 [https://www.scylladb.com/2017/07/31/database-caches-
not-good...](https://www.scylladb.com/2017/07/31/database-caches-not-good/)

~~~
vittore
There is db with integrated application server that mail.ru was writing about
in 2016 - [https://medium.com/@denisanikin/how-to-save-one-million-
doll...](https://medium.com/@denisanikin/how-to-save-one-million-dollars-on-
databases-with-tarantool-5eb1596ec628)

------
polskibus
Isn't a stateful service basically the actor model with optional persistence,
like in Akka.Persistence model?

------
djhworld
I'm not entirely sure I understand this paper.

Are the frontend servers hitting the auto shard service with say, a key, and
it returns a list of backend server(s) to hit where it knows that key is
stored in memory?

Wouldn't the auto sharding service come with the usual challenges of HA, load
balancing, integrity etc?

------
natch
This looks like it is trying to build a technical rationale for keeping user
data on servers, in contrast to the privacy-centric approach being pushed by
their main competitor in the mobile space, which advocates keeping user data
local to the user device for privacy reasons.

------
jdefr89
Well. They may not of intended to do so. But they convinced me more than I
already was, that data stores like Redis are needed. This article sounds like
they already decided they had a problem with key value stores and tried to
rationalize their bias desperately...

------
adsharma
"Application logic as a library" is the right way to think. We can't even make
"threads into a library", let alone "database as a library".

The challenge is to design safety so badly written application logic doesn't
crash the database.

------
bob33212
Thank you. I finally have a good reference to point folks to when I get the
question about why I'm not using redis. Although not as fun as the "web scale"
response to mongo questions.

~~~
judofyr
> I finally have a good reference to point folks to when I get the question
> about why I'm not using redis.

What are you using instead?

~~~
bob33212
.net memory cache. It is not distributed, but we have a single tenant
architecture so it works well

------
RocketSyntax
I need a place to store values organized with just a bit of logic and a nice
api while my workflow runs. I don't want the overhead of a file system or
psql... enter redis.

------
dotnwat
ITT: the mix of haters, supporters, and curious observers is the exact type of
reaction that the HotXXX series of workshops is intended to surface.

------
setheron
I don't understand. I skimmed the paper but it sounds like NoSQL but using the
application servers themselves as the data store. ?

~~~
draaglom
You're not far off, I'd analogise as

"redis, but compiled into your app, with all the necessary bits to make that
work like you'd hope (consistent hashing/replication), (native data
structures)"

~~~
YongMan
When I suppose the separation architecture of computing layer and store layer
is go without saying, others begin talks about stateful service.

------
talkingtab
The title is click bait in my opinion as the article (as others noted) is a
proposal for:

"... LInKstore is a high level abstraction over an auto-sharder that provides
a distributed, in-memory, key-value map withrich application objects as
values, rather than strings or sim-ple data structures".

This comes from Section 4 of the .pdf which has the hard core content. The
issue is interesting.

------
peterwwillis
k/v mem store is only supposed to be a read cache. it's not supposed to handle
business logic, and it's not supposed to persist storage. if you need any of
that, _use an rdbms_.

------
keymone
> custom in-memory stores with domain-specific APIs

CIMSDA is the new CORBA?

------
cbetti
Only skimmed the paper, but to me this sounds like an incoherent description
of Reliable Actors on Azure.

I don't think it benefits the paper to position this model directly against
something like Redis.

