
Remote, in-memory key-value stores: An idea whose time has come and gone? [pdf] - Terretta
http://pages.cs.wisc.edu/~rgrandl/papers/link.pdf
======
nostrademons
The stateful application server is actually a common architecture for initial
prototypes of new apps, because of its simplicity. Hacker News, Mailinator,
Plenty of Fish, and IIRC StackOverflow were all built with this architecture.

Where it really falls down is when you have a team of developers all trying to
add new features. Code changes a lot more frequently than data. With stateful
app servers, every time you deploy new code you need complex and error-prone
routines to bring the new process up, have it read the existing state, make
sure it's consistent with the new features, and if not migrate it to the new
format.

With RInKs, a huge benefit is that you can take the application up and down
and all persistent state is already there, and usually the RInK itself handles
persistence & error-checking (because it enforces a common data model itself).
For minor UI tweaks and per-request features (which are a large portion of
changes in the early part of the product lifecycle), you don't need to do a
migration at all, the new version just magically appears. For more complex
data format changes, you make it so that the application code can read both
the old and new formats, converts old to new if necessary, and writes back
only in the new format, and then you can drop the old format once it is no
longer present in the persistent store. This lets you do the upgrade
piecemeal, and avoids corrupting your entire data store if you do it wrong
(because the architecture functions as a canary on every deployment, and you
can rollback with zero to a handful of corrupted records if things aren't
working properly).

The other situation where stateful app servers are useful is when scalability
is much more important than change velocity. This may be why Google is coming
out with this paper now. This would never have flown within Google when I was
there (2009-2014), but I've heard that the focus on maintenance, performance,
and architectural improvements is now much greater than that on new features,
so it may be appropriate for the Google of today.

~~~
graycat
Nice.

The paper anticipated in good detail just what the heck I did with its

"These stores enable services to be deployed using stateless application
servers [16], which maintain only per-request state; all other state resides
in persistent storage or in a RInK store. Stateless application servers bring
operational simplicity: for example, any request may be handled by any
application server, which makes it easy to add and remove application servers,
to handle server failures, and to handle skewed workloads."

Yup, I thought that was a good idea.

After reading more of the paper, for my project, I still think it is a good
idea. Your remarks add weight to that view!

~~~
bostik
In all honesty, the paper describes a solution to a problem most companies
won't need to care about. If processing latency for remote cache lookups is a
performance bottleneck, it's likely you're already dealing with either
enormous scale (Google/FB/Twitter/Uber/...) or some form of HFT
(exchanges/hedge funds/banks/adtech/...)

Amusingly enough, I have reason to believe those groups are over-represented
in HN audience. They are also environments where computing capacity can make
up a significant fraction of your overall engineering costs.

In practically any other scenario the sub-millisecond delay is dominated and
completely masked by your business logic taking tens to even hundreds of
milliseconds per request. Remember - if you can return a response and a visual
update to a client action within 100ms (including the network roundtrip),
you're already operating in the lowest Nielsen band.

Another funny thing about this paper is that it feels to me like people are
rediscovering zero-copy network processing. Again, a topic dear to many HN
hearts but unlikely to be a prime consideration for most companies.

------
nullwasamistake
I'm astounded at the number of developers in my own company that reach for
Redis when they just need a cache. It's a huge waste of effort for far worse
performance and less reliability than something like Java's Caffiene.

Stateful servers run into huge problems when other people try to access them
from non-static ip (usually sticky sessions are IP based). We, at least in
apps I work on, stuff all the shared state into JWT tokens to avoid this.

Everything else that needs to be cached goes into an in-memory in-app store.
If you're using a "fast" language to start with, you won't need more than a
dozen servers unless you're at Google scale so cache hit rates are good even
though each instance maintains it's own.

Again if you're using a "fast" language memory overhead from in app storage
isn't much more than putting things in a dedicated KV store. There's really no
point to it when the performance drop is usually 20x. We use DropWizard and
Vert.X extensively, and most of our endpoints can handle many thousands of
requests per second. The ones that use the cache instead of hitting the DB
every time hit upwards of 1 million requests/sec. You're very lucky to get 50k
with any remote KV store in the same app.

Most load balancer strategies send many requests in a row from a client to the
same backend server anyways for TCP performance reasons.

The only reason for a KV cache is if you're running a slow language like Ruby,
PHP, Python, where the memory overhead for objects is large and the speed of a
remote cache is roughly the same as one in memory. If you find yourself
needing a cache for good enough performance you shouldn't be using these
languages in the first place. In memory caches don't work well when you need
100+ server instances to handle the load anyways

~~~
alexchamberlain
I run several Python servers backed by Redis; Redis was added when we profiled
the apps and determined that loading data from several databases was taking
too much time. Using a different language wouldn't have changed how much time
various RDBMSs take to return an answer to your query.

I think there's some citation needed on the performance of in memory vs out of
process caches in interpreted languages.

~~~
nullwasamistake
In my experience with Postgres at least, our queries average less than a
millisecond in Java. Our slowest query averages 5. We're not running anything
crazy, pretty generic CRUD app.

We can hit about 50k queries per second on one server from our limited stress
testing. Redis was around 150k. In memory cache is ~1.2 million. This is JSON
requests using Vert.X on Java.

I just don't see the advantage to an out of JVM cache when your service is so
fast that you only need more than one instance for failover. In slower
languages where you need a bunch of front end servers to handle a reasonable
load, out of app caches look more appealing. The hit rate for in app caches
becomes too low when you have a bunch of instances.

But my point is that you get a lot more performance with a simpler setup if
you use a language fast enough that you can run just a few instances and cache
locally. Out of app caches are usually an anti pattern brought about by
needing too many front end servers because your app is too slow for in-app
cache to be useful(more instances means too low hit rate without shared
cache).

You're effectively forced to use slower out of app caching because of the
number of instances you need to run

~~~
graycat
"We can hit about 50k queries per second on one serveR"

WOW! For my 1.0 case, I was thinking maybe 10 a second!

~~~
nullwasamistake
It's really far beyond what we need. We never hit over 5% cpu unless we break
an index or do something dumb. But it's great that we can run just a few
instances and in-memory cache anything coming back from slow services. We've
got a lot of dog slow apps buried in the backend, and running minimal
instances makes our hit rate on in-app caches nice enough that we can avoid
complexity and slowness of shared cache.

Postgres is extremely fast when decently tuned, probably not much slower than
Redis(both IO bound). For an ORM the same can be said of Hibernate. And Vert.X
is just stupidly fast in general, just need to be careful not to block event
loop or use Fibers with Quasar. There's a lot of tweaks for Jackson
serialization speed and Hibernate + Postgres performance that are off by
default as well.

Do we need all this speed? Absolutely not. But caching is one of the most evil
things in comp sci and having so much excess capacity allowed us to simplify a
lot of things For instance, we cache virtually nothing coming from the DB,
only slow backends and other third party services. Hibernate also caches some
reads for us, and since we don't have many instances the hit rate is fine on
that as well.

We probably don't need to worry about performance ever for the future of our
app. If the stack wasn't so fast we would need a lot of front end servers and
a (slow) shared cache would be needed. Kinda a snowball effect of complexity
when you try to retrofit something to be fast down the road

~~~
Tarq0n
What about cold starts when the cache is empty? Are those painful?

~~~
nullwasamistake
Not terrible because we do rolling restarts, but if all the servers went down
at once it could be a problem.

------
neeleshs
It's a balance between how inprocess stores bring instability to the main
application vs cost of n/w & marshalling. Out of memory errors crashing the
application, instead of providing a degraded experience etc. Answer as usual
is, "it depends".We moved from inprocess to remote, because we valued
application stability more than raw performance.

~~~
kahnjw
Question: for data keyed by user, did you find you'd actually be better off
using an distributed in-process cache than a remote cache? I find that hard to
believe but maybe I'm just not up to speed on how advanced these caching
solutions are.

~~~
neeleshs
Our use case was not for data keyed in by user, but system generated events
that were cached in memory (on a remote cache) for deduplication. We started
with distributed in-process cache (Apache Ignite), but found out that failure
modes of the cache were not acceptable for the core application. Hence we
moved the cache out, to Redis. App behaves far better now.

------
pwpwp
Funny that this doesn't mention parse-free data structures like FlatBuffers or
Cap'n Proto that don't need unmarshalling.

[https://google.github.io/flatbuffers/](https://google.github.io/flatbuffers/)
[https://capnproto.org/](https://capnproto.org/)

------
stcredzero
_Instead, data center services should be built using stateful application
servers or custom in-memory stores with domain-specific APIs, which offer
higher performance than RInKs at lower cost._

I'm in the midst of implementing a stateful application server-cluster with
custom in-memory stores with domain-specific APIs. Basically, it's a way of
implementing a game loop and a multiplayer game by specifying lambdas going
against in-memory entity databases and in memory spatial databases. However,
to do this, I have gone from using PostgresSQL to the Badger fast key-value
store as a database. I'm mainly motivated by the desire to keep everything in
golang and eliminate all outside dependencies.

That said, what I'm doing is increasing the granularity of the server
processes. The server processes are composed of Actor-like things which are
responding to async input queues. So my experience both confirms and
contradicts some of the the things said in the paper.

~~~
jrumbut
Would you mind sharing what made you go with a stateful application server?

I've never found myself in the situation of wanting to go that direction.

~~~
zemo
the main function of a multiplayer game server is to contain the game state.
Depending on the game, it may also run a simulation. You couldn't, for
example, write a database record every time someone shot a gun in an FPS, the
whole thing would come to a crawl.

And game clients generally only care about the present. Only the current game
state is of interest to the client in most situations; the past is rarely
queried. The outliers here are things that are naturally transactional and
persistent like character progression and item ownership; those things you'd
stick in a database.

~~~
jrumbut
Makes sense thanks!

------
analyst74
Another benefit in-memory cache vs remote key-value store is the reliability.

Redis/memcached servers are super reliable, but at scale, you'll run into non-
trivial amount of failures when communicating to those servers, even a sub-
second blip in network connections could result in hundreds of failures.

Now you have to deal with retry, ensure your application logic is failure
tolerant, worry about timeout, consider impact of temporary delays in redis to
your application server due to skyrocketing threads waiting for results and
all that fun stuff.

~~~
alexchamberlain
At scale, don't you have the same problems with your in memory caching
solution (specifically in the distribution and coordination parts)?

------
philihp
The author has missed a very very important feature: Redis as a distributed
lock manager.

Redis’s SETNX is a very fast and very cheap solution to resolve race
conditions and dining philosopher problems among shards and nodes in large
clusters.

~~~
saurabhsharan
I assume you've read [https://martin.kleppmann.com/2016/02/08/how-to-do-
distribute...](https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-
locking.html)

------
d--b
The title is annoying on purpose... This is a different, more complex and
more-efficient-in-some-cases approach.

------
akkartik
Sounds like their argument hinges on the efficacy of systems like
[https://eng.uber.com/intro-to-ringpop](https://eng.uber.com/intro-to-ringpop)

~~~
sytse
Yes, Ringpop was reference number 8 in the article. It is interesting that
[https://github.com/uber/ringpop-go](https://github.com/uber/ringpop-go) says:
"This project is no longer under active development"

So the time of Redis has passed but the successor has passed away as well?

~~~
roskilli
Basically it's been deprecated in favor of using projects that are more mature
like Apache Helix (but conversely more operationally intense to get started,
requires Zookeeper and a bunch of things on top if you want to use it in a
non-Java setting, i.e. with Go).

~~~
Tharkun
I've really grown to dislike ZooKeeper. While incredibly useful, it's pretty
awful. It's nearly impossible to secure in any non-trivial setup. And it's
inane requirement to reserve a _physical disk_ [1] for its transaction log is
almost impossible to fulfil with virtualized hardware.

[1] From the manual:

> ZooKeeper's transaction log must be on a dedicated device.

~~~
hadsed
Is this not the assumption of many SQL databases that make certain guarantees
about complying with ACID? I'm not a DB expert but feels like something I've
read before.

~~~
tatersolid
Not since the 1990s

------
happythought
Atul Adya and Daniel Myers work on Slicer[1] so they have some bias.

[1]
[https://www.usenix.org/system/files/conference/osdi16/osdi16...](https://www.usenix.org/system/files/conference/osdi16/osdi16-adya.pdf)

------
bakhy
I'm currently working on a side project, a distributed in-process key/value
store for .NET Standard, which might be interesting in this context. I hope
you'll pardon the shameless plug, but I'm interested in getting feedback.
Particularly, what to focus on, what use cases, missing features... And of
course, if you see any issues with the approach.

[https://github.com/jbakic/Shielded.Gossip](https://github.com/jbakic/Shielded.Gossip)

It needs more work, but I think the eventually consistent part could already
be useful for a simple distributed cache layer atop a classic relational DB.

------
netik
Article reads like an overview of 2008 technologies, but offers no actual, new
solution to the existing solution that fuels Twitter and other social media
sites.

------
dbmikus
Would this just be like using RocksDB or something like that with extensions
to auto-shard and balance between instances? That's what it looks like to me,
but I could be misunderstanding it.

------
yenwel
Event sourced persistent actors in cluster mode is where it's at!

------
jdalsgaard
Love it; "if you fetch it over a socket, it ain't no cache"

~~~
hinkley
When I'm explaining KV to a junior dev, I do it something like this:

Memcached got big during a confluence of two problems that have both been
solved now. First, we had a bunch of big, semi-stateful web applications built
in programming languages with GC, and server memory got bigger than you could
reliably collect without long pauses. Pushing some of your data into a
separate memory space raises the ceiling considerably.

Secondly, during this ten year window, ethernet cards were half an order of
magnitude faster than hard drives. Putting stuff on another machine could be
faster than sending it to swap, a memory mapped file, or some more
sophisticated data store (like a database).

We don't have to struggle with these now, and half the time we avoid the first
one altogether. They still have lots of places they are used, but you are way
better off working to cache inbound requests instead of outbound requests.
That lets you move a bunch of caching to the edge of your network, or to the
user agent.

------
PunksATawnyFill
It's "key/value" store.

------
graycat
I don't get it: _Key-value_ stores is a topic that just won't go away. I was
interested in key-value stores, did a little work, solved the problem for my
needs, and don't care anymore. Sure maybe there can be some high end,
challenging aspects, but what I did was dirt simple, childishly simple to do,
blindingly fast, with plenty of capacity -- maybe will cover 75% of the need
for everyone?

Heck, if we want something with really deep functionality, we are just back to
relational database, right?

So, I wanted a key-value store for _session state_ for users of the Web site
of my startup.

When a user sends a page back to my server, it has in a "hidden field" the
value of an encrypted token with identifies the user's logical _session_. And
I have in VB.NET a class for the data I want to keep on that user, their
_session state_.

So, in my VB.Net Web server code, I have defined some classes: (1) An
allocated instance of one class sends a new or updated key-value pair to the
key-value server. (2) An allocated instance of another class requests from the
key-value server the value that corresponds to a key. (3) An allocated
instance of a third class gets the value back from the key-value server.

For communications, just take an instance of a class, _serialize it_ (thank
you VB.NET), copy it to a byte array, and send that via old TCP/IP sockets.
Then the receiver gets the byte array and does a _deserialization_ to get the
corresponding object instance.

The key-value server is based almost entirely on just two instances of a
standard .Net collection class. One instance has the key-value pairs where,
right, the key in the pair is the key in the collection class. The other
instance has a time-date pair as key and for a value a key from the key value
store. Here there is the time-date when the corresponding key-value was last
written or read and is used for session time outs. So, for session time outs,
just read this time-date and session key instance in ascending order on the
time-date key and for any session time-outs delete that collection class
element and also the corresponding element in the other collection class.

It's a few hundred lines of code and was a simple exercise in TCPIP, object
instance de/serialization, and collection class usage.

I also have code to permit sending some system management commands to the key-
value server.

It runs as just a console application.

The code is single threaded, and I'm counting on just the standard TCP/IP FIFO
queue to hold the backlog of work to be done.

All the data is held in main memory so should be blindingly fast. Some simple
arithmetic indicates that the main memory needed is not very large for even a
quite busy Web server computer.

The session state server can serve any reasonable number of Web server
computers, and a particular user doesn't need _session affinity_ with a
particular Web server.

Looked simple and easy to me.

How much of the need will something so simple satisfy?

What needs will such a simple solution not satisfy?

~~~
zemo
that's ... literally what people do with redis and memcache, you just wrote a
slower memcache with fewer features.

that's on one server and the data is all in process memory?

~~~
throwmeawy
GP has shown a long-term tendency to create a naive implementation, in VB.NET
even, and then get surly about folks who refuse to acknowledge its
sufficiency. No good can come from engaging / trying to educate.

N.b. built-in .NET object serialization is a slow, wasteful, brittle solution,
but it could be okay if one doesn't mind the performance or portability
issues. Heck why not use XmlSerializer then?

~~~
graycat
What's wrong with VB.NET? As far as I can tell, it's as good as anything else
to get to the .NET Framework. I like it because it has more traditional syntax
I believe is easier to teach, learn, read, write, and debug than C#.

With VB.Net, what am I giving up?

I'm asking about "sufficiency": It all seems dirt simple to me, where some
simple code really is sufficient. So, where is the simple approach not
powerful enough? Where is Redis needed and something simple not good enough?
Where does my simple solution fall short? It appears that Redis is now really
important: What does it have that is so important?

Is object instance serialization really slower than, all things considered, it
has to be? Or if we accept (1) we are going to have instances of VB.Net style
classes and (2) send the data via TCP/IP, that is, basically via byte arrays,
then where can we do better than just serializing the class instances to byte
arrays? If we work with just XML text, then we don't really have the
advantages of performance of VB.Net instances of classes, and we're back to
interpretive data structures instead of compiled data structures. I can think
of some just blindingly fast approaches in PL/I because the PL/I data
structures (1) are nearly as useful as class instances and (2) occupy
essentially sequential storage -- so, could just copy storage from a starting
address to an ending address and send the result as the byte array -- reverse
that process at the receiver. Could do much the same with C structures, much
less general than PL/I structures, if avoided most uses of pointers.

This is my first Web site, first use of collection classes, first use of a
session state store -- I'm trying to learn what I'm missing out on.

~~~
zemo
> What's wrong with VB.NET?

nobody in the startup space is using it, so it'll be hard to hire for people
that have the skills. There's a feedback loop there. Because no other
companies are using it, people that don't have the skills also don't want to
spend their time learning it, because the investment won't have a good pay
off. I improved my resume and got more job offers by -removing- VB.NET ...
nine years ago.

> It appears that Redis is now really important: What does it have that is so
> important?

It's really fast, really stable, and really well-understood by a lot of
people. You can hire someone that already knows how to query it, how to
operate it, how to fix it when it breaks. A lot of tools exist to work with it
that you would have to write yourself if you rolled your own solution. When a
bug is found, it's reported to the maintainers, and then fixed for everybody.

> If we work with just XML text

XML is pretty dead as a data interchange format. It's incredibly space-
inefficient. Most people are using one of json, msgpack, thrift, or protocol
buffers. (I'm personally a fan of protocol buffers; you write a schema and use
the schema to generate the serialization and deserialization code, and the
messages are efficiently-packed binary so they're pretty good on space
efficiency.) There's a few outliers; mongodb uses bson because it predates the
publication of msgpack but it's a very similar format. Writing the in-memory
representation of a value and then reading it back on another machine and
mmaping it and operating on it directly is a rare approach because it's hard
to do it correctly while supporting changes to the types. Eventually you
update an application, and an old record written by version 1 of the
application has to be readable by version 2 of the application. It's not a
completely dead strategy though, it's what cap'n proto does.

seriously though I'm willing to bet there is almost nobody on Hacker News that
understands any of your PL/I comparisons. You keep bringing up PL/I as if you
assume everyone knows PL/I. In ten years as a professional programmer you are
literally the only person I have ever heard bring up PL/I in a discussion
that's not about history.

~~~
graycat
I have a box from IBM with a version of PL/I that can run on Windows. I did
run it years ago on Windows XP. I no longer run or want to run PL/I. I don't
expect anyone to use it.

I bring up PL/I here on HN occasionally just as an example and lesson in
programming language design. It's got some good features now lost, and for
people interested in programming language design maybe some of the features
could be brought back.

Since there has been a complaint on this thread that .Net style object
instance de/serialization is slow and otherwise not so good, part of the
problem is that such an object instance is not all by itself in some range of
sequential memory addresses and, instead, can be _scattered_ all over. So,
de/serialization needs implementation by the language developers. So, HAVING
the desired data in sequential memory could have an advantage, e.g., in speed,
that is, just copy the contents of memory in the block of addresses -- tough
to be faster than that (ah, do it in assembler and use some string copy
instruction assuming that X86 has one of those!). Could do similar things for
the simpler versions of C structs.

But in cautious shops, running code to get to memory addresses could be
frowned on!

A broad point about PL/I is that its structures are nearly as useful as the
.Net style classes but with much faster and simpler addressing. In programming
language design, that's a little point to keep in mind.

I wrote enough C code to see how nearly all that syntax worked. So, maybe
someday I'll write some C# code to be sure I know how its version of the C
_idiosyncratic syntax_ works!

Since I'm a sole, solo founder, floor sweeper to CEO, of my own startup, I get
to concentrate on the tools my startup needs instead of tools someone else's
startup uses!

The key-value store discussion here is good: I'm concluding that my needs are
simple and my simple code is enough for those needs, that from the experience
of others here in time I should want more functionality, and maybe should
convert over to Redis. By then I might be hiring people to do such things.

I'm not sure that having my code in VB.NET is a serious dead end: As I
understand, C# can do what VB.NET does with just a different flavor of
_syntactic sugar_ , there's a translator from Microsoft to go from VB.NET to
C#, and the actual semantics are so close that for the corresponding language
features (some of the more recent features in C# may be missing from VB) it is
fine to call one of the languages from the other. That is, as I understand the
situation, the real work is to get to the .Net Framenwork and the CLR (common
language runtime) and VB, C#, F#, etc. are all essentially equivalent for that
work. So, for differences we're talking some functionality and otherwise
syntactic sugar.

My current work is more nonsense: Last night I started a run of Robocopy to do
an incremental backup of one of my hard disk partitions. This morning Robocopy
had completed without errors, but what it copied just as an _incremental_ copy
was about 80% of the whole disk partition when I've had only minor activity on
that partition since the last full backup and turning off all the archive bits
(archive bit OFF means that the file has been backup up since it was created
or last changed and ON means that the file was created or changed since the
last time the archive bit was set OFF).

So, somehow that disk partition got a LOT of archive bits turned on.

Open Object Rexx has a cute function can call to get a really nice description
of everything in a file system subtree, one file/directory per line, with
archive, hidden, system, read only bits, time-date stamp, size in bytes, full
tree name.

So, I compared two of these files and found that after the last full backup I
really did set all the archive bits off and for the incremental a lot of
REALLY old files, going back years, still had their correct time-date stamps,
had NOT changed, but DID have their archive bits set ON.

Bummer. Something has been walking around in that tree setting archive bits ON
for NO good reason. Really big bummer.

So, to explain, the partition drive letter is K:. There are some files in the
root of K:, but mostly the files are in K:\data01, ... K:\data05 and K:prog01,
..., K:\prog05. Right, separation of data and programs!

So, looking, all the archive bits set ON for no good reason were in just
K:\data05. Hmm.

What could have happened?

Well, this computer is one I built from parts, running Windows 7 64 bit
Professional SP1, but I also have an HP laptop running Windows 10 Home Edition
with, of course, all the latest updates whether I wanted them or not.

And a week or so ago, I got okay with

net share

net use

etc. to set up file sharing between the two computers.

Well, I gave the Windows 10 system access to the Windows 7 directory, right,
you guessed it,

    
    
         K:\data05
    

That about has to be the cause. Somehow letting Windows 10 _share_ the Windows
7 directory

    
    
         K:\data05
    

had the Windows 10 system walking around all over that directory, including
several directory levels deep, and to some files years old, and setting
archive bits ON.

Bummer.

So, no more such sharing! I'll create a directory, say, HP_laptop_share, in
the root of a partition and let Windows 10 share it. I'll just put temporary
stuff there. And unless there are more bugs, Windows 10 will be limited to
making a mess only in that directory

    
    
         HP_laptop_share.
    

So, with the HP OFF the Windows 7 system, I'll do a full backup, set the
archive bits off, and keep Windows 10 the heck OUT of there! Lost half a day
with this. I call it mud wrestling. I has NOTHING to do specifically with my
startup. ALL the work unique to my startup has been fast, fun, and easy for me
-- the difficulty has been literally YEARS of mud wrestling, all for no good
reason. What do others do about such mud wrestling?

~~~
zemo
> I bring up PL/I here on HN occasionally just as an example and lesson in
> programming language design.

PL/I influenced B, which influenced C, which influenced so many things that
literally every person on HN has used a language influenced by C. You're out
here talking down to people like "you whippersnappers don't know what you're
doing" when you're literally decades behind the curve.

and besides, check out lambda-the-ultimate.org, that's a community more
specifically oriented at programming language design. And no, they don't talk
about PL/I either.

> A broad point about PL/I is that its structures are nearly as useful as the
> .Net style classes but with much faster and simpler addressing. In
> programming language design, that's a little point to keep in mind.

Go has this, Rust has this, lots of languages have this. Language that give
the developer the ability to control how their data structures are laid out in
memory are not a revelation, we know about them. That's not a new or lost
feature, you just have your head buried so far in the sand that you are
incapable of viewing anyone that came after you as knowing anything.

[https://syslog.ravelin.com/go-and-memory-
layout-6ef30c730d51](https://syslog.ravelin.com/go-and-memory-
layout-6ef30c730d51) [https://doc.rust-lang.org/reference/type-
layout.html](https://doc.rust-lang.org/reference/type-layout.html)

And besides, Go was co-designed by Ken Thompson, who invented B, which was
strongly influenced by, drumrollllll please, PL/I!!! Your dedication to not
listening to people has robbed you of knowing about things that you would
probably like.

> just copy the contents of memory in the block of addresses

I already addressed this earlier. That's what cap'n proto and flatbuffers both
do: [https://capnproto.org/](https://capnproto.org/)
[https://google.github.io/flatbuffers/](https://google.github.io/flatbuffers/)

If you listen to people, they don't have to repeat themselves so much.

> de/serialization needs implementation by the language developers

that's literally untrue. People write serialization and deserialization
libraries all the time. Lots of languages give the programmer the ability to
lay out how their data will look in memory, and lots of serialization
libraries are mindful of creating values that have reasonable data locality
characteristics. Lots of languages and deserialization libraries don't do this
because it literally doesn't matter for most programs and for most programmers
most of the time. You're approaching the problem as if computers are
expensive, because in your PL/I days they were. But computers are cheap now,
and getting cheaper every day. For example, an ec2 instance with 4gb of memory
will run you all of $20 a month. Napkin math on some nice round numbers: a
programmer costs you $150k a year and they work 2000 hours a year, that works
out to $75 an hour, or $600 a day. If a task takes you one day, but you could
avoid the task by buying another ec2 server, you just buy the space and avoid
the task. A day of programmer time is worth 30 months of server time. Unless
you're talking about Windows servers, which are twice as expensive because
literally nobody runs Windows servers, they are garbage.

> I wrote enough C code to see how nearly all that syntax worked. So, maybe
> someday I'll write some C# code to be sure I know how its version of the C
> idiosyncratic syntax works!

That word does not mean what you think it means. Idiosyncratic is a relative
term: it specifically means someone who is outside of expectations, outside of
the norm. C-style syntax is used in, yes, C#, but also JavaScript, Go, Rust,
and a great many other programming languages. It is PL/I that is
idiosyncratic, as it is literally only you talking about it. Idiosyncratic
does not mean "unfamiliar to ME", it means "differentiating the individual
from the majority". If everyone is doing something one way, and an individual
is doing it a different way, it's the individual that is being idiosyncratic.

I don't know why you went off on a long tangent about your backup problem. I
do not care. It should be obvious to you that I would not care. And the cherry
on the top was "I call it mud wrestling". Dude. We have a name for this. It's
"yak shaving". Everyone uses that name. You would know that if you ever
listened to anyone.
[https://en.wiktionary.org/wiki/yak_shaving](https://en.wiktionary.org/wiki/yak_shaving)

> What do others do about such mud wrestling?

You put all of your code in a hosted source control provider like Github or
Bitbucket. Database backups and archival images go on Amazon's S3 or Google
Cloud storage. Usually you'd automate the backups and make them full backups
and retain a certain number of snapshots. I don't think most people nowadays
back up their entire computer. "Everything important is in the cloud" is the
most common strategy.

Talking to you is very exhausting. You consistently demonstrate a complete
lack of respect for the generations of programmers that have followed you. As
a result, you struggle with problems that beginner programmers solve every
single day without issue.

~~~
graycat
> You're out here talking down to people like "you whippersnappers don't know
> what you're doing" when you're literally decades behind the curve.

I'm doing no such thing. I've done nothing wrong.

I know C and PL/I well, and I see little or no influence on C from PL/I.

> Go has this, Rust has this, lots of languages have this. Language that give
> the developer the ability to control how their data structures are laid out
> in memory are not a revelation, we know about them.

When I was at IBM's Watson lab, we designed the language KnowledgeTool. Then I
looked at some of the literature of programming language design. I didn't find
anything very new or exciting. It looked like a dead field. I haven't looked
again since then.

Some years ago I made a decision, to do my computing on just one of Linux or
Windows.

What influenced me:

(1) After some trying, I didn't find any documentation that explained even
simple things about Linux. E.g., just what is/is not in a Linux "distro"? Do
all the distros have the same operating system APIs, the same utilities, and
run the same applications? What are the differences in the distros?

(2) What progress has Linux made in exploiting 64 bit addressing?

(3) With Linux, for operating system updates, do I have to compile the
operating system?

The books I found didn't say. Dirt simple questions; no answers.

(1) For Windows, I was already using 3.1 or some such.

(2) I was and still am impressed by the relatively high promises of backwards
compatibility.

(3) I looked around and saw some large Web sites using Windows.

(4) Microsoft runs some astoundingly well _organized_ , huge server farms
running Windows, e.g., just for their MSDN documentation. It appears that they
have operating system installation down close to just push one button and
bring up 1 million square feet of racks with servers.

(5) From what I've seen, I like the Microsoft file system NTFS.

(6) I want a LOT of good documentation. E.g.. for .NET I found, downloaded,
read, abstracted, and indexed about 5000 MSDN Web pages. I put tree names in
my code and have a single keystroke in my editor that will have Firefox
display the page. Maybe this is something like what Microsoft does with
_intelisense_ \-- but since I don't know anything about intelisense I can't
say.

Microsoft's documentation is really good in some respects -- the pages are
nicely formatted; they often have references to related pages; there is some
organization to the pages; they spell the words correctly; etc.

I get torqued at times, e.g., when they use terminology without defining it,
explaining it, or linking to such content.

For several of their major efforts, I still have no idea what they are driving
at -- as far as I can tell, they don't explain. E.g., I went for some months
getting their news letters touting Azure. Lots of talk about Azure. How good
Azure can be. How to get started with Azure. And with all this, they never
explained what the heck Azure actually WAS! Eventually Azure leaked out: It's
a Microsoft server farm set up so that users can send their code and the farm
will run it. So it' _cloud_ computing often running some user's code. Okay.
Now I understand the basics of Azure.

But I'm reluctant to bet that the Linux documentation is better.

(7) At the time, Microsoft had $40 billion in cash; now they are worth about
$1 T. So, Microsoft DOES have the resources to keep Windows, Windows Server,
SQL Server, various utility programs, malware defenses and corrections, etc.
up to date.

(8) Microsoft has a LOT of users, and sometimes some of those users post good
answers to some questions.

(9) I found that if really get stuck and really do have a serious question,
then there are people at Microsoft who will give some fast one on one
tutoring. E.g., early with VB.NET, I wrote a little routine to use classic
heap sort to sort an array of instances of a class. I ran the thing, and it
ran for 600 seconds when it should have run for about 2 seconds. So, right, I
didn't understand just what I was supposed to do with _polymorphism_ \-- I was
supposed to write an _interface_. Gee, we did much the same back in Fortran.
Okay, _polymorphism_ has a lot of hype and, really, is much like what we were
doing with old Fortran; have to write an _interface_ ; I found some examples,
tweaked a little, got an _interface_ , and running time about 3 seconds.

An amazing part is that somehow VB.NET essentially _dynamically_ discovered
the data type of the class I was passing to the sort routine. So, maybe all
that is available via _reflection_. I was impressed.

For GO, Rust, Haskell, as for Ada, Prolog, APL, etc., for me for my work now
mostly no thanks.

I need to eat; for that, need to make money; for that, starting a business.
The business makes use of computing, and I have to write some software.

For that software, on Windows, I picked VB.NET. It's fine with me. I suspect
that I could convert over to C# without much trouble. One way to convert would
be just to run my VB.NET code through a translator that writes out equivalent
C# code -- my understanding is that Microsoft has such a translator.

As far as I can tell, VB.NET is a plenty good way to get to the Windows
services and APIs, the .NET Framework, and the common language runtime (CLR).

Maybe some programmers take pride in learning lots of languages. Okay. But
that's not what I am: Instead, I'm trying to start a business.

Besides, I don't see much attention to GO, Rust, Haskell, etc. on Windows.

On being "behind", I don't think so: The amazing thing about computing is the
low cost and high performance of processors, storage, and communications.

Other than that, the significant things in computer science and practical
computing have changed only very slowly.

To me programming is essentially defining storage, writing expressions to
manipulate storage, the control structures if-then-else, do-while, and call-
return, handling various data types, e.g., strings, dates, and handling
exceptional conditions.

To me the key to my startup is not the computing at all -- it is just routine.
The key is some applied math I derived. So, I'm an applied mathematician in
business who uses computers.

The computer tools I use are the ones I want for my business. Since I won't be
using Go, Rust, Haskell, Java, Ada, Prolog, etc., I don't pay attention to
them. Someday I may use some Python or JavaScript -- if to, then so be it.
Microsoft has Iron Python that is compiled and maybe has good or full access
to the .NET Framework. Okay.

To me, the future of computing will be in mathematics, complete with theorems
and proofs.

------
klodolph
This is a misleading title: it’s talking specifically about remote in-memory
key-value stores. I thought that this would be against e.g. Badger.

~~~
kitotik
Agreed. It’s clearly targeted at the memcached/redis model.

