Hacker News new | past | comments | ask | show | jobs | submit login
Remote, in-memory key-value stores: An idea whose time has come and gone? [pdf] (wisc.edu)
146 points by Terretta on May 3, 2019 | hide | past | favorite | 72 comments

The stateful application server is actually a common architecture for initial prototypes of new apps, because of its simplicity. Hacker News, Mailinator, Plenty of Fish, and IIRC StackOverflow were all built with this architecture.

Where it really falls down is when you have a team of developers all trying to add new features. Code changes a lot more frequently than data. With stateful app servers, every time you deploy new code you need complex and error-prone routines to bring the new process up, have it read the existing state, make sure it's consistent with the new features, and if not migrate it to the new format.

With RInKs, a huge benefit is that you can take the application up and down and all persistent state is already there, and usually the RInK itself handles persistence & error-checking (because it enforces a common data model itself). For minor UI tweaks and per-request features (which are a large portion of changes in the early part of the product lifecycle), you don't need to do a migration at all, the new version just magically appears. For more complex data format changes, you make it so that the application code can read both the old and new formats, converts old to new if necessary, and writes back only in the new format, and then you can drop the old format once it is no longer present in the persistent store. This lets you do the upgrade piecemeal, and avoids corrupting your entire data store if you do it wrong (because the architecture functions as a canary on every deployment, and you can rollback with zero to a handful of corrupted records if things aren't working properly).

The other situation where stateful app servers are useful is when scalability is much more important than change velocity. This may be why Google is coming out with this paper now. This would never have flown within Google when I was there (2009-2014), but I've heard that the focus on maintenance, performance, and architectural improvements is now much greater than that on new features, so it may be appropriate for the Google of today.


The paper anticipated in good detail just what the heck I did with its

"These stores enable services to be deployed using stateless application servers [16], which maintain only per-request state; all other state resides in persistent storage or in a RInK store. Stateless application servers bring operational simplicity: for example, any request may be handled by any application server, which makes it easy to add and remove application servers, to handle server failures, and to handle skewed workloads."

Yup, I thought that was a good idea.

After reading more of the paper, for my project, I still think it is a good idea. Your remarks add weight to that view!

In all honesty, the paper describes a solution to a problem most companies won't need to care about. If processing latency for remote cache lookups is a performance bottleneck, it's likely you're already dealing with either enormous scale (Google/FB/Twitter/Uber/...) or some form of HFT (exchanges/hedge funds/banks/adtech/...)

Amusingly enough, I have reason to believe those groups are over-represented in HN audience. They are also environments where computing capacity can make up a significant fraction of your overall engineering costs.

In practically any other scenario the sub-millisecond delay is dominated and completely masked by your business logic taking tens to even hundreds of milliseconds per request. Remember - if you can return a response and a visual update to a client action within 100ms (including the network roundtrip), you're already operating in the lowest Nielsen band.

Another funny thing about this paper is that it feels to me like people are rediscovering zero-copy network processing. Again, a topic dear to many HN hearts but unlikely to be a prime consideration for most companies.

Only problem is the network caches in wide use at Google are significantly faster than what they are using as a baseline in this paper. It’s pretty misleading.

"every time you deploy new code you need complex and error-prone routines to bring the new process up, have it read the existing state, make sure it's consistent with the new features, and if not migrate it to the new format."

Pardon my ignorance ... but wouldn't a stateful model that used mem as true 'cache' not avoid that problem? As if to say the in-memory store is not 'pre loaded' persey, rather, it's loaded incrementally on a on-demand basis from the app itself?

So 'rebooting' has the effect of flushing the cache thats built up over time after reboot, hopefully seamlessly?

Or am I (probably) missing something?

Your upstream needs to be able to handle 100% of the load, which could be several orders of magnitude more traffic than it normally has to handle. Handling that huge spike in traffic for, what might not be more than a few minutes a year, can be insanely expensive.

Also, you're potentially giving users a noticeably worse experience as you rebuild the cache.

Well that'd be it then, thanks.

I'm astounded at the number of developers in my own company that reach for Redis when they just need a cache. It's a huge waste of effort for far worse performance and less reliability than something like Java's Caffiene.

Stateful servers run into huge problems when other people try to access them from non-static ip (usually sticky sessions are IP based). We, at least in apps I work on, stuff all the shared state into JWT tokens to avoid this.

Everything else that needs to be cached goes into an in-memory in-app store. If you're using a "fast" language to start with, you won't need more than a dozen servers unless you're at Google scale so cache hit rates are good even though each instance maintains it's own.

Again if you're using a "fast" language memory overhead from in app storage isn't much more than putting things in a dedicated KV store. There's really no point to it when the performance drop is usually 20x. We use DropWizard and Vert.X extensively, and most of our endpoints can handle many thousands of requests per second. The ones that use the cache instead of hitting the DB every time hit upwards of 1 million requests/sec. You're very lucky to get 50k with any remote KV store in the same app.

Most load balancer strategies send many requests in a row from a client to the same backend server anyways for TCP performance reasons.

The only reason for a KV cache is if you're running a slow language like Ruby, PHP, Python, where the memory overhead for objects is large and the speed of a remote cache is roughly the same as one in memory. If you find yourself needing a cache for good enough performance you shouldn't be using these languages in the first place. In memory caches don't work well when you need 100+ server instances to handle the load anyways

> The only reason for a KV cache is if you're running a slow language like Ruby, PHP, Python, where the memory overhead for objects is large and the speed of a remote cache is roughly the same as one in memory. If you find yourself needing a cache for good enough performance you shouldn't be using these languages in the first place. In memory caches don't work well when you need 100+ server instances to handle the load anyways

You're saying reading a serialized Python object from Redis and deserializing it is comparable to accessing a ready in-memory object? No, it's an order of magnitude slower. I think the reason in-memory caches are not used much in Python/Ruby is a concurrency model, ie. in Python/Ruby you usually have to run multiple processes and in Java you run multiple threads, so the in-memory cache can be just a regular data structure normally accessed.

Also, I think you are overestimating the Python's memory overhead compared to Java - Python's int is 24 bytes and Java's Integer is 16 bytes (and using PyPy is another story - it can use several times less memory compared to Java).

I run several Python servers backed by Redis; Redis was added when we profiled the apps and determined that loading data from several databases was taking too much time. Using a different language wouldn't have changed how much time various RDBMSs take to return an answer to your query.

I think there's some citation needed on the performance of in memory vs out of process caches in interpreted languages.

In my experience with Postgres at least, our queries average less than a millisecond in Java. Our slowest query averages 5. We're not running anything crazy, pretty generic CRUD app.

We can hit about 50k queries per second on one server from our limited stress testing. Redis was around 150k. In memory cache is ~1.2 million. This is JSON requests using Vert.X on Java.

I just don't see the advantage to an out of JVM cache when your service is so fast that you only need more than one instance for failover. In slower languages where you need a bunch of front end servers to handle a reasonable load, out of app caches look more appealing. The hit rate for in app caches becomes too low when you have a bunch of instances.

But my point is that you get a lot more performance with a simpler setup if you use a language fast enough that you can run just a few instances and cache locally. Out of app caches are usually an anti pattern brought about by needing too many front end servers because your app is too slow for in-app cache to be useful(more instances means too low hit rate without shared cache).

You're effectively forced to use slower out of app caching because of the number of instances you need to run

> I just don't see the advantage to an out of JVM cache when your service is so fast that you only need more than one instance for failover

I don't think anyone would advocate that you move to a remote cache. If you can fit your entire app on a single machine, that's great. It's a huge reduction of complexity, and you'll likely be able to get very fast performance like you indicate. You definitely do not need or want a remote cache.

These tools are for those that have more data or traffic than a single machine can handle. Things get much more complicated when you move from one computer to two.

Sounds like someone should design a server farm architecture tool that will take in various parameters and report designs and what constraints are tight, that is, the solution is right at the constraint.

"We can hit about 50k queries per second on one serveR"

WOW! For my 1.0 case, I was thinking maybe 10 a second!

It's really far beyond what we need. We never hit over 5% cpu unless we break an index or do something dumb. But it's great that we can run just a few instances and in-memory cache anything coming back from slow services. We've got a lot of dog slow apps buried in the backend, and running minimal instances makes our hit rate on in-app caches nice enough that we can avoid complexity and slowness of shared cache.

Postgres is extremely fast when decently tuned, probably not much slower than Redis(both IO bound). For an ORM the same can be said of Hibernate. And Vert.X is just stupidly fast in general, just need to be careful not to block event loop or use Fibers with Quasar. There's a lot of tweaks for Jackson serialization speed and Hibernate + Postgres performance that are off by default as well.

Do we need all this speed? Absolutely not. But caching is one of the most evil things in comp sci and having so much excess capacity allowed us to simplify a lot of things For instance, we cache virtually nothing coming from the DB, only slow backends and other third party services. Hibernate also caches some reads for us, and since we don't have many instances the hit rate is fine on that as well.

We probably don't need to worry about performance ever for the future of our app. If the stack wasn't so fast we would need a lot of front end servers and a (slow) shared cache would be needed. Kinda a snowball effect of complexity when you try to retrofit something to be fast down the road

What about cold starts when the cache is empty? Are those painful?

Not terrible because we do rolling restarts, but if all the servers went down at once it could be a problem.

I have to agree. When I first moved from the C++ world to the Web world five years ago, I was surprised to see an entire generation of engineers for whom the idea of using application memory is alien. It's as if we're ignoring an entire layer of the memory hierarchy because we're afraid of managing the complexity of application state with load balancing / sticky sessions.

But once you get to deployment, particularly if you're using shiny new containerized orchestration, in my experience, it is safer and easier to allocate the host node's memory to the redis container and keep it protected from application layer fires.

Same as you would do with a database.

It speeds up CI/CD and keeps things safe / reduces downtime in cases of roll back (assuming you're versioning your keys).

> Stateful servers run into huge problems when other people try to access them from non-static ip (usually sticky sessions are IP based). We, at least in apps I work on, stuff all the shared state into JWT tokens to avoid this.

They're not saying that an application server should never query another server, they're saying that the server that functions as a state container should host a domain-specific api for manipulating the state. See figure 2 b; the frontend and stateful application server layers are separate layers.

It's a balance between how inprocess stores bring instability to the main application vs cost of n/w & marshalling. Out of memory errors crashing the application, instead of providing a degraded experience etc. Answer as usual is, "it depends".We moved from inprocess to remote, because we valued application stability more than raw performance.

Question: for data keyed by user, did you find you'd actually be better off using an distributed in-process cache than a remote cache? I find that hard to believe but maybe I'm just not up to speed on how advanced these caching solutions are.

Our use case was not for data keyed in by user, but system generated events that were cached in memory (on a remote cache) for deduplication. We started with distributed in-process cache (Apache Ignite), but found out that failure modes of the cache were not acceptable for the core application. Hence we moved the cache out, to Redis. App behaves far better now.

Funny that this doesn't mention parse-free data structures like FlatBuffers or Cap'n Proto that don't need unmarshalling.

https://google.github.io/flatbuffers/ https://capnproto.org/

Instead, data center services should be built using stateful application servers or custom in-memory stores with domain-specific APIs, which offer higher performance than RInKs at lower cost.

I'm in the midst of implementing a stateful application server-cluster with custom in-memory stores with domain-specific APIs. Basically, it's a way of implementing a game loop and a multiplayer game by specifying lambdas going against in-memory entity databases and in memory spatial databases. However, to do this, I have gone from using PostgresSQL to the Badger fast key-value store as a database. I'm mainly motivated by the desire to keep everything in golang and eliminate all outside dependencies.

That said, what I'm doing is increasing the granularity of the server processes. The server processes are composed of Actor-like things which are responding to async input queues. So my experience both confirms and contradicts some of the the things said in the paper.

I’m looking more and more at Erlang/OTP for many of these same reasons. Ets/Dets/Mnesia seems to address a ton of typical stateful app use cases right out of the box, and it was built for the Actor model with mass quantities of fine-grained processes you mentioned.

It’s cool to hear a similar approach can be done in golang without a ton of external deps.

ha, I ... also am working on a stateful game server in Go, the multiplayer servers that power Jackbox Games' line of games. I don't currently need disk persistence though, I just do it all in memory. It's also way easier than an MMO because the game is entirely instanced, so you can just partition the data by the game instance.

Just wanted to say playing Jackbox with my family has been really enjoyable, and it's really great how easy it is to get everyone set up and started.

Would you mind sharing what made you go with a stateful application server?

I've never found myself in the situation of wanting to go that direction.

the main function of a multiplayer game server is to contain the game state. Depending on the game, it may also run a simulation. You couldn't, for example, write a database record every time someone shot a gun in an FPS, the whole thing would come to a crawl.

And game clients generally only care about the present. Only the current game state is of interest to the client in most situations; the past is rarely queried. The outliers here are things that are naturally transactional and persistent like character progression and item ownership; those things you'd stick in a database.

Makes sense thanks!

Would you mind sharing what made you go with a stateful application server?

I'm implementing a MMO game loop. Even though I am exposing the API as what amounts to a bunch of lambdas, under the covers, I am holding onto state and mutating it, basically for performance.

EDIT: To put it another way, there is a level of abstraction at which my system is stateless. To implement that with enough performance, I have to hold onto state, so I can deliver data to the implementation of those abstractions in a timely fashion.

Interesting perspective, thanks!

Another benefit in-memory cache vs remote key-value store is the reliability.

Redis/memcached servers are super reliable, but at scale, you'll run into non-trivial amount of failures when communicating to those servers, even a sub-second blip in network connections could result in hundreds of failures.

Now you have to deal with retry, ensure your application logic is failure tolerant, worry about timeout, consider impact of temporary delays in redis to your application server due to skyrocketing threads waiting for results and all that fun stuff.

At scale, don't you have the same problems with your in memory caching solution (specifically in the distribution and coordination parts)?

Nice. Sounds like the clients should be willing to retry any such remote communications due if only due to little blue ET men crawling through the server farm LAN cables.

The author has missed a very very important feature: Redis as a distributed lock manager.

Redis’s SETNX is a very fast and very cheap solution to resolve race conditions and dining philosopher problems among shards and nodes in large clusters.

The title is annoying on purpose... This is a different, more complex and more-efficient-in-some-cases approach.

Sounds like their argument hinges on the efficacy of systems like https://eng.uber.com/intro-to-ringpop

Yes, Ringpop was reference number 8 in the article. It is interesting that https://github.com/uber/ringpop-go says: "This project is no longer under active development"

So the time of Redis has passed but the successor has passed away as well?

Basically it's been deprecated in favor of using projects that are more mature like Apache Helix (but conversely more operationally intense to get started, requires Zookeeper and a bunch of things on top if you want to use it in a non-Java setting, i.e. with Go).

I've really grown to dislike ZooKeeper. While incredibly useful, it's pretty awful. It's nearly impossible to secure in any non-trivial setup. And it's inane requirement to reserve a physical disk [1] for its transaction log is almost impossible to fulfil with virtualized hardware.

[1] From the manual:

> ZooKeeper's transaction log must be on a dedicated device.

Is this not the assumption of many SQL databases that make certain guarantees about complying with ACID? I'm not a DB expert but feels like something I've read before.

Not since the 1990s

Reading about ringpop, came across this statement

> Ringpop, a library for building cooperative distributed systems, solved some of Marketplace’s problems before its adoption in other teams at Uber and beyond. It gives the high-availability, partition-tolerant properties of distributed databases like DynamoDB or Riak to developers at the application level.

I am wondering about the need to have your app services communicate with each other (is that because these are stateful applications ?) or have I missed the point completely ?

It's to avoid the complicated routing problem (i.e. building a router microservice that sits in front of it), and encapsulate that in the service itself (by proxying internally to another part of the service after inspecting the request).

you mean like an orchestrator service ? ex: building for a uber eats system which depends on responses of systems (B, C, D..) where each of these systems have xx min SLA`s ?

Atul Adya and Daniel Myers work on Slicer[1] so they have some bias.

[1] https://www.usenix.org/system/files/conference/osdi16/osdi16...

I'm currently working on a side project, a distributed in-process key/value store for .NET Standard, which might be interesting in this context. I hope you'll pardon the shameless plug, but I'm interested in getting feedback. Particularly, what to focus on, what use cases, missing features... And of course, if you see any issues with the approach.


It needs more work, but I think the eventually consistent part could already be useful for a simple distributed cache layer atop a classic relational DB.

Article reads like an overview of 2008 technologies, but offers no actual, new solution to the existing solution that fuels Twitter and other social media sites.

Would this just be like using RocksDB or something like that with extensions to auto-shard and balance between instances? That's what it looks like to me, but I could be misunderstanding it.

Event sourced persistent actors in cluster mode is where it's at!

Love it; "if you fetch it over a socket, it ain't no cache"

When I'm explaining KV to a junior dev, I do it something like this:

Memcached got big during a confluence of two problems that have both been solved now. First, we had a bunch of big, semi-stateful web applications built in programming languages with GC, and server memory got bigger than you could reliably collect without long pauses. Pushing some of your data into a separate memory space raises the ceiling considerably.

Secondly, during this ten year window, ethernet cards were half an order of magnitude faster than hard drives. Putting stuff on another machine could be faster than sending it to swap, a memory mapped file, or some more sophisticated data store (like a database).

We don't have to struggle with these now, and half the time we avoid the first one altogether. They still have lots of places they are used, but you are way better off working to cache inbound requests instead of outbound requests. That lets you move a bunch of caching to the edge of your network, or to the user agent.

It's "key/value" store.

I don't get it: Key-value stores is a topic that just won't go away. I was interested in key-value stores, did a little work, solved the problem for my needs, and don't care anymore. Sure maybe there can be some high end, challenging aspects, but what I did was dirt simple, childishly simple to do, blindingly fast, with plenty of capacity -- maybe will cover 75% of the need for everyone?

Heck, if we want something with really deep functionality, we are just back to relational database, right?

So, I wanted a key-value store for session state for users of the Web site of my startup.

When a user sends a page back to my server, it has in a "hidden field" the value of an encrypted token with identifies the user's logical session. And I have in VB.NET a class for the data I want to keep on that user, their session state.

So, in my VB.Net Web server code, I have defined some classes: (1) An allocated instance of one class sends a new or updated key-value pair to the key-value server. (2) An allocated instance of another class requests from the key-value server the value that corresponds to a key. (3) An allocated instance of a third class gets the value back from the key-value server.

For communications, just take an instance of a class, serialize it (thank you VB.NET), copy it to a byte array, and send that via old TCP/IP sockets. Then the receiver gets the byte array and does a deserialization to get the corresponding object instance.

The key-value server is based almost entirely on just two instances of a standard .Net collection class. One instance has the key-value pairs where, right, the key in the pair is the key in the collection class. The other instance has a time-date pair as key and for a value a key from the key value store. Here there is the time-date when the corresponding key-value was last written or read and is used for session time outs. So, for session time outs, just read this time-date and session key instance in ascending order on the time-date key and for any session time-outs delete that collection class element and also the corresponding element in the other collection class.

It's a few hundred lines of code and was a simple exercise in TCPIP, object instance de/serialization, and collection class usage.

I also have code to permit sending some system management commands to the key-value server.

It runs as just a console application.

The code is single threaded, and I'm counting on just the standard TCP/IP FIFO queue to hold the backlog of work to be done.

All the data is held in main memory so should be blindingly fast. Some simple arithmetic indicates that the main memory needed is not very large for even a quite busy Web server computer.

The session state server can serve any reasonable number of Web server computers, and a particular user doesn't need session affinity with a particular Web server.

Looked simple and easy to me.

How much of the need will something so simple satisfy?

What needs will such a simple solution not satisfy?

that's ... literally what people do with redis and memcache, you just wrote a slower memcache with fewer features.

that's on one server and the data is all in process memory?

GP has shown a long-term tendency to create a naive implementation, in VB.NET even, and then get surly about folks who refuse to acknowledge its sufficiency. No good can come from engaging / trying to educate.

N.b. built-in .NET object serialization is a slow, wasteful, brittle solution, but it could be okay if one doesn't mind the performance or portability issues. Heck why not use XmlSerializer then?

What's wrong with VB.NET? As far as I can tell, it's as good as anything else to get to the .NET Framework. I like it because it has more traditional syntax I believe is easier to teach, learn, read, write, and debug than C#.

With VB.Net, what am I giving up?

I'm asking about "sufficiency": It all seems dirt simple to me, where some simple code really is sufficient. So, where is the simple approach not powerful enough? Where is Redis needed and something simple not good enough? Where does my simple solution fall short? It appears that Redis is now really important: What does it have that is so important?

Is object instance serialization really slower than, all things considered, it has to be? Or if we accept (1) we are going to have instances of VB.Net style classes and (2) send the data via TCP/IP, that is, basically via byte arrays, then where can we do better than just serializing the class instances to byte arrays? If we work with just XML text, then we don't really have the advantages of performance of VB.Net instances of classes, and we're back to interpretive data structures instead of compiled data structures. I can think of some just blindingly fast approaches in PL/I because the PL/I data structures (1) are nearly as useful as class instances and (2) occupy essentially sequential storage -- so, could just copy storage from a starting address to an ending address and send the result as the byte array -- reverse that process at the receiver. Could do much the same with C structures, much less general than PL/I structures, if avoided most uses of pointers.

This is my first Web site, first use of collection classes, first use of a session state store -- I'm trying to learn what I'm missing out on.

> What's wrong with VB.NET?

nobody in the startup space is using it, so it'll be hard to hire for people that have the skills. There's a feedback loop there. Because no other companies are using it, people that don't have the skills also don't want to spend their time learning it, because the investment won't have a good pay off. I improved my resume and got more job offers by -removing- VB.NET ... nine years ago.

> It appears that Redis is now really important: What does it have that is so important?

It's really fast, really stable, and really well-understood by a lot of people. You can hire someone that already knows how to query it, how to operate it, how to fix it when it breaks. A lot of tools exist to work with it that you would have to write yourself if you rolled your own solution. When a bug is found, it's reported to the maintainers, and then fixed for everybody.

> If we work with just XML text

XML is pretty dead as a data interchange format. It's incredibly space-inefficient. Most people are using one of json, msgpack, thrift, or protocol buffers. (I'm personally a fan of protocol buffers; you write a schema and use the schema to generate the serialization and deserialization code, and the messages are efficiently-packed binary so they're pretty good on space efficiency.) There's a few outliers; mongodb uses bson because it predates the publication of msgpack but it's a very similar format. Writing the in-memory representation of a value and then reading it back on another machine and mmaping it and operating on it directly is a rare approach because it's hard to do it correctly while supporting changes to the types. Eventually you update an application, and an old record written by version 1 of the application has to be readable by version 2 of the application. It's not a completely dead strategy though, it's what cap'n proto does.

seriously though I'm willing to bet there is almost nobody on Hacker News that understands any of your PL/I comparisons. You keep bringing up PL/I as if you assume everyone knows PL/I. In ten years as a professional programmer you are literally the only person I have ever heard bring up PL/I in a discussion that's not about history.

I have a box from IBM with a version of PL/I that can run on Windows. I did run it years ago on Windows XP. I no longer run or want to run PL/I. I don't expect anyone to use it.

I bring up PL/I here on HN occasionally just as an example and lesson in programming language design. It's got some good features now lost, and for people interested in programming language design maybe some of the features could be brought back.

Since there has been a complaint on this thread that .Net style object instance de/serialization is slow and otherwise not so good, part of the problem is that such an object instance is not all by itself in some range of sequential memory addresses and, instead, can be scattered all over. So, de/serialization needs implementation by the language developers. So, HAVING the desired data in sequential memory could have an advantage, e.g., in speed, that is, just copy the contents of memory in the block of addresses -- tough to be faster than that (ah, do it in assembler and use some string copy instruction assuming that X86 has one of those!). Could do similar things for the simpler versions of C structs.

But in cautious shops, running code to get to memory addresses could be frowned on!

A broad point about PL/I is that its structures are nearly as useful as the .Net style classes but with much faster and simpler addressing. In programming language design, that's a little point to keep in mind.

I wrote enough C code to see how nearly all that syntax worked. So, maybe someday I'll write some C# code to be sure I know how its version of the C idiosyncratic syntax works!

Since I'm a sole, solo founder, floor sweeper to CEO, of my own startup, I get to concentrate on the tools my startup needs instead of tools someone else's startup uses!

The key-value store discussion here is good: I'm concluding that my needs are simple and my simple code is enough for those needs, that from the experience of others here in time I should want more functionality, and maybe should convert over to Redis. By then I might be hiring people to do such things.

I'm not sure that having my code in VB.NET is a serious dead end: As I understand, C# can do what VB.NET does with just a different flavor of syntactic sugar, there's a translator from Microsoft to go from VB.NET to C#, and the actual semantics are so close that for the corresponding language features (some of the more recent features in C# may be missing from VB) it is fine to call one of the languages from the other. That is, as I understand the situation, the real work is to get to the .Net Framenwork and the CLR (common language runtime) and VB, C#, F#, etc. are all essentially equivalent for that work. So, for differences we're talking some functionality and otherwise syntactic sugar.

My current work is more nonsense: Last night I started a run of Robocopy to do an incremental backup of one of my hard disk partitions. This morning Robocopy had completed without errors, but what it copied just as an incremental copy was about 80% of the whole disk partition when I've had only minor activity on that partition since the last full backup and turning off all the archive bits (archive bit OFF means that the file has been backup up since it was created or last changed and ON means that the file was created or changed since the last time the archive bit was set OFF).

So, somehow that disk partition got a LOT of archive bits turned on.

Open Object Rexx has a cute function can call to get a really nice description of everything in a file system subtree, one file/directory per line, with archive, hidden, system, read only bits, time-date stamp, size in bytes, full tree name.

So, I compared two of these files and found that after the last full backup I really did set all the archive bits off and for the incremental a lot of REALLY old files, going back years, still had their correct time-date stamps, had NOT changed, but DID have their archive bits set ON.

Bummer. Something has been walking around in that tree setting archive bits ON for NO good reason. Really big bummer.

So, to explain, the partition drive letter is K:. There are some files in the root of K:, but mostly the files are in K:\data01, ... K:\data05 and K:prog01, ..., K:\prog05. Right, separation of data and programs!

So, looking, all the archive bits set ON for no good reason were in just K:\data05. Hmm.

What could have happened?

Well, this computer is one I built from parts, running Windows 7 64 bit Professional SP1, but I also have an HP laptop running Windows 10 Home Edition with, of course, all the latest updates whether I wanted them or not.

And a week or so ago, I got okay with

net share

net use

etc. to set up file sharing between the two computers.

Well, I gave the Windows 10 system access to the Windows 7 directory, right, you guessed it,

That about has to be the cause. Somehow letting Windows 10 share the Windows 7 directory

had the Windows 10 system walking around all over that directory, including several directory levels deep, and to some files years old, and setting archive bits ON.


So, no more such sharing! I'll create a directory, say, HP_laptop_share, in the root of a partition and let Windows 10 share it. I'll just put temporary stuff there. And unless there are more bugs, Windows 10 will be limited to making a mess only in that directory

So, with the HP OFF the Windows 7 system, I'll do a full backup, set the archive bits off, and keep Windows 10 the heck OUT of there! Lost half a day with this. I call it mud wrestling. I has NOTHING to do specifically with my startup. ALL the work unique to my startup has been fast, fun, and easy for me -- the difficulty has been literally YEARS of mud wrestling, all for no good reason. What do others do about such mud wrestling?

> I bring up PL/I here on HN occasionally just as an example and lesson in programming language design.

PL/I influenced B, which influenced C, which influenced so many things that literally every person on HN has used a language influenced by C. You're out here talking down to people like "you whippersnappers don't know what you're doing" when you're literally decades behind the curve.

and besides, check out lambda-the-ultimate.org, that's a community more specifically oriented at programming language design. And no, they don't talk about PL/I either.

> A broad point about PL/I is that its structures are nearly as useful as the .Net style classes but with much faster and simpler addressing. In programming language design, that's a little point to keep in mind.

Go has this, Rust has this, lots of languages have this. Language that give the developer the ability to control how their data structures are laid out in memory are not a revelation, we know about them. That's not a new or lost feature, you just have your head buried so far in the sand that you are incapable of viewing anyone that came after you as knowing anything.

https://syslog.ravelin.com/go-and-memory-layout-6ef30c730d51 https://doc.rust-lang.org/reference/type-layout.html

And besides, Go was co-designed by Ken Thompson, who invented B, which was strongly influenced by, drumrollllll please, PL/I!!! Your dedication to not listening to people has robbed you of knowing about things that you would probably like.

> just copy the contents of memory in the block of addresses

I already addressed this earlier. That's what cap'n proto and flatbuffers both do: https://capnproto.org/ https://google.github.io/flatbuffers/

If you listen to people, they don't have to repeat themselves so much.

> de/serialization needs implementation by the language developers

that's literally untrue. People write serialization and deserialization libraries all the time. Lots of languages give the programmer the ability to lay out how their data will look in memory, and lots of serialization libraries are mindful of creating values that have reasonable data locality characteristics. Lots of languages and deserialization libraries don't do this because it literally doesn't matter for most programs and for most programmers most of the time. You're approaching the problem as if computers are expensive, because in your PL/I days they were. But computers are cheap now, and getting cheaper every day. For example, an ec2 instance with 4gb of memory will run you all of $20 a month. Napkin math on some nice round numbers: a programmer costs you $150k a year and they work 2000 hours a year, that works out to $75 an hour, or $600 a day. If a task takes you one day, but you could avoid the task by buying another ec2 server, you just buy the space and avoid the task. A day of programmer time is worth 30 months of server time. Unless you're talking about Windows servers, which are twice as expensive because literally nobody runs Windows servers, they are garbage.

> I wrote enough C code to see how nearly all that syntax worked. So, maybe someday I'll write some C# code to be sure I know how its version of the C idiosyncratic syntax works!

That word does not mean what you think it means. Idiosyncratic is a relative term: it specifically means someone who is outside of expectations, outside of the norm. C-style syntax is used in, yes, C#, but also JavaScript, Go, Rust, and a great many other programming languages. It is PL/I that is idiosyncratic, as it is literally only you talking about it. Idiosyncratic does not mean "unfamiliar to ME", it means "differentiating the individual from the majority". If everyone is doing something one way, and an individual is doing it a different way, it's the individual that is being idiosyncratic.

I don't know why you went off on a long tangent about your backup problem. I do not care. It should be obvious to you that I would not care. And the cherry on the top was "I call it mud wrestling". Dude. We have a name for this. It's "yak shaving". Everyone uses that name. You would know that if you ever listened to anyone. https://en.wiktionary.org/wiki/yak_shaving

> What do others do about such mud wrestling?

You put all of your code in a hosted source control provider like Github or Bitbucket. Database backups and archival images go on Amazon's S3 or Google Cloud storage. Usually you'd automate the backups and make them full backups and retain a certain number of snapshots. I don't think most people nowadays back up their entire computer. "Everything important is in the cloud" is the most common strategy.

Talking to you is very exhausting. You consistently demonstrate a complete lack of respect for the generations of programmers that have followed you. As a result, you struggle with problems that beginner programmers solve every single day without issue.

> You're out here talking down to people like "you whippersnappers don't know what you're doing" when you're literally decades behind the curve.

I'm doing no such thing. I've done nothing wrong.

I know C and PL/I well, and I see little or no influence on C from PL/I.

> Go has this, Rust has this, lots of languages have this. Language that give the developer the ability to control how their data structures are laid out in memory are not a revelation, we know about them.

When I was at IBM's Watson lab, we designed the language KnowledgeTool. Then I looked at some of the literature of programming language design. I didn't find anything very new or exciting. It looked like a dead field. I haven't looked again since then.

Some years ago I made a decision, to do my computing on just one of Linux or Windows.

What influenced me:

(1) After some trying, I didn't find any documentation that explained even simple things about Linux. E.g., just what is/is not in a Linux "distro"? Do all the distros have the same operating system APIs, the same utilities, and run the same applications? What are the differences in the distros?

(2) What progress has Linux made in exploiting 64 bit addressing?

(3) With Linux, for operating system updates, do I have to compile the operating system?

The books I found didn't say. Dirt simple questions; no answers.

(1) For Windows, I was already using 3.1 or some such.

(2) I was and still am impressed by the relatively high promises of backwards compatibility.

(3) I looked around and saw some large Web sites using Windows.

(4) Microsoft runs some astoundingly well organized, huge server farms running Windows, e.g., just for their MSDN documentation. It appears that they have operating system installation down close to just push one button and bring up 1 million square feet of racks with servers.

(5) From what I've seen, I like the Microsoft file system NTFS.

(6) I want a LOT of good documentation. E.g.. for .NET I found, downloaded, read, abstracted, and indexed about 5000 MSDN Web pages. I put tree names in my code and have a single keystroke in my editor that will have Firefox display the page. Maybe this is something like what Microsoft does with intelisense -- but since I don't know anything about intelisense I can't say.

Microsoft's documentation is really good in some respects -- the pages are nicely formatted; they often have references to related pages; there is some organization to the pages; they spell the words correctly; etc.

I get torqued at times, e.g., when they use terminology without defining it, explaining it, or linking to such content.

For several of their major efforts, I still have no idea what they are driving at -- as far as I can tell, they don't explain. E.g., I went for some months getting their news letters touting Azure. Lots of talk about Azure. How good Azure can be. How to get started with Azure. And with all this, they never explained what the heck Azure actually WAS! Eventually Azure leaked out: It's a Microsoft server farm set up so that users can send their code and the farm will run it. So it' cloud computing often running some user's code. Okay. Now I understand the basics of Azure.

But I'm reluctant to bet that the Linux documentation is better.

(7) At the time, Microsoft had $40 billion in cash; now they are worth about $1 T. So, Microsoft DOES have the resources to keep Windows, Windows Server, SQL Server, various utility programs, malware defenses and corrections, etc. up to date.

(8) Microsoft has a LOT of users, and sometimes some of those users post good answers to some questions.

(9) I found that if really get stuck and really do have a serious question, then there are people at Microsoft who will give some fast one on one tutoring. E.g., early with VB.NET, I wrote a little routine to use classic heap sort to sort an array of instances of a class. I ran the thing, and it ran for 600 seconds when it should have run for about 2 seconds. So, right, I didn't understand just what I was supposed to do with polymorphism -- I was supposed to write an interface. Gee, we did much the same back in Fortran. Okay, polymorphism has a lot of hype and, really, is much like what we were doing with old Fortran; have to write an interface; I found some examples, tweaked a little, got an interface, and running time about 3 seconds.

An amazing part is that somehow VB.NET essentially dynamically discovered the data type of the class I was passing to the sort routine. So, maybe all that is available via reflection. I was impressed.

For GO, Rust, Haskell, as for Ada, Prolog, APL, etc., for me for my work now mostly no thanks.

I need to eat; for that, need to make money; for that, starting a business. The business makes use of computing, and I have to write some software.

For that software, on Windows, I picked VB.NET. It's fine with me. I suspect that I could convert over to C# without much trouble. One way to convert would be just to run my VB.NET code through a translator that writes out equivalent C# code -- my understanding is that Microsoft has such a translator.

As far as I can tell, VB.NET is a plenty good way to get to the Windows services and APIs, the .NET Framework, and the common language runtime (CLR).

Maybe some programmers take pride in learning lots of languages. Okay. But that's not what I am: Instead, I'm trying to start a business.

Besides, I don't see much attention to GO, Rust, Haskell, etc. on Windows.

On being "behind", I don't think so: The amazing thing about computing is the low cost and high performance of processors, storage, and communications.

Other than that, the significant things in computer science and practical computing have changed only very slowly.

To me programming is essentially defining storage, writing expressions to manipulate storage, the control structures if-then-else, do-while, and call-return, handling various data types, e.g., strings, dates, and handling exceptional conditions.

To me the key to my startup is not the computing at all -- it is just routine. The key is some applied math I derived. So, I'm an applied mathematician in business who uses computers.

The computer tools I use are the ones I want for my business. Since I won't be using Go, Rust, Haskell, Java, Ada, Prolog, etc., I don't pay attention to them. Someday I may use some Python or JavaScript -- if to, then so be it. Microsoft has Iron Python that is compiled and maybe has good or full access to the .NET Framework. Okay.

To me, the future of computing will be in mathematics, complete with theorems and proofs.


I'd heard of Redis -- it looked like overkill and more work just to understand than what I wrote.

I'd never looked at memcache.

With your mention of these two, I did a little Google search. Google has some of their little nutshell relevant descriptions right away and a also a link to an informative article:


This article seems to say that now Redis has the memcache functionality built-in implying that never need to use both.

The article does seem to claim that memcache is multi-threaded and Redis, single.

I've been hoping I could get by with single threaded and the TCP/IP FIFO queue -- IIRC .Net has a multi-threaded version of the collection class.

Or, maybe if want multi-threaded, better just to run that many instances of the software, each single threaded, and f'get about the overhead for the multi-threaded collection classes?

Again it looks like memcache and Redis are more complicated just to use than what I wrote.

"You just wrote a slower memcache with fewer features."

Well, well almost anything will have more features than I included!

What I wrote should be plenty fast internally if the .Net collection classes I used are from AVL trees, Red-Black trees, or something similar. Then what I wrote would be slower because (i) get to it via TCP/IP sockets instead of some addressing more direct that would have to be on the same machine.

But I didn't really want my session state store to have to be on the same machine as the Web server software that is using it. So, that essentially implies that a Web server has to get to the session state key-value store via TCP/IP sockets and stands to be about as fast as anything else with such socket access.

Let's see: For some much heavier loads with lots of Web servers and lots of session state store key-value reads/writes, startup 20 such session state stores and have each key start with a number 1-20 to specify which key-value store. Then do a crude load leveling -- for a new key-value pair, send that to 1 plus the number of the last number used modulo 20, that is, round-robin. So, that would be my approach to something scalable, distributed, IIRC, sharding. Yes, could do more where could have an instance of the session state store start a shutdown be accepting no new key-value pairs and then after the time out value with all its pairs deleted send a message somewhere saying it is shutting down. A Web server trying to sent a new pair to such a shutting down server would receive back a going out of business message and not bother that server again until it has received another message. Or when intending a shutdown of a session state server, just send a notification to each Web server that the particular session state server was shutting down. So, could have some system management features for better continual operation. Maybe Redis has all this already and the details of such features are what I don't want to take time to read just now!

Sounds like if my startup is successful, I'll be converting over to Redis!

yes, redis is single threaded. The author's stated position is that if you want to use more cores, you run more instances of redis on the target machine (but then you have a scheme to map keys to running redis instances as if it were multiple servers). Keeping it single threaded is that it makes reasoning about the data guarantees simple. You're guaranteed that no two redis commands are in flight at the same time.

serialize an object to an opaque run of bytes, writing it to redis with SET, get it back with GET, is like ... the majority use case of redis. It's basically just a big hash table with a nice networking layer, that's all it is. It'll do a few hundred thousand operations per second if all you're doing is getting and setting opaque values. You don't have to use the other features, and their existence doesn't cost you anything, but you can have, for example, a hash table of strings, or a list of strings, or a skiplist of strings, or a set of strings, and those things all come in handy. For example, if you have a set, you can just add elements to the set, and if you want to know if the set has something, you just send the value and it gives you a bool back; you don't have to read the entire set to check it for one value.

yeah, you're describing a very naive implementation of modulo sharding. That's a pretty old technique, although you wouldn't generally put the shard index into the key. If you do that, you're making it impossible to move the value, because you'd have to change the key. `foo1` and `foo2` are different keys; you have to be able to move the value without changing the key. So generally you wouldn't put the offset in the key itself, you would take a hash of the key and modulo the hash to find the index of the node containing that key.

But anyway partitioning the data that way is easy. Nobody is contending that writing it that way once is easy. The problem occurs when you want to add or remove servers. Re-partitioning a live dataset is much harder; adding and removing a node means having to reassign and move a bunch of keys. Unless you build some sort of query-routing proxy, you're also requiring the clients to be aware of the cluster topology, and you have to coordinate cluster changes with the clients.

TCP buffer as command queue is hilariously idiosyncratic, I've never given it much though. Dunno what the failure modes are there. TCP flow control should dictate that the sender stop sending data, but I don't know what that means practically; it might cause the sender to block until the queue clears.

Many thanks.

Your points on the topology of the architecture of the servers is well taken.

My idea for taking a session state server off line has been just (1) to send it a message to decline more new pairs and (2) to wait until all its key-value pairs have timed out and then just shut it down. But there's a flaw here: An actual point in server farm system management uses a variable part of the Web site logic, that is, the time out intervals. That's not so good.

But other ways of session state server shutdown would about have to be able to move key-value pairs from the server to be shutdown to another key value server. Maybe easier would be just to replace each session state server with two essentially identical. Then if don't have a proxy (itself a single point of failure?) have each client send everything to both of the session state servers. In a serious data base environment, there would have to be commit logic, etc. Hmm. Otherwise, when want to shutdown a session state server, then just shutdown one of a pair.

To bring new pairs on-line, then just send to all the clients a table of all the session state servers they could use. Lock the table, change it, and unlock it? Or just have the table accessed via a pointer and in an atomic operation change the pointer to the new table? Then to delete the old table have to wait until it is no longer in use? Hmm ... concurrency issues which I'm trying to avoid! Hmm, put the new table on some server and then send all the clients a message to go to the server and get their copy of the new table -- single threaded solution!

Moving key-value pairs from one key-value server to another sounds to me like Cadillac, filtered air, humidity controlled air conditioning in an open Jeep! Wow. My startup isn't for stock trading, banking, or health care! If a key-value store fails, then on some Web servers there will be some TCP/IP session time outs and some Web site users will get some messages with apologies and some cartoon of a mess up.

Hashing could be faster than AVL trees. There is a tweak to hashing in

Ronald Fagin, Jurg Nievergelt, Nicholas Pippenger, H. Raymond Strong, 'Extendible hashing—a fast access method for dynamic files', "ACM Transactions on Database Systems", ISSN 0362-5915, Volume 4, Issue 3, September 1979, Pages: 315 - 344.

that has a graceful way around collisions. We used that in some work at IBM's Watson lab for their Resource Object Data Manager (RODM) shipped with NetView.

For one of the collection classes I'm using, for detecting time outs, the key is a time-date stamp, and I want a fast way to read the pairs in ascending order on time-date stamp, right, starting with the oldest time-date stamp. So, mere hashing would not offer that. Ah, I fell in love with AVL trees when I first read about them in Knuth, The Art .... I'm just hoping that the .Net class is based on AVL trees or something equivalent.

Somewhere between the user's fingers and the most distant parts of the server farm there have to be some flow controls of some kind.

So, if the TCP/IP input queue to a session state server fills, then likely, right, the Web server making the request will block, that is, wait, maybe time out. So, that would be one source of flow control.

Maybe I should think about implementing more.

There's an old approach to flow control: Just don't respond to the user so fast. Instead, make the user wait a little. That approach, in effect, slows the rate of data input to the whole Web site.

Maybe I should think, at least for the future, say, version 2.0, about actually designed means of flow control, rate limiting, etc.

What do people do about that, say, at high end sites?

> wait until all its key-value pairs have timed out and then just shut it down

you can do it that way but it often winds up making the server's shutdown dependent on a client behavior. For example, the liveness of a key is determined by the client in most applications, not the server. Any sort of naive TTL system where a read on a key extends its lifetime means that the server cannot shut down without permission from the clients, because the client can just keep reading the same key over and over, extending its lifetime until the client shuts down. I have a system that works this way now, I'm working on removing that behavior. When I want to take a server offline, I have no idea if it will take ten minutes or four hours to drain. It's very annoying.

> have each client send everything to both of the session state servers

generally a client would write to just one server (server A), and that server would write to the other server (server B). If server A applies the change locally and then confirms it with the client, then syncs the change to server B independently of the client request, that's asynchronous replication. If server A waits until server B has applied the change locally before confirming the write with the client, that's synchronous replication. If the client can only send writes to one of the servers, but can send reads to either server, that's master-replica replication. The easiest thing to reason about is synchronous master-replica systems: client sends a request to A, A applies it to A's storage, A sends it to B, B applies it to B's storage, B confirms it with A, A confirms it with the client. If a client can send a write to either server, that's master-master replication. Master-master replication is harder to implement; what if two different clients write the same key on two different servers at the same time? A conflict arises.

Concurrency issues are unavoidable in web systems. Uncoordinated users are using the system simultaneously; "everybody wait in line" is not a realistic option in practice.

> What do people do about that, say, at high end sites?

that varies pretty widely. Depends a lot on what you're building, how much money you have, how much time you have, how many users you have, how many engineers you have, how easy it is for you to hire, what skills your current engineering team has, what clients will be accessing the data, etc. The permutations of concerns are innumerable. Lots of people buy hosted redis from redislabs.com or AWS Elasticache because they are either unwilling or unable to run a redis server themselves. That's a great solution for most people most of the time; it works great for a lot of use cases, it's relatively inexpensive, and the providers give you useful tools that you would otherwise have to build yourself.

> Maybe I should think about implementing more.

Are you selling a key-value store? What is your startup?

Maybe you should think about implementing less, and focusing on building the things that actually matter to users. Key-value stores are nice engineering problems, but they're implementation details: users don't know about and don't care about their existence, and their existence does not drive revenue. And that's why most people just buy hosted redis, download a redis client library, and call it a day. It's not technically the best solution, but it's often times the most economical solution.

Many thanks. A keeper.

Of course, you are fully correct about waiting until all the users have timed out. Even if I set the time out to 1 hour of inactivity, a user could do something each 10 minutes for days, as you mentioned.

The only things I would consider adding to my key-value Web session state store would be some little system management commands/queries. I have some now. They are dirt simple to implement. We're talking no more than an easy day. I wouldn't be trying to implement much. I'm learning: I time I'll want a graceful, fast, high performance, and reliable way to move key-value pairs from one server to another one. If Redis or some such has that, then I won't program it -- buy, not build.

Early on I have been intending to do nothing special about hardware redundancy for on-line reliability. I suspect I can get the equipment and software needed for the on-line work to run fine for weeks at a time. That should be enough for alpha test, beta test, version 1.0, and some ad revenue. Then I'll take some simple approaches to hardware redundancy and keep growing. Maybe a Cisco box between the Internet and my Web servers could help bring Web servers on/off line. And, sure, some uninterruptible power supply boxes for the computers, network equipment, lights, etc.!

> Are you selling a key-value store?

Heck no!

> What is your startup?

Might as well say -- close enough to the alpha test which I intend to announce here:

It's a new and very different Web content search engine for the, my guess, 2/3rds of the content on the Web, searches people want to do, and results they want to find currently served at best poorly. It's all safe for work, families, and children, uses no cookies, has no logins, and likely meets the EU GDPR standards.

It makes no use of old data specific to a user: That is, if two users give the same inputs on the same day, then they will get the same results (assuming I do system updates at night!). They may not get the same ads.

There is no use of key words.

It gets new data via an interactive, iterative dialog. In this way, the results are highly personalized. Users get to drill own, focus in, filter out, zoom in. Actually it's more about discovery and recommendation than search.

Or, roughly key word search requires a user looking for Web content (i) to know what they want, (ii) to know that it exists, and (iii) to have key words that accurately enough characterize that content. I suspect that now often what the users are getting is some "Top 40" version based on popularity.

When a user has (i)-(iii), key words can work well. If what they want is really popular, then they can do a sloppy key word search and still get what they want. But IMHO that covers only about 1/3rd of the content, searches, and results of interest.

Some people at Battelle knew some such things long ago.

Key words are especially poor on still images, video clips, and recorded music. All that aside, in the 2/3rds, what a user really wants is the content with the meaning they have in mind. Getting at meaning has been challenging: Key word search has a tough time getting at meaning even for content based on text.

To be clear, in the 2/3rds, the user is looking for something they have never seen before and don't even know exists. So, they need recommendation and discovery of content with the meaning they want.

The key to the whole thing is some applied math I derived, complete with theorems and proofs, to process the data. The applied math is based on some advanced pure math prerequisites. The data manipulations are not heuristic, data science, machine learning, neural anything, or artificial intelligence and, really, not in computer science. Instead, the core is some math -- of course the users won't be aware of anything mathematical.

I had the code running on Windows XP, but then that motherboard quit. I didn't lose any code. Now I'm bringing up the code, for an alpha test, on the box I plugged together, an AMD FX-8350 at 4.0 GHz with Windows 7 64 bit Professional. The code is about 24,000 programming language statements in about 100,000 lines of typing. Yup, it's all in VB.NET -- the language looked fine to me.

Likely in time I should convert to some version of Windows Server.

Now, if I can get simple file sharing not to make a mess out of archive bits, it will help.

This is a misleading title: it’s talking specifically about remote in-memory key-value stores. I thought that this would be against e.g. Badger.

Agreed. It’s clearly targeted at the memcached/redis model.

Fixed above. Thanks!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact