Hacker News new | past | comments | ask | show | jobs | submit login
Paul Graham inspired the creation of Redis (twitter.com/antirez)
411 points by oneowl on March 27, 2019 | hide | past | favorite | 93 comments



I want to clarify a couple of things. I'm not saying that Paul Graham invented this pattern. Actually after he mentioned it, I remembered a friend of my father to implement exactly that in QUICKBASIC in the late 80s :-) The point is that maybe the Redis design was already inside me, but I needed a trigger: I often think of good things after being triggered. And smart people are more likely to tell about good ideas, old and new. That was the point. Similarly I believe there are a lot of simple fundamental ideas that can be re-applied to today's technology, as the landscape changes many things become relevant again.


I'll just take the opportunity to say how grateful I am that the idea of inventing Redis struck you -– regardless of how it originated. I use it all the time, both professionally and in my free time. Awesome piece of software. An idea is worthless by itself, execution is everything.


Very much this. Redis is pretty much the swiss army knife of persistence around our shop.


I've been meaning to learn redis for quite some time. Do you have favorite resources for this? Thank you.


Just the documentation to be honest. The basic functions of redis are quite simple to learn and use either using the redis-cli client or a language binding. Basically you PUT keyname value, and GET keyname to retrieve value. There's a ton of additional features and types of structures but the basic use of it is as a key/value store.


<3


Let's say you get an idea—or, as Pooh would more accurately say, it gets you. Where did it come from? From this something, which came from that something? If you are able to trace it all the way back to its source, you will discover that it came from Nothing. And chances are, the greater the idea, the more directly it came from there. "A stroke of genius! Completely unheard of! A revolutionary new approach!" Practically everyone has gotten some sort of an idea like that sometime, most likely after a sound sleep when everything was so clear and filled with Nothing that an Idea suddenly appeared in it.

-Benjamin Hoff, The Tao of Pooh


>"I often think at good things"

The idiom is "think _of_", not "think at". :)

(BTW, Antirez, your English is very good; I'm aiming for helpfulness, not nit-picking / criticism!)

-- and given your awesome contribution to the world (Redis), you definitely do think of very good ideas!


Thanks! Sorry in Italian is "pensare a" (literally think to) so I always get confused :-) I'll try to remember.


I quite like the image of an idea floating somewhere near me so that I can 'think at' it!


> I quite like the image of an idea floating somewhere near me so that I can 'think at' it!

For languages like Italian (as well as my own native language), the image of 'thinking at' makes sense as an analogue to 'looking at', which you'd indeed do if the idea were floating by in your vicinity.


Very true!

I think the closest English analogue is 'think about', which taken extremely literally does also place the thinker _about_ (in the vicinity of, around) the idea.


Man, some of the comments on here are really disheartening. What's the deal with trying to humble people with long-winded, tangential counterpoints, gotchas, and condescending questions? Constantly proving one's intellect seems to be a prevailing theme on HN and I'm not seeing how it adds to the quality of the content. I'm sure someone will unearth the irony in this and let me know soon enough.


Best not to read the comments. There's literally nothing you can do about the phenomenon you're describing.



Hmm, I thought this pattern was really common? That is, appending everything to a file and reading back from the file when there's a reboot. I constantly use it when a database (or redis for that matter) is simply overkill for my use case.

Here's a 34 line implementation I use on a node production system. It writes struct-like JavaScript objects that represents events to disk. When reading it back I do a fold (or .reduce) to build the state.

And yes –– it could be way smarter (writing to memory and disk), but YAGNI has been working out pretty well so far.

  class EventStore {
    constructor(file) {
      this.file = file;
      this.cache = null;
    }
  
    async appendEvent(event) {
      // Purge the cache for entries since we mutated the store.
      this.cache = null;
      return new Promise((resolve, reject) => {
        createWriteStream(this.file, { flags: 'a' })
          .on('error', reject)
          .on('close', resolve)
          .end(`${JSON.stringify(event)}\n`);
      });
    }

    async readEvents() {
      if (this.cache !== null) {
        return this.cache;
      }

      try {
        const data = await readFile(this.file, 'utf-8');
        const lines = data.split('\n').filter(line => line);
        const events = lines.map(line => JSON.parse(line));
        this.cache = events;
        return events;
      } catch (error) {
        return [];
      }
    }
  }


It's pretty common. All the wonderful proprietary file formats from the 90s (and probably before, and probably after) basically boil down to writing raw C structs to disk.

You can try it yourself... mmap a file, memcpy some structs there, do the reverse ... and enjoy!

(Obviously depending on the memory layout of one C compiler on one architecture does not make for portable files. But that was never a design goal of this system.)


I spent part of the mid-1990s hacking a proprietary data streaming protocol written in lexx and yacc. It was what we had back then.


I've been using (and developing a fork of) NeDB [0] that does exactly what you describe: an in-memory database with append-only logs of changes for file-system persistence.

On startup, and optionally at regular intervals, it "compacts" the database by reducing all events to a single JSON string.

The README links to an article by @antirez, Redis Persistence Demystified [1]. It's been educational studying how it works.

[0] https://github.com/louischatriot/nedb#persistence

[1] http://oldblog.antirez.com/post/redis-persistence-demystifie...


> Hmm, I thought this pattern was really common?

while true, redis predates node, so your example does not support your use of past tense.


A pattern is not specific to a language. Read again.


To offer a slightly alternative perpective on this, I actually think this type of "article"/("listicle"/"tweetacle"?) can have negative effects. In my mind, it lends credence to the very toxic notion of "value of ideas" over "value of execution".

The former is something that has all sorts of knock-on effects: backward IP laws "protecting" ideas, perverse incentives within large corporations with outspoken "idea men" being promoted ahead of doers, non-technical founders with "high-potential ideas" sucking up investment and expecting to execute with technical hires on untested theories.

I'm not saying any of the above applies in this case of course, but the fact is that it is you who built Redis, not pg, nor many others who've had similar ideas, and I think writing the above tweet thread lends undue weight to many of above negative trends in our industry (and also in general in recorded history of IP/invention-credit battles).


It's a matter of interpretation. IMHO the tweet shows how valuable Hacker News itself is, not Paul Graham ideas (but then HN was created by PG, so, yep, also gives credits). If you take a number of people that are good at doing things and put them together, this will result in more things created because of a natural process of ideas exchanges / triggering. I'm a example of a very isolated programmer, so this applies especially to folks in my condition, but at this point I guess there are quite a bit of "us".


> In my mind, it lends credence to the very toxic notion of "value of ideas" over "value of execution".

Ideas are required for execution; they don't have independent value (idea without execution delivers nothing), but neither does execution (you have tohave something to execute.)


Of course. I didn't say otherwise: what I'm talking about is value. Ideas are often (usually?) assigned greater value than execution itself, which is absurd.

In actual fact, even beyond "inception", most execution requires ongoing iteration and innovation. No final product is solely the result of its inspiration.


We might have biased opinion on the subject at least those of us heavily involved on execution side. Execution is much more subject to available resources.


Irrespective of what the perception of pg is, the tweet was pretty clear that pg only "inspired" or "triggered" the creation of Redis. That gives pg no more more credibility than the moon gets for inspiring Van Gogh to paint Starry Night.

Attributing inspiration takes nothing away from an artist, nor from the work. I don't event think it meaningfully changes the credibility of the inspiring object.

I could randomly paraphrase and quote lines from the Art of Computer Programming, for instance, and if I had a wide enough audience I'd probably bring about a Great Renaissance in the field of computer science and programming. History might remember as the Greatest Idea Man of all time, but that seems unlikely.


“Look at how awesome I am for inventing Redis” wouldn’t have been as interesting, though.

Ideas are cheap and plentiful if you have the eye for them, and learning the sources of inspiration other makers had can be informative if you also want to make things.


In fact your main problem is descoping a large number of ideas to get to a small, coherent set.


> In my mind, it lends credence to the very toxic notion of "value of ideas" over "value of execution".

On the spectrum on sh#t hn believes, this one, along with meritocracy has got to go. No one is arguing that ideas have _more_ value than execution. But ideas clearly have value, we entire buildings devoted to them. The internet was built to transport them. Getting exposure to them is deemed critical for the development of our young and the future of the humanity.


You might have some points here (I certainly agree about meritocracy), but the examples are pretty bizarre. How does building devotion prove inherent value? The internet was built to transport information: knowledge, education, documentation, history, insight. "Ideas" are none of these things: ideas are inspirational and transient in nature.

(Sure, the internet's communication protocols do transmit "ideas" in numerous ways: mainly in terms of shared experience and evolving iterative conclusions from shared knowledge, but it's hardly the intent of its creation).

Getting exposure to past knowledge and experience is critical to our young, and the future of humanity, but—frankly—the glorified fantasy surrounding some popular histories does more to further the idea of the privileged "celebrated few" with the genius of inspiration, than to instill any sense of the true study, investigation, work put in by those who have achieved great things in the past. (think even old storiea like Archimedes' bath or Edison's bulb or Newton's apple or whichever more modern example— placing individuals on sort of divine pedestals to be blessed with such ideas).

You're calling out meritocracy, but it's exactly this kind of championing of ideas that leads to it. Salvatore has put the long, quiet, sometimes relatively thankless hours into making this a reality, and it could seem to some that pg is being put on a pedestal for being momentarily inspirational: is that not the very thing you're calling out?


No pedestals.

> pg is being put on a pedestal being momentarily inspirational

I am certainly not doing that. I don't think what happened wrt pg and antirez even qualifies as an idea or inspiration. Chain of events? Catalyst? Memcached was already being used like a data structure server before Redis came along. Redis did put deep thought into the design which is why it works so well for so many applications.

Ideas in and of themselves have value and are needed. Really good ones take a long time to create and hone. Edison's bulb is a great example of brute force. Newton's Apple always seemed like a creation myth. History goes a lot deeper than pull quotes.


Trivia: when I wrote Redis I was not aware of Memcached, only heard the name multiple times but I did not know what it exactly was.


The comment should probably be still around right? Does anyone have the link?



Here's the one, I believe. Linked to parent for context:

https://news.ycombinator.com/item?id=14754

PS: They're basically the same, really, just adding another candidate to yours.


Wonder if HN still has all 20 million comments in a hash table?


“System prevalence[1] is a simple software architectural pattern that combines system images (snapshots) and transaction journaling to provide speed, performance scalability, transparent persistence and transparent live mirroring of computer system state.“ — https://en.m.wikipedia.org/wiki/System_prevalence


PG was also involved in the inception of Reddit: It was PG who gave Alexis and Steve the idea to make something like reddit, and also gave them the tagline "the front page of the internet".[0] PG had vetoed their initial idea to create a food-delivery app and then called them back and asked them to come up with something new. [0]: https://www.youtube.com/watch?v=5rZ8f3Bx6Po


Actually, I wonder why we dont have yet a programming language and runtime which, after a shutdown, reload exactly like it was.


I believe you're thinking of Smalltalk there - https://en.wikipedia.org/wiki/Smalltalk#Image-based_persiste...


Unisys/Burroughs mainframes running COBOL ca. 1992 (and probably earlier) had this as a feature by default. You could walk up to one of these boxes right in the middle of a scary finance, payroll, whatever job, yank the cord out (not recommended), plug it back in, and the machine would boot up and return to what it was doing, usually with no ill effect.

Despite switching to a dozen new fad languages since then, programmers have yet to get around to replicating this in any broadly-adopted modern system. If you want to put the effort in to engineering it yourself, of course, you can. But it was nice to not have to engineer it at all.

When greybeards seem cranky, it's because of stuff like that.


If I recall correctly, Emacs currently does this. During the build it (slowly) loads all of the elisp into memory and then dumps out the memory image of it after it’s all been compiled.

I’m a bit hazy on the details, but I think it involves calling unexec(). https://lwn.net/Articles/673724/

Edit: just noticed a sibling comment mentioned this too...


Only for built-in elisp, as part of the build process. User configuration (including packages) is reloaded from scratch on each start.


Some lisps do this. like sbcl's sb-ext:save-lisp-and-die function is a "quit" that also persists the memory state. When you load this image later, you get it as it was when you "quit" last time.


"Actually, I wonder why we dont have yet a programming language and runtime which, after a shutdown, reload exactly like it was."

It's been tried many times, but you die the death of a million cuts. (Not just a thousand.) After you're shutdown, the world moves on, and then you get restarted. Now you're hardware has changed, your network connections have all changed, your version may have changed so the stored data may be all different, and perhaps surprisingly, worst of all, once something gets corrupted, it's corrupted forever. No "reboot" for you.

It turns out that "reboot" step is inconvenient in the short term, but in the long term, enforced a minimum amount of discipline on programmers to make sure they don't get into an unrecoverable state.

You'll note that if anything, the trend continues in that direction. All the recent operational work lately, in Docker, in things like ansible and chef and puppet, in "serverless", in reproducible builds for binaries, etc. can all be read through the lens of taking things that were previously images of unknown provenance, and ensuring that we can always rebuild them from an initial definition. We're in fact headed even farther away from image-based systems that reload in their previous state.


What strategy do you use to migrate old data to new codebase? Different business logic would probably want different answers in this regard, so I doubt that there's a one size fits all solution.


eLisp (and others!) do this. It's very slow because memory and disk sizes got larger, but disk transfer bandwidth didn't keep up.


Paul Graham inspired many people. His Hackers & Painters is something I enjoy going through now and then.



Those footnotes are pretty funny.


Doesn't AWS's Aurora database sort of work that way too?


The pattern in general is used by more or less every database in the form of write-ahead logging.

https://en.wikipedia.org/wiki/Write-ahead_logging


I'd be keen to see a simple example of this pattern in Lisp (or another language). Does anybody have a good link?


I use it by default for new Node.js projects (most of which are experiments but some end up in production, and have been running without problems for years).

It's a simple pattern to implement. To make it a bit easier to use repeatedly, I've got a small helper class called JournaledCollection. You pass it serialize+deserialize callbacks for your item type, and it takes care of persistence in event logs.

For a while I was thinking about releasing my helpers as a project called LAUF, short for "Lame-Ass Un-Framework". Then one could say: "Most of my projects are LAUFable, I don't need anything more serious." (Awful dad jokes are a solid reason to publish open source, right?)

Never got around to it though, but if you're interested, I could put together an example.


I like the name -- and it sounds worth sharing!


    sjl at alephnull in ~/Desktop 
    ><((°> sbcl 
    
    [SBCL] CL-USER> (defparameter *name* "World") 
    *NAME* 
     
    [SBCL] CL-USER> (defun foo () (format t "Hello, ~A~%" *name*)) 
    FOO 
     
    [SBCL] CL-USER> (sb-ext:save-lisp-and-die "session.core") 
     
    sjl at alephnull in ~/Desktop 
    ><((°> sbcl --core session.core 
    
    [SBCL] CL-USER> (foo) 
    Hello, World 
    NIL


Here's one I'm currently using in node (wrote it as another comment): https://news.ycombinator.com/item?id=19499799


Redis is a great example for unbundling a pattern to a lib/service/business.


Probably the next logical idea would be to apply to YC with Redis?


PG inspires as usual!


When the application would be restarted, reading back the log would recreate the in memory data structures. I thought that it was cool, and that databases themselves could be that way instead of using the programming language data structures without a networked API.

That is literally how databases work. In Memory + WAL + Data Files on disk. You could, in theory, live without the Data Files and just a big WAL.


Relational databases (except maybe MemQL) treat the file system as the source of truth. And usually the file system is bigger then the memory so they need to constantly update their cache with more relevant data.

Redis loads everything to memory. And doesn't keep the structure in the file system, only the log, recreating it from log+snapshot.


Except that most databases don't store their data in anything resembling "programming language data structures". You get tables, rows, and columns (or maybe a bit of JSON if you're lucky) instead of native integers, strings, lists, sets, and dictionaries.


Object and document databases have been around for decades.

Likewise ORMs which allow for higher order types etc. have been around since WebObjects i.e. also decades.


The primary purpose of an ORM is to overcome the "impedance mismatch" between relational databases and programming language data structures. There's no need for an ORM if you can store your data structures directly in the database.


> ... if you can store your data structures directly in the database.

ABSTRACT. Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation). It provides a means of describing data with its natural structure only—that is, without superimposing any additional structure for machine representation purposes. Accordingly, it provides a basis for a high level data language which will yield maximal independence between programs on the one hand and machine representation and organization of data on the other.

E.F. Codd. 1970. A relational model of data for large shared data banks. Commun. ACM 13, 6 (June 1970), 377-387.

You propose to reintroduce a problem that they absolutely wanted to get rid off 40 years ago. Just imagine that you first have to figure out how to painstakingly parse serialized Python dictionaries before you can access the data in another program written in e.g. Rust.

It clearly amounts to UNSOLVING a problem that is now SOLVED ALREADY.


Well, Redis allows me to store a JavaScript array, add some more with a Ruby client, remove some items with a PHP client, and finally read it back as a Python list just fine. What's the problem that has been unsolved? :)


https://msgpack.org is absolutely fine. I wasn't criticizing msgpack. You can also serialize the bytes of an array of C structs, if you see what I mean.


If this is a primary purpose of an ORM, then I wish I knew of one that isn't utterly failing at that.

One thing I've learned about this "impedance mismatch" is that it isn't a syntax thing, it's a fundamental difference in the way of viewing the world. The way you store data about the world is different from the way you model that world dynamically, with objects. I find it safer to always split out the "business model" from the storage layer, so that those different views don't interfere - and once you do that, you may as well implement the storage layer in a relational way.


.. and the code you implement to connect that business layer to that storage layer is an ORM.

The idea that the ORM forces the storage layer to a particular representation of the business layer is only true if they implement the ActiveRecord pattern, which isn't universal.


That is an angle.

Other way to say it is that ORM is a workaround to the fact most languages are VERY poor at manipulate data.

Exist 2 main reasons for the "impedance mismatch":

- Paradigms. 2 different paradigms will be at odds. Example: Functional and OO. This is ok.

- Limitations: The relational model is absolutely superior and more expressive at manipulate data than OO/Functional. You need A LOT of machinery to recover that power.

This is not ok.

However, this not change the fact that OO is ok.Similar how a KV store is fine, but certainly, a RDBMS store is much more capable.


> Other way to say it is that ORM is a workaround to the fact most languages are VERY poor at manipulate data.

This is why I love Clojure (and a particular style of Javascript). Destructuring and a good library of object/array manipulation functions make an environment well-suited to transforming data structures (which is precisely what I want to do in most programs I write), and I find I do not need an ORM where this type of data-focused programming is supported.


How is relational more powerful than OO/Functional if you can’t define an infinite dataset with it?


You can't?

Most people only see the relational model as is inside a RDBMS. That is how judge the OO for what you can do on Mongo.

RDBMS have some weird and well considered restrictions for their case.

But read a little about the relational model, and none of it depend on SQL or say anything about how is the storage.

P.D: Is important to note for where is MORE powerful. Remember that the creator of Pascal say:

https://en.wikipedia.org/wiki/Algorithms_%2B_Data_Structures... Algorithms + Data Structures = Programs

You can say OO/Functional lean more to the "Algorithms" side but RM to the "Data Structures". OO/Functional not say much how operate on data, most is a exercise for the reader.

Instead, RM give a clear answer and defined operations for that.

You need to "spice up" things to make the one be more useful to the other part of the equation. RM, too pure, certainly is incredible limited, not even can print "hello world!", but that is just too give a solution about how transform data...


It does if you use an object database like Realm.


I think you're slightly missing the point -- for me, a certain unique "awesomeness" lies in specifically being able to literally pipe stdout back to stdin and get back the exact same data structures.


Can be easily done by printing out the stdin to stdout in the first place?


ok, without doing that :p


Strange title.

Does merely recalling a language pattern really count as inspiring ?


In the tweet, the author of Redis says "[Paul Graham] ... inspired the creation of Redis". So it probably counts!


Ok, we put that in the title above for disambiguity.


Of course Erlang has had ETS (and of course you can use gen_servers of various kinds for this) built in forever. Redis is fantastic but I think there have been many examples of prior art before this tweet!


Lots of prior art but the tweet didn’t say PG inspired the paradigm; rather just Redis specifically.

Anecdotally, I’ve used plenty of in memory DBs (and written some too) but Redis has been by far my personal favourite.


Perl's `Storable` springs to mind here - I know many places who have mini-"databases" that are effectively straight dumps of Perl variables to disk that get loaded in, worked on, then saved back out.

I guess Smalltalk's images are the ur-example here?


Is this really a brilliant idea though?

Everybody knows RAM became cheaper cheaper, while mechanical disk can't be made faster, and SSD have reliability limits.

It seems entirely logical to expect databases to work on RAM first and then commit on disk for a large performance improvement.


When I started Redis people were like FTW data in memory?!


I was talking to a database teacher, and I tried arguing about ACID on an in memory database, arguing that there was a minimal window for database corruption in a transaction system, since the worst case scenario would seem to be a loss of very few transaction.

He was not really listening to what I was saying or my arguments, because a system like redis seems like a more than acceptable compromise.

It still seems a few people are reluctant to an in-memory database.


VoltDB is a probably a better example of an in-memory database with ACID semantics. Redis usually won't be deployed in a way where it fsyncs after each write operation because it runs too slow. You need to do some tricks like batching fsyncs to get decent performance and I don't think Redis has support for this. However, SSD fsync performance seems very high. I remember benchmarking an EC2 i3 instance that was giving 20000 fsync/s but if redis is giving you like 80k writes/s then even fsync this fast is going to be a bottleneck. [https://redis.io/topics/benchmarks]

I think people who are deploying redis are willing to tolerate lower durability guarantees for the extra performance. A lot of the time redis is some kind of cache and there is a way of reconstructing the real data from more durable storage in the case of failure. Or people are willing to lose a second of data or whatever fsync interval people are using.


s~FTW~WTF~

of course, _now_ they are like "FTW" :)


This is the definition of hindsight bias.

The best ideas often seem obvious in retrospect.


I've had plenty of good ideas that I haven't bothered to implement because I have no need for the solution. That doesn't mean if someone else implements the idea that I'll have hindsight bias about its value. It means it wasn't valuable enough to me to direct my behavior.


What I am saying is that this idea is totally obvious, but until somebody implements it to show it works, nobody dares to try it because it sounds like a bad idea.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: