Hacker News new | comments | show | ask | jobs | submit login
A quick look at the Redis source code (heychinaski.com)
149 points by HeyChinaski 1288 days ago | hide | past | web | 31 comments | favorite



Redis is my favorite example of a very clean and beautiful example of C code, just the right amount of comments, good variable names. It is great work. You can tell Salvatore cares and has passion for what he does just by looking at his work.

https://github.com/antirez/redis/tree/unstable/src


Wow it's awesome, comparing to the most of the C code that I saw it's beautiful. I like the long method names, because I can actually understand what they are doing.


Maybe you haven't looked very often but nice code is out there for quite some time. Take a look at anything related to the GNOME stack (GLib, GTK+, all the applications) or the Linux kernel for example. And quite frankly, I stumble upon badly written Python way more often than C.


The code for those might be great, but Redis is beautiful inside and outside. I'm not a fan of the actual output of the GNOME stack. I agree the Linux kernel code is generally great to read and to use.


IMO, OpenBSD and tarsnap are other notable samples of clean C code.


I'd also mention FreeBSD kernel and Niels Provos' code.


"And quite frankly, I stumble upon badly written Python way more often than C."

I'm curious, which public codebases would you consider to be examples of well written Python?


The often cited requests library, flask and Django. Some counter examples are high profile projects such as IPython and Matplotlib.


Thanks. I'll spend some time with Requests over the weekend. It looks like the person who maintains that library is also behind this http://docs.python-guide.org/en/latest/#writing-great-code


What are you comparing that with? I find it fairly normal for C code dating from past, say 2000.

Also, one thing that made a negative impression on me is https://github.com/antirez/redis/blob/unstable/src/adlist.c:

  /* Free the whole list.
   *
   * This function can't fail. */
  void listRelease(list *list)
  {
      unsigned long len;
      listNode *current, *next;

      current = list->head;
      ...
If you write such a comment, you better make sure it is true. And no, an early "list == NULL" check is not the only thing missing.


Compared to an average of other projects coded in C out there.

Others mentioned kernel, GNOME. Redis is nice because it is fairly self contained. It has a good set of networking code, some command parsing, storage. Kind of a happy medium.

Anyway, I didn't look in depth at understanding the semantics of every module.

Here are some more things (besides NULL check I can see that are worrisome in that particular piece of code):

* can len be out of sync with actual list length

* can list->free be defined (non null) but value wasn't allocated with an malloc perhaps

* what if an element is inserted twice in list, will list->free be called on already free-ed area

* on 64 bit machine unsigned long will be 64 bit and on 32 bit it will be 32 bit, is that problem?

Which ones do you see?

Now those are issues I see by looking at the module in isolation. But it is not quite an isolated. It is part of a large module. Sometimes there are invariants that are enforced at the input (at the system boundary) and so it is not necessary to always keep checking for NULL or validating inputs in every single internal function.

That is one good lesson I learned from Erlang. Check your inputs at the boundary then code for the "happy" path and let error result in quick and early failure. Maybe prefer a quick segfault rather than a dangling pointer or wondering later how exactly to handle NULL pointer if it is not really expected to be NULL.


you'd probably want to zero the length in the list structure and remove the dangling pointer to the first element otherwise you run the risk of a double-free.


It's actually a quick tutorial describing how to add a new command to Redis, not an actual analysis of the source code as I expected.


You might be interested in one of these articles:

http://pauladamsmith.com/articles/redis-under-the-hood.html

http://blog.togo.io/how-to/adding-interval-sets-to-redis

They have both been discussed before though.


Ah yes, I remember the interval sets article. The first post is a lot more in depth than mine. Both great links, thanks.


Yeah, it's a pretty shallow introduction. I'd like to write more articles on the Redis code. The sorted set skiplist stuff is really interesting.


The entire implementation of sorted sets is really interesting, with a dual implementation of ziplists and skiplists being used depending on the amount of elements in the list. I've been meaning to write bit more about Redis internals lately; maybe I'll start on that in my commute hours.

I've got a couple of general articles on adding a command and adding a datatype to Redis at http://starkiller.net, but I don't get too into existing code. I'd be interested in writing a bit more about the other data structures as well as the multiple strategies used for EXPIRE (which recently changed I believe).


Write: How to write custom C commands ?


It's more of an analysis of the structure of Redis' code.


I'm certainly novice in C, but as I was reading, I wondered about this

  {"get",getCommand,2,"r",0,NULL,1,1,1,0,0},

  "The fourth field, set to "r", is specifying that the 
   command is read only and doesn’t modify any keys’ value 
   or state.  There are a whole bunch of one letter flags 
   that you can specify in this string that are explained 
   in detail in the nearby block comment.  The field 
   following this string  should always be set to zero, and 
   will be computed later.  It’s simply a bitmask 
   representation of the information implied by the string."
Why would you opt for this, when you could specify some constants and bitwise or them together? Isn't that a more common thing to do, than to calculate a bitwise flags at run time?

   COMMAND_READONLY | COMMAND_RANDOM | COMMAND_NOSIDEEFFECTS
ect ect ect.

I'm sure there's a good reason, but this style seems strange to me.

Maybe redis makes use of the string later? but I can't help but feel it should build the string based on the flags, rather than build the flags based on the string.


In defence of the technique, the command table is quite succinct and arguably more readable at a glance than if there were a bunch of constants |ed together. I have no idea whether this was the original motivation though.


Yes. It is certainly more pithy.

ACTUALLY! It reminds me of a technique Bisqwit used when he made his emulator. He used strings to define the behavior of certain instructions, the strings were actually interpreted at compile time. Though I think this is a C++ specific trick.

http://www.youtube.com/watch?v=y71lli8MS8s

he brings in the instruction table at 1:30


Fun. This isn't really specific to Redis, but is a good introduction to the sort of thing that C programmers often get up to. You'll see this sort of function table in all sorts of C programs. Take a look at GNU stuff like make and you'll see the same format.

Congratulations on the exploring.


Nice post. I've been working quite a lot with the internals of Redis in the past few months. Adding custom commands along with the usual skimming through the builtins. Perhaps I should give people some insight by creating a few blog post as well. It's a really nice piece of software and written in clean, high quality, C. Not sure about the tests in Tcl though :).


I was quite surprised to see the tcl tests. I'm reserving judgement until I've tried writing one though.


To be fair, the tests in themselves are alright, but I'm not to familiar with tcl and have had problems with running them in a CI build with a lot of redis-servers being left behind. As the test are as far as I've seen basically integration tests it would be quite nice to have them in something like python to make them a bit more easy to handle.


I would buy a book written about Redis like this.


I find it fascinating to compare H2 and Derby source code. First was written by single man, has more features, is more compact and faster. Second was 'designed by committee' and evolved over long period of time.

I would also post link to my project, which is sort of 'Redis in Java', but it would be probably spam.


Postgres source code has been pleasure to read, and the commit logs are outstanding.


The Redis source code is very clean and readable. It's a great example of how a non-trivial code base can be written in good C style.


Excellent write up aimed at curious coders.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: