Hacker News new | past | comments | ask | show | jobs | submit login
A fun optimization trick from rsync (plover.com)
186 points by luu 5 months ago | hide | past | web | favorite | 64 comments

I don't like it. You should put the logic where the logic should be, so it's clear to see what was intended.

    typedef int (*MethodFunc)();

    void doMethod() {
        static int methodId = 0;

        static const MethodFunc methodFuncs[] = {

        while (methodFuncs[methodId]) {
            int success = (methodFuncs[methodId])();
            if (success)
                return 0;

        return -1;

In case performance is all you are after, this has much less potential for inlining than the rsync implementation has.

True, but for rsync's case of setting the mtime of a file, performance of this logic is utterly insignificant.

"Everything is fast for small n." -Knuth.

While syncing some million files, I'm sure it'll have an impact tho.

P.S.: Yes, we use it for a million files, regularly.

it's probably not noticeable for a million files

the syscall is going to be many, many orders of magnitude slower than the difference between a direct and an indirect call

IIRC, switch statements bigger than a certain sizes are compiled to jump tables. So it's just a static jump after the code finds a working method.

In theory, it's a very cheap piece of assembly after that point.

With trivial method() functions, you could speed through that loop a million times in a millisecond.

If all of the methodFuncs fail, this will attempt to jump to whatever address happens to be in memory after the last element of methodFuncs. You've forgotten to stop iterating.

There is a terminating NULL at the end of the array to stop the iteration.

oh my... that never went wrong

A terminating null element or logical equivalent in an array is helpful in lots of situations. As one example it simplifies the logic of looking for an element in an array and performing some special operation if it was not found. No bool found variable necessary.

Well yes, but it can easily be removed by accident, and is easy to oversee. See the parent comment

That's not clever, that's language abuse and horrific -- You could do this in so many other ways there is absolutely no point in abusing #includes mid-source-code (without mentioning the code smell around statics.)

Now, as a working curiosity, this is pretty damn cool.

Not really its just a static var on a switch statement nothing really bad about it, in fact its a semi nice way to handle a simple state machine.

Not tooo uncommon either. It's a "reasonable" way to write generators in C (if you're masochistic Python programmer, I guess.)

Switch on the last return point at the top of the function, and make all variables that need to persist between calls `static`.

(I've written it before, and I swear I didn't invent it... Maybe it's just a short skip from Duff's Device? Still, using it to pick up before the last return instead of after is new trick for me!)

Yes, it's basically the idea in Duff's device. Protothreads (http://dunkels.com/adam/pt/) uses a pretty similar trick.

Ah, that's exactly the trick I was thinking of. I think I learned it from [1], but I'm pretty sure Protothreads predates that post. Thanks for the link!

1: https://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

This was modeled on early Fortran where all local variables were static and it was trivial to implement persistent state machines.

To my eyes it also looked like a Duff's device which kept track of the successful state.

Yup. First thing that popped into my mind - with Duff's observation following shortly thereafter. Clever...too clever by half, IMNSHO.

> without mentioning the code smell around statics

Or the code smell around fall through switch statements.

There are legitimate uses for fall through switch statements and/or mid-file includes, and or static function vars, but combining all of them together is not clever, it's a recipe for bugs.

Im with you. This is great and clever, but I would have to maintain code like this.

An interesting trick, but it seems like there would be a simpler way, like having set_the_mtime be a function pointer. Initially it would point to an implementation that figures out which method works and sets the function pointer to point to a particular implementation.

Sounds really clever, but also has the potential to be really opaque and confusing. On the other hand, the trick in the article is quite verbose.

Can be tricky balancing terseness with readability.

> has the potential to be really opaque and confusing

Yes, but.. compared to switch and preprocessor hacks described in the article? Granted it's subjective but I'd find a function pointer that's assigned to once to be way easier to comprehend.

Yeah, I think both methods are somewhat opaque, and the one described here is opaque + hacky. I wouldn't let this through code review.

I would say that this switch hack is lightyears more "clever" (that is, opaque and confusing) than a simple function pointer.

It's more inliner-friendly this way though.

No. Compiler can't assume value of switch_step at compile time, so it would have to inline the whole horrid thing every single time.

no guarantee the signatures are the same on all functions, can't use a fp

In that case wrapper functions can be created where needed to normalize the signatures.

Creating signature normalization wrappers, which contain superset of all the function parameters really complicates the code a lot.

It's like trying to push anything and everything through a little hole.

Have you heard of our Lord and savior, the single method interface?

Awesome, that's horrifying.

One concrete issue with this technique: it doesn't "loop". Meaning, for example, if initially method #3 doesn't work but method #4 does, it'll settle on method #4. If later, method #3 starts working, it won't switch it. Worse, if method #4 stops working, it'll just always fail.

I imagine this isn't an issue for rsync's use but could be a concern if applying it more generally.

According to the article, a step isn't skipped because it had a transient failure, it's skipped because it returns a special error code ENOSYS that means the syscall hasn't even been implemented, and therefore there's no point in ever trying it again.

That's correct. Although it's a very specific use-case.

AIUI with containers it's now entirely possible that a running program may be migrated from a host where a syscall has not been implemented to one where it has.

Just remove the `static` keyword and there you go.

But it's an "optimization", and avoiding previously attempted methods is the whole point of the trick. Otherwise it would be an easy and uninteresting function.

> Just remove the `static` keyword and there you go.

That‘ll probably try all methods on every single file that is processed, so if method #16 is the one that works, you get 15 unnecessary syscalls for every file.

Exactly, that's what the parent comment was saying.

Lets say that there are only 3 methods and method #2 is the one that works initially. On the first call it'll try #1 and when that doesn't work it'll try #2 which will be successful. For the next however many calls it'll jump straight to #2. If #2 starts returning ENOSYS at some point it'll try #3 and if that works then all future calls will jump straight to #3. However if #3 stops working then it will keep on failing until #3 starts working again even if #1 started working after the first call.

I think (I haven't actually tested it) you can do what the parent commenter wants by wrapping the whole thing in a do-loop but the original implementation is questionable enough, sticking an extra loop in there isn't doing anything to help make it less horrible.

  int set_the_mtime(...) {
    static int switch_step = 0;

    int loop_again = 1;

    do {
      switch (switch_step) {

      case 0:
        if (method_0_works(...))
          return 0;

        /* FALLTHROUGH */

      case 1:
        if (method_1_works(...))
          return 0;

        /* FALLTHROUGH */

      case 2:
        if (method_2_works(...))
          return 0;


      switch_step = 0;
    } while (loop_again--);

    return -1;   /* ultimate failure */

My understanding of the code from the article is that if method #4 stopped working, it would just carry on down the list. The static variable just means where to skip to in the case list.

Andrew Tridgell’s thesis is also a very interesting read, as rsync has some pretty innovative algorithms that make the file transfer and remote diffing fast:


Thanks, I hate it. I'm a bit bemused about this bizarro case_N.h trickery. I'm also almost positive __COUNTER__ would be adequate and entirely remove the need for __LINE__ as suggested in the addendum which is likely to be quite large in any non-trivial source unit. I mean, something gcc this way comes I guess...

__COUNTER__ is not standardized. Repeated inclusion is.

So thread unsafe?

Note that if this were C++ (11 or later) this could be done with the static variable in a thread-safe way, since function-scoped static variables initialized by functions are guaranteed to be initialized exactly once. So this code could be something like:

  int set_the_mtime(...) {
    static int starting_from = set_mtime_starting_from(0);
  // Try various ways of setting mtime until it works.
  int set_mtime_starting_from(int starting_from) {
    // horrible switch statement

In C++ you could also make an object (per thread?) and use member vars instead of static vars.

Also, if you're going to use static vars in multithreaded contexts doing this, better to write

   thing = 5;
(via that macro magic) than `thing++`. Then at least nothing too bad should happen...


It doesn't have to be. It just is, because there is no need for the overhead of dealing with threads.

So without actually looking into anything. What id you have a directory structure which is built out of 2 mounts. One local fs and one network mounted. Then lets say method 5 works on one, and method 1 works on the other, both are the only ones that work. Then it would basically fail, depending on the order of directories enumerated, eventhough there is a supported method.

I haven't looked at the code, but the ENOSYS error is more about the capabilities of the kernel. So when we eventually start swapping entire kernels at runtime then it'll break.

Edit: d'oh - I misread, the addendum is doing exactly this...

I don't understand the need for the macro-fu to generate the case values. The use of switch_step++ in each previous case seems to drive that, but couldn't the same be accomplished by setting switch_step to a constant?

  static int switch_step = 0;
  switch(switch_step) { // Falls through between cases

      switch_step = 0;
    case 0:
      if (method_0_works(...))

      switch_step = 1;
    case 1:

So, slightly off topic, but one thing from C that I really miss in other languages is subroutine-local static variables. It always annoys me in Java or C# that you can't really make a static variable that's scoped to a single method or function, you have to scope them to the entire class. It's common enough that a private static variable is only needed in a single method or function, so why not do it the way C does it?

In js you can use `this.x` or `this.methodname.x`

rsync is an amazing tool, full of delightful secrets.

One thing it doesn't have natively (unless it's really well hidden) is the ability to look at two directory structures, and generate a delta of the two out to a separate location (say a USB stick).

A very useful feature for backups where you've got good enough (to run an rsync --dry-run) network connectivity, but not good enough to actually do the transfer.

Unless you actually want whole files stored in a directory tree on your USB, --only-write-batch does exactly that. From the man page:

   you can feel free to write the batch directly to some portable media

Wow .. how long has that been there?

(Sadly it's clear it was in fact there long before I wrote a workaround in response to not finding it.)

can rsync --dry-run and | output to a file?

Yup, that's what I ended up doing [1] - but was mildly surprised it wasn't built in, and very surprised that I couldn't find anyone else's solution.

[1] https://jeddi.org/b/2016/05/28/rsync-from-a-to-b-via-usb/

I think he means a diff that includes the differences in content.

I'd be interested if someone could show me a FP friendly solution to this problem.

Some compilers turn large switches into lookup tables. Using side effects like this however, makes the compiler shy away from this optimisation, and compile it literally. So it's probably not a good idea anymore.

I'm not sure switch performance is a relevant concern for setting the last modification time of a file.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact