… which is one of the pressures driving Docker adoption. Each process tree gets its own root filesystem to trash with its multitude of dependencies. DLL hell, shared library hell, JDK hell, Ruby and Python environment hell… a lot of it can be summed up as "userland hell". Docker makes it easy to give the process its own bloody userland and be done with it.
My experience is that "configure;make;make install" has a much higher probability of success (>95% regardless of whether you are running the most up-to-date version of the OS) than something like cmake (which seems to hover around 60% if you try to build on slightly older systems).
1. How do you extend Boyer-Moore to arbitrary regexps?
2. Does "not looking at every byte" really matter when the strings being searched for are typically so short that you're gonna hit every cache line anyway?
1. you find the fixed part of the regexps, do boyer-moore to fast filter the haystack, then do deeper NFA matching if the fixed bits are present. One example from later in that email thread: ^foo is the same as a fixed string search for \nfoo
I presume there are ways you can bound your fixed searches, and have your NFA work "backwards" through the string after a boyer-moore hits. (or both backwards and forwards from fixed points, depending on where in the string the fixed bit is).
2. No not for some people's small file usage - But grep is used in many many cases on huge log files (and mail files/queues and so on) - there is a valid case for optimizing those uses. Production sysadmins deal with multiple GBs of logfiles regularly, and grep them a lot.
You might be surprised. Consider, counting to 2 million by 5 is significantly faster than counting by 1. Especially if you just go ahead and start some prefetching in the memory, I would expect that this can speed things up considerably.
Edit: I should say I would still think things should be benchmarked to really know...
>How do you extend Boyer-Moore to arbitrary regexps?
As mentioned in the follow-up messages, some regexps can be translated to one or at least few fixed string searches (/^[Gg]rep/ is really a search for "\nGrep" or "\ngrep"). If you can't do that, your regexp may contain longish fixed strings that allow you to discard most lines via Boyer-Moore before you have to run the regexp engine.
The short version is that in the early days, pointers where declared like "char p" and for arrays declared like "char buf" a pointer was created, like this (in modern C):
char buf_storage; /* this is not actually visible to the programmer */
char *buf = &buf_storage;
So pointers and arrays really were the same thing. Of course, this didn't fly anymore when structs were introduced (or when embedding arrays in structs was allowed, at least) so that's when arrays were introduced as a first-class type, with weird semantics so that existing code that used array variables as if they were pointer variables wouldn't break.
Specifically I was in need of making an "object" that could have attributes set on it. You can either use lambda:anything_syntax_ok or an empty class. The empty class is much bigger. Lambda is an ugly hack but it works...
Sure, but more radicial OO languages like Smalltalk were full of getters/setters too (e.g. in Smalltalk all member variables are private, so getters/setters are the only way to access those) so it makes sense to think that this was just the way to go.