It is for lesser mortals, too. I think everyone with half a CS degree should be able to pull it off.
That said: it's an implementation which doesn't behave like normal regexps (it's a left-shortest match first instead of the common left-longest) and it's a back-tracker. Because this implementation doesn't allow much choice, expensive examples will look contrived, but extending this implementation with character sets, groups and choices (i.e. a|b) will make its exponential nature felt very quickly.
It's a natural approach to anyone familiar with the pattern matchers explained in old Lisp books. This is not just hindsight because when I first read this chapter in Beautiful Code I stopped after the problem statement and wrote something similar before reading the rest.
(I didn't choose exactly the same grep sublanguage to implement -- it was something like supporting the most basic escaping instead of ^ and $.)
There's other whitespace in there that does get counted - for example, LOC is different depending on whether you prefer K&R braces. And since comments compensate for things that code doesn't directly express, it probably makes sense to include them too.
If we're trying to measure code complexity I think we have to ask "complexity to whom". If it's to humans, we should measure the code as it appears to humans. If it's to machines, then "line" is meaningless and we should look at tokens, the way PG suggested years ago.
Gosh dang, I have to remind myself that you're more than entitled to technical opinions as well!
A tight definition of SLOC isn't something we have. We do have sloccount, or we can install it, different tool from `wc -l` (we can sed the blank lines out first).
Really we should be calculating cyclomatic complexity like adults. I'm told this isn't straightforward! But what is the use of heaven if a man's reach cannot exceed his grasp.
The title of this article is misleading since it only matches a small subset of what's possible with regular expressions. It's like saying "A novel programming language linker and compiler in 30 lines of code" while it can only do simple arithmetic operations and is lightyears away from being turing complete.
I guess we can address this objection by reverting to the article's actual title, which is what the site guidelines call for anyhow. (Submitted title was 'A Regular Expression in 30 Lines of C')
This just illustrates the power of recursion. With it, the call stack becomes your invisible data structure - it keeps all saved program state deep down. The same approach is used to write recursive-descent parsers for languages with an inherently recursive definition.
You could also devise an iterative solution that wouldn't be more complex - then you simply have to maintain explicit stacks for operators and operands.
Rob Pike's simple C regex matcher in Go - https://news.ycombinator.com/item?id=32434412 - Aug 2022 (75 comments)
A Regular Expression Matcher (2007) - https://news.ycombinator.com/item?id=22317138 - Feb 2020 (22 comments)
A Regular Expression Matcher (2007) - https://news.ycombinator.com/item?id=12199836 - Aug 2016 (34 comments)
A Regular Expression Matcher (2007) - https://news.ycombinator.com/item?id=10654150 - Dec 2015 (1 comment)
A regular expression matcher By Rob Pike and Brian Kernighan (2007) - https://news.ycombinator.com/item?id=5672875 - May 2013 (45 comments)
A Regular Expression Matcher in 30 lines of C - https://news.ycombinator.com/item?id=2723366 - July 2011 (9 comments)