Hacker News new | comments | ask | show | jobs | submit login

Maybe it is just me, but I've seen this issue multiple times where a programmer used a regular expression thinking they were clever, only to have it backfire later when there was some corner case not covered by their regex (which likely would have obviously been caught if they had just taken the time to write out each case as an if statement). You're usually just gaining complexity in exchange for fewer lines of code, and I'm not sure that is always the best tradeoff. Something to keep in mind when deciding whether or not to use regular expressions.

With that being said, your list of things to know are all things I have to know for my job (these are things almost all programmers should know, in fact) and I would not consider myself a low-level or systems programmer, just an application programmer.






A good in-between option is a parser generator like Ragel ,http://www.colm.net/open-source/ragel/

This takes in the definition of a regular language as a set of regular expressions, and generates C code for finite state machine to parse the language. You can visualise the state machine in Graphviz to manually verify all paths, making it much easier to spot hidden corner cases, while being a lot quicker to code than a big pile of if statements.


Been using ragel for a while now, it’s awesome. Ragel is a bit like the vim of parsing: the learning curve can be pretty steep, but once you get it, you’ll be the parser magician. Parsing in general just becomes pretty easy with ragel.

Regardless of implementation (regex vs. conditionals), there should be sufficient unittesting to make sure that all the corner cases are tested. For embedded systems, the complexity tradeoff is relevant though, since a regex library will (probably) take up more code space than some conditionals.

In principle, I agree. In practice, it's easier to miss an edge or corner case in a regular expression than it is in a series of conditionals. That's just another consequence of the complexity tradeoff.

Just use re2c. There's no library.

There’s a fine line between being clever and being a smartass.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: