Hacker News new | past | comments | ask | show | jobs | submit login

More concise? Sometimes. Slower? Always.

    BenchmarkRegexp	  500000	      5136 ns/op
    BenchmarkStrings	10000000	       173 ns/op
http://play.golang.org/p/YT29Ao-tOt



That is an implementation specific benchmark. A grungy real-world regexp engine (such as Perl's) usually will recognize important special cases and substitute in faster code for them.

The classic example is to recognize that you're looking for a fixed string, and substitute in Boyer Moore. But prefix/suffix recognition are two other common examples.


Last time I checked, any time I needed the power and flexibility of a using a regular expression. Getting the job done was far more and over a degree of magnitude more important than saving some milliseconds of processing time.


You could more than double the performance of the regexp if you did MustCompile just the once rather than within every loop.

MustCompile is generally used to make the regexp a global so that it isn't done over and over.

Just move it out of the loop, as it's really not necessary to compile regular expressions every time you want to match/replace against it.


It does have the MustCompile outside of the loop. I pasted the wrong link originally.


Ah, my apology I saw the earlier link.


If I am reading the chart correctly, doubled performance would still not be enough.


Absolutely. But the original version linked was twice as slow as need be.

For trivial replacements string manipulation I find is faster and safer (fewer bugs). But there is some threshold of complexity in which regular expressions are both more performant and safer.


Just curious: how do the regexes compare when you use "^@(.*)@$" ? Semantically, it's closer to the string version.

Realistically, you'd expect them to behave exactly the same, but Go's pretty new, and you never know what is or isn't going to be optimized.


    `\A@(.*)@\z`

    BenchmarkRegexp	  500000	      5181 ns/op
    BenchmarkStrings	10000000	       171 ns/op


No, not always. That's a poor example biased to simple string handling.


Using Golang as an example of real world regex performance is borderline dishonest. Their regex engine is notoriously unoptimized and is not intended to be a strong point of the language.


It's not "unoptimized", it has a different design[1] than other language implementations, resulting in different performance characteristics.

[1] http://swtch.com/~rsc/regexp/regexp1.html


Not just a different design, a different set of features. For example, no backreferences.


Thanks for clarifying. I guess my point was that if you're just matching one email address in a form submission for example, is performance significant?


No, a few µs vs a few ns when processing your web form won't be significant. Don't shy away from regular expressions, but be aware of their performance and readability impact.

The problem is when developers that don't know any better build parsers with regular expressions. That's almost always a bad idea.


Agreed.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: