Hacker News new | past | comments | ask | show | jobs | submit login
Regex Licensing (regexlicensing.org)
12 points by pabs3 9 months ago | hide | past | favorite | 15 comments



There is hardly any content on this post. What are we supposed to read? Just look at the title and rant about the first thing that comes to our minds?


Almost feels like it was created by a LLM


Jeffrey Friedl's "Mastering Regular Expressions" was my gateway into Perl and server-side programming so you'll excuse me if I don't buy(!) one of your licences. Regular expressions used intelligently are amazing.


That imho is one of the best, if not the best written programming texts I've ever read.

The Fine Article is a an attempted comedic "regex considered harmful" but I think people who have read Friedl are actually completely qualified.


Yes, along with "Programming Perl" by Larry Wall et al.



here you go - have a 2504 line regex that parses perl code (and passes all the tests the other perl static analysis parser PPI passes): https://metacpan.org/dist/PPR/source/lib/PPR.pm#L67


Only The Damian. How I miss Perl culture now that I have to use other languages.


It took me a while to figure out it wasn't about needing a copyright license before adding a regular expression to one's code.

After all, most people copy&paste complex regexps rather than author their own.


> After all, most people copy&paste complex regexps rather than author their own.

They do? Somehow I have not seen this happening. But if this is really how software development is done these days, then it is highly irresponsible. Simply copy-pasting complex regexps without understanding what they do is ripe for disaster. Not to mention that every regex flavor has subtle differences from every other flavor. So at minimum the person copying has to understand it and adjust it for differences in flavors.


Yes, I can copy&paste with understanding.

I think that's pretty common, yes? After all, it's pretty hard to copy&paste without at least some tweaking, even if only to change the variable names appropriately, and that requires some level of understanding.

To explain what I mean, here's a regexp which is starting to get complex - match an IEE754 double.

I could and have done it myself. These days I'll search for then copy&paste something like:

    [-+]?((\.[0-9]+|[0-9]+\.[0-9]+)([eE][-+][0-9]+)?|[0-9]+)
from https://www.regexlib.com/REDetails.aspx?regexp_id=3098 .

Eyeballing looks right (I might need to add [+-]inf, NaN, and nan), and at the very least it's a good start, yes?

From personal experience, it's easy to forget .4e1 is valid (no digit before the period, lowercase "e", no sign for the exponent). I would rather start with someone's worked-out version than from scratch.

Regexps can of course be far more complex than that. OTOH, the most common use cases are not that hard to find, like "How do you validate a URL with a regular expression in Python?" at https://stackoverflow.com/questions/827557/how-do-you-valida... , with several good answers at different levels of completeness, and this entirely appropriate comment:

"I've needed to do this many times over the years and always end up copying someone else's regular expression who has thought about it way more than I want to think about it."


You’re right. we now ask chatgpt to write them /s


... that that is how we get nice DOS issues because of greedy regular expressions that where not checked


To be even more correct, writing your own regexps is also how we get nice DOS issues because of greedy regular expressions that were not checked.

To be even more fully correct, we get nice DOS issues because we aren't using re2 or similar tools that don't have exponential backtracking.


> And those who have licenses generally know it’s a better idea not to use them.

This is basically saying "don't use regex" which of course is silly. However you probably shouldn't use regex for user-facing input validation in production applications, at least not by default.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: