Hacker News new | comments | ask | show | jobs | submit login

Perl 6 breaks with the traditional regexp syntax:


Also breaks with the traditional regexp by not actually being regular expressions :)

There have been two such breaks, not one. Perls 1-5 correspond to the first such break. Perl 6 corresponds to the second.

If "regular expressions" is taken to refer to formal language theory regular expressions -- which is NOT what 99% of devs mean by the terms regex or regular expressions -- then "breaks with the tradition" (formal language theory "tradition") happened somewhere in the 1960s to 1980s timeframe, when capturing parens and backreferences were introduced in [qs]?ed or similar. (Years before the first Perl arrived in 1987 to popularize and extend said break with tradition.)

If "traditional regexp" is instead taken to refer to this latter notion of "regular expression", i.e to match what 99% of devs DO mean by the terms regex or regular expression -- Perl 5 compatible regexes, PCRE, etc. -- then Perl 6 represents a second break with tradition, breaking away from the currently still popular Perl 5 "tradition".

In other words:

* "Regex" meaning 1: formal regular expressions (from 1950s)

* "Regex" meaning 2: Perl (5) compatible regexps (from 1960s)

* "Regex" meaning 3: Perl 6 rules[1] (first officially available 2015)

[1] "Perl 6 rules are the regular expression, string matching and general-purpose parsing facility of Perl 6 ..." (https://en.wikipedia.org/wiki/Perl_6_rules)

I think what most people mean when they say 'regex' is actually 2 thing:

1. The syntax of that pcre-like regex engine accept.

2. regular language, a kind of formal language.

Many regex engine nowadays like the one in perl5 and onigmura already breaks 2, but still makes 1 compatible. I think what perl6 does is also breaks 1. (I am not experienced in Perl6. Please correct me if I am wrong.) I don't think it is a problem, though.

In Perl 6 regexes are a type of method, and you can use them in grammars which are a type of class. (You can use them on their own as well)

Which means you can subclass grammars, compose in regexes with roles, and have parameterized regexes.

The syntax has also had an overhaul to make it more consistent with itself as well as the rest of Perl 6. Since you can embed Perl 6 code, some features of other regular expression engines haven't been implemented as they aren't needed.

The result of using a regex or grammar is also now a parse tree rather than True/False or the matched substring.

I generally recommend reading the code for JSON::Tiny::Grammar as a quick example of what it is like. https://github.com/moritz/json/blob/master/lib/JSON/Tiny/Gra...

> I think what perl6 does is also breaks 1.

No, you can use the backward-compatible syntax if you don't want to spend any time porting to the newer improved syntax.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact