

Regular Expressions in CoffeeScript are Awesome - elijahmanor
http://www.elijahmanor.com/2012/02/regular-expressions-in-coffeescript-are.html 

======
mike-cardwell
His example, slightly modified for Perl:

    
    
      my $emailPattern = qr{ ^ #begin of line
       ([a-z0-9_.-]+)          #one or more letters, numbers, _ . or -
       @                       #followed by an @ sign
       ([\da-z.-]+)            #then one or more letters, numbers, _ . or -
       \.                      #followed by a period
       ([a-z.]{2,6})           #followed by 2 to 6 letters or periods
       $ }xi;                  #end of line and ignore case
    
      if( 'john.smith@gmail.com' =~ $emailPattern ){
         print "E-mail is valid\n";
      } else {
         print "E-mail is invalid\n";
      }
    

EDIT: A much better way of doing it though:

    
    
      use Mail::Sendmail;
      if( 'john.smith@gmail.com' =~ /$Mail::Sendmail::address_rx/ ){
         print "E-mail is valid\n";
      } else {
         print "E-mail is invalid\n";
      }

~~~
draegtun
And for just email address checking then even better IMHO is:

    
    
      use Email::Valid;
    
      if (Email::Valid->address('john.smith@gmail.com')) {
        print "E-mail is valid\n";
      }
      else { print "Email is invalid\n" }
    

ref: <https://metacpan.org/module/Email::Valid>

------
phuff
In JS you can comment regular expressions, too... :)

    
    
      var foo = new RegExp(
                        '\\d\+' + // First digit
                        '-' +     // Delimiter
                        '\\d\+'   // Second digit
                        );
      foo.test("1234-5432");

~~~
elijahmanor
True, but then you have the overhead of string concatenation... although in
the large scheme of things that is probably very small, unless you are doing
it in a huge loop, but then you'd cache that RegExp anyway.

~~~
natrius
Concatenation of string literals can be compiled away.

------
jashkenas
For a little dollop of recursivity in your morning coffee, here's the bit
where the CoffeeScript compiler uses these extended regular expressions to
lex, among other things, extended regular expressions...

[https://github.com/jashkenas/coffee-
script/blob/master/src/l...](https://github.com/jashkenas/coffee-
script/blob/master/src/lexer.coffee#L595-647)

------
joshuahedlund
I know the email validation was just an example, but since we're on the
subject, if you've got an email validation that includes a maximum number of
characters for the TLD you might want to update that before the new TLDs start
rolling in later this year or next.

------
peter_l_downs
Python, too:

<http://docs.python.org/library/re.html#re.VERBOSE>

    
    
      a = re.compile(r"""\d +  # the integral part
                       \.    # the decimal point
                       \d *  # some fractional digits""", re.X)
      b = re.compile(r"\d+\.\d*")

~~~
elijahmanor
Ohh nice, maybe there is where CoffeeScript got inspiration from? The syntax
looks very similar.

~~~
berntb
Afaik Python, like most open source languages, use the PCRE lib, Perl
Compatible Regular Expressions. Google it.

PCRE isn't as good as Perl's regexps, but not bad. (I've helped people with
PCRE problems that worked in Perl.)

Edit: I googled myself. :-) The Python lib isn't pcre (anymore), it is
something home rolled. The answer to the parent question about inspiration is:
Most everyone has copied the Perl syntax (except, irritatingly, Emacs).

Edit 2: herge -- Perl is ~ a superset of awk. Use that. (Check -n and -e flags
in perldoc perlrun.)

~~~
herge
Vim and awk use the old syntax, and that irritates me to no end.

------
gmac
Perl, Ruby and some other languages support a similar 'extended' RegEx syntax
which allows comments and disregards whitespace (in Ruby, you use the 'x' flag
on a Regexp literal).

For those using plain JavaScript -- or wanting to do a one-off conversion -- I
wrote a simple JS function that converts an extended RegEx to a plain one:

[http://blog.mackerron.com/2010/08/08/extended-multi-line-
js-...](http://blog.mackerron.com/2010/08/08/extended-multi-line-js-regexps/)

~~~
elijahmanor
Yeah, my guess is that CoffeeScript took direction from those languages for
this feature.

Nice function to convert annotated string to a regex. I wonder if this is
being considered in the next versions of JavaScript? Or better yet... native
annotation?

I like the CoffeeScript approach since that logic is done at compile time and
not when it is being executed.

------
samarudge
You shouldn't validate emails by regular expressions since, technically (at
least according to spec)

"()<>[]:;@,\\\\\"!#$%&'*+-/=?^_`{}| ~ ? ^_`{}|~."@example.org

is a perfectly valid email address [1]

[1]
[http://en.wikipedia.org/wiki/Email_address#Valid_email_addre...](http://en.wikipedia.org/wiki/Email_address#Valid_email_addresses)

~~~
ColMustard
More like you shouldn't validate emails by using crappy regular expressions?

AFAIK there are regular expressions that cover all valid email addresses (and
some invalid, but better than not recognizing one that is valid).

Although it gets pretty nasty:
<http://www.diablotin.com/librairie/autres/mre/chBB.html>

~~~
samarudge
But since nearly every character is valid in an email address before the @
sign (within ""), it's much easier to validate `*@thisisavaliddomain.com` and
just try sending an email. The way we do it, if the user enters an email that
matches our reasonably comprehensive regexp we consider it valid, if it
doesn't match we send the user an email and get them to verify it with a link,
that way most users don't have to go through an annoying click-and-confirm
email validation but we can still catch a few edge cases of people mistyping

------
jriddycuz
"Let's face it, regular expressions aren't for everyone."

Wait, since when aren't they? I would have thought basic regex skills were a
baseline shibboleth of a programmer. I even know several non-technical people
that are proficient.

~~~
Zancarius
Your comment reminded me of Jamie Zawinski's tongue-in-cheek quotation "Some
people, when confronted with a problem, think 'I know, I'll use regular
expressions.' Now they have two problems."
(<http://regex.info/blog/2006-09-15/247>)

Jestful, off-the-wall comments aside, I love regex. It takes away a great deal
of the drudgery associated with processing volumes and volumes of text. I
can't imagine not having it.

(Aside: Now that you mention it, I can think of a couple people who aren't
programmers who use regex pretty frequently because it makes _their_ life a
little easier even if it isn't highly advanced, super-dense regex.)

------
SeoxyS
That has nothing to do CoffeeScript, most regex parser implementation support
this via a flag. I've seen it done in PHP, Ruby, Perl, Objective-C. For
languages whose regular expression engine don't support this, it's easy enough
to achieve it via string concatenation.

~~~
RyanMcGreal
Python has it, too.

------
shtylman
I personally don't find this example very helpful in showing why I would do
this. The regex is simple enough and to me the comments are the equivalent of
"return here", needless and distracting.

~~~
elijahmanor
The example was to show how you could do it, not necessarily that it was
needed for that case. However, you might be surprised how many developers
aren't familiar with that example regex.

------
buddydvd
What about trailing spaces? They need to be escaped by parenthesis?

------
mrpollo
is there any performance penalty?

~~~
tikhonj
I'm pretty sure that these expressions are simplified to normal JavaScript
expressions at compile time, so there is no performance penalty.

Even if they were compiled at runtime, the performance penalty would probably
be negligible unless you were randomly compiling the expression in the middle
of an inner loop or something equally silly.

