
Regexper – Regular expressions visualizer - xchip
https://regexper.com/
======
ken
It's always neat to see where one's ideas go! AFAIK, I was the first person to
create dynamic railroad diagrams for regular expressions (maybe 12 or 13 years
ago). I got the idea from json.org, which I think was Douglas Crockford's
brainchild.

My initial implementation was strfriend.com (in Lisp: well under 1,000 lines,
including views), and I think its main claim to fame was that Jeff Atwood made
fun of it on Twitter. (I was truly clueless at promoting myself back then. Not
only did I not have a Twitter account, but it didn't occur to me to submit it
to HN.)

Every so often, I toy around with the idea of making it into a proper
local/native application -- maybe someday. In the meantime, my obsession with
regular expressions lives on in my current application (see bio) where I parse
and rewrite the user's entered regex syntax (ICU) into whatever regex syntax
the backend requires. I don't know of any other application that does this,
but I predict that in 15 years it'll be commonplace (and I'll still be poor)!

~~~
kyberias
I was taught the algorithms to do this stuff in my Computer Science class over
20 years ago (RE is equivalent to DFA). You weren't the first person to
implement regular expression visualizations.

~~~
skate22
A dynamic visualization on the web 13 years ago? He may have been the first.

~~~
osteele
It was pretty hard to do viz on the web 13 years ago. In 2006 I made a web
tool to turn regular expressions into NFAs and DFAs and animate their states
as you typed. It took a lot of code (drawing and animating along beziers, AJAX
to a server for graphviz and a regex compilation and minimization package I
wrote for this).
[https://imgur.com/gallery/Yqqoh](https://imgur.com/gallery/Yqqoh)

These days there’s a lot more tooling and components that can snap together to
make this kind of thing.

------
gilleain
Nice. From this SO answer:

[https://stackoverflow.com/a/800847/415384](https://stackoverflow.com/a/800847/415384)

we get this:

[https://regexper.com/#%5E(%3F%3A(%3F%3A(%3F%3A0%3F%5B13578%5...](https://regexper.com/#%5E\(%3F%3A\(%3F%3A\(%3F%3A0%3F%5B13578%5D%7C1%5B02%5D\)\(%5C%2F%7C-%7C%5C.\)31\)%5C1%7C\(%3F%3A\(%3F%3A0%3F%5B13-9%5D%7C1%5B0-2%5D\)\(%5C%2F%7C-%7C%5C.\)\(%3F%3A29%7C30\)%5C2\)\)\(%3F%3A\(%3F%3A1%5B6-9%5D%7C%5B2-9%5D%5Cd\)%3F%5Cd%7B2%7D\)%24%7C%5E\(%3F%3A0%3F2\(%5C%2F%7C-%7C%5C.\)29%5C3\(%3F%3A\(%3F%3A\(%3F%3A1%5B6-9%5D%7C%5B2-9%5D%5Cd\)%3F\(%3F%3A0%5B48%5D%7C%5B2468%5D%5B048%5D%7C%5B13579%5D%5B26%5D\)%7C\(%3F%3A\(%3F%3A16%7C%5B2468%5D%5B048%5D%7C%5B3579%5D%5B26%5D\)00\)\)\)\)%24%7C%5E\(%3F%3A\(%3F%3A0%3F%5B1-9%5D\)%7C\(%3F%3A1%5B0-2%5D\)\)\(%5C%2F%7C-%7C%5C.\)\(%3F%3A0%3F%5B1-9%5D%7C1%5Cd%7C2%5B0-8%5D\)%5C4\(%3F%3A\(%3F%3A1%5B6-9%5D%7C%5B2-9%5D%5Cd\)%3F%5Cd%7B2%7D\)%24)

or from another answer:

[https://regexper.com/#((1%7C0(00)*01)((11%7C10(00)*01))*%7C(...](https://regexper.com/#\(\(1%7C0\(00\)*01\)\(\(11%7C10\(00\)*01\)\)*%7C\(0\(00\)*1%7C\(1%7C0\(00\)*01\)\(\(11%7C10\(00\)*01\)\)*\(0%7C10\(00\)*1\)\)\(\(1\(00\)*1%7C\(0%7C1\(00\)*01\)\(\(11%7C10\(00\)*01\)\)*\(0%7C10\(00\)*1\)\)\)*\(0%7C1\(00\)*01\)\(\(11%7C10\(00\)*01\)\)*\)))

~~~
awhiskeyshot
That last link shows an error.

~~~
rplnt
Add ')' at the very end (you can see it is not included in the clickable link
and it's also what the error says).

~~~
gilleain
Thanks, fixed!

------
erric
Neat! I also suggest: [https://regexr.com/](https://regexr.com/)

~~~
ChrisGranger
I'm fond of [https://regex101.com/](https://regex101.com/) as well.

~~~
criley2
I'm 100% behind you here.

Comparing [1] and [2], there's no competition.

[1]
[https://regexper.com/#%5E(%3F%3A(%3F%3A(%3F%3A0%3F%5B13578%5...](https://regexper.com/#%5E\(%3F%3A\(%3F%3A\(%3F%3A0%3F%5B13578%5D%7C1%5B02%5D\)\(%5C%2F%7C-%7C%5C.\)31\)%5C1%7C\(%3F%3A\(%3F%3A0%3F%5B13-9%5D%7C1%5B0-2%5D\)\(%5C%2F%7C-%7C%5C.\)\(%3F%3A29%7C30\)%5C2\)\)\(%3F%3A\(%3F%3A1%5B6-9%5D%7C%5B2-9%5D%5Cd\)%3F%5Cd%7B2%7D\)%24%7C%5E\(%3F%3A0%3F2\(%5C%2F%7C-%7C%5C.\)29%5C3\(%3F%3A\(%3F%3A\(%3F%3A1%5B6-9%5D%7C%5B2-9%5D%5Cd\)%3F\(%3F%3A0%5B48%5D%7C%5B2468%5D%5B048%5D%7C%5B13579%5D%5B26%5D\)%7C\(%3F%3A\(%3F%3A16%7C%5B2468%5D%5B048%5D%7C%5B3579%5D%5B26%5D\)00\)\)\)\)%24%7C%5E\(%3F%3A\(%3F%3A0%3F%5B1-9%5D\)%7C\(%3F%3A1%5B0-2%5D\)\)\(%5C%2F%7C-%7C%5C.\)\(%3F%3A0%3F%5B1-9%5D%7C1%5Cd%7C2%5B0-8%5D\)%5C4\(%3F%3A\(%3F%3A1%5B6-9%5D%7C%5B2-9%5D%5Cd\)%3F%5Cd%7B2%7D\)%24)

[2] [https://regex101.com/r/LzdGpw/1](https://regex101.com/r/LzdGpw/1)

I can use 2, but 1 just looks cool.

------
cel1ne
After 20 years of software development I‘ve come to adopt a best practise:

Whenever I start writing a regular expression, I stop and write a „manual“
domain specific parse function instead.

Saved me a LOT of debugging time.

Since I can now use kotlin pretty much anywhere (jvm, browser, shellscripts)
this is easy because of the superb stdlib („startsWith“, „lastIndexOf“,
„substringBeforeLast(...)“)

The time saved I invest in Unittests for the parser.

~~~
hinkley
I can't shake the feeling that Regexp could be written just as efficiently as
a fluent interface with a more human friendly syntax.

I've been telling Jr devs bucking for promotion for years to explain what
they're doing in plain english, then write code that looks like that.
Basically telling them to skip right over the "gee look what a clever fuck I
am" stage and write good code instead of creating riddles.

The Regexp problem just screams this at me. What am I doing? I'm looking for a
line that starts with a capital T, then has some quantity of alphanumeric
characters greater than n (if n is not 0, 1, or infinity, this requires extra
work in Regex), followed by an equals sign with or without whitespace
characters around it.

Give me an API that does exactly that, instead of Regex. Something gets lost
in translation every time.

I think the fact that the origin of Regex is the command line interface is
pretty telling. We didn't and we don't have a convenient way to type in
imperative code on a command line. So an arcane syntax was created so you
could do the whole thing in a quarter line of text.

Speaking as someone who has had a Unix shell for 25 years, and routinely works
on mini tools for their fellow developers, I don't think we actually type
stuff into a shell that often anymore. The difference between documenting a
one-liner in a README and just building a shell script that does the same
thing is not that big. There's a difference in development effort but building
a script can allow you access to a debugger. Personally, I'd be willing to pay
that tax any day.

~~~
knight17
> I can't shake the feeling that Regexp could be written just as efficiently
> as a fluent interface with a more human friendly syntax.

You can use _SRL_ \- Simple Regex Language ([https://simple-
regex.com/](https://simple-regex.com/)) for making readable regex/matching
rules. It is supported in C++, Java, C#, PHP, Javascript, and Python. Also,
you can use the web version to generate equivalent regex if your language is
one of the above.

Here is an example from the website for matching an e-mail address:

    
    
      begin with any of (digit, letter, one of "._%+-") once or more,
      literally "@",
      any of (digit, letter, one of ".-") once or more,
      literally ".",
      letter at least 2 times,
      must end, case insensitive
    

Regex to do the same:

    
    
      /^(?:[0-9]|[a-z]|[\._%\+-])+(?:@)(?:[0-9]|[a-z]|[\.-])+(?:\.)[a-z]{2,}$/i
    

The first one is readable, second one is cryptic. [https://simple-
regex.com/examples](https://simple-regex.com/examples) has more examples.

SRL was previously discussed here in 2017, see
[https://news.ycombinator.com/item?id=12384862](https://news.ycombinator.com/item?id=12384862)

Also, the parse feature (DSL) of Rebol language is an excellent regex
alternative:

1\. Why Rebol, Red, and the Parse dialect are Cool
([http://blog.hostilefork.com/why-rebol-red-parse-
cool/](http://blog.hostilefork.com/why-rebol-red-parse-cool/))

2\. Rebol's answer to Regex: parse and Rebol types ([https://rebol-
land.blogspot.in/2013/03/rebols-answer-to-rege...](https://rebol-
land.blogspot.in/2013/03/rebols-answer-to-regex-parse-and-rebol.html))

~~~
tkp
Interesting, thanks for mentioning! Looks similar to VerbalExpressions
([https://github.com/VerbalExpressions/JSVerbalExpressions/wik...](https://github.com/VerbalExpressions/JSVerbalExpressions/wiki))

------
nickcw
Here is a regexp to match an IPv4 address - looks quite nice and easy to
understand compared to the regexp! In fact the visualisation makes it easy to
spot the mistake.

[https://regexper.com/#'%5Cb((25%5B0-5%5D%7C2%5B0-4%5D%5B0-9%...](https://regexper.com/#'%5Cb\(\(25%5B0-5%5D%7C2%5B0-4%5D%5B0-9%5D%7C%5B01%5D%3F%5B0-9%5D%5B0-9%5D%3F\)\(%5C.%7C%24\)\)%7B4%7D%5Cb')

(From
[https://stackoverflow.com/q/5284147/164234](https://stackoverflow.com/q/5284147/164234)
)

~~~
dmayle
The visualization tool shows that the regex is not correct. It allows
000.000.000.000 as an IPv4 address

~~~
grok2
I created this instead:
[https://regexper.com/#((%5B0-9%5D%5C.)%7C(%5B1-9%5D%5B0-9%5D...](https://regexper.com/#\(\(%5B0-9%5D%5C.\)%7C\(%5B1-9%5D%5B0-9%5D%5C.\)%7C\(1%5B0-9%5D%5B0-9%5D%5C.\)%7C\(2%5B0-5%5D%5B0-5%5D%5C.\)\)%7B3%7D\(\(%5B0-9%5D\)%7C\(%5B1-9%5D%5B0-9%5D\)%7C\(1%5B0-9%5D%5B0-9%5D\)%7C\(2%5B0-5%5D%5B0-5%5D\)\)\(%24\))

The repetition count seems to be displayed off-by-one though.

~~~
reificator
On mobile so can't (easily) test it, but doesn't this produce a false negative
for `246.{snip}`, for example?

~~~
grok2
Yes, you are right...the following better?
[https://regexper.com/#((%5B0-9%5D%5C.)%7C(%5B1-9%5D%5B0-9%5D...](https://regexper.com/#\(\(%5B0-9%5D%5C.\)%7C\(%5B1-9%5D%5B0-9%5D%5C.\)%7C\(1%5B0-9%5D%5B0-9%5D%5C.\)%7C\(2%5B0-4%5D%5B0-9%5D%5C.\)%7C\(25%5B0-5%5D%5C.\)\)%7B3%7D\(\(%5B0-9%5D\)%7C\(%5B1-9%5D%5B0-9%5D\)%7C\(1%5B0-9%5D%5B0-9%5D\)%7C\(2%5B0-4%5D%5B0-9%5D\)%7C\(25%5B0-5%5D\)\)\(%24\))

------
thrownaway954
The could really use a bunch of "try it" links or examples to show what it can
do.

------
JelteF
Looks neat, but after a quick look I think I still like
[https://www.debuggex.com/](https://www.debuggex.com/) better. Going step by
step through the regex for a given string is really a killer feature for me.

~~~
deepsun
Yep, and not having to click "Display" button each time.

Also, partially highlighting the text you write is a pretty hard feature to
implement, I did it once. Kudos to debuggex.com for working correctly even
with browser zoom on.

------
avar
The Email::Valid Perl distribution ships with a more than 6000 character regex
to validate E-Mail addresses.

Both goo.gl and bit.ly refused to shorten it, and when I tried to paste the
link here HN refused to accept my comment.

But on a machine with Email::Valid installed do:

    
    
        perl -MEmail::Valid -wE 'say $Email::Valid::RFC822PAT'
    

And copy the output into the Regexper form. It takes a while to render, but
it'll eventually complete.

~~~
shakna
Wow. Despite the utterly insane complexity of a regex of that size, it doesn't
seem to do an insane amount of branching. Maximum depth of choices seems to be
about 4, which is less than a lot of other regex examples I've seen here.

That being said... That regex is just noise. Nobody can tackle it all at once
or by themselves, unless they specialise in just regex. It's 6599 characters,
at least on my system. So at a wild stab, its the equivalent of 4,500 lines of
obfuscated code.

You can't audit it, you just kinda have to trust it, and hope.

But with something like regexper, I can at least read it.

~~~
avar
I can't remember where, but there's some version of that regex somewhere that
uses variables for interpolation in subsequent regexes.

Viewed like that it's really not that complex, most of it is repetition of
previously used regex sequences, it's only when fully expanded that it becomes
so humongous.

~~~
shakna
The pattern is Jeffrey Friedl's, from his book _Mastering Regular
Expressions_.

And as clear as the source could be made, I feel the fact that so many people
have just copied and pasted it means that any understanding is lost, and
they're just praying and hoping, because a regex of that size is actually
difficult for them to comprehend.

------
Raphmedia
There's also [http://emailregex.com/regex-visual-
tester/#(%3F%3A%5Ba-z0-9!...](http://emailregex.com/regex-visual-
tester/#\(%3F%3A%5Ba-z0-9!%23%24%25%26'*%2B%2F%3D%3F%5E_%60%7B%7C%7D~-%5D%2B\(%3F%3A%5C.%5Ba-z0-9!%23%24%25%26'*%2B%2F%3D%3F%5E_%60%7B%7C%7D~-%5D%2B\)*%7C%22\(%3F%3A%5B%5Cx01-%5Cx08%5Cx0b%5Cx0c%5Cx0e-%5Cx1f%5Cx21%5Cx23-%5Cx5b%5Cx5d-%5Cx7f%5D%7C%5C%5C%5B%5Cx01-%5Cx09%5Cx0b%5Cx0c%5Cx0e-%5Cx7f%5D\)*%22\)%40\(%3F%3A\(%3F%3A%5Ba-z0-9%5D\(%3F%3A%5Ba-z0-9-%5D*%5Ba-z0-9%5D\)%3F%5C.\)%2B%5Ba-z0-9%5D\(%3F%3A%5Ba-z0-9-%5D*%5Ba-z0-9%5D\)%3F%7C%5C%5B\(%3F%3A\(%3F%3A25%5B0-5%5D%7C2%5B0-4%5D%5B0-9%5D%7C%5B01%5D%3F%5B0-9%5D%5B0-9%5D%3F\)%5C.\)%7B3%7D\(%3F%3A25%5B0-5%5D%7C2%5B0-4%5D%5B0-9%5D%7C%5B01%5D%3F%5B0-9%5D%5B0-9%5D%3F%7C%5Ba-z0-9-%5D*%5Ba-z0-9%5D%3A\(%3F%3A%5B%5Cx01-%5Cx08%5Cx0b%5Cx0c%5Cx0e-%5Cx1f%5Cx21-%5Cx5a%5Cx53-%5Cx7f%5D%7C%5C%5C%5B%5Cx01-%5Cx09%5Cx0b%5Cx0c%5Cx0e-%5Cx7f%5D\)%2B\)%5C%5D))

Edit: Add the missing ) to the url

------
sevensor
This is a nifty tool, and to be honest I was unaware that regex visualization
was a thing before now. I usually write comments in BNF next to my regex, so
that I can make sense of them later. I'm going to keep doing that, but
visualization is going to be great for debugging and for figuring out other
people's less carefully commented regex.

------
Kagerjay
Here's an example of a regex to validate us phone numbers

[https://regexper.com/#%2F%5E(1%5Cs%3F)%3F(%5B0-9%5D%7B3%7D%7...](https://regexper.com/#%2F%5E\(1%5Cs%3F\)%3F\(%5B0-9%5D%7B3%7D%7C%5C\(%5B0-9%5D%7B3%7D%5C\)\)%5B%5Cs%7C-%5D%3F%5B0-9%5D%7B3%7D%5B%5Cs%7C-%5D%3F%5B0-9%5D%7B4%7D%24%2Fgm)

From

[https://www.freecodecamp.org/challenges/validate-us-
telephon...](https://www.freecodecamp.org/challenges/validate-us-telephone-
numbers)

I found these 11 videos most helpful when learning regex from codecourse

[https://www.youtube.com/watch?v=GVZOJ1rEnUg&index=1&list=PLf...](https://www.youtube.com/watch?v=GVZOJ1rEnUg&index=1&list=PLfdtiltiRHWGRPyPMGuLPWuiWgEI9Kp1w)

------
tzury
Nicely done!

I use [https://www.debuggex.com/](https://www.debuggex.com/) on a daily basis.

If you want to improve your regex skills, Regex Golf is the place to go!
[https://alf.nu/RegexGolf](https://alf.nu/RegexGolf)

------
chatmasta
Nicely done.

Next step, can you do it in reverse? Would be cool to create the regex in a
graphical editor and then generate the actual expression.

Feature request: export the visualization to ascii, so I can copy paste it
into a comment above the regex in my code.

------
kawera
Suggestion: put a link to one or two examples to quickly get an idea of what
it does.

~~~
rplnt
And base64 share urls.

------
dbcurtis
I love the graphics. I would like the equivalent for Python and the GNU flex
(lexer) RE's.

No... wait.... I know what I want! I want a sphinx extension that allows me to
include an RE in a Python docstring and have it render as a railroad track
graphic in the generated documentation:

    
    
      some_re_string = r'ab[0-9]*z'
      """:regex: Any string starting whith 'ab', followed by 
          digits, ending with 'z'.
      """
    

That should be able to go pick up the documented string and render a railroad
track diagram along with the text in the generated documentation.

------
royal_ts
I'd love to have it the other way around. Human speech to RegExp.I find it
really hard to write these expression, e. g.: A colleague needed to write a
RegExp for a nickname alias for a URL. It had to have only letters and numbers
or a specific number. So either name34 na34me or 23 would be valid.

------
ythn
This is very well made. I don't know what I was expecting, but this really
impressed me.

------
kenborge
Does not look like it supports negative lookbehind
[https://regexper.com/#(%3F%3C!a)b](https://regexper.com/#\(%3F%3C!a\)b)

~~~
onion2k
It does do negative lookahead though -
[https://regexper.com/#%2F%5Cd%2B(%3F!%5C.)%2F](https://regexper.com/#%2F%5Cd%2B\(%3F!%5C.\)%2F)

Negative lookbehind only appeared in Chrome in v.62
([https://v8project.blogspot.co.uk/2017/09/v8-release-62.html](https://v8project.blogspot.co.uk/2017/09/v8-release-62.html))
it's not too surprising that tools haven't caught up yet.

------
kmill
Back when [https://xkcd.com/1930/](https://xkcd.com/1930/) was posted, I made
a regular expression to create a generator using a regex sampler (for instance
[http://dwickern.github.io/regex-sample/](http://dwickern.github.io/regex-
sample/) ).

I've put the regex at
[https://gist.github.com/kmill/17c5ef4f99bd9ef7ad799f0b487448...](https://gist.github.com/kmill/17c5ef4f99bd9ef7ad799f0b4874484a)

The amusing thing to me is that this regex visualizer can reproduce the comic.

~~~
Corrado
[https://regexper.com/#%5EDid%20you%20know%20that%20(the%20(f...](https://regexper.com/#%5EDid%20you%20know%20that%20\(the%20\(fall%7Cspring\)%20equinox%7Cthe%20\(winter%7Csummer\)%20\(solstice%7COlympics\)%7Cthe%20\(earliest%7Clatest\)%20\(sunrise%7Csunset\)%7Cdaylight%20savings%3F%20time%7Cleap%20\(day%7Cyear\)%7CEaster%7Cthe%20\(harvest%7Csuper%7Cblood\)%20moon%7CToyota%20truck%20month%7Cshark%20week\)%20\(happens%20\(earlier%7Clater%7Cat%20the%20wrong%20time\)%20every%20year%7Cdrifts%20out%20of%20sync%20with%20the%20\(sun%7Cmoon%7Czodiac%7C\(Gregorian%7CMayan%7Clunar%7CiPhone\)%20calendar%7Catomic%20clock%20in%20Colorado\)%7Cmight%20\(not%20happen%7Chappen%20twice\)%20this%20year\)%20because%20of%20\(time%20zone%20legislation%20in%20\(Indiana%7CArizona%7CRussia\)%7Ca%20decree%20by%20the%20pope%20in%20the%201500s%7C\(precession%7Clibration%7Cnutation%7Clibation%7Ceccentricity%7Cobliquity\)%20of%20the%20\(moon%7Csun%7CEarth's%20axis%7Cequator%7Cprime%20meridian%7C\(international%20date%7CMason-
Dixon\)%20line\)%7Cmagnetic%20field%20reversal%7Can%20arbitray%20decision%20by%20\(Benjamin%20Franklin%7CIsaac%20Newton%7CFDR\)\)%5C%3F%20Apparently%20\(it%20causes%20a%20predictable%20increase%20in%20car%20accidents%7Cthat's%20why%20we%20have%20leap%20seconds%7Cscientists%20are%20really%20worried%7Cit%20was%20even%20more%20extreme%20during%20the%20\(bronze%20age%7Cice%20age%7CCretaceous%7C1990s\)%7Cthere's%20a%20proposal%20to%20fix%20it%2C%20but%20it%20\(will%20never%20happen%7Cactually%20makes%20things%20worse%7Cis%20stalled%20in%20Congress%7Cmight%20be%20unconstitutional\)%7Cit's%20getting%20worse%20and%20no%20one%20knows%20why\)%5C.%24)

------
sfkamath
searched for some javascript regex's to test it with and found the regex
chapter from Eloquent javascript
[https://eloquentjavascript.net/09_regexp.html](https://eloquentjavascript.net/09_regexp.html)

Coincidence that the diagrams seem to be generated by regexper !

------
shalabajzer
has a bug: {2,4} displayed as 1..3 times

~~~
bhrgunatha
I noticed it translates repeats into a form with a required first
character/group so 5{2,4} became 55{1,3} - see
[https://regexper.com/#5%7B2%2C4%7D](https://regexper.com/#5%7B2%2C4%7D)

------
jdnier
Should I expect to have to to escape "/" as "\/"?

~~~
jdnier
Complex regexes do render (must escape "/").

[https://regexper.com/#%5B%5E%3C%5D%2B%7C%3C](https://regexper.com/#%5B%5E%3C%5D%2B%7C%3C)(!(--(%5B%5E-%5D
_-(%5B%5E-%5D%5B%5E-%5D_ -) _-%3E%3F)%3F%7C%5C%5BCDATA%5C%5B(%5B%5E%5D%5D_
%5D(%5B%5E%5D%5D%2B%5D) _%5D%2B(%5B%5E%5D%3E%5D%5B%5E%5D%5D_
%5D(%5B%5E%5D%5D%2B%5D) _%5D%2B)_
%3E)%3F%7CDOCTYPE(%5B%20%5Cn%5Ct%5Cr%5D%2B(%5BA-Za-
z_%3A%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)(%5BA-
Za-z0-9_%3A.-%5D%7C%5B%5E%5Cx00-%5Cx7F%5D) _(%5B%20%5Cn%5Ct%5Cr%5D%2B((%5BA-
Za-z_%3A%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)(%5BA-
Za-z0-9_%3A.-%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)_ %7C%22%5B%5E%22%5D _%22%7C
'%5B%5E'%5D_')) _(%5B%20%5Cn%5Ct%5Cr%5D%2B)%3F(%5C%5B(%3C(!(--%5B%5E-%5D_
-(%5B%5E-%5D%5B%5E-%5D _-)_
-%3E%7C%5B%5E-%5D(%5B%5E%5D%22'%3E%3C%5D%2B%7C%22%5B%5E%22%5D _%22%7C
'%5B%5E'%5D_') _%3E)%7C%5C%3F(%5BA-Za-z_%3A%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)(%5BA-
Za-z0-9_%3A.-%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)_
(%5C%3F%3E%7C%5B%5Cn%5Cr%5Ct%20%5D%5B%5E%3F%5D
_%5C%3F%2B(%5B%5E%3E%3F%5D%5B%5E%3F%5D_ %5C%3F%2B) _%3E))%7C%25(%5BA-Za-
z_%3A%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)(%5BA-
Za-z0-9_%3A.-%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)_ %3B%7C%5B%20%5Cn%5Ct%5Cr%5D%2B)
_%5D(%5B%20%5Cn%5Ct%5Cr%5D%2B)%3F)%3F%3E%3F)%3F)%3F%7C%5C%3F((%5BA-Za-
z_%3A%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)(%5BA-
Za-z0-9_%3A.-%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)_
(%5C%3F%3E%7C%5B%5Cn%5Cr%5Ct%20%5D%5B%5E%3F%5D
_%5C%3F%2B(%5B%5E%3E%3F%5D%5B%5E%3F%5D_ %5C%3F%2B)
_%3E)%3F)%3F%7C%5C%2F((%5BA-Za-z_%3A%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)(%5BA-
Za-z0-9_%3A.-%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)_
(%5B%20%5Cn%5Ct%5Cr%5D%2B)%3F%3E%3F)%3F%7C((%5BA-Za-
z_%3A%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)(%5BA-
Za-z0-9_%3A.-%5D%7C%5B%5E%5Cx00-%5Cx7F%5D) _(%5B%20%5Cn%5Ct%5Cr%5D%2B(%5BA-Za-
z_%3A%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)(%5BA-
Za-z0-9_%3A.-%5D%7C%5B%5E%5Cx00-%5Cx7F%5D)_
(%5B%20%5Cn%5Ct%5Cr%5D%2B)%3F%3D(%5B%20%5Cn%5Ct%5Cr%5D%2B)%3F(%22%5B%5E%3C%22%5D
_%22%7C '%5B%5E%3C'%5D_'))*(%5B%20%5Cn%5Ct%5Cr%5D%2B)%3F%5C%2F%3F%3E%3F)%3F)

[http://www.cs.sfu.ca/~cameron/REX.html](http://www.cs.sfu.ca/~cameron/REX.html)

~~~
jdnier
Horizontal scroll would be useful.

------
jnordwick
I would think it would be better to minimize the pseudo dfa to collapse
states.

Also having each character of an alternation of subset in light blue box then
laying them out vertically makes the hard to read.

Of the point of this is to make regex more easily understood, you would think
you would want to make them compact.

------
bart42_0
^I love it$

------
teddyh
In Emacs:

M-x regexp-builder

------
mdip
I'm an rx geek and can often craft what I'm looking to get without a lot of
help, but I have used this tool many times before -- it's very slick. It would
be _nice_ if it supported something other than JavaScript[0], but hey, it's on
the web, it probably makes a lot of sense to be that way (and it's nice that
it's all client-side and I don't have to wait for it to ship my regular
expression back to a server for processing).

Regular expressions are simply awesome and I'm continually surprised at how
frequently I run into developers who have next-to-no understanding of
them.Case in point, I ran into some code a few months ago that spanned two
methods and 20-lines to do something that a 6-character regular expression
could have solved (and would have done so more performantly[1]); the best part
was that part of what I was responsible for handling was a bug that ended up
residing right inside one of those methods. And then there's all of the things
related to "dealing with strings" that many regular expression libraries just
handle, such as "\d" vs "[0-9]" in a world with unicode strings[3]. It _feels_
cryptic[4] when you encounter it and you're not familiar with the syntax, but
to learn the "80% most useful parts", you needn't study much more than content
that would fit on a single printed sheet of paper (and to get the last 20%,
you'd need, maybe 2, ... 3?)

All of that said, there's also the other side of the coin; if ever the saying
"If all you have is a hammer, everything looks like a nail" had application,
it's with regular expressions. I'm not sure how many times the question "How
do I write a regular expression to parse HTML" has to be responded with
"don't" before folks quit trying[2]. It tends to be the first thing I reach
for when I have a need to process text, even when there are better tools;
heck, _all_ of my find/replace dialogues in every application that supports it
have the "Regex" box checked by default (and it really throws me off when I
hit up "Find" in the browser and need to search for something with a ( or ) in
it which I escape due to muscle memory)

[0] I have an occasional need for PCRE and .NET style; and I really miss
named-groups when I have to do something complex in JavaScript.

[1] While it's easy to accidentally end up in hell, ala
[https://blog.codinghorror.com/regex-
performance/](https://blog.codinghorror.com/regex-performance/), poorly
written string-search code can be worse when the complexity of the pattern
your searching for reaches a certain point, and that's to say nothing of the
errors per x lines of code and readability (not that rx is particularly
readable under complexity).

[2] And hey, I've got a shell script that downloads a few status pages on my
server at home that uses awk with regular expressions to extract values from a
web page. I wouldn't say it necessarily qualifies as "parsing HTML" since it's
really only concerned with looking for a small string which it filters a
second time to get the value -- horribly inefficient, but it's worked for 5
years through page changes without requiring adjustment.

[3] At least in the C# world, are about twice as slow due to handling digits
"correctly" [https://stackoverflow.com/questions/16621738/d-is-less-
effic...](https://stackoverflow.com/questions/16621738/d-is-less-efficient-
than-0-9)

[4] While it's usually written cryptically, many (most?) implementations
support flags to ignore whitespace and support comment features. I've had a
few crazy-ugly rx's that I had to use to extract data from a ticketing
system's "blob field" to insert into a structured format; were it not for that
feature, it would have been impossible to write and support.

------
antonagestam
Isn't this just debuggex.com? That's hacker but not news.

~~~
freehunter
The guidelines say you can post whatever is interesting to hackers, whether
it's new/news or not.

------
coinerone
My best friend for Regex: [http://www.txt2re.com/](http://www.txt2re.com/)

With code examples!

