Unfortunately if you don't "understand" RegEx it won't help much. It is more for people who already have it down.
For me I am still stuck in copy/paste land. I could never get my head around the "logic" of RegEx, it just seems completely random and arbitrary.
Plus they re-use the same characters but have multiple meanings (e.g. ^ for NOT and for START).
By the way, could someone explain this to me - do regexes match a string if a part of a string matches the regex, or if the whole string does?
For example, this regex "([+|-]* )(\d+[,|.]* )(\d+)(\.)?(e)?(\d+)" (intended to match decimal/scientific numbers - see http://regexone.com/example/0) matches "720p" on regexone, but fails to match on debuggex. So it seems like it varies depending on some configuration - is that right?
As for your question, it generally depends how they're called - eg. in Python's re module, whether you call search() or match(). You can force a regex to match the entire string by adding ^ and $.
Seriously thank you.
I'm thinking of doing some tutorials geared towards teaching students in grade school how to use them. I think a visual representation would help significantly.
I find it sort of sad that several people have responded by linking to their preferred (but clearly inferior) Regex pages, which detracts from the accomplishment of this one.
The way I learned RegEx was simply spending 2 work days writing a parser with it. I think the problem is that there is a moment when RegEx suddenly makes sense, and you cannot understand how anyone can be confused by it (even when you yourself were confused just 5 minutes ago).
This kills the browser.
A good read on executing regular expressions in linear (and thus predictable) time is http://swtch.com/~rsc/regexp/regexp1.html
Many other algorithms have exponential edge cases. This can open yourself to DoS'ing if you accept regular expressions from the user (e.g. a search feature.)
2) A debugger has to be true to the input. If the user wants to debug (a) it doesn't help that the debugger just casually transforms it into a*. That wouldn't make the diagrams fun at all.
I do however agree that it's a pitty how many good regular expressions are run on stupid backtracking systems out there.
1) Back references do not mean you need to have exponential edge cases for vanilla REs
2) There is no one true way to execute an RE. There are good ways and bad ways, though.
3) The Thompson algorithm does not preclude non-regular extensions.
Anyway, just wanted to add the rsc link to the discussion :)
To replicate the crash on your own, try typing:
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab'.match(/^(a)$/) in your console.
Anyway I dont consider it a "bug" or anything, just wanted to bring up the rsc paper for discussion :) Keep up the good work!
Surprised no one brought it up
It just feels right to me. It explains your regex to you, which, in my mind, is a much better way to debug a regex than to supply a large set of test strings.
My entry into this category:
Important caveat: it makes use of a hidden Java applet -- so that it can supports the somewhat larger Java regex syntax, doesn't send your data anywhere else for matching, and can hook into the string-probing to animate the process. So dig out whatever browser you use for Java applets (if you still have one) to test.
Regarding the animation, click the 'animate?' link to show the animation step/speed controls. For example, you can watch the regex that tests whether a number is prime (by failing) or composite (by succeeding) via these two animations:
I really want to get rid of the applet requirement; I might someday cross-compile the JDK7 regex support to JS so that the full syntax and animation can still be supported, without an applet.
Regex: TVo.\d.* [Aa] ..[^k]
Test string: TVo1-0:01.0-1:01.0 A Nashville
However, if you use the slider to slide to just past the "s" in Nashville, you can see that the end state does indeed light up.
Just use: TVo.\d.* [Aa] ..[^k].*
and it works
When I need to haul out the big guns, I load up RegexBuddy in a Wine bottle and dump a screenfull of text into it along with the regex to figure out where I went wrong.
They have a very different way of visualizing the step by step, but both are great tools.
However, only exact matches are supported for the first release. I wanted to get user feedback before I built any more features. I think I have an intuitive way to visualize findAll() type matches.
Sadly it doesn't seem to understand (?i).
I think these subtle differences leads to a lot of confusion when users are not aware that the underlying implementation is different from what they are used to.
Besides that, apologies for it stretching sideways and making you scroll. Will be fixed in a future release!
Here's another one http://ocpsoft.org/tutorials/regular-expressions/java-visual...
Done with GWT and Errai, source here: https://github.com/ocpsoft/regex-tester/tree/master/src/main...
Suggestions; the regex reference could use a distinguishing feature such as a subtle light grey background and/or a line seperating it along with more whitespace.
Also, the boxes seem arbitrarily placed. I realize one is centered, and the others take up the remaining space on the next "line", but perhaps you could create better visual boundaries or something.
Lastly, apologies, but maybe the font Lato looks nice on your setup, but its rather jaggedy/unappealing on windows.
The regex reference is only temporarily there. It'll eventually be replaced by a much better feature which is in the pipeline. I'll play around with the css to make it better.
I've played around with the positioning a bit, and it definitely needs iteration. However, an upcoming ui change will drastically change the demands on the ui, so it doesn't make sense to optimize that yet.
I'll replace the Lato font. I agree it looks terrible on windows.
Thanks for all the feedback!
Unless you want to do that match-across-newlines witchcraft.
One quick UI note: The reference table is much easier to read if the lines are left-aligned. With centering and two columns, it's hard to tell at first which descriptions the escape sequences belong to.
Chrome, mac os.
I think I will add this to the URLs that I know on my head.
A set of fail strings would be useful, it's something no-one else does but it vital for a good user experience.
How would you recommend generating fail string? The space of failing strings is really large. From my discussions with users, they usually have a specific failed string and they want to see why it doesn't work.
Linting is a planned feature for a future release.
I'm sure they can do better: next please provide us the ability to use a tiny URL directly from within the domain (i.e. do not force me to lamely go to bit.ly or other non-sense).