
Show HN: RegEx for Regular Folk – A visual, example-based introduction - shreyasminocha
https://refrf.shreyasminocha.me
======
m463
I think what regex's need is a really powerful syntax and language aware regex
editor.

I've been using regexs for most of my career, and still struggle to get them
right on first writing.

The #1 problem I run into is:

what is a literal character and what is a control character?

for example, both these are very common:

\- match a parenthesis character or a period character

\- use a parenthesis to group a match or use a period to match any one
character

You would think I would learn it once, and be good.

but my #2 problem confounds this:

what is a literal character and what is a control character - in the language
I am using?

for example I might need to escape a period to make it a literal for a regex.

If I am checking the files filexc and file.c and want to match the second, the
regex I want is

    
    
      ^.*\.c$
    

in perl, I could say:

    
    
      $rx = "^.*\\.c\$";    ($" is a thing)
      if ($f =~ /$regex/) { ...
    

better would be:

    
    
      if ($f =~ /^.*\.c$/) {...
    

in python I would write

    
    
      m = re.search("^.*\\.c$",f)
    

better would be:

    
    
      m = re.search(r'^.*\.c$', f)
    

in a shell script, I might say:

    
    
      grep "^.*\\.c$"
    

EDIT: crap, I had to escape _my comment_ because the asterisk in the regex was
making my text italic

~~~
dntbnmpls
The biggest problem I've noticed for regex is we use it every once in a while
and once it works we move onto other things. And a few weeks/months later, we
have forgotten much of it and have to relearn it all over again. Whereas, you
generally use your programming language ( C++, C#, Java, etc ) everyday to
keep your skills sharp, regex is generally "set it and forget it" situation
for most people. And as you noted, different languages/shells/etc implement
their own flavor of regex which can trip you up.

It's similar to SQL when you think about it. You set up a query to get the
data you need and move on to other things. And every RDBMs implements their
own flavor of SQL which can complicate things.

~~~
csunbird
Please do not forget the fact that after a couple of months, you want to make
a small change, but you forgot the edge cases when you first created the regex
:D

~~~
b3kart
Unit tests will help with this.

~~~
m463
I would have to unit-test my command line commands :)

~~~
b3kart
If they are critical: why not? If they aren't: you can live with missing a
corner case or two. :-)

------
nobrains
Hi, very nice RegEx educational site/book.

Feedback:

\- In the chapter [https://refrf.shreyasminocha.me/chapters/character-
classes](https://refrf.shreyasminocha.me/chapters/character-classes) an
example is given which uses:

    
    
      o ^ character outside brackets
    
      o $ end of line
    
      o +
    

But the explanation above does not introduce these yet, so a real beginner
user (like me) is lost. The ambigious characters example is fine, since it
uses all the concepts already explained.

~~~
shreyasminocha
Thanks. Yeah, others pointed this out too. I'll get to it soon!

~~~
bathory
Just wanted to chime in and also say that this is extremely confusing for a
RegEx beginner, but thanks for your work, it looks really nice.

------
LeonB
I like the example based approach. I learn from examples far quicker than I
learn from “explanations”. If I attempt to learn from an example and my brain
hits an exception, only then do I start reading the supporting text.

Nice approach. You’ve made a valuable thing and implemented a powerful idea.

~~~
ehsankia
I honestly wish a lot more documentations started like that with a bunch of
examples. I think one I really enjoyed recently was attrs [0].

[0]
[https://www.attrs.org/en/stable/examples.html](https://www.attrs.org/en/stable/examples.html)

~~~
kmundnic
I've been using tldr [1] instead of man pages lately to get started with a
command (or to remind myself how to use one). I've learned a lot just by
reading the examples shown, and then read the man pages if I am missing
something.

[1] [https://github.com/tldr-pages/tldr](https://github.com/tldr-pages/tldr)

------
wonnage
I personally had the most trouble with regexes because I didn't have a good
mental model of how they worked. The hard part wasn't finding the correct
symbol/character class I was trying to match, but coming to grips with
repetition, greedy/nongreedy, etc.

I took a compilers class in college where one of the projects was to implement
a simple regex matcher using NFAs. Bashing my head against this for a week
really helped with being able to "read" a regex. Not sure if this was due to
finally understanding the algorithm, or the fact that I was just constantly
staring at broken regex matches all day.

IMO it was a fairly small time investment for something that is so widely
used.

I'll recommend this post that's been on HN many times:
[https://swtch.com/~rsc/regexp/regexp1.html](https://swtch.com/~rsc/regexp/regexp1.html)

~~~
the-pigeon
I'm not sure how to explain it but the most important thing I've learned in
over a decade of programming experience is to not use regular expressions for
many things they may seem like a regular expression problem.

For example, even something as simple a phone number can have all sorts of
weird but valid variants. Be sure you really need to even validate it's format
and not just that it's present.

Trying to handle all of those variants via regex expression is doable but a
pain. And in practice you as the programmer should not be defining those
variants that are valid as it's up to the business itself to define what type
of data it considers to be valid for the field.

That said I've also worked for companies with small engineering teams where
the goal has always been to be as efficient with development time as possible,
as opposed to making a near ideal system. Software has different needs when
it's used by a thousand people than when it's used by millions.

------
pmarreck
One thing not mentioned here which I think is good to be aware of as you write
intermediate to advanced regexes is understanding "catastrophic backtracking"
and how to mitigate it: [https://www.regular-
expressions.info/catastrophic.html](https://www.regular-
expressions.info/catastrophic.html)

For some reason I enjoy figuring regexes out. What I usually do is TDD them, I
have a mini test suite of examples of strings I want to match and strings I
don't want to match and I write some code to apply a candidate regex to them
all and validate, and then I iterate until it passes. Then I rewrite the regex
in extended regex format and add comments so that _other people or future me_
understand what's going on.

Doing what a good regex can do with regular code instead (which you might do
with the goal of readability or maintainability) is usually much much MUCH
slower, FYI

------
saberworks
This looks really nice but I think it suffers from the same issue a lot of
regex tutorials suffer from. It's focusing solely on the regex and not at all
on how to actually execute them. This site in particular says it's going to
use javascript but at least the first few pages don't show anything except raw
regular expressions.

For any tutorial about regular expressions I think the second thing (beyond a
very simple example regex) to show should be how to actually execute one in
code. Is it that all the tutorials want to be language-agnostic? Maybe just
show a javascript example and point out which part is the js function/method
call and which part is the actual regex.

It's nice to be told what /[aeiou]/ means but without actually typing it in
and executing it (against various inputs, not just one) it wouldn't really
sink in for me.

~~~
jehlakj
Good. Unless you’re trying to write a one off script, you shouldn’t manually
use them in real world projects. They’re a big source of bugs

~~~
ben509
Every compiler and interpreter as well as all text formats like HTML, XML,
JSON, etc. have a lexing pass that uses manually crafted regular expressions.
Those are real world projects.

The defensible form of this argument is that one should prefer a serialization
library or a properly normalized database rather than trying to "stringly
type" data and then pull it out via regexes.

------
asicsp
Neatly presented.

However, I'd suggest to reorganize the chapters so that features not yet
introduced aren't shown in examples without explanations. For example, you
explain anchors and quantifiers many chapters later but use them liberally in
earlier chapters without explaining them.

~~~
shreyasminocha
Yep, thanks for pointing that out. I was finding it tricky to present features
in isolation without making the examples trivial.

I'll work on making things clearer.

~~~
nicoburns
I wouldn't worry too much about making the examples trivial. That just makes
it easy to learn! There are probably lot's of good orders, but I'd probably go
something like:

\- Literal strings \- Optional characters \- Optional strings of characters
(using groups) \- Alternations (using groups) \- Repetitions (using groups)

Then move onto to things like character classes.

IMO character classes are quite an advanced feature (or at least confusing for
beginners) because of being character orientated. They also don't tend to very
useful unless you've already covered repetition.

------
twicetwice
This looks like a great resource! Like others, I vastly prefer an example-
based style, and the examples are really well chosen and very illustrative. I
generally think I know my regexes, but I've already learned a few tricks.
(Backreferences to match different delimiter options but not mixed delimiters
is very cool!)

Feedback:

The highlighting of matches is slightly shifted to the left for me in Firefox
75 but not in Chrome (both on Ubuntu 16.04). The shift is subtle but enough to
make me have to look two or three times at most examples, as the highlight
covers half of the character before the match and only half of the last
character in the match. Can I suggest adding Firefox to your test regimen, if
you haven't already? :)

Also, on the Anchors page, I believe "carat" should be spelled "caret."

Thanks for this once again! I will definitely be revisiting this site to brush
up and learn new tricks. Especially lookaround, which I have never quite
wrapped my head around!

~~~
shreyasminocha
> The highlighting of matches is slightly shifted to the left … Can I suggest
> adding Firefox to your test regimen, if you haven't already? :)

Oh, I thought I had fixed that. I primarily test with Firefox, so this is a
bit of a surprise. I'll check it out—I think it's something to do with CSS's
`letter-spacing`.

I've fixed the typo, thanks for pointing it out.

Thanks for the comments!

------
ggm
Your examples use * and $ and + before they are explained. Inductive learning
goes smoother if new concepts have context.

You explain [^ ...] So the use of these examples without explanation is ..
unexpected. If you use examples which don't depend on * or + or $ I agree it's
'boring' but for a class of learner these surprise moments interfere with
learning.

You only casually mention capitalised \thing is inversion of \thing \d and \D
I think you would want to repeat that \w and \W and \s and \S and after
three.. it's established.

I see this a lot in e.g. Haskell tutorials: simple inductive constructive
learning examples littered with 'oh I explain that later just ignore it for
now' syntactic constructs.

\\( and \\) are dangerous in substitution. Their meaning shifts from regex to
variable-marker. Surely this needs to be noted in passing?

------
vasili111
Great regex tutorial: [https://www.princeton.edu/~mlovett/reference/Regular-
Express...](https://www.princeton.edu/~mlovett/reference/Regular-
Expressions.pdf)

Good regex book:
[https://www.amazon.com/gp/product/0596528124/](https://www.amazon.com/gp/product/0596528124/)

Good regex website: [https://www.regular-
expressions.info/](https://www.regular-expressions.info/)

Interesting regex links: [https://github.com/aloisdg/awesome-
regex](https://github.com/aloisdg/awesome-regex)

~~~
kccqzy
On the implementation's side, Russ Cox's articles are pretty indispensable:

[https://swtch.com/~rsc/regexp/regexp1.html](https://swtch.com/~rsc/regexp/regexp1.html)

[https://swtch.com/~rsc/regexp/regexp2.html](https://swtch.com/~rsc/regexp/regexp2.html)

And actual implementations based on these articles:
[https://github.com/google/re2](https://github.com/google/re2) and
[https://github.com/rust-lang/regex](https://github.com/rust-lang/regex)

~~~
vasili111
More articles and notes from Russ Cox:
[https://swtch.com/~rsc/regexp/](https://swtch.com/~rsc/regexp/)

------
shanecoin
RegExr [0] does a great job of showing individual highlights even when they
are in a sequential string. You can try to implement this if you want instead
of showing a callout with a note to let the reader know that they highlights
should be on individual characters.

[0] [https://regexr.com/](https://regexr.com/)

------
backzerman
Constructive criticism: I was about to send this to a friend who is new to
programming, but the introduction is just too short. It would be great if the
introduction included one or two motivational examples for the types of
trouble you run into when you _don 't_ know regular expressions.

~~~
shreyasminocha
Makes sense—thank you! I'll add that in.

------
tragomaskhalos
Final exam here: [https://www.i-programmer.info/news/144-graphics-and-
games/54...](https://www.i-programmer.info/news/144-graphics-and-
games/5450-can-you-do-the-regular-expression-crossword.html)

:)

~~~
donaldihunter
And [https://regexcrossword.com/](https://regexcrossword.com/)

~~~
shreyasminocha
Yep! I included it on the next steps page —
[https://refrf.shreyasminocha.me/chapters/next-
steps](https://refrf.shreyasminocha.me/chapters/next-steps). Great fun.

------
sakekasi
[https://regex101.com/](https://regex101.com/)

This site is my goto whenever I need to write a complex regex. It's got syntax
highlighting, explanations and a tested all rolled into one!

------
evo_9
Just some 2cent feedback - don't assume anything is known.

The BASIC lesson doesn't mention anything about /g. Having not touched regex
in years I had no idea what that was and kept thinking 'why isn't he showing
it matching a g if he has that in the example'.

~~~
shreyasminocha
Point. I've made the temporary omission of an explanation for /g explicit.
I've also included a link to the relevant section of Flags in the note.

------
canada_dry
Decent guide. It's great that all the examples are linked to
[https://regex101.com/](https://regex101.com/) so people can play/explore!

More regex resources I rely on:

[http://www.regexr.com/](http://www.regexr.com/)

[https://gchq.github.io/CyberChef](https://gchq.github.io/CyberChef)

[https://regexper.com/#.%3F%5Bv%2Ci%5D.*](https://regexper.com/#.%3F%5Bv%2Ci%5D.*)

[https://cheatography.com/davechild/cheat-sheets/regular-
expr...](https://cheatography.com/davechild/cheat-sheets/regular-expressions/)

~~~
shreyasminocha
The examples are linked to RegExr! There's a link titled `[RegExr]` in blue
next to each example.

Also, those are some amazing resources, especially CyberChef.

~~~
canada_dry
Corrected! ( _Covid brain_ ) Thanks.

------
donaldihunter
This is nicely done. It could benefit from some non /g examples on the basics
page, especially since flags are not covered until chapter 8.

One visual enhancement that could be really helpful would be to hover over the
regex or the match and see the reciprocal highlighted.

------
filmgirlcw
I love this! I love RegEx but have struggled trying to “teach” others over the
years. In addition to books like this, I often find writing RegEx with
something like Expressions[1] (and I know there are many great website
solutions, Expressions is just a great app that I find very approachable to
newcomers) is a great way to learn. When you can see what you’re writing
select what you want, you get a great grasp of how it works. This, alongside a
good book with good examples, is pretty much how I learned RegEx ~12 years
ago.

[1]:
[https://www.apptorium.com/expressions](https://www.apptorium.com/expressions)

------
binstub
Nice intro. Tangential question. Is there a regex tool that shows where the
expression failed ? Not in syntax, but the logical failure point? Would be
useful for when an expression gets a little long and nested and modifications
need to be made.

Edit: I mean like:

Target text is abcde

Regex is /abe/

Is there a tool that will tell me it matched a and b and then failed trying to
match e ?

Those sites are great resources but they are showing pass/fail and do show an
excellent breakdown when something satisfies the expression, but I’m just
wondering if there is something that shows partial matching until the failure
point?

~~~
bmn__
You want a debugger.

[http://p3rl.org/rxrx](http://p3rl.org/rxrx)

rxrx -e'"abcde" =~ /abe/'

Demo: [https://blog-cloudflare-com-
assets.storage.googleapis.com/20...](https://blog-cloudflare-com-
assets.storage.googleapis.com/2019/07/23-steps-1.gif)

[http://p3rl.org/re#'debug'-mode](http://p3rl.org/re#'debug'-mode)

perl -Mre=debug -e'"abcde" =~ /abe/'

\----

[https://stackoverflow.com/questions/2348694/how-do-you-
debug...](https://stackoverflow.com/questions/2348694/how-do-you-debug-a-
regex)

~~~
binstub
Wow, where have I been. That’s awesome. Exactly what I was looking for .
Thanks. My Google fu obviously is lacking.

------
ben509
It might be nice to touch on composition as a good way to get started is to
test out individual pieces and be confident they work when you're putting them
together.

If you're building a complex regular expression, setting smaller parts in
variables and dropping them in with (?:${part}) makes things a bit more
readable.

It also exposes a real weakness of most regex engines. In particular,
alternation is a first-class operation, but complement and intersection, while
theoretically possible[1] are typically not.

A person might guess that to match three keywords is /. _keyword1._ &.
_keyword2._ &. _keyword3._ /

Or maybe /. _keyword1._ &(. _keyword2._ )!/ to match keyword1 and not
keyword2.

But those won't work, so it's a good idea to explain some options, an obvious
one being /keyword1/.test() && !/keyword2/.test()

In the section on lookaround assertions, it's probably useful to note that
(?=thing1)(?=thing2) can match both, and it's a good mental model for it, but
that it comes with a few gotchas.

[1]:
[https://www.researchgate.net/publication/220994310_Succinctn...](https://www.researchgate.net/publication/220994310_Succinctness_of_the_Complement_and_Intersection_of_Regular_Expressions)

------
parhamn
I sometimes wonder what a syntactically clarified regex could look like. There
are two things that often confuse newcomers:

\- What are escapes are and what needs to be escaped?

\- The <character-class><repetitions> structure of a regex.

\- Syntax around things like capture (is the parens part of some matcher? what
to escape?)

We should have a version of regex that separates characters, character classes
and operators, or whatever the regex jargon for those things are. Half the
things I usually want to regex for, like parens on a function or dot accessors
need to be escaped!

A quick example for illustration purposes (please don't point out why this
grammar wont map to regex):

    
    
        <startofline>(['a' or 'b']<2,4,greedy>, captureAs="prefix")[number or '.']<2><endofline>
    
    

is definitely more approachable and easier to explain than the regex
equivalent (which I'm avoiding to write because I don't have time to test if I
got capture syntax right).

Maybe someone makes a wasm regex-simple transformer we can use in multiple
languages. Regex is too useful to have such a scary syntax for newcomers!

~~~
yoz-y
I think most people just like to hate on regex syntax because when just
glanced over it looks like spilled tea leaves.

However I'd argue that it's not actually very hard to learn and its brevity
makes it easier to retain. (personally I did so using [https://www.regular-
expressions.info/tutorial.html](https://www.regular-
expressions.info/tutorial.html))

I agree that escaping is a problem, mainly because languages have often
different rules for this.

------
120bits
I started teaching Python to my GF(she is working from home and now has plenty
of time to do some extra learning). She is not a programmer and I have been
giving her small functions to write. Recently, we started with RegEx and she
finds it really hard to get into. She wants to see examples and follow along.
I think this will be perfect for her and anyone starting out to learn regex.

------
trevor-e
FWIW Cisco Umbrella is blocking/reporting your site as a security threat.

~~~
shreyasminocha
Very odd, no idea. Perhaps because the certificate is just a few days old?

------
appleflaxen
Hey this is great!

I noticed that in the "Escapes" chapter, the "Next" link at the bottom of the
page goes back to the introduction when it should go to "Groups".

I poked around in your repo to try and submit a pull request, but I can't tell
where the edit needs to go; the meta.json file seems to have an array with the
right chapter headings which was my guess about the problem.

Anyway, there is a typo. Sorry I can't be more helpful when you've put
together such a great resource.

[https://github.com/shreyasminocha/regex-for-regular-
folk/blo...](https://github.com/shreyasminocha/regex-for-regular-
folk/blob/c49e2fa5dcfb2ff55792b14733c7115ecb50f7c9/meta.json)

~~~
shreyasminocha
Yep! Sorry, I broke that late last night. I've fixed it with
[https://github.com/shreyasminocha/regex-for-regular-
folk/com...](https://github.com/shreyasminocha/regex-for-regular-
folk/commit/21c726bfe1c0118da9ed7883872165cabe11d312).

It was fairly unintuitive for someone unfamiliar with the source, so no
worries :D

------
asutekku
It seems the “next” button is not working on mobile safari on ios 13. It just
reloads the page you’re in. Tried with and without content blockers.

However from what i quickly read from the links on the front page, the
tutorials itself seem really high quality!

~~~
shreyasminocha
Fixed it! Thanks.

------
vongomben
Link saved for later. Looks really well done!

------
mycall
Recursive RegEx has always been confusing to me.

[https://www.rexegg.com/regex-recursion.html](https://www.rexegg.com/regex-
recursion.html)

~~~
asicsp
Railroad diagrams may help. I use a three step approach to present one example
of recursion, which includes showing the difference between two-level nesting
vs recursive version [1]. This is the railroad visualization for two-level
[2].

[1]
[https://github.com/learnbyexample/py_regular_expressions/blo...](https://github.com/learnbyexample/py_regular_expressions/blob/master/py_regex.md#recursive-
matching)

[2]
[https://www.debuggex.com/r/SMLRfiyt0Ag2hXu5](https://www.debuggex.com/r/SMLRfiyt0Ag2hXu5)

------
phillipseamore
Very clear and concise with simple presentation. Good job!

------
Pxtl
I've always wanted to introduce regexes to non-programmers because they have
always struck me as staggeringly useful. Just simple things like find/replace
in my filenames and find/replace in documents.

This guide, along with a simple web-based regex tester would be great for
this...

But it's missing a 3rd part: regex plugins for common non-programmers tools,
like for ms office, the windows explorer, etc.

------
scottfits
Incredible resource - I liked the structured approach as opposed to guess and
check regex which is what most tools offer

------
bane
This is a nice, non-threatening intro. One piece of advice that I've learned
from teaching people regex syntax is that it's much easier to keep it to three
basic topics at first (Repetition, Alternation, Concatenation), and then
describe the rest of the stuff (character classes, character escapes, etc.) as
"syntactic sugar" that makes those previous topics simpler, or provides more
power. I usually introduce groups pretty early but most people get them
notionally because they're kind of like algebraic parentheses. And then I'll
expand on groups as well to show more power, escapes, etc.

For example,

(a|c|d|e|f|g|...|z) uses only notional groups and basic alternation while
[abcdefghi...xyz] shows character classes, and [a-z] shows ranges - each step
builds on the previous step and shows how to make them easier. For the learner
this seems to act as building blocks rather than "separate things that are
kind of alike I need to learn"

This is similar to how you can talk about repetition as

aaaaa, then aaaaaaaaaaaaa, then a _, then aa_ , then a+, then a?, a{0,1},
a{0,5}, a{1,5}, a{,10}, etc. which simplifies, then generalizes the idea of
repetition from a very natural concept build on concatenation to an opaque
looking syntax that turns out to be both general and powerful

After that, most of the time I need to explain how capturing works, and how to
turn it off and so on. Good tools help here and it starts to move away from a
whiteboard exercise into something more active. But if students have followed
you to this point it starts to make them feel very powerful as they're
suddenly parsing things apart and transforming them.

At the end I usually follow up with a big on anchors (^ and $) and other odds
and ends (case insensitivity, global search, greedy and non-greedy, etc.) and
usually turn people loose after that. I've rarely found people who actually
need lookarounds and other advanced topics and those are usually covered one-
on-one as they need.

But this is fairly minor quibbling and is just rearrangement of what's here. I
think this is overall a nice clear explanation. Regex syntax is honestly
pretty simple once the syntax magic is explained.

What I think would be really helpful is a tool where somebody can type in a
regex, have it checked for syntax and then generate the list of strings that
would match it (within the constraints of limits on infinite repetition
operators, like turning * to {0,2} or something.

------
digitalmaster
Learned more in just the first few pages than I have in my many years of copy-
pasting regex from StackOverflow. Cheers

------
arthurofbabylon
In case it's useful for others, I found "Mastering Regular Expressions" by
"Jeffrey E. F. Friedl" to be very effective for becoming proficient with
Regex.

Also – reading it was just a useful look into systems mapping (which is what
language is!) with insights that apply in many contexts.

------
cjhveal
Great work! These examples are super clear.

One thought: it would be great to highlight a given match on hover. I know
that each match has its own undertie and it's explicitly mentioned early on
but it might help really drive it home if each match reacted individually when
hovered.

------
olq
Nice guide! As a complete regex-dyslectic i didn't know the slash '/' could be
used to make expressions more readable.

On another note, since this is supposed to be a book and all, is there a
simple way to get this on one a single page and make it easier to print?

~~~
catblast
That's not really true. The guide seems to be javascript centric. The slash
and flags is not part of the regex, it's a delimiter. In certain contexts like
sed, perl, php, an arbitrary delimiter can be used to avoid needing to escape
slashes. If you pass a string to a regex engine with a /, it does what you
would expect, match a literal /. For instance, python and grep does not
interpret slashes and flags. Those are pretty common.

~~~
olq
Thanks for the heads up, it seems i have some more reading to do :)

------
jamesrcole
Feedback: the “next” link on this page just takes you back to that same page
[https://refrf.shreyasminocha.me/chapters/introduction](https://refrf.shreyasminocha.me/chapters/introduction)

~~~
shreyasminocha
Yeah, I messed that up late last night. Fixed!

------
tuan
Related: I found this online tool to be very useful for debugging regex:
[https://www.debuggex.com/](https://www.debuggex.com/)

------
cryptoslug
Thank you so much for making this! at a certain point, on our team at least,
we have to compile regex resources into a guide and this is incredibly
helpful.

------
busterarm
We need this approach for more advanced RegEx and the regular language subject
in general for actual working programmers, from some of what I've seen.

------
Zhyl
Lots of praise here, so I won't re-iterate the good points (presentation,
pleasant tone, good structuring) and will head straight to the meat of my
issue with the title:

This is not a book for regular folk.

A regular HN reader, sure. A technically inclined interested party who wants
to break the ice with Regexes, sure. But not regular folk.

Here is what I'm talking about:

> Introduction

> Regular expressions (“regexes”) allow defining a pattern

Ok, with you so far. As a layman, though, I would be very much be looking for
you to expand on what you mean by 'pattern'.

> and executing it against strings.

"Executing" gets a wrinkled brow. "Strings" gets a squinty eye. "executing
against strings" and you've lost me. There's now too much new information in
this sentence for me to be on board with it. If I knew what all those terms
meant and the context with which they are meaningful, I probably wouldn't be
trying to read 'RegEx for Regular Folk'.

> Substrings which match the pattern are termed “matches”.

As above, but it's also slightly confusing here that we're defining matches
and we haven't even talked about what a pattern is yet. As such, I can't even
visualise or conceptualise what I would be matching or similar. If I press on
regardless, this is just some unresolved debt that I will have to reconcile
later or I will just get frustrated and put the book down.

> A regular expression is a sequence of characters that define a search
> pattern.

Ah, good, we're defining a pattern _after_ we've already described a 'match'.

> Regex finds utility in:

>input validation

And straight out the bat we're hit with a term that is only going to be
relevant for techie people. Unless you are aiming this at techie people. But
aren't we aiming this at 'regular folk'?

The above is really just my long drawn out beef with 'x for the masses', 'y
for mere mortals' and the like. For me the best explanation of regular
expressions comes from Al Sweigart in 'Automate the Boring Stuff with Python'
[1]. He not only gives a pretty thorough explanation of pattern matching
before bringing in any domain-specific terms, but he also motivates why you
would want to pattern match in the first place. He gives context for
circumstances under which you might reach for regex as a tool.

I'm looking through the later pages of this book and as a techperson I'm
thinking 'this is beautiful. I can see the examples clearly, there is a clear
correlation between the visuals and the exercise.' I'm also thinking as a folk
person 'when the hell will I need a match? Under what circumstances and I
going to need to know that there is one 'p' in 'grape' but two 'p's in
'apple'? What use is writing a pattern to match against certain fruits and
utility items?

So yes, basically, after all that I can summarise "good book, bad title".

[1] [https://automatetheboringstuff.com](https://automatetheboringstuff.com)

~~~
shreyasminocha
Valid points. I agree.

I'll try easing the curve, especially early on and make clearer the intended
audience.

------
romes
Hey! I'm wondering if in the first example of regex negation the (^) should
appear after the bracket ([).

~~~
shreyasminocha
Yes, it must appear immediately after `[`

------
bogomipz
This is very nice looking. Especially so given that regexes are often not easy
to look at. Kudos.

------
iluvblender
This is looking great. Thank you.

~~~
iluvblender
Also, I rely on [https://regexr.com/](https://regexr.com/) for interactive RE
debugging.

~~~
shreyasminocha
Me too! There are links to RegExr next to each example. Glad they have query
string support.

------
johnnythunder
This is literally the best RegEx tutorial that I've ever gone through.

------
SPBesui
Maybe I missed it, but there doesn't seem to be any credit given for the xkcd
comic ([https://xkcd.com/208/](https://xkcd.com/208/)) shown on the Next Steps
page ([https://refrf.shreyasminocha.me/chapters/next-
steps](https://refrf.shreyasminocha.me/chapters/next-steps)). Does Randall
even require it?

~~~
shreyasminocha
He does require it per CC-BY-NC-2.5. I felt it was sufficient to permalink to
the version on his domain, but I shall make it more explicit.

