
Ask HN: How do I get started with regular expressions? - seanccox
I am not a programmer, I am a research investigator.<p>Thus, I have tinkered with code only on an as-needed basis, when I have had very specific problems that I knew a computer could do for me accurately and quickly – changing file names in a folder, performing repetitive editorial tasks, automatically filing downloaded sources with specific names. Most of these things were done in Applescript, and they are all good things for me, because the time they save me gets devoted to thinking that helps me build a case better. I can't hire an intern, because that's expensive, but my meager Applescripting I have done so far has made for a very reliable intern.<p>Recently, I wanted to write a script that would scan my documents, identify local currency figures, convert them to a global standard ($, €, or Bitcoin), and then insert them into my report copy. This seemed very reasonable to me, and I thought that it would be easy (since everything else I had done had been a simple matter of articulating the problem and finding the solution somewhere online). Time consuming, perhaps, but easy... like reading, copying, pasting, and then tinkering to customize it.<p>This was not easy.<p>Someone has suggested I try regular expressions to solve the problem. The thing is, that sentence doesn't mean anything to me, because I don't know where/how to begin, and I don't really know what regex is.<p>Where would you send someone like me to start learning regex?<p>Cheers,
-s
======
mindcrime
Regular Expressions, by themselves, don't solve your problem. Regular
expressions, implemented as part of a tool like sed or grep, or as part of a
programming language like Perl or Java, can.

Regular expressions are basically just a (powerful) way to match patterns in
text. If you've done "ls *.sh" in a directory to list only the .sh files,
you've used something very much like a regular expression. But that's just
working on filenames, tools like grep and sed can look inside a file and match
patterns, and sed can then replace those patterns with whatever you supply.

If you just want to start off learning regexes, google "regex tutorial" and
start playing around searching through files with grep or something.

For your specific case, you should be able to do what you want to do with
sed[1], or with awk[2]. Or if you're more comfortable with a higher level
language, you could go with anything that has regex support, which should be
most mainstream languages.

[1]: <http://en.wikipedia.org/wiki/Sed>

[2]: <http://en.wikipedia.org/wiki/AWK>

~~~
Sakes
The above is great advice. You will definitely want to play around with a
tutorial, any tutorial, so you can grasp the basics of what regular
expressions are, and how they are used. After you "get" what regular
expressions are, you can work on implementing them in your code.

After you understand the basics, you can follow these steps that I use every
time I need to program regular expressions.

1) Google a chart explaining the regular expression syntax for the language
you are using (something like - applesscript regexp chart)

2) Create some test code to help refine your regular expression. Your code
should have

* a - REGEX STRING - the regular expression value you are testing. You can store this value in a variable in your code, reference this value from a text file, or input it manually every time you run your script. I like to give myself access to this outside the code, so for a webpage have a form with an input field with this value. For scripting languages, make this a parameter of your script, so you can play with it quickly without having to change any code.

* b - STRING DATA - this is the value (or values) you will be comparing your regular expression against. You can store this in a variable, or reference this from a text file. Try and think of all cases that could potentially cause your script to break. For example, the currency symbols you are trying to replace, could they ever be used in ways where you would not want to replace them?

* c - RESULTS - you should output the results every time you run your script and keep tweaking it until you are satisfied with the output.

3) Implement your awesome REGEX in your code.

------
brudgers
_Learn Regex the Hard Way_

<http://regex.learncodethehardway.org/book/>

------
logn
<http://regular-expressions.info>

"Mastering Regular Expressions" by Friedl is the ultimate resource, but for
getting started that website above will let you learn the practical syntax.

------
taoufix
The tutorial that helped me is "sed" based:
<http://www.gentoo.org/doc/en/articles/l-sed1.xml>

Also, an online regex tester may be helpful: <http://regexpal.com/>

------
adrian_pop
Here's a first link. It's a visual matching tool for your regex and targeted
text + several saved regex expression by the community:
<http://gskinner.com/RegExr/>

~~~
wcfields
This is amazing! (and amazingly helpful with the Community tab to show
examples!)

------
stefantalpalaru
Start with the wikipedia page[1] and continue with the documentation of the
regex engine you'll be using (there are differences between implementations).

[1] <http://en.wikipedia.org/wiki/Regular_expression>

------
kgen
This is another plug, but people have told me that they've found RegexOne
(<http://regexone.com>) to be pretty useful in learning and practicing regular
expressions interactively.

------
saravanaram
I liked Eric Lippert series of article on regular expressions
[http://blogs.msdn.com/b/ericlippert/archive/tags/regular+exp...](http://blogs.msdn.com/b/ericlippert/archive/tags/regular+expressions/)

------
seanccox
Wow, thanks for all this feedback. Once I get a chance to wade through these
resources, I'll let ya'll know what I found the most useful. Thanks again.

Cheers, -s

------
chiph
_Some people, when confronted with a problem, think "I know, I'll use regular
expressions." Now they have two problems.

\- Jamie Zawinski_

------
ulisesrmzroche
<http://rubular.com/> is another way to get a feel for them.

------
andyzweb
<http://regex.learncodethehardway.org/>

