

Parsing HTML with regexes - Artemis2
https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

======
armabiz
I think the worst thing is to parse HTML with regexes.

Had research in past related to this. The trick is that big amount of websites
have broken HTML, what brings unexpected results when parsing with regexes.

Entire internet is a bit broken and it's interesting that ALL browsers do more
than usual work, outside of RFCs to "fix it" and bring content to user without
issues.

