There is good reason to distinguish the two, since they have completely different characteristics. PCREs may use an unbounded amount of extra memory, and may use exponential time. "Formal" regular expressions take linear time and memory, and can match only the regular languages. They operate completely differently (a regular automation vs. a stack based pattern matcher).
Regular expressions can NOT match HTML, PCREs CAN. Conflating the two is not helpful.
It's more exciting than this actually! An implementation of formal regexes doesn't have to simulate an automaton explicitly. In fact, it can also use backtracking and keep its linear running time. Such an implementation would use an explicit stack and also keeps track of each state that has been visited for a particular position in the search text. If a state has already been visited, then it doesn't repeat itself.
A backtracking implementation can be faster than a full NFA simulation if: 1) you need to track submatch boundaries, 2) you have a small regex and 3) you have small input. (2) and (3) are necessary because of the memory required to keep track of the states you've visited (proportional to len(regex) * len(search text)). (1) is necessary because you could otherwise use a DFA, which AFAIK, does not typically track submatch boundaries. Tracking submatch boundaries in a backtracking implementation is faster than simulating an NFA automaton because you only need to keep one copy of the capture locations around at any point in time. The NFA simulation has to copy them around a lot.
(Of the regex implementations that guarantee linear running time, C++'s RE2, Go's `regexp` and now Rust's `regex` all implement a bounded backtracking algorithm... among others!)
Also, PCRE does stand for "Perl Compatible Regular Expression".
For example the a^n b^n example can be recognised up to n<=3 by the RE ^(|ab|aabb|aaabbb)$.
WebKit and Blink both use 512.
Gecko uses 200.
Trident uses a limit beyond what I've tested quickly (over 4096).
Presto uses 500.
Alternative title: you can use regex derivatives (PCRE, Ruby ::Regexp, Boost.Regex, ...) to match non-regular syntax but you shouldn't.
The advice still holds. In the case of HTML you should reach for an XML parser.
(\w+) \d+ \1
This is particularly confusing because you mean "derivative" in the sense of "etymologically/conceptually related", but there is actually a (formal) notion of the derivative of a regular expression.