By default RE2 matches directly against the UTF-8 encoding, see this neat state diagram:
https://swtch.com/~rsc/regexp/regexp3.html#step3
In terms of backtracing engines, perl also stores strings internally as UTF-8 (on most platforms) and its regexp engine runs directly on them. It can also match "eXtended grapheme clusters" with \X.
By default RE2 matches directly against the UTF-8 encoding, see this neat state diagram:
https://swtch.com/~rsc/regexp/regexp3.html#step3
In terms of backtracing engines, perl also stores strings internally as UTF-8 (on most platforms) and its regexp engine runs directly on them. It can also match "eXtended grapheme clusters" with \X.