The tool you linked uses a GA (not a NN) to find a short regex that gives correc...

The tool you linked uses a GA (not a NN) to find a short regex that gives correct answers to whatever testcases you provide. If your test cases are correct, it is guaranteed to produce correct output.

GPT-3 is known to be confidently wrong at things like basic arithmetic, so I really wouldn't trust it in scenarios like this. Maybe one day I'll be able to trust NN-generated code without testing it first, but we're not there yet.