That specific SQLite test suite (one of four [1]) has loads of generated SQL que...

chubot · on Oct 4, 2022

Ah OK that makes sense, because I can't see any way you'd fit builtin functions like string matching/transformation in ~3600 minus ~1000 lines, and I think a bunch of those are part of the SQL standard.

The numbers could be misleading because the 5% of failing tests could be a large part or most of the core language (window functions?), while the 95% are some combinatorial tests that happen to take the exact same code path in an unoptimized engine

i.e. with 4.2 million tests for 2600 lines, it seems like you're going to have a lot of duplicate coverage, even accounting for state space explosion