That specific SQLite test suite (one of four [1]) has loads of generated SQL queries, and a long tail of more varied hand-written tests. That 95% of the test suite will be mostly generated queries that follow the same basic pattern of joins and projections with basic arithmetic and comparisons. See for example [2] and [3].
The generated tests are not designed to test a wide breadth of features of the SQL language, and passing them with a simple engine is very doable. A lot of the value of these tests is that the sheer volume of queries tends to find obscure problems in optimizers that would not easily surface otherwise. That is of course not a problem in a simple engine that does not have an optimizer.
Ah OK that makes sense, because I can't see any way you'd fit builtin functions like string matching/transformation in ~3600 minus ~1000 lines, and I think a bunch of those are part of the SQL standard.
The numbers could be misleading because the 5% of failing tests could be a large part or most of the core language (window functions?), while the 95% are some combinatorial tests that happen to take the exact same code path in an unoptimized engine
i.e. with 4.2 million tests for 2600 lines, it seems like you're going to have a lot of duplicate coverage, even accounting for state space explosion
The generated tests are not designed to test a wide breadth of features of the SQL language, and passing them with a simple engine is very doable. A lot of the value of these tests is that the sheer volume of queries tends to find obscure problems in optimizers that would not easily surface otherwise. That is of course not a problem in a simple engine that does not have an optimizer.
[1] https://www.sqlite.org/testing.html#test_harnesses
[2] https://github.com/gregrahn/sqllogictest/blob/master/test/ra...
[3] https://raw.githubusercontent.com/gregrahn/sqllogictest/mast...