Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was hoping from the title that it aims for postgres SQL compatibility, but I can't find it explicitly mentioned in the docs. This really makes me think I really want something like sqlite://memory which completely disregards speed or even persistence. Instead you could say for example "open an in-memory database that behaves like postgres 9" and run your tests against it. With typical fixtures of 10 or so rows, you wouldn't even need a query plans - just naive filte+copy over whole tables for most operations.

Currently I'm using https://pypi.org/project/testing.postgresql/ which spawns a temporary server per each test suite, which has quite a lot of overhead.



> SQL Parser: We use the PostgreSQL parser that was repackaged as a stand-alone library. The translation to our own parse tree is inspired by Peloton.

Right at the bottom of this page:

https://duckdb.org/docs/why_duckdb.html


That's awesome news, thanks for that!


No problem I was curious as well and hunting it down, this project does seem interesting, I'm a huge fan of database systems like SQLite and H2. There's something wonderful about being able to copy a simple file and share it with other developers. This one seems rather interesting as well.


That’s what made Excel so popular.


One of the authors of DuckDB here: we use the PostgreSQL parser, and try to be generally compatible with the syntax used by SQLite, Postgres and MySQL. In some cases those are unfortunately mutually exclusive (e.g. null ordering - we have a PRAGMA to change this system-wide). Making the SQL dialect “as compatible as possible” with these systems is very much one of our goals, and if you find any incompatibilities feel free to open an issue :)


I'd note that when moving from pg to $other the thing that really trips me up isn't the syntax changes, it's the lack of ARRAY and ROWTYPE.

I'm not sure whether those are in scope for you but it'd be nice if the docs said "to be implemented" or "out of scope" somewhere ... and my apologies in advance if they do and I somehow missed it.


We already have support for LIST and STRUCT (which I think are equivalent to ARRAY and ROW, respectively). There is still some functionality missing there (notably storage, and several functions) - but the base functionality is there and the remainder definitely will be added.


Could only find a pull request on struct, and nothing at all on 'list' because I just got a billion 'select list' articles. Also nothing in the docs.

Struct looks 10x uglier but perfectly easy to generate, mind, but the whole thing is non-trivial to find :/


I would just create a per-test-file database (and delete it the next time the test runs). The overhead is very small (compared to booting Postgres) and it works exactly like production because it is exactly production.

In general, I am not comfortable using an "almost equivalent" database for tests. Every database engine has its quirks, and if you don't know what they are, your program that passes its tests only "almost" works in production, which is annoying.


And if you need to run expensive migrations, use a template database.

https://www.postgresql.org/docs/current/manage-ag-templatedb...


While PostgreSQL templates allow to scaffold new databases (one per test-spec) quite fast, it's even faster to have a warmed up pool of databases ready: https://github.com/allaboutapps/integresql#background

Disclaimer: author.


Oh that looks great, and very close to what I had in mind for the next upgrade of our in-house testing framework. Definitely going to give that a try soon. Thanks for posting it!


I would also love this, in the past I’ve used the H2 in-memory database for tests with Spring boot applications.

The Hibernate ORM handles configuring the H2 schema but they’re not fully compatible, so it means I have to be careful to not rely on Postgres-specific features. I generally am not testing database logic, just need a working persistence layer to get the tests running, but an in-memory Postgres implementation would be amazing.


Although it's something I'd love to write, supporting it over a long time as pure FOSS would probably be a sad experience. I wonder what model could be used here. Maybe something like patreon or "sponsor a feature" with an open-source project? FOSS core + proprietary Java / .NET bindings because corps can pay?


Yeah, for sure this is the type of niche, high-effort work that would be used extensively by the closed-source world and yet struggle to find adequate support.


IIRC h2 supports a postgres compatibility mode. Though I'm not sure how feature complete it is.

http://h2database.com/html/features.html


Man, I’ve wanted the “compatible SQL engine with only RAM storage” for testing for YEARS. Closest I got was some shenanigans with MSSQL’s LocalDB.


I use docker to spin up new postgresql DBs on the fly


Same here, and I run the same set of migrations that run in production. To he clear, this is only done once per test session, not for individual tests, and the tests are written in such a way that they don't interfere with each other.

The overhead is actually pretty small, less than 10s. I'd saw too much for unit tests, but we'll within the tolerable range for integration/functional tests. Compared with the time I'd spend hacking together some brittle and unrealistic in-memory alternative, I much prefer to use a real database.


I do that too but it’s a noticeable overhead, I guess it wouldn’t be great for mass-testing scenarios.


RAM FS + fsync off?


Those solutions still have a high overhead. There's acid compliance, serialisation in memory, maintaining indices, and many other layers. Compare it to an ideal testing solution with no initialisation cost and insert being literally: parse the query, add a new entry to a list, done.


But you want to be testing against something that is as close as possible to the deployment environment. So if that means acid, indices etc, then that's what it is.


You can still do them in a trivial way that works like production. For example: if some column has a unique index, look at all rows and compare the new value. You don't need an actual index for it. (And definitely not a fancy concurrent access btree) For transactions/mvcc you can literally make a copy of everything.


Have you tried postgresql with libeatmydata?


I'd build this, but I'm not sure if I'd be able to get anyone to fund it.


You shouldn't rely on any "just like X" systems for tests, you should use X exactly, including correct config and version as production.

What you want is having a Postgres server, but optimized for tests. What I've done is 1. One instance per pipeline 2. Parallelize by having separate databases per process 3. Build schema once 4. Before each test, purge the database (or run tests in transactions so you can just rollback in the end) 5. Run in Docker, use tmpfs volumes for writes (no disk writes)

It runs fairly quickly.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: