
Ask HN: SQL engine reclaiming space from DELETEs - twa927
Both PostgreSQL and MySQL are poor in that regard, especially PostgreSQL: after many DELETEs&#x2F;UPDATEs, the disk space is not released to the OS, and the re-use of the space by the DB engine is often not working well due to the fragmentation. The only option is rebuilding&#x2F;DROPping the whole table which requires downtime.<p>I&#x27;m essentially looking for an SQL engine that would be suitable as a storage for a queue server. I would like it to release the freed disk space to the OS immediately, even if this requires making extra I&#x2F;O.<p>I see that MariaDB comes with multiple engines, maybe some of them could work like that?<p>I know that Amazon&#x27;s Aurora is even worse than raw MySQL&#x2F;PostgreSQL.
======
toomuchtodo
Juggle tables to meet your use case. Switch queueing to new table, wrap up
work in old table, drop old table.

Or switch to RabbitMQ if you can (you mentioned queueing, which Rabbit is
designed for), which can journal to disk on a per queue basis if necessary.

------
void141star
[https://www.postgresql.org/docs/11/sql-
vacuum.html](https://www.postgresql.org/docs/11/sql-vacuum.html)

~~~
twa927
Regular VACUUM only marks space as available for reuse. VACUUM FULL returns
space to the OS but it requires an exclusive lock and a multi-hour run for a
large table.

~~~
salex89
Shouldn't plain vacuum be enough? The space is going to be reused for new
items anyway, it won't take more of the space not already reserved by the
database?

On another topic, do you require using a SQL database for your queue?

~~~
twa927
> Shouldn't plain vacuum be enough? The space is going to be reused for new
> items anyway, it won't take more of the space not already reserved by the
> database?

In theory, yes. In practice, I've seen a lot of bloat being preserved despite
using aggressive autovacuum settings. I'm guessing this is due to
fragmentation?

> On another topic, do you require using a SQL database for your queue?

I'd need some way to browse the queue and select items for processing using
arbitrary queries.

------
docuru
When delete or update records, the database engine basically marks the space
as available, then the next record will be filled in.

Because the file system only allows append or update content on a file. To
release the disk space, it will need to completely rewrite the whole data
file, which will cause much more things to handle and time.

So it is not a way to design a database engine

~~~
twa927
Yep, you're describing a typical design of a DB. I'm looking for an
alternative design. I'm guessing it could use multiple small files (a few MBs)
and merge them to avoid fragmentation.

~~~
zzzcpan
Maybe check out TokuDB, I haven't tried it myself, but it has things like
tokudb_cleaner_iterations [1] and tokudb_cleaner_period.

There is also MyRocks based on RocksDB fork of LevelDB, but LevelDB is piece
of garbage, not sure about how much RocksDB fork fixes it though.

[1] [https://www.percona.com/doc/percona-
server/5.6/tokudb/tokudb...](https://www.percona.com/doc/percona-
server/5.6/tokudb/tokudb_variables.html#tokudb_cleaner_iterations)

~~~
zepearl
TokuDB won't release the allocated blocks (it will reorganize its internal
blocks to achieve max compression but as far as I know and saw won't release
what it has allocated previously, but I might be wrong :) ).

~~~
twa927
And it looks like TokuDB is no longer developed?

~~~
zepearl
Noooooo, please tell me that it's not true - it's the foundation of many of my
experiments delivering huge performance and flexibility compared to anything
else...!

Where did you read/hear it?

~~~
zepearl
TokuDB still active - last bugfix was on 29.Nov.2018:

[https://www.percona.com/doc/percona-server/5.6/release-
notes...](https://www.percona.com/doc/percona-server/5.6/release-
notes/Percona-Server-5.6.42-84.2.html)

List of which TokuDB version is integrated in which MariaDB version:

[https://mariadb.com/kb/en/meta/tokudb-
versions/](https://mariadb.com/kb/en/meta/tokudb-versions/)

------
zzo38computer
In SQLite you can use the VACUUM command to minimize the disk space needed.
(And from another comment, it look like PostgreSQL also has a VACUUM command,
but I do not use PostgreSQL and do not know much about that)

------
natmaka
[https://www.cybertec-postgresql.com/en/introducing-
pg_squeez...](https://www.cybertec-postgresql.com/en/introducing-pg_squeeze-a-
postgresql-extension-to-auto-rebuild-bloated-tables/)

------
sansnomme
Vacuum?

