Hacker News new | past | comments | ask | show | jobs | submit login

Does FDW let you do performant `FULL OUTER JOIN`s and/or `NATURAL FULL OUTER JOIN`s? If so then I would think that would be a decent place to start for remote DB diffs for PG. If might not be enough, of course, if the tables are huge, in which case taking a page from rsync and using some sort of per-row checksum as TFA does is clearly a good idea.

I'm not completely sure I understand your comment, so pardon me if I misunderstand. I don't think a foreign data wrapper would fundamentally to be more efficient with whatever table is ~foreign~, especially for an OUTER JOIN? Unless you're basically implementing something similar to data-diff with an OUTER JOIN with FDW, which seems possible

If you're doing in-database diffs, however, a join-based approach will likely outperform data-diff though.

Ideally databases would have support a standard MERKLE TREE INDEX so we could get extremely fast comparisons.

A naive FULL OUTER JOIN is O(N), which is not efficient, indeed.

An RDBMS could implement something like the rsync algorithm, or history tables, etc., to speed up a FULL OUTER JOIN.

The point is that FULL OUTER JOIN is the SQL table source "diff" primitive. Thus it seems natural to use that and let the RDBMS optimize it.

Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact