
How to delete large amounts of data from SQL Server - ed_elliott_asc
https://the.agilesql.club/2019/05/how-to-delete-a-lot-of-data-from-a-sql-server-database?h
======
SigmundA
One thing I never understood is why it logs the data deleted? It obviously
logged it on insert, it could just log the row ID's deleted.

They say its for rollback, but why wouldn't you just mark rows deleted during
transaction, then actually delete them later in the background?

Anyone know if other database do this differently, I think Postgres doesn't
log the full row on delete but haven't been able to find much on it.

This is always a big problem when I try and use SQL server to store things
like logs where you want to retain them for a period of time then delete them.
At high volumes the logging overhead becomes an issue especially in the back
up system. Usually have to do a separate DB in simple mode or start doing
partitioned tables with truncate, would be so nice not to have to think about
it much.

~~~
anarazel
Hi,

> They say its for rollback, but why wouldn't you just mark rows deleted
> during transaction, then actually delete them later in the background?

> Anyone know if other database do this differently, I think Postgres doesn't
> log the full row on delete but haven't been able to find much on it.

Yes, postgres doesn't do that. But that has substantial disadvantages too:
It's why postgres then needs to VACUUM that relation later - to remove all the
deleted and updated rows. Postgres cannot reuse space on the page with the
deleted rows, even if the the transaction that performed the DELETE also would
like to reuse the space (in some UNDO based systems the space won't be
reusable if _another_ transaction wants to reclaim it, because there needs to
be space for ROLLBACK to put the tuples back).

> This is always a big problem when I try and use SQL server to store things
> like logs where you want to retain them for a period of time then delete
> them. At high volumes the logging overhead becomes an issue especially in
> the back up system. Usually have to do a separate DB in simple mode or start
> doing partitioned tables with truncate, would be so nice not to have to
> think about it much.

Using partitioning usually is the right approach for this kind of load, in
postgres as well.

------
teddyuk
The thing about triggers being able to do anything, I have seen triggers make
webservice calls - it was horrible and caused so many issues it was
unbelievable

