sathishmg's comments

sathishmg · 2026-03-04T19:27:35 1772652455

Good one. QQ. If you store it hash chained, how are you handling GDPR erasure requests? Isn't that supposed to erase within 30 days for GDPR instead of 180? Do you recreate the chain or soe sort of Pseudonymization or anything else?

systima · 2026-03-04T23:19:15 1772666355

Great question.

voxic11 is right that the AI Act creates a legal obligation that provides a lawful basis for processing under GDPR Article 6(1)(c).

To add to that, Article 17(3)(b) specifically carves out an exemption to the right to erasure where retention is necessary to comply with a legal obligation.

(So the defence works at both levels; you have a lawful basis to retain, and erasure requests don’t override it during the mandatory retention period).

That said, GDPR data minimisation (Article 5(1)(c)) still constrains what you log.

The library addresses this at write-time today, in that the pii config lets you SHA-256 hash inputs/outputs before they hit the log and apply regex redaction patterns, so personal data need never enter the chain in the first place.

This enables the pattern of “Hash by default, only log raw where necessary for Article 12”.

For cases where raw content must be logged (eg, full decision reconstruction for a regulator), we’re planning a dual-layer storage approach. The hash chain would cover a structural envelope (timestamps, decision ID, model ID, parameters, latency, hash pointers) while the actual PII-bearing content (input prompts, output text) would live in a separate referenced object.

Erasure would then mean deleting the content object, and the chain would stay intact because it never hashed the raw content directly.

The regulator would also therefore see a complete, tamper-evident chain of system activity.

sathishmg · 2026-03-05T03:30:28 1772681428

Thanks both for the replies. Can't you make it simpler: encrypt the data, store the encryption key separately and move the raw data to cold storage. If user wants to erase, delete the encryption key avoiding massive recompute from cold store. Do you think this is better approach? This is not efficient, but in large scale (peta bytes) this could work. Developers make mistakes, if they miss encrypting due to some bug in the code, and they want to fix it, then the hash chaining will be a problem though.

systima · 2026-03-05T07:29:13 1772695753

IMO what you’re describing is essentially crypto-shredding.

It would definitely work (and when dealing with petabyte levels of data the simplicity of only having to delete the key is convenient).

We’re leaning toward the dual-layer separation I described though (metadata separate to content) mainly because crypto-shredding means every read (including regulatory reconstruction) depends on a key store.

In my view that’s a significant dependency for an audit log whose whole purpose is reliable reconstructability, whereas dual-layer lets the chain stand on its own.

Your point about developer mistakes is fair. It applies to dual layer as you say with your example, but I’d say crypto shredding isn’t immune to mistakes because (for example) deleting the key only works if the key and plaintext never leaked elsewhere accidentally in logs / backups etc.

sathishmg · 2026-03-06T11:14:06 1772795646

thanks, that's a good approach!!

voxic11 · 2026-03-04T21:13:58 1772658838

GDPR permits retention where necessary for compliance with a legal obligation (Article 6(1)(c)).

The AI Act qualifies as such a legal obligation.

sathishmg · 2026-03-04T19:19:10 1772651950

I got the creativity back, by asking more questions :)