I wrote my own duplicate file finder way back in the days. I did the obvious tri... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		magicalhippo on Aug 30, 2021 \| parent \| context \| favorite \| on: Go Find Duplicates: A fast and simple tool to find... I wrote my own duplicate file finder way back in the days. I did the obvious trick of binning by size before trying to compute any hashes, and was mildly surprised to find how few out of my ~million files had exactly the same size. For multiple files with identical size I just did the full file MD5, we only had HDD's back then and we all know how much they like random access.

unnouinceput on Aug 31, 2021 [–]

I wrote one too, over 20 years ago. Still works, that .exe, even today. Unsurprisingly I was using CRC32 too. When I look at the code that is there I cringe, such is the mess there. Oh well, everyone has to start somewhere.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact