
Show HN: Record Linkage Resources - ropeladder
https://github.com/ropeladder/record-linkage-resources
======
fiatjaf
Thank you.

I don't understand half of this, but seems very useful. Need better
naming/explanation of use-cases (if I don't understand it, maybe people who
are in need of these services don't also).

~~~
ropeladder
Record linkage is conceptually pretty simple--it's just deduplicating records
(usually of people, in census or medical or commercial or other scenarios)
that don't have unique identifiers. It's tough because it's O(N^2) and because
you're trying to build a bunch of disparate information into your matching
decision.

It's helpful to hear how confusing it is. I stumbled around for literally
years trying to build my own solution until finally a kind soul on the Data
Science board on Stack Exchange directed me to the Record Linkage Wikipedia
page, and I figured out some of the right terms to search for.

I'll try adding some more examples up top and use case scenarios up top.

I'm also hoping to add some info to help differentiate the software a bit
more, but unfortunately most of them are barely past the proof-of-concept
stage in terms of usability.

~~~
fiatjaf
Thank you.

