
Ask HN: Data Matching and Reconciliation machine learning algorithms suggestions - maddy1512
I am trying to solve Data reconciliation problem using ML and need suggestions on which algorithm would be suitable?
Follow the link to get more elaboration:
https:&#x2F;&#x2F;www.kaggle.com&#x2F;questions-and-answers&#x2F;171307
======
Imanari
In your example it seems the primary clue to find matches is the name, i.e.
'ABC' \+ Corp/Des/etc. So how about doing some fuzzy string matching? Once you
have done this you can identify edge cases and additionally group by dates or
whatever.

So you would have 'ABC' in L and a selection of matches in S. If not all of
the matches in S actually belong to the ABC in L you are faced with the
Knapsack Problem[0] that you can solve with different methods(sorry, no expert
here).

[0]
[https://en.wikipedia.org/wiki/Knapsack_problem](https://en.wikipedia.org/wiki/Knapsack_problem)

------
doonesbury
You mean comparing data? For what purpose (to help assess solution) ... and
why ML? Surely a rules engine is much more practical.

~~~
maddy1512
Umm... not comparing data but taking a data point and finding its nearest data
points whose amounts nets to zero. Rule engine might work on a data where the
data is not complex but here there are a lot of complexities like you don't
have exact matching features which gives enough surety to rule based matching
engine.

