
Ask HN: What are your small data problems? - smalldataguy
I&#x27;m looking into a data-quality-as-a-service idea I had, focused on &quot;small data&quot;. Inspired a bit by https:&#x2F;&#x2F;www.neutrinoapi.com&#x2F; which describes itself as a &quot;general purpose API&quot;.<p>The tech scene is all about data quantity, but I&#x27;ve found that a lot of small or medium businesses, not necessarily in tech, have what I call small data problems.<p>For example, CRMs are rife with duplicate data, but the risk of losing possible customers in the deduplication process makes them struggle along.<p>Address lists might not have great geographic coverage, so visualizing where your customers are can help you identify opportunities.  That kind of stuff.<p>Anyone have frustrations with their data content or quality that they&#x27;d like to share? If you&#x27;re willing to have a 15 minute voice conversation with me, I&#x27;ll buy you a book from your amazon wishlist as a sign of appreciation.
======
staticautomatic
Cleaning and standardizing are easily the biggest "small" problems. With
small-ish data sets (<= a thousand records), I often find it's easier to have
a cheap contractor do the work by hand than to spend time writing a
programmatic solution.

------
minimaxir
Data cleanup is a contextual activity depending on business needs and
capabilities. Using an API to heuristically cleanup and analyze data will very
likely lead to misleading results, which is worse than having no results at
all.

~~~
smalldataguy
Can you help me understand what you mean with some concrete examples you've
experienced?

