
Ask HN: Would you buy/sell encrypted dataset for machine learning? - bartolomej
Hi, would you buy or sell quality datasets to teach your AI on? The thing is that the data and model would be encrypted, so both would stay hiden from the other parties. What pros&#x2F;cons do you see in such approach? Thanks, TB
======
verdverm
Encryption does not equal privacy. Some amount of metadata would be required
to know if the data is the type of data we want to make a model for. How would
data cleaning or feature selection work?

~~~
bartolomej
Thank you for remark! I was thinking about this as a problem too. That needs
to be addressed. I can imagine that some metadata would be public. Keeping it
simple, basic stats of dataset, number of items in the category, names of
categories. But the dataset would still not leak sensitive information about
the exact data ( like the identities of people ) so they could not be misused
or stolen.

