

Ask HN: Are we collecting more data, more noise, or more rare events? - rmord

I posted this as a blog entry and in quora, but I would also like to hear any thoughts from the hn community on this.<p>We are in a data splurge. Everyone is interested in data, and developing data driven products. We collect tons of data about our users.<p>But how does one decide what data is worth collecting? And how do you strike the balance between collecting just increasing noise vs those events that will likely give us the crucial insight?
======
codesuela
I would argue that that depends on your architecture. If you gather more data
than you data storage can handle and it starts slowing you down your core
product then you should stop. But if you that is not a factor then you could
collect as much as possible and later decide whether it was or is worth
collecting UNLESS it starts raising privacy concerns if you have a user driven
project.

------
paulsutter
\- Well chosen statistics can extract the needles from the haystack. So
collecting more may be better. This observation is so true that some people
prefer the phrase "extracting needles from needles".

\- Get familiar with the new consumer privacy bill of rights.

