Hacker News new | past | comments | ask | show | jobs | submit login
HyperLogLog and MinHash (adroll.com)
47 points by ColinWright on Dec 7, 2014 | hide | past | favorite | 7 comments



For those interested, here's a nice HyperLogLog implementation in Scala: https://github.com/twitter/algebird/blob/develop/algebird-co...


I love that every month or two someone who missed it the first time rediscovers this article, has the "holy crap!" moment, and posts it to HN. :)


Can you point me at earlier conversations? I'd love to see what the HN hive-mind has to contribute to this. I've only found one previous submission[0], and that has no discussion.

Thanks.

[0] https://news.ycombinator.com/item?id=6460048


Try HN search - https://hn.algolia.com/#!/story/forever/prefix/0/hyperloglog

But I take exception to hive-mind; this place really isn't.


Thanks for the link to search, but I've already done that, and you serve to confirm my point. Unless I'm mistaken, that search shows lots of items about HyperLogLog, but exactly one previous submission of this specific subject wherein the facilities of HLL are extended to include intersections. And that item has no discussion.

This submission is not about HyperLogLog - it's about an extension that has more capability. When Harimwakairi said:

    > I love that every month or two
    > someone who missed it the first
    > time rediscovers this article,
    > has the "holy crap!" moment,
    > and posts it to HN. :)
... they were, quite simply, wrong. More, that comment may have served to make people say "Oh, I've seen that before" and ignore this one.

  > But I take exception to hive-mind;
  > this place really isn't.
I guess this is a case of terms carrying different baggage in different places. In my circles the implication is more that of "collective intelligence" rather than "group-think". I intended the former - it's a compliment.


We recently open-sourced the algorithm described in the blog post here:

https://github.com/AdRoll/cantor

Btw, we are hiring! Feel free to ping me at ville@adroll.com if you find topics like this relevant to your interests.


Streaming algorithms, or sketches, are a fascinating topic. Some of the results are virtually indistinguishable from magic.

I can recommend this blog for those interested: http://research.neustar.biz/tag/sketching/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: