
Show HN: Find HN threads about the page you're browsing - achairapart
https://github.com/pinoceniccola/what-hn-says-webext
======
joshstrange
Love the idea, I have thought about this before and the main issue is training
my self to click an extension to check. If an icon flashed when there was a HN
submission (ideally only if I didn't come directly from HN since I already
know) it would be way more useful.

That said we all know the big issue with that is privacy. I don't want an
extension sending every url I visit to any service (directly to the API or
through some third-party). I've mulled over this issue before and I'm not sure
how much space it would take up to store a list of urls that have been
submitted to HN (Maybe keep 1 month plus all submissions that got over 100
upvotes or something) and check against that local list.

Then, and only then, when you get a match you can call out to the Algolia API
to get the HN url (or store that as well depending on size).

I have no idea, off the top of my head, what the storage requirements for this
look like but I don't think they would be huge. The other issue (which I want
to look into the source to see how this handles it) is the stupid social/ads
tracking params that are added to URLs. Maybe there is a good list of these
that you can remove (from both the current URL and the HN submission) so you
can see if it's the same base URL.

~~~
Willamin
> I'm not sure how much space it would take up to store a list of urls that
> have been submitted to HN

I recently was wondering how much space this would take up myself. After a lot
of searching, I found this reddit post, which links to an archive of
Hackernews. It contains data from HN from late 2006 until mid 2018 and totals
just over 2 gigabytes. These dumps contain all comments, job postings, polls,
poll options, and stories.

I did some super quick analysis of the 2018-05 archive (the latest provided by
this source). I found that there were 237,646 total items, and only 32,473 of
those are stories. That's only ~14%. Assuming the ratio of stories to non-
stories has been constant for the entire dataset, that's only 280 megabytes
for the entire 2006 to 2018 set.

That data can further be shrunk by removing extraneous information from each
story in the data. Mirroring the HN api, it has the following pieces of data
for each item: author username, id, date retrieved, score, time posted, title,
type, url, and whether its dead, how many descendants it has, and what items
are its kids. I didn't attempt to reduce the data to only contain links, but I
imagine it would be significantly reduce the size.

Once you've reduced the data down to a list of urls, I imagine it can be
reduced even more by removing duplicate links.

Depending on the average size of the urls, it's not unreasonable to think that
taking a hash of each of the urls would result in a smaller set of data.

On top of that, there's wonderful text compression, but I don't have the
numbers on how much that would reduce the size of data.

~~~
tleb_
I was curious so I downloaded a list of id-url pairs from here [0]. It's CSV
formatted and contains 1_960_207 entries (last update being 22 feb 2019). It
is 134MiB uncompressed and 35MiB compressed using xz, so definitely storable
in a web extension.

IDs being integers smaller than 10_000_000, they can be stored in 3 bytes and
using a 64 bits hash function is enough (using this approximation [1] with
k=2_000_000 and N=2^64 gives p=1,08e-7) which accounts for 22MB for 2 million
entries. Stats on duplicates would be needed to know the impact of bundling
identical hashes together. Definitely doable!

Keeping up-to-date would be harder, having a server querying the API to
collect and distribute the day-by-day data to every extension-user is probably
the best option.

[0]:
[https://console.cloud.google.com/marketplace/product/y-combi...](https://console.cloud.google.com/marketplace/product/y-combinator/hacker-
news) [1]: [https://preshing.com/20110504/hash-collision-
probabilities/](https://preshing.com/20110504/hash-collision-probabilities/)

------
coffeemug
This is awesome. I really love this idea, and love the implementation. Works
great and I think will be really useful or at least interesting (can't tell
yet).

One piece of feedback-- I'd love a mode where the extension notifies me there
are HN threads that pass a certain threshold (e.g. number of upvotes, number
of comments, etc.) for every page I visit. This is less privacy preserving,
but I'd be willing to make that trade-off in exchange for useful information
being surfaced to me opportunistically.

Thanks for making it!!

------
adenozine
I'd like something like this in reverse.

Downvote if you will, all the same, I find most HN discussions to be of
relatively low-value, and also it's not easy to vet whether or not someone's
credentials align with what they're writing. I come across interesting links
on HN all the time, and I wish I could have something to tell me "Oh, there's
a LtU user with hundreds of posts discussing this with links to papers and
proofs."

My personal perspective is just that way, I don't see myself coming across
anything and thinking "Gee, I wonder what HN thinks about this."

I love the idea though. I wish browsers didn't suck so much. I wish Opera had
won more, and maybe we'd have lots of different browsers, infinitely
configurable like emacs/vim, with my whole little customized universal
browsing tool. Extensions are an adequate compromise, it's just that the kind
of person who thinks this stuff up, could so much MORE if browsers weren't so
limited.

~~~
localcdn
>there's a LtU user

What is LtU?

~~~
jaen
[http://lambda-the-ultimate.org/](http://lambda-the-ultimate.org/)

------
yalooze
I use Kiwi Conversations[0] for this which allows you to check HN, Reddit and
others if you want.

The design of What Hacker News Says is really nice though.

[0] [https://chrome.google.com/webstore/detail/kiwi-
conversations...](https://chrome.google.com/webstore/detail/kiwi-
conversations/pkifhlefpamigmobjmjjjnjglpebflhp)

~~~
cmbailey
Thanks! Sadly their Firefox extension appears to have been removed/blocked.
[https://addons.mozilla.org/en-US/firefox/addon/kiwi-
conversa...](https://addons.mozilla.org/en-US/firefox/addon/kiwi-
conversations/)

------
llimos
I made a bookmarklet for Firefox that opens the HN discussion in a new tab (if
there is one) and offers to submit if there isn't. It's very quick and dirty
but it does the trick. It opens the first result from the search API, can be
modified to open all of them if you want.

    
    
      javascript:(()=>{const w=window.open();fetch(`https://hn.algolia.com/api/v1/search?tags=story&query=${encodeURIComponent(window.location.href)}`).then(a => a.json()).then(a=>{const c=a.hits.filter(b=>b.url===window.location.href)[0];if(c){w.location.replace(`https://news.ycombinator.com/item?id=${c.objectID}`)}else{w.confirm('Not on HN. Submit?') ? w.location.replace(`https://news.ycombinator.com/submitlink?u=${encodeURIComponent(document.location)}&t=${encodeURIComponent(document.title)}`):w.close();}})})()
    

Interestingly I had to open the tab before getting the search results, it
seems there is an exemption to the popup blocker for bookmarklets but only
synchronously.

Edit: It seems the backticks mess something up in HN formatting. Code here:
[https://gist.github.com/llimos/ee818bcb3060adc8469f4978c654a...](https://gist.github.com/llimos/ee818bcb3060adc8469f4978c654a184)

------
kioleanu
Haha, I had the exact same idea a few weeks ago:
[https://github.com/viorelsfetea/commenter](https://github.com/viorelsfetea/commenter)

Your design is much nicer tho

~~~
nishparadox
Hey. I've been using the commenter for a month now. Thanks for such a handy
extension. I use it on daily basis... Appreciate it.

------
KenanSulayman
This is great!

I'm also using [0] which displays mentions of a site on reddit.

(And while you're at it, one [1] that replaces youtube comments with reddits
comments from the subreddit threads where a video was posted to.)

[0] [https://chrome.google.com/webstore/detail/reddit-
check/mllce...](https://chrome.google.com/webstore/detail/reddit-
check/mllceaiaedaingchlgolnfiibippgkmj)

[1] [https://chrome.google.com/webstore/detail/karamel-view-
reddi...](https://chrome.google.com/webstore/detail/karamel-view-reddit-
comme/halllmdjninjohpckldgkaolbhgkfnpe)

------
maxbaines
Also think it would be great to see an indicator of whether or not there are
hits, perhaps not a flash as thats pretty invasive but a number of hits, kinda
like SMS or email count on icons.

The privacy thing is also making me flinch, an idea could be to disable unless
clicked, when I find an interesting page, product or application I often
wonder if its featured on HN

------
alexpi
Simple and great! I consume lot of content from HN and bookmark posted links
often. As everyone here knows hn comments sometimes contribute more to the
topic than article itself (so I add them to favorites). Now both of them
linked. Thanks

------
ekzy
Your extension looks good! I made a similar extension with clojurescript 4
years ago, using algolia api too. It's not intrusive and only look up when you
click. Check out the code here:
[https://github.com/jazzytomato/hnlookup](https://github.com/jazzytomato/hnlookup)

[https://chrome.google.com/webstore/detail/hacker-news-
lookup...](https://chrome.google.com/webstore/detail/hacker-news-
lookup/ekfmfhhfalhmiacchemmhapffjaolffo)

------
XCSme
That's a really cool idea! I added it. I am mostly curious to learn more about
sites using HN as a way to market their products or to know more about the
context in which a product is discussed.

------
eatonphil
One weird thing to me about this extension and Kiwi Conversations is that it
doesn't search comments, only submissions.

------
arendtio
'Looped in' is a very similar extension and a few years older:

[https://news.ycombinator.com/item?id=16316374](https://news.ycombinator.com/item?id=16316374)

[https://github.com/jdormit/looped-in](https://github.com/jdormit/looped-in)

------
SilasX
Love it! One suggestion, at risk of promoting feature creep/visual bloat:
maybe go into those thread and pull the top comments (ideally, the top
comments over all discussions), and have those pop us as the first thing I see
on the drop-down, instead of just links to the discussions?

------
commonturtle
Nice, I've wanted something like this for a while. HN often has substantive
comments on writing in the internet, so I often find myself checking if
something interesting I read has been submitted to HN before.

------
ikedaosushi
I'm using a similar extension which allow to see comments and threads.
[https://github.com/doublemarket/hnpopup](https://github.com/doublemarket/hnpopup)

------
johnnujler
To me HN Algolia has been a one-stop shop for everything related to search in
HN. I always have a browser tab that has HN Algolia opened up for any kind of
research. I’d love if this extension could be extended to include HN Algolia
too.

------
ivan_ah
Here is a similar web extension for reddit discussions
[https://thredd.io/](https://thredd.io/)

------
lukeplato
off-topic: even if it's not necessary for a blog, someone needs to hook up
Paul Graham with an SSL certificate.

------
swyx
relatedly - a browser extension that shows you twitter convos about the page
you are browsing [https://github.com/round/Twitter-Links-
beta](https://github.com/round/Twitter-Links-beta)

------
zingermc
I feel compelled to point out that this extension sends the URLs of _all_ open
tabs to algolia.com when you click the extension (at least on Chrome).

I would much prefer if it only looked up the current tab.

A more private design might fetch the top N results from algolia.com and only
search through them locally.

That being said, this is cool! Thanks for sharing.

~~~
achairapart
>I feel compelled to point out that this extension sends the URLs of all open
tabs to algolia.com when you click the extension (at least on Chrome).

Wait, how's that possible? The extension doesn't even have permission to get
urls from tabs that are not the active one...

~~~
zingermc
Your comment made me dig in a little more. I was wrong, it is only fetching
the current tab, although it wouldn't need more permissions to see all the
tabs.

In popup.js[1]:

    
    
        chrome.tabs.query({active:true,currentWindow:true}, function(tabs){ ... })
    

These `active` and `currentWindow` parameters to query() [2] restrict the
results to the current tab. If I remove those parameters and run in DevTools,
I seem to get a full tab listing.

[1]: [https://github.com/pinoceniccola/what-hn-says-
webext/blob/ma...](https://github.com/pinoceniccola/what-hn-says-
webext/blob/master/popup.js#L104)

[2]: [https://developer.chrome.com/extensions/tabs#method-
query](https://developer.chrome.com/extensions/tabs#method-query)

~~~
achairapart
Even without `active` and `currentWindow` parameters the extension cannot get
urls and titles from other tabs because it has only the `activeTab`[1]
permission declared in the manifest. You need more powerful permission for
that.

I think with the `activeTab` permission you still get the an object for every
tab other the active one, but without access to `url`, `title` and
`faviconUrl` properties.

Thanks for checking out anyway. I built this tool especially because all of
the others already available were a privacy nightmare.

[1]: [https://developer.mozilla.org/en-US/docs/Mozilla/Add-
ons/Web...](https://developer.mozilla.org/en-US/docs/Mozilla/Add-
ons/WebExtensions/manifest.json/permissions#activeTab_permission)

