
Show HN: Chrome extension to avoid noisy domains in the Hacker News feed - mathiasrw
https://github.com/mathiasrw/no-noise-hacker-news/blob/master/README.md
======
im_dario
From the source code:

Noise list generate with data from
[https://bigquery.cloud.google.com/dataset/bigquery-public-
da...](https://bigquery.cloud.google.com/dataset/bigquery-public-
data:hacker_news) per november 2015

All domains with more than 2500 stories with more than 60% of stories having 3
or less votes

Sites with usergenerated content was manually filtered out.

[...]

Sites with usergenerated content:

    
    
      - blogspot.com
      - youtube.com
      - wordpress.com
      - tumblr.com
      - google.com
      - wikipedia.org
      - github.com
      - bit.ly
      - typepad.com
      - reddit.com
      - stackoverflow.com
      - quora.com
      - youtu.be
      - stackexchange.com
    

Result:

    
    
      - techcrunch.com
      - nytimes.com
      - arstechnica.com
      - wired.com
      - bbc.co.uk
      - wsj.com
      - businessinsider.com
      - forbes.com
      - cnn.com
      - venturebeat.com
      - mashable.com
      - theverge.com
      - thenextweb.com
      - cnet.com
      - washingtonpost.com
      - theatlantic.com
      - readwriteweb.com
      - gigaom.com	
      - theguardian.com
      - economist.com
      - reuters.com
      - bloomberg.com
      - yahoo.com
      - guardian.co.uk
      - zdnet.com
      - engadget.com
      - slate.com
      - technologyreview.com
      - theregister.co.uk
      - posterous.com
      - bbc.com
      - gizmodo.com
      - npr.org
      - businessweek.com
      - itworld.com
      - fastcompany.com
      - huffingtonpost.com
      - telegraph.co.uk
      - networkworld.com

~~~
douche
Hmm, filtering out wikipedia, blogspot and wordpress would eliminate some of
the more interesting content.

Surprised medium isn't on here. I'm sort of done with them, signal-to-noise is
just not very good.

~~~
nprescott

      > Sites with usergenerated content was manually filtered out.
    

I read that as Wikipedia, blogspot and wordpress are not being included in
"noisy domains" \- which you can verify by looking at:
[https://github.com/mathiasrw/no-noise-hacker-
news/blob/maste...](https://github.com/mathiasrw/no-noise-hacker-
news/blob/master/src/script.js)

I agree with you about Medium though, I might go through the last several
months and see if there were any legitimately interesting articles I read from
Medium.

The extension is interesting but ultimately I think too set in stone to be
generally interesting. I think it would have broader appeal if each user could
edit the filtered domains by default.

~~~
zentiggr
The author does say he's aware of this, and PRs are welcome... I like the idea
of being able to flag certain domains as well... although I'm rarely looking
at newest anyway.

------
jasonkostempski
I have a handful of these sites in custom uBlock filters that hide the links
to the sites. The sites either require ad blockers be disabled or require a
login to use so I'd rather not ever see them. I use a similar list to filter
my QuiteRSS feeds. If anyone is interested
[http://pastebin.com/T1gxeWtj](http://pastebin.com/T1gxeWtj)

------
Uehreka
Hmm, I feel like a lot of these websites are good for the news feed. If
something big happens in the news and HN wants to talk about it, I'd rather
the featured article be from nytimes or wsj than some random person's blog.

I also think there is a total lack of consensus about what constitutes "noise"
on HN or how bad the problem is. I find the political discussions here to be
interesting, but I know a lot of people don't (they just want to talk about
business and tech). I like talking about "hip" JS frameworks, but a lot of
people here aren't web developers or aren't interested in this area of web
development. I'm not a super "businessy" person, so a lot of "A merged with B
and now their preferred equity is underwater" articles/discussions go over my
head. I often find Medium articles thought-provoking, but some of the stuff
posted there is garbage, and many HN readers have only been exposed to that
side.

Perhaps a better way to attack this problem is to look at the topics commonly
posted about on HN and find a way to filter by topic. There are
precise/difficult and greedy/easy ways to go about this, but that might be the
best way to satisfy people who have different opinions about what is "noise".

------
molecule
_> I am talking about the noise from big websites trying to push their content
to HN hoping for a bigger audience._

Does this really happen? If so, it's hard to imagine even 10% of the sites
listed actually doing this. It doesn't seem like there would be any overlap
between "big websites" and sites where traffic from HN would be noticeable.

~~~
tedunangst
I am somewhat skeptical the economist is pimping their content on HN hoping
for a bigger audience.

~~~
jessaustin
We're _thought leaders_ , dammit!

------
bryanlarsen
I'm not sure why "noisy" domains rated X would be any worse than other domains
with the same rating X.

(rating does not equal # of points; hacker news takes a lot of other factors
into account, resulting in a position on the front page and amount of time on
the front page)

I'd trust the continuously tweaked hacker news algorithms. We do know that
certain domains are penalized.

If you want to avoid the cruft, just use some sort of service that only shows
you the best posts on hacker news. hckrnews.com is one that I like, use the
filtering in the top right.

~~~
zokier
I think this is for browsing /newest where noisiness is a more significant
factor

------
koolba
Typo and/or non native speaker?

> See src/script.js for more details on how the domains where found.

Probably s/where found/were selected/.

~~~
mathiasrw
Thanks - fixed...

