
Hacker News for Hackers - PaulHoule
http://ontology2.com/essays/HackerNewsForHackers/
======
grzm
Impressive. The aspect I find of particular interest is dog piling. When I
first heard about Prismatic[0] (RIP), I thought that would be a great way of
discovering articles on topics I was in interested from sources I didn't
already know. I ended up being overwhelmed by articles dog piling without a
useful way to filter them for either (a) the original source of the news or
(b) articles written on the topic from sources who added valuable context or
insight into the topic. I'm glad to see someone tackling this particular
issue.

One aspect the author touched on is "I'd like to be suprised by relevant
things that I don't know about." It's not immediately clear how to discover
serendipitously items that are outside of the things you're already interested
in, and therefore would already be in your feed. This is an area I'd like to
know more about.

I'm also really keen on someone taking the time to automate curation of their
own feed. One can still fall prey to the more negative aspects of Daily Me[1],
but at least you're better aware of what's going into producing the feed
you're reading and have the tools to update it if you find it's not serving
your best interests.

[0]:
[https://en.wikipedia.org/wiki/Prismatic_(app)](https://en.wikipedia.org/wiki/Prismatic_\(app\))

[1]:
[https://en.wikipedia.org/wiki/Daily_Me](https://en.wikipedia.org/wiki/Daily_Me)

~~~
CPLX
Perhaps off-topic but I was quite fond of Prismatic. Is there a good
replacement out there you've found?

~~~
grzm
No, unfortunately. I haven't really searched much. My patience for trying out
new tools isn't all that high. The only two I've really used have been
NetNewsWire (which I stopped using I think back around the time NewsGator
acquired it) and Prismatic, which I never really found useful. The closest
thing I use now is HN. I think likely the next tool I'll use in this space is
roll-my-own as the author has or some app that provides the building blocks or
plugin architecture for customization along those lines.

------
napa15
Are you not setting this up as a website? I was really hoping that this would
turn out as an announcement to be an aggregator of an aggregator. Personally I
dont understand either why Apple news and 'Why is X down' links dont get
banned, these are the most obvious offenders. I was just about to make a list
of posts on the top page that I personally wouldnt allow on a serious
hackernews site and I stopped because it would include almost every single
link and at that point I feel like I'm just grandstanding. The number one rule
that I would have for news that I dont want to read is that if this is some
political decision, I dont want to hear it. Politics change all the time, it's
messy, it's opinionated and frankly populated by stupid people with stupid
commentators. Absolutely no value gets generated by reading some post about
how solar power is now 20% more affordable in Florida or how France just
banned Monsanto crop nr. 4 or how country X wants independence. These daily
occurences will happen over the course of thousands of years, I dont care. You
would have to keep reading and following that particular industry/country in a
research-like capacity to be up to date at all. Other rules are of course any
kind of 'do this to be a better programmer' post. Look I heard it all, I dont
need to hear it for the next 30 years too, this is all too similar to politics
and they are always blatantly opinion pieces instead of objective looks at the
landscape.

~~~
petercooper
You might like
[http://www.hackernewsletter.com/](http://www.hackernewsletter.com/) \- a guy
whittles down the stories but on a weekly basis.

~~~
dmd
[http://n-gate.com/](http://n-gate.com/) is better

~~~
petercooper
Well.. I'd agree with _funnier_ :-D

------
jpl56
Pop-ins : here is what I do

1) I found here, on Hacker News, a nice "kill sticky" javascript function to
put in my shortcuts

2) In Firefox, I gave it a shortcut keyword

3) In my AutoHotKey script, I assigned a key to "type the shortcut keyword in
the address bar" to run it quickly

4) when a pop-in occurs, I reflexively press my kill key

The script is

    
    
      javascript:(function()%7B(function%20()%20%7Bvar%20i%2C%20elements%20%3D%20document.querySelectorAll('body%20*')%3Bfor%20(i%20%3D%200%3B%20i%20%3C%20elements.length%3B%20i%2B%2B)%20%7Bif%20(getComputedStyle(elements%5Bi%5D).position%20%3D%3D%3D%20'fixed')%20%7Belements%5Bi%5D.parentNode.removeChild(elements%5Bi%5D)%3B%7D%7D%7D)()%7D)()
    

Enjoy!

~~~
Retr0spectrum
For those curious, this is what that script is doing:

    
    
        var i, elements = document.querySelectorAll('body *');
        for (i = 0; i < elements.length; i++) {
        	if (getComputedStyle(elements[i]).position === 'fixed') {
        		elements[i].parentNode.removeChild(elements[i]);
        	}
        }

------
megaman22
> Videos, Podcasts, etc.: If I am rating 100+ articles a day I just can't
> spend the time it takes to look at time-based media.

It makes me sad when I see an interesting looking post, and it turns out that
it is only a video or podcast, with no proper writeup. I just don't have time
to watch an hour-long video, for content that I could read at my own pace in
ten/fifteen minutes.

~~~
minimaxir
Usually those submissions are affixed with a _[video]_ or _[audio]_ tag.

------
minimaxir
This is certainly a _different_ take on Hacker News data.

One of the thing I honestly dislike about comments on Hacker News is the
advocation of the No True Scotsman-esque definition of "Hacker," where the
_only_ thing that matters is code and how it's used. In 2017, there's more to
being a "Hacker" than just what low-level language is being used.

~~~
mikegerwitz
> In 2017, there's more to being a "Hacker" than just what low-level language
> is being used.

Always has been:

[https://stallman.org/articles/on-
hacking.html](https://stallman.org/articles/on-hacking.html)

------
Nomentatus
I'm a bit more interested in business (although a programmer first); so my
article choices would have been somewhat broader. One of the things I value
about Hacker News is that it helps introduce people who might be coders right
now to business, market and management info; which will prepare them for the
future, but also help ensure their coding decisions fit the business that
hired them. I know that's not the primary mission of Hacker News, but I do
think it's a valuable contribution.

There's an historical pendulum with a cycle of many decades re monopoly/market
power regulation which seems to be reversing itself right now. IMHO, this will
likely be the most important shaping factor shaping the tech industry over the
next decade, especially re small startups. So I don't turn up my nose at "Big
Tech Companies Behaving Badly" articles, although I know there's a fair bit of
repetition. I do want to know where that pendulum is, and whether it's really
swinging back. Yes, it's law that will be the instrument of that change, but
it's public perception that will necessitate changes in the law (say re patent
misuse) and its implementation.

------
garysieling
That's an interesting project. I've been exploring this from a different
angle, in the form of generated email newsletters
([https://www.findlectures.com/form?type=alert](https://www.findlectures.com/form?type=alert)).
This uses the contents of the articles as well - I'm crawling links on
programming subreddits.

If you start from keywords you can use NLP (Word2vec) to measure how close
articles are to your interests, so for instance "python, machine learning"
gives you stuff on tensorflow, scikit learn, etc whereas "java, machine
learning" gives you articles on spark, Deeplearning4j.

You can also measure how similar articles are to each other - getting the most
dis-similar articles avoids the "piling on" problem mentions.

~~~
pmoriarty
Many years ago, back when Bayesian spam filtering was the hot new thing, I
wrote up a little program that would classify articles in RSS feeds based on
whether I found similar articles to be interesting or not interesting in the
past.

The ultimate goal was to save me time and effort in manually classifying them,
as I and everyone else do when we scan through what we come across on a daily
basis. Instead of manually doing that, the hope was the program could do it
for me, and I could just focus on reading the interesting articles and not
even have to deal with the uninteresting ones.

From that experiment I learned a few things:

\- First, that I'd have to manually scan through all the articles anyway, just
in case the classifier made a mistake and maybe dumped an article I found
intensely interesting in the uninteresting pile.

\- Second, that having to consciously think about which article was
interesting or uninteresting in order to do the training, about whether the
classifier was working or not and which articles needed to be reclassified,
about having to re-train it when it messed up, and so on was a hell of a lot
more work than just scanning through my RSS feed manually and deciding on
which articles I found interesting or not myself.

\- Third, my interests were not static things that the algorithm could learn
and classify on correctly from then on out. My interests were constantly
changing. Sure, maybe there were a handful of things I always found
interesting or uninteresting -- but overall what I found interesting or not
changed from day to day. It was also kind of unpredictable, even to myself.

The third point kind of argues towards the approach of the HN front page,
which is un-classified and un-tagged. I've read all sorts of great articles on
HN that if I'd been going by some pre-written list of interests that I had, I
would have never have read. I do still often wish for tagging on HN anyway,
just because there are certain types of articles that I really never ever want
to read, and I'd love to be able to exclude them. But the vast majority of HN
articles aren't of that kind (or I wouldn't be here).

That experiment with bayesian classification turned out to be rather short-
lived, as I found the whole thing way too much of a bother to maintain and to
retrain when it misclassified articles. I'm still reading RSS feeds the old
fasioned way today, and am a little suspicious about any AI/machine-learning-
like approaches to article classification.

~~~
crznp
> First, that I'd have to manually scan through all the articles anyway, just
> in case the classifier made a mistake and maybe dumped an article I found
> intensely interesting in the uninteresting pile.

But I'm sure there are interesting articles that didn't get classified at all.
It doesn't seem like the end of the world if you lose a little wheat with the
chaff.

> overall what I found interesting or not changed from day to day

This seems like the problem. If you rated an article highly yesterday, it
doesn't mean you want to read the same article again today. "Interesting"
largely means "novel" and it is hard to find that by looking at similarities
to what was new in the past.

~~~
pmoriarty
_" It doesn't seem like the end of the world if you lose a little wheat with
the chaff."_

But is it losing a little or a lot? No way to tell without looking.

Even if it misses just a little, that little bit might have been crucial. If
it misclassifies "NYC NUKED!!!" as uninteresting that's a single mistake that
could make me oblivious to a hugely consequential event.

Of course, as a human classifier, I'll doubtlessly making my own mistakes, and
maybe the AI classifier could help me out by pre-classifying articles for me,
but it could also be misleading and will cost me in terms of spending time on
training and maintenance. I'm not really sure what the right solution is here.

~~~
PaulHoule
If NYC got nuked, I'd probably hear about it some other way.

------
rrggrr
Can't find link to source code. Is there source available on this?

------
ivm
Just a keyword filter works great for reducing the noise. I wrote one for
hckrnews.com:

[https://gist.github.com/ivmirx/66a0015884d44297ea05a8c54d935...](https://gist.github.com/ivmirx/66a0015884d44297ea05a8c54d93566d)

------
laurent123456
> Off-Topic: ... If they'd cover it on TV news, it's probably off-topic.

That would be nice but unfortunately in practice all the big news make their
way to the top of HN, despite the fact that we've already read and heard about
them from countless other sources. For example some of the top posts at the
moment are:

\- Catalan parliament declares independence from Spain

\- New Zealand to ban foreigners from buying existing houses

\- How to Read the JFK Assassination Files

~~~
danso
“All the big news”? That’s quite a stretch. If it were true, it would mean the
most of the other 27 stories on the HN front page could conceivably be found
on a newspaper front page, which is clearly untrue. I don’t think it’s
unreasonable that some news is so big — such as the potential creation of a
new European country — that HN users want to discuss it.

~~~
TeMPOraL
Indeed. For years now I've been using HN as a proxy news filter, based on the
assumption that if something _actually important_ happens - like a war or a
deadly earthquake - it'll get on HN, and all the rest of the news can be
safely discarded. It generally works pretty well for me.

------
detaro
Interesting project, and nice to see some data on it. I've been mulling
something like this as well.

One issue I have with pure keyword filters when I apply them to e.g. RSS feeds
is that they can't capture details well - e.g. I'd like seeing a detailed
article about how V8 works and wouldn't to exclude an otherwise interesting
thing just because it is in JS, then you spend quite a lot of finetuning the
filters.

~~~
PaulHoule
OP here: definitely the keywords-in-title have limitations as a feature set.
Once I get more training examples (enough to characterize what "interesting")
I will definitely look at the full text, HTML metadata, etc.

As you say, there is interesting deep stuff going on in the JS space, but
there is too much average stuff for me to look at right now.

------
11thEarlOfMar
I don't have time to read the whole thing, so maybe it's already there, but,
....

I'd like to be able to credit comments that got me to re-think or even change
my position on a topic. Upvotes basically say, 'that's a good point', or 'I
agree with you'. But I'd like to see a way to gauge 'influence', in terms of
affecting what people think in a positive way.

Can that be worked in?

~~~
danso
What about stating this in a reply?

~~~
mikestew
I've often prefaced such replies with "Because sometimes a upvote isn't
enough...", and those get downvoted about 40% of the time. Not that I care, I
bathe daily in karma points, it's just a tiny bit disheartening.

I would also argue that an upvote is not "I agree with you", but "this
contributes to the conversation in a positive manner". I frequently upvote
posts with which I disagree. Conversely, a downvote in my book says, "hey,
you're kind of being a dick and dragging the conversation down" rather than "I
disagree". I mean, I'll downvote something that is just demonstrably wrong,
but more often than not it's "quit being a dick".

------
jasonkostempski
I've been using newsbeuter with filters for the undesirable domains and I
probably don't spend more than 1 minute a day skipping over stuff I don't
want. The biggest win is using RSS just so I don't have to rescan things I've
already skipped.

