Hacker News new | past | comments | ask | show | jobs | submit login
Predicting Hacker News article success with neural networks and TensorFlow (intoli.com)
87 points by foob on May 23, 2017 | hide | past | web | favorite | 30 comments

"Rust Rust Rust Rust Rust" seems to be the optimal number of Rusts with 97.6% success probability.

EDIT: This beats it with 99.4%:

  "  Rust Rust Rust Rust Rust                 "

You can do even better - "YC PG Rust Rust Rust" gets 99.7%!

"YC YC YC YC YC" gets you 99.9%!!

edit: "YC YC YC YC YC YC" -> 100%

"YC YC YC Rust Rust" is also 100% :). Gotta have some Rust.


YC YC YC YC YC YC golang is better thanRust

100% :)

Live longer by coding at YC and Rust

Interesting article.

Ignoring the outlier, stories have at best a 1 in 3 chance of succeeding on HN. This means 2/3 of the interesting stories passing through HN are lost, and I could potentially triple my high quality reading material. It is amazing how much good content there is out there that I will never find.

Don't fall victim to FOMO (fear of missing out). If something is important enough you'll hear about it anyway, if not then it's whimsy.

Think of it like when you go to a restaurant, sometimes you choose a meal but you notice other people eating things that look more interesting which you are now too full to try. It's possible you'll never been able to eat at that restaurant again (eg if you're on vacation or it's the last night of the restaurant's operation (I've had quite a lot of favorite restaurants shut down, taking many fond gastronomic and social memories with them).

But really, this is a misunderstanding of opportunity cost. Those other meals may have been delicious, but so was yours; you would still only have been able to comfortably eat one meal, rather than the whole menu; and if you had had what your neighbor was eating, then you might have regretted not having your own meal.

Game theorists and economists explore regret minimization frameworks for decision-making, and that's valuable, but you have to consider both the extra work involved in applying the framework vs the real opportunity cost. A surfeit of choice (whether via advertising or in reality) can lead to overestimation of opportunity cost by tricking you into imagining you could enjoy all alternatives whereas in reality your selection was going to be limited anyway.

you may find it interesting to think about the psychology of collecting, and how it differs from usage. Collections can themselves have considerable value (scientific, cultural etc.) but some collections are the result of acquisition gone wrong and tipping into hoarding without any enjoyment of the collected object.

> you are now too full to try

That's your mistake right there.

This model doesn't match for resubmission of stories, so I am assuming that the reality of the stories you are potentially deprived of may be quite different.

I see what you did there with your own title: https://imgur.com/a/SthQV

Pretty interesting : The highest score I've been able to find right now, apart from the extreme examples, is 49.6% success, 5.3% flag ... with the title "Predicting Hacker News article success with neural networks and TensorFlow".

Did the author chose that title on purpose? :)

It's fun to try a title, and then add "Ask HN:" or "Show HN:" in front of it and see the probability change dramatically, or remove the (YC ...) from the extreme examples and see the prediction change.

Try 'YC has died'.

Predicting the success of comments is way easier. Just lean left-wing for positive points and right-wing for negative points. I have been testing this myself for a while.

Libertarian economic policy is typically right wing and it will very often get upvoted.

Libertarian social policy is left wing and will typically get upvoted.

I'd say if you want a better than random chance to have positive upvotes lean Libertarian not left. Except when the Libertarian view is left in which case you are doing both.

Also, please don't bring politics into a non-political story.

I agree that HN is more likely filled with Libertarians than liberals. Also, as many are INTJ, saying just the garbage INTJs like hearing will get you more love.

I did something similar ("50 terms most predictive of a submission making it to the front page") some time ago: https://news.ycombinator.com/item?id=10893677 .

One realization was that it was easier to predict if a title/keyword would NOT make it to the front page than if it would. That is, it's clearer what to avoid (startup, app, business, product, mobile, marketing, etc) than what to do.

Why do people assume posts on the front-page are driven fully automatic + some secret juicy for points and what have you?

Why wouldn't YC just use a human (or two) to bump/nudge posts they would like to climb or expose according to their agenda/internal policy?

It you monitor the hot page, there is a clear political bias, topical bias, as well as temporal peaks of movements/ranking indicating that would be the case - too complex for any ML currently to predict. Just my 2 cents.. keep it simple.

HN is a hive mind that has some complex transfer function. This project is experimentally (using past data) determining that transfer function.

I upvoted this story because the title scored well in the model.

Show/Ask HN seem to do pretty well. I suppose I'd have expected that, given the community feel of the site. I'd say HN is a pretty good place to test a prototype or ask for guidance, and overall people do seem to constructively and thoughtfully try to help each other out.

I absolutely love that the author includes both a live demo and an explanation of how they did it! Bravo.

Although, something seems odd when whitespace effects the score. It may have been a good idea to normalize the whitespace.

This was a very interesting read though most of the article simply went over my head (because I'm just a lowly web dev). What would be the shortest path to learning the things described in this article?

"Predicting Hacker News article success with neural networks and TensorFlow" has success probability of 49.6%. I am wondering if OP tried several options to come up with this title.

This is a neat tool! I had too much fun identifying minimal titles, particularly my coming post "(YC S0000009)" with 99.3% probability of success and a 0% flag probability.

I love the way putting in a title gives a score, and then putting a space on the end sometimes changes it. Sometimes for the better, sometimes for the worse.

So: Consider the trailing space ...

Don't forget to send some of your karma back to the author in between spending it on champagne and yacht rental.

"Universal Basic Income, Neural Networks, Rust" does pretty good at 50.6%/3.2%.

"A* explained" should probably have a low probability of being flagged.

"HN considered harmful"


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact