In case you're wondering why this obviously brilliant article doesn't get much discussion, or many votes, some people here have seen it before. Here are some of the previous submissions:
God DAMN you put more work into meta-summary than most of us put into actual comments ;)
I know there's been talk about it, but has anyone hacked together a quickie-scraper tool or mini-app that would, given a HN submission, find similar submissions, and generate a plaintext list like the parent comment, but maybe with some metadata (such as date of submission, number of comments, number of upvotes). I'll put that on my side projects list but I bet someone more OCD than me has already done something like this.
I have a bunch of half-finished, partially working, user-hostile tools. I can't bundle or release them, it would take too long to write what they do, and don't do. I could do something like you say, but again, as you say, I'd bet someone else has already mostly done it.
I wonder if I could cobble together from what I have something that I could make public. Next time I cross-reference stuff I'll leave a trail of bread-crumbs to see if I can regularize/semi-automate it.
If memory serves, someone built this with a semi-automated account, he got some (to him) surprisingly heated criticism from the community, and he stopped.
The search box at the bottom of the page does a pretty good job, given a title :)
Depends on what you mean by "similar submissions", though. Similar titles (similar in topic? Sentiment? Levenshtein distance?), identical links, similar linked content (similar how?), etc.
Frequently you need to pick a single word, or perhaps two, from the title in order to find related articles. Using something like TF-IDF on submission titles (perhaps after mods have "normalized" them) then works well on the search. Then you want to make sure the articles really are the same, so you need to download the article, extract the relevant text, and do a similar sort of comparison.
Or just eyeball it. Or allow for mistakes.
Then there's the question of deciding whether it's worth having a cross-reference at all. Heuristics about number of comments, age of submission, etc, might be relevant.
2012-01-24 (~852 days ago)
389 points and 45 comments
Top comments:
- bitops [27 points]: Only two minor quibbles: 1) did not mention Clojure. 2) broke the amusing narrative a bit in the middle by including a true story (Perl). Really funny otherwise. [https://news.ycombinator.com/item?id=3504001]
- f4stjack [26 points]: "1996 - James Gosling invents Java. Java is a relatively verbose, garbage collected, class based, statically typed, single dispatch, object oriented language with single implementation inheritance and multiple interface inheritance. Sun loudly heralds Java's novelty. 2001 - Anders Hejlsberg invents C#. C# is a relatively verbose, garbage collected, class based, statically typed, single dispatch, object oriented language with single implementation inheritance and multiple interface inheritance. ... [https://news.ycombinator.com/item?id=3503969]
- perfunctory [24 points]: My favourite is actually: 1972 - Dennis Ritchie invents a powerful gun that shoots both forward and backward simultaneously. Not satisfied with the number of deaths and permanent maimings from that invention he invents C and Unix. [https://news.ycombinator.com/item?id=3504067]
2009-05-08 (~1844 days ago)
299 points and 14 comments
Top comments:
- tumult [52 points]: Lambdas are relegated to relative obscurity until Java makes them popular by not having them. Alright, this was surprisingly amusing. Thanks. [https://news.ycombinator.com/item?id=599281]
- jfarmer [33 points]: My favorite:\n1972 - Dennis Ritchie invents a powerful gun that shoots both forward and backward simultaneously. Not satisfied with the number of deaths and permanent maimings from that invention he invents C and Unix. [https://news.ycombinator.com/item?id=599308]
- visitor4rmindia [17 points]: Oh my God! That was the best belly laugh I've had in a long time. Very fun read. > 1995 - Brendan Eich reads up on every mistake ever made in designing a programming language, invents a few more, and creates LiveScript. Later, in an effort to cash in on the popularity of Java the language is renamed JavaScript. Later stil, in an effort to cash in on the popularity of skin diseases the language is renamed ECMAScript. [https://news.ycombinator.com/item?id=599345]
2013-05-12 (~378 days ago)
209 points and 33 comments
Top comments:
- ldubinets [26 points]: It took me a couple minutes too... The first comment sorts it out though. Turns out that Jacquard's loom was multi-threaded after all. [https://news.ycombinator.com/item?id=5697144]
- mercuryrising [17 points]: I like this. I like this a lot. I think far too often people take the concepts we study too seriously. Things get challenging, things get precise, but the moment the humor leaves, the creativity is gone. Think of how easy it would be to learn something if you make a joke every 5 minutes while learning it (about the subject). In your mind, you turned this abstract concept into something else, something funny, something with pathways and connections that weren't expected. You manipulated it, chan... [https://news.ycombinator.com/item?id=5696773]
- Symmetry [17 points]: Reminds me of C being described as "a language that combines all the elegance and power of assembly language with all the readability and maintainability of assembly language". [https://news.ycombinator.com/item?id=5696368]
https://hn.algolia.com/?q=brief+incomplete#!/story/forever/p...
Of course, it may again get lots of discussion and lots of up-votes. We'll see.
https://news.ycombinator.com/item?id=7263243
https://news.ycombinator.com/item?id=7149634
https://news.ycombinator.com/item?id=6953863
https://news.ycombinator.com/item?id=6504217
https://news.ycombinator.com/item?id=6234361
https://news.ycombinator.com/item?id=5804668
https://news.ycombinator.com/item?id=5728844
https://news.ycombinator.com/item?id=5728843
https://news.ycombinator.com/item?id=5695816
https://news.ycombinator.com/item?id=5377944
https://news.ycombinator.com/item?id=5129062
https://news.ycombinator.com/item?id=4586462
https://news.ycombinator.com/item?id=3507566
https://news.ycombinator.com/item?id=3503896
https://news.ycombinator.com/item?id=1475826
https://news.ycombinator.com/item?id=1327746
https://news.ycombinator.com/item?id=1310127
https://news.ycombinator.com/item?id=599164