
Why HN Should Use Randomized Algorithms - signa11
http://danluu.com/randomize-hn/
======
Udo
I think it's a great idea that solves an actual problem and it would probably
help smooth out voting rings that slip past the detector.

However, that would probably mean HN couldn't optimize or cache the front
page. On the other hand, this is a feature for logged-in users only, so it
might be feasible.

~~~
1200bps
As far as I can see, HN already has randomized algorithms. I just had to
create a new account because my six-month old 3,000+ karma account just got
slowbanned and silentbanned for apparently no reason.

Out of nowhere, anywhere from 6 to 14 seconds to load the front page (or any
other HN page) when logged into that account.

~~~
dserban
Interesting.

The HN community appreciates diversity and unbiased, well-reasoned debate.

Maybe you were too strong of an advocate of a certain technology stack /
platform to the detriment of all others?

What is your old account?

Edit: typos.

~~~
sologrrl
Good guess
[https://www.hnsearch.com/search#request/all&q=300bps+microso...](https://www.hnsearch.com/search#request/all&q=300bps+microsoft)

~~~
300bps
So I had 555 comments, 42 of which contained the word Microsoft. I had 5
submissions, none of which were about Microsoft.

Is your conclusion that I was slowbanned because 0% of my submissions were
about Microsoft and 7.5% of my comments mentioned Microsoft?

If someone who doesn't work at Microsoft and has no affiliation with them
whatsoever other than to use their products is banned for mentioning Microsoft
so infrequently, then this place is a ridiculous echo chamber.

EDIT: My last comment mentioning Microsoft was over 3 weeks ago and honestly
it's more about stock performance of multiple companies than it is about
anything to do with Microsoft.

[https://news.ycombinator.com/item?id=6745233](https://news.ycombinator.com/item?id=6745233)

------
kens
This sounds like a cool idea that solves a real problem, but I'm not convinced
it will work. The main blocker for an article is getting the first two or
three votes fast enough to get from the new page to the front page. If you
look at my article on HN scoring [1], most successful articles shoot up
quickly and then slowly drop, so randomization is mostly going to just put
declining articles back on the front page for a bit, which doesn't really help
the reader or articles.

To be effective, the randomization needs to be focused on newish articles. I
like danmaz's idea of adding a random new article to the front page. Best
would be combining ideas and using weighted randomness to add a new article to
the front page, so a vote or two would boost an article's chance of getting
the random slot.

(Another thing that would help is if the More link didn't time out after a
couple minutes, so readers could go to the second page more reliably.)

[1] [http://www.righto.com/2013/11/how-hacker-news-ranking-
really...](http://www.righto.com/2013/11/how-hacker-news-ranking-really-
works.html)

------
danmaz74
Giving a randomness to the whole sorting of the articles could have a "noisy"
effect on users who update the home page often.

On the other end, PG could just add a line at the end of the home page to show
one post randomly selected from the new articles. This way we could all
contribute to the selection of new articles, instead of only relying on those
brave souls who regularly wade through the "new" page.

~~~
pmiller2
>Giving a randomness to the whole sorting of the articles could have a "noisy"
effect on users who update the home page often.

Yes, but not much. Provided the amount of noise is small enough, the order of
articles on the front page may change a bit, but the articles themselves won't
change much. I don't know much about the typical distribution of scores of
articles on the front page at any given moment, but I suspect even the order
of most of the page probably won't be affected.

Think about it this way: pushing an article that's already high up on the home
page up or down a little in ranking won't push it off the front page. It's
only the bottom of the page (and the top of page 2) that would even be
materially affected.

~~~
danmaz74
Agreed on that, but you know, just not knowing if you are "really" #3 or not
is bad, if you have your post in the home page. IMHO it would detract some
from the value of the website.

------
tehwalrus
It is unfortunate that you have to read the footnotes to understand the
article - on a first read of the text I was completely lost.

This is, though, a good idea - at the moment, many interesting articles are
lost because not enough people visit the _new_ page.

~~~
ColinWright

      > It is unfortunate that you have to read the footnotes
      > to understand the article - on a first read of the text
      > I was completely lost.
    

I'm not the author, but I'm intensely interested in effective communication.
You've taken the trouble to comment, and I was wondering if you'd be willing
to walk through the item and explain where you got to in the main body that
you thought was unclear, and where you were lost. I assume the first paragraph
was self-evident - where did you find it started to become impossible to
follow?

    
    
      > ... at the moment, many interesting articles are lost
      > because not enough people visit the new page.
    

I've always done this, but it's becoming increasingly pointless. My impression
is that the number of really interesting articles remains roughly constant,
but the total volume increases. It's getting more and more depressing to wade
through the uninteresting (to me) to find the rare gems. I still do it though.

~~~
tehwalrus
The first three are in English, sure, I'm talking about the last two
paragraphs. I understood the problem, almost from the title - I was interested
mostly in the proposed solution.

In particular, the last 2-3 sentences where a complex procedure is defined in
a few inline code snippets - and then the author continues as if the result is
obvious.

It reminds me very much of when I read overly mathematical scientific
publications (which I do a lot, being a PhD student.) In such papers, the
authors often expect you to stop and spend 5 minutes reading a few symbols
(perhaps reading a couple of Wikipedia articles along the way) before
continuing. This makes the articles highly unreadable - I sometimes spend 2-3
days slowing working through the particularly poor ones. (Obviously, I only do
this for highly relevant/useful papers!)

This article isn't in the same level of awful, obviously, but it did give me
the same "I'm lost now, I should go back" feeling - which I thought was worth
feeding back to the author, especially since they _had_ provided the
explanation, but they'd hidden it in a footnote.

~~~
ColinWright
The fact that you're a PhD student might help to explain this. My experience
is that most people just won't care, but will be reassured by seeing real code
in the text. They will skim without understanding, but getting the idea that
there is a concrete process being explained.

You (somewhat like me in this instance) will want to understand what's
actually happening. For that case the author has in fact provided more detail,
relegated to a footnote. You decry this, but you (and I) are in a significant
minority in this instance. We want to understand it properly, but I suspect
the vast majority or readers won't. Or if they do, not yet.

Your comment is useful, and I will keep it in mind next time I write
something. For reference, when I want to tuck away technical details I do it
in side-bars so it's there to be read alongside the main text.

Here:

* [http://www.solipsys.co.uk/HowHighTheMoon.pdf](http://www.solipsys.co.uk/HowHighTheMoon.pdf)

* [http://www.solipsys.co.uk/new/TheBirthdayParadox.html](http://www.solipsys.co.uk/new/TheBirthdayParadox.html)

~~~
tehwalrus
Your paper (the first one is as far as I got) is much better, since it just
states a simple equation which is typeset separately ("in display mode", in
LaTeX-speak) with the derivation for the equation in a sidebar and a note
inline in the text pointing it out.

This reads like a normal human speaking, who was asked a question and
elaborated. That's fine.

The problem with the OP's article was that I _couldn 't_ skim read it, because
the code didn't make obvious sense in a "reading aloud" kind of way. It would
have worked much better if the footnote had replaced, or been included as
padding _around_ the code. The skim readers would still have been happy,
especially if there was a carefully placed paragraph break so they could "skip
over the details," and I/we would have been happy too.

Also, two code snippets out of context are _anything but_ real code, at least
in my brain :)

------
Houshalter
It's a good idea, though I wonder if there is a better way to do this without
randomness.[1]

I mean it's just an exploitation vs exploration problem. You want to maximize
the number of articles the average user will upvote (essentially avoiding
wasting our time.) You want to use the time and number of votes to predict the
probability that a user will upvote it. But you also want to do "experiments"
to find even better articles even if it means wasting a few people's time.

I'm certain there is an elegant math equation that does this perfectly but I
can't figure it out.

[http://lesswrong.com/lw/vp/worse_than_random/](http://lesswrong.com/lw/vp/worse_than_random/)

~~~
yummyfajitas
I definitely agree with this. I'm not sure what HN's algorithm is actually
supposed to optimize, but apart from that I think it's pretty straightforward
how to optimize it with non-random systems.

[http://www.bayesianwitch.com/blog/2013/why_hn_shouldnt_use_r...](http://www.bayesianwitch.com/blog/2013/why_hn_shouldnt_use_randomized_algorithms.html)

------
aragot
...or we could have a much larger home page, so we don't have to deal with the
end-of-page-1 discontinuity, and we rely on the end-of-reader's-attention-span
discontinuity, which is much smoother.

------
alan_cx
Great idea. Would be interesting to see it trialed.

------
baby
I've put some thoughts into it as well. By doing that you create a new problem
:

pushing crappy stories to the frontpage and scaring new users away. For a
website like HN that doesn't need or want new users that's a good thing. But
for a reddit, hypem, etc... the best solution, IMO, is to trust people
browsing /r/new to do the work for the others.

------
read
The idea is not new. For building systems, tlb argues there should be a random
component in all arbitrary decisions.

[http://tlb.org/tlbImages/thesis.ps](http://tlb.org/tlbImages/thesis.ps)

------
taspeotis
Previous discussion:
[https://news.ycombinator.com/item?id=6498992](https://news.ycombinator.com/item?id=6498992)

    
    
        86 points by luu 58 days ago | flag | 41 comments

------
jedberg
I think this is the wrong solution. I wrote my thoughts here:
[https://news.ycombinator.com/item?id=6834705](https://news.ycombinator.com/item?id=6834705)

------
netforay
Instead of randomness, rank should be used as probability of getting selected
on the first page. Not sure how to do it, and how to maintain the cache.

