

"Unknown or expired link." - Why? - bradharper
http://www.google.com/search?q=hacker+news+Unknown+or+expired+link.&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
Why would such a valuable site continue to allow itself to be plagued with this type of repulsive usability issue?
======
jgrahamc
It's an artefact of the way in which news.arc (actually srv.arc) uses
functions for links. Here's the key code:

    
    
      (= dead-msg* "\nUnknown or expired link.")
    
      (defop-raw x (str req)
        (w/stdout str
          (aif (fns* (sym (arg req "fnid")))
               (it req)
               (pr dead-msg*))))
    

If the fnid (function ID) isn't in the fns* list then you get the dead
message.

    
    
      (def flink (f)
        (string fnurl* "?fnid=" (fnid (fn (req) (prn) (f req)))))
    

In many places in the code closures are used to handle requests (see the flink
code there). If the fns* list is cleared (say news.arc is restarted or
harvest-fnids kills them) then you'll get the message.

The use of closures in this manner means that the code needed to handle say a
form submission is really compact and set up when the form itself is
generated.

~~~
piinbinary
I would be interested to know how much state is in those closures. If it is
less than 200 or so bytes, it would not be impractical to encode it (b64) in
the url for the next page (rather than a reference to the state).

~~~
JoachimSchipper
You don't want to execute code from URLs. (Yes, you can use cryptography to
"sign" URLs you create. Don't try that at home unless you know the difference
between MACs and hashes, and how to avoid timing attacks.)

~~~
Dylan16807
Whoa, whoa. Putting the _state_ from the closure in the url is not the same as
putting the _closure_ in the url.

~~~
dspillett
But you are still likely to want to sign the state so you can tell if it has
been corrupted (or deliberately doctored) and reject it if so.

~~~
Dylan16807
Sign or sanity check, whichever you prefer. Personally in a simple interface
like this site I'd rather sanity check the few simple parameters.

~~~
wnight
It's not a druthers kind of thing. If you need to trust that it hasn't been
tampered with you _must_ sign it.

And if you don't care you might as well not add authentication because without
signing it's just a fancy CRC - ie, totally replicable by an attacker. As
cookies and links are sent over TCP there should be vanishingly few errors in
transmission - you're far more likely to introduce false positives with buggy
code, and ...

You need to sanity check your inputs anyways. Just do it. This is also how you
avoid bugs normally.

~~~
Dylan16807
I'm not sure what your argument is.

Yes if you want to trust it you have to sign it and make sure you implement
all the crypto correctly. But I don't see a need for that here.

Also TCP's checksum sucks.

~~~
wnight
You said "sign or sanity check it", as if you can do whichever you want. But
in the area suggested they have vast differences and security implications.

How many corrupted web pages do you see because of CRC failure in TCP?

------
extension
You get that when your session expires, which takes just slightly less time
than writing a well thought out comment.

~~~
wvenable
It doesn't just happen when writing comments; it can happen almost anywhere on
the site if you linger too long. It's the most glaring fault with the software
behind this site and would be completely impracticable if this site weren't
populated by technology-minded people who aren't bothered by error messages.

~~~
FaceKicker
Agreed. I've pretty much gotten used to refreshing before I click to the next
page every time...

~~~
ericd
I just open up each page I'm interested in reading in its own tab, I've never
really had a problem with the error message unless I take too long on a
comment.

------
brlewis
PG is using a really cool programming technique that I'm afraid is ahead of
its time relative to current hardware. An upgrade to the HN server should
allay the problem.

To see the potential, look at this code snippet from an academic paper on the
topic. The web server presents a form asking for a number, then presents a
form asking for another number, then displays their product. This technique
makes event-driven web applications feel (to the programmer) like sequential
imperative programs.

    
    
      ;; main body
      ‘(html (head (title ”Product”))
             (body
               (p ”The product is: ”
                  ,(number→string (∗ (get-number ”ﬁrst”) (get-number ”second”))))))
    

The paper:
[http://cs.brown.edu/~sk/Publications/Papers/Published/khmgpf...](http://cs.brown.edu/~sk/Publications/Papers/Published/khmgpf-
impl-use-plt-web-server-journal/)

~~~
pg
It's not so much that it's ahead of its time relative to hardware as it is
something you do in the early versions of a program.

Using closures to store state on the server is a rapid prototyping technique,
like using lists as data structures. It's elegant but inefficient. In the
initial version of HN I used closures for practically all links. As traffic
has increased over the years, I've gradually replaced them with hard-coded
urls.

Lately traffic has grown rapidly (it usually does in the fall) and I've been
working on other things (mostly banning crawlers that don't respect
robots.txt), so the rate of expired links has become more conspicuous. I'll
add a few more hard-coded urls and that will get it down again.

~~~
biot
Over the last week the home page appears to be cached longer than the arc
timeout, no doubt due to the spike in traffic. As I throw away cookies when
closing the browser, I need to login daily. It's been impossible to login from
the HN home page because of this. Refreshing the page doesn't help; I've had
to click through to a story to be able to login.

You should hard-code that one too.

~~~
pg
The problem there is that we switched to a new deliberately slow hashing
function for passwords.

Edit: I investigated further, and actually you're right, the problem was due
to caching. It should be better now because we're not caching for as long. But
I will work on making login links not use closures.

~~~
tptacek
What'd you go with, and how much of a pain was it to get working in Arc?

I ask because I'd love to be able to make a claim like "even Hacker News,
which is written in a Lisp, managed to implement a modern password hash".

~~~
pg
We use bcrypt. Rtm did it. I never looked at the code till now; it's about a
page of Scheme.

~~~
tptacek
Thanks!

------
mooism2
The HN server uses a table of closures to implement those links (the id code
for the closure is the bit after fnid= in the url).

When the HN server starts running out of memory, it drops entries from this
table. When your browser asks for an entry that is no longer in this table,
you get the "Unknown or expired link" error.

This is a crazy design, but unless someone would like to patch the source code
and get PG to accept it, we're stuck with it.

~~~
gnaritas
It's not crazy at all, it greatly simplifies development to use callbacks for
actions rather than manually encoding the necessary state into the URL.
Techniques such as this are what enable a single developer to be so productive
by automating boring and time consuming stuff.

~~~
srdev
Except it doesn't appear to work robustly, which makes it poor design.
"Automating boring and time consuming stuff," is all well and good if it
actually produces a functional system, but that concern is secondary to
robustness.

~~~
blahedo
It'd work robustly enough if the links didn't expire, and if we believe other
posts on this page, the links are expiring due to memory limits on the system.
(The other possibility is a timeout, I guess, which is easily fixed.) If it's
running out of memory to store the closures it would run out of memory to
store the interaction state.

In other words, there's a problem here, but it's not the programming model
that pg chose.

~~~
srdev
Except the links do expire, so its not robust. I expect that when I visit a
web page, I can let it sit for an extended period of time before moving on to
the next page and have it work. HN doesn't work.

Furthermore, the technique of holding important state authoritatively in
memory like this is not a good web-development practice for various reasons.
Doubly so if its state data which can be round-tripped. Links should not break
when the web server or cache (I'm not sure which one it is) runs low on
memory. So yes, there is a problem with the programming model that pg chose.

~~~
gnaritas
If he hadn't used that technique, there would be no hacker news for you to use
at all. You're entirely missing the point that this is a technique to make
hobby programming more fun, it's not about being robust or best practice, it's
about making programming simpler so pg finds it worth his time to build this
site in the first place.

~~~
srdev
blahedo's point was that the technique was not fundamentally a problem from a
robustness point of view, and I disagree with that point. It is a problem, and
I was pointing that out.

Your point seems to be that since Hacker News is a "hobby project," that we
may forgive sacrificing a bit of robustness to make the programming exercise
more pleasant. That point was not clear to me from your original posting.
Rather, the point seemed to be that the technique was good because it was
clever and fun, and I disagreed with that sentiment.

PG seems to be saying elsewhere that it was used as a rapid prototyping
technique. That seems to be a fair justification of the technique, in my
estimation.

------
icebraining
If you clicked on the links, you'd know why.

The software stores the current state in a closure. The closure gets cached.
When the cache is full, the older closures get flushed, hence the error
message.

------
tomcreighton
I find it interesting that most discussions I've seen about this exact topic
are about Arc and closures... instead of about the fact that this may well be
an interesting programming thing to do but it's a moronic user experience
thing to do.

~~~
pg
Your comment is in a sense its own refutation, because the ultimate test of
user experience is whether users continue to use the software.

Getting user experience right depends on the users. I wouldn't use this
technique in an online store. Random online shoppers would be confused by
expired links, and you'd lose sales. But HN users aren't confused by them.
What HN users care about is the quality of the stuff on the site.

Since I can't work full time on HN, I focus on the things that matter most.
What I spend my time thinking about is e.g. detecting voting rings. Those
affect what you see on the frontpage, which is what users of this site care
most about.

~~~
mfjordvald
I think you underestimate how annoying the issue is. It's one of those things
you put up with because of the content, but which are annoying enough that
they detract from the site experience.

So far I'd rate the user experience of the site around 3/5 and the content
5/5. You don't need to work any more on the content unless it starts dropping!

~~~
tptacek
I've been here a long time and seen this expired linky thing happen roughly
every other week; on a very few occasions it's been an annoyance, but mostly
it makes me smile; after reading news.arc, it's a reminder of what a hack HN
is.

I could care less if this issue got fixed. I have never once felt, "man, I'd
definitely jump to another site if it didn't have this expired linky thing
happen".

(Now, politics stories on the front page, on the other hand... I've _often_
wished for a site with as good a crowd as HN but without the politics...)

------
kristopher
I, too, wanted to ask this question, but feared that someone would respond
with something akin to "submit a patch" or "grep the source code"

Obviously, it makes sense for someone who is versed in the news.arc internals
to fix the problem; nonetheless this issue certainly bugs me.

------
JohnsonB
Good question, but in my opinion it's somewhat rhetorical. Given bugs like
these, the ongoing optimization battle, and fairly reasonable feature requests
(see the huge HN topic on that), isn't it about time Paul Graham hired someone
full or part time to work all these issues out? Given how important HN is to
YC, I would think it's worth it. Are any of these tasks _really_ things pg has
to do or are the best use of his time?

~~~
brlewis
It's good to do things you enjoy. _Best_ use of his time? I don't think anyone
can definitively answer that.

------
breckinloggins
Steps to reproduce:

1\. Open Hacker News

2\. Go to lunch

3\. Come back from lunch and click next

Every. Time.

~~~
przemoc
Restrain from eating then.

~~~
przemoc
Hm... Didn't know that obviously jest comments are ill-favoured in HN.
Hackering is a serious business apparently.

------
smackfu
The first complaints are from 1575 days (over 4 years) ago, including about
the more button breaking, so I am guessing pg has no interest in fixing it.

~~~
Tichy
Hacker News is open source, so it seems as if nobody is interested in fixing
it. Or are there fixes and pg has rejected them?

~~~
wvenable
Or it's so ingrained in the architecture of the software that a fix isn't
possible without completely rewriting it and changing the entire design
philosophy.

~~~
pg
Sort of yes, sort of no. It's a rapid prototyping technique. Essentially you
fix it case by case, by taking individual bits of code that use this technique
and replacing them with the uglier and less flexible but more efficient
alternative of a hard-coded url.

~~~
bkmartin
Paul, I really do not mean any disrespect here because you are truly a class
act and first rate player in the start up world. You are also a great hacker
that loves to push the limits. You've created an amazing community here that I
have been able to learn a ton from.

I have to ask, and I'll probably get down voted to hell because I'm naive or
something, but what is so elegant about a coding technique that breaks under
normal usage conditions? If I put out a customer facing piece of code,
especially after 4 years, wouldn't it make sense to use an "uglier and less
flexible but _more efficient_ alternative" that doesn't break?

I understand your previous explanations of why this happens and of rapid
prototyping etc. But at what point does the architecture actually get changed
to eliminate this _bug_?

~~~
pg
It doesn't make sense to call any specific amount of traffic "normal
conditions."

What's good about this technique, and about rapid prototyping in general, is
that you can write an initial version quickly in very little code, then
gradually make it more efficient as the demands on the app increase.

The rate of expired links says more about how busy I personally have been
lately than about the desirability of storing state in closures.

~~~
bkmartin
I'm not referring to any specific amount of traffic. I'm referring to how
users expect a website to work. If the user sees a link, especially a More or
Login link then the user expects it to do just what it says. When those don't
work I would call that a bug. I'm in agreement that this technique can be
useful for rapid prototyping, but I also think this site is probably the most
active and mass used prototype I've ever seen. ;)

My goal for a web site or web app is to have 0 expired links. Sometimes stuff
you link to outside your site will go dead, and it must be fixed or removed or
whatnot. But for your own internal stuff... I don't know... something doesn't
feel right about an architecture that allows that systematically. How much
time could you save if you didn't even have to worry about fixing any expired
links? Any idea on what the ROI on your time would be?

Anyway, just thinking out loud. Thanks again for the site though. I do indeed
enjoy it very much regardless.

------
paulkoer
This is apparently an old problem that hasn't been fixed yet. As you can
imagine, pg has a lot to do these days ;)

See here <http://news.ycombinator.com/item?id=28944>

------
davidcollantes
This happens to me a lot while reading HN. I hit "More" and by the time I am
done reading a few comments on a handful of entries, the next "More" has
expired, and so has the current.

This only happens on HN (at least to me).

~~~
d1b
Actually _right_now_ I cannot click "more"(on the first page) without hitting
the "Unknown or expired link" page ... so I cannot go past the first page :/
-- Someone should submit a patch :)

------
blauwbilgorgel
I thought this was to combat cross-site request forgery attacks?

Else couldn't one set up malicious scripts to up-or downvote many stories, or
post comments under someone else's name?

~~~
dasil003
Those are orthogonal issues.

------
odobenus
Everyone who has literally answered the question "Why?" has completely missed
the point. What would PG say about a primary site feature that is so
completely broken that it drives users to complain actively, and maybe stop
using the site? That it is their problem because they don't understand the
technical details? Well, obviously, no one is losing any money here, so maybe
that's the answer after all.

------
smountcastle
I'm getting this error when using the login link right now!

EDIT: The only way I was able to login was to use the 'add comment' button on
this post.

~~~
emu
I also just tried to log in about 6-7 times in a row (clicking the "login"
link on the front page, reloading the front page in between attempts), and I
repeatedly received the "expired link" page.

It also reliably happens clicking the "next page" link on the bottom of the
front page; by the time I'm done reading the front page the next page link
usually expires.

Please fix?

------
abdulhaq
The problem could made less painful by including a link back to
<http://news.ycombinator.com> on the "Unknown or expired link" page. That
would save me fishing around with the mouse and the back button to get a new
start.

~~~
VBprogrammer
I would go one step further and just send me back to the homepage.

------
Silhouette
I've been wondering about this for a long time, but just as a data point if
anyone cares, it has reached the point recently that HN is basically unusable
for me a lot of the time, and I really am starting to give up on trying and
spend more time elsewhere instead.

Perhaps one visitor is no great loss -- I'm hardly the personality around here
that someone like patio11 is -- but I hope my contribution is constructive,
and my comment scores have always suggested so.

However, subjectively, it seems like the quality of posting and voting has
taken a sharp nosedive since the "Unknown or expired link" problems have
become a several-times-per-session occurrence over the past few weeks. I can't
help wondering whether long-standing regular contributors are being put off as
a result. If positive contributors can't even log in to refute an objectively
incorrect post with a verifiable link or downvote Redditesque diversions, a
downward slide seems inevitable, and then the loss of high quality posting and
voting becomes a self-sustaining decline.

~~~
Tichy
When writing comments, always make a copy of your text before hitting submit
(CTRL+A, CTRL+C). A good strategy for any text form on the web.

~~~
6ren
"go back one page" (alt-left arrow) recovers your text on HN

~~~
thwarted
This is browser dependent (although many modern browsers do keep form content
in the history).

------
socialmediaking
It seems kind of ironic to have such an egregious bug on a site dealing with
coding and technology...

------
adamrmcd
Silly question, but, if HN is purportedly open-source where can I download the
source code?

~~~
mrb

      1. http://arclanguage.org/install
      2. Extract arc3.tar
      3. See news.arc
    

However, AFAIK pg forked from the latest public news.arc, so the current
Hacker News platform is _not_ open source.

------
gbaygon
Yes please, fix it already. I this problem where in another site we would be
posting complains from random blog posts over and over again.

------
drivebyacct2
I think some pages are generated statically and expire. Why the login/logout
pages "expire", I can't really guess at.

Wow, not sure why this is so deserving of downvotes. Trying to find a source,
but I thought there was a previous discussion of many HN pages being
statically generated and served quickly with links that expire after a certain
time (or become invalid because of what may happen on the server side of
things). But oh well.

------
sadfasdfads
One reason might be that posts/users are getting axed. Would be nice if
everything except spam was unmoderated, imo. Also, the current rate limiters
to keep spam out are blocking those that would otherwise be more active.

