
Ask HN: Is the Hacker News Team Actively Developing Arc? - Kinnard
Is the Hackernews Team @YC actively developing Arc? If so what important developments have they made?
======
dang
Nearly all our development has been on the HN application rather than on Arc
itself. Of course, in a Lisp, the line between application and language is
blurred. Many of our changes have a language-like quality, but because they're
implemented in macros, they haven't involved modifying Arc itself.

We did make one change to Arc proper: we added syntax for binding thread-local
variables in function signatures, analogous to how optional arguments are
declared. This adds a limited kind of dynamic scoping to Arc. That is useful
when you have some 'top level' state (e.g. the basic parameters involved in
making an HN web request: username, item id, etc.) that you'd like not to have
to pass down every chain of function calls that's going to need it at some
point. We originally added this as an experiment to see whether it would
simplify or complicate the code. It simplified it very nicely, so we kept it.
I can describe this in more detail later if anyone's interested.

~~~
jcr
> _" I can describe this in more detail later if anyone's interested._"

Yes, please.

BTW, that was some of the best nerd sniping I've seen.

[https://xkcd.com/356/](https://xkcd.com/356/)

~~~
dang
Ok. What I'll describe is a common problem in web programming (and other kinds
of programming too) that most people here will probably recognize. We ended up
modifying Arc to make the code that has this problem simpler. We weren't sure
how well it would turn out, but in the end it was a net win and we kept it.

Consider the following function from
[https://github.com/arclanguage/anarki/blob/master/lib/news.a...](https://github.com/arclanguage/anarki/blob/master/lib/news.arc),
which is similar to what HN looked like before our change. I've added comments
beneath the code to explain the relevant bits.

    
    
      (def user-name (user subject)
        (if (and (editor user) (ignored subject))
             (tostring (fontcolor darkred (pr subject)))
            (and (editor user) (< (user-age subject) 1440))
             (tostring (fontcolor noob-color* (pr subject)))
            subject))
    
      ; Print colored usernames for editors, plaintext otherwise.
      ; 'user' is the account that sent this request, or nil if logged out
      ; 'subject' is the user whose name we need to print out
    

For example, if you've requested the front page, then 'user' is your account
name at the top right of the page (or nil if you're not logged in), and
'subject' is the account name of a submitter of a story on the page. If we're
printing a story that says "218 points by tokenadult", then 'subject' is
"tokenadult".

Now, go back and notice how the only thing the function 'user-name' does with
its argument 'user' is pass it to another function, 'editor'. That function
looks like this:

    
    
      (def editor (u)
        (and u (or (admin u) (> (uvar u auth) 0))))
    
      ; Return whether user 'u' has editor privileges
    

'user-name' calls 'editor' to find out whether the currently logged-in user
should see usernames in special colors that non-editors don't get to see.

This example is typical of HN's code. Every request to the site involves a set
of standard variables, like 'user', that last for the lifetime of the current
page request. (Two others are the page being requested and the requesting IP.)
HN has lots of functions (like 'editor') that need one or more of these
variables, but it has even more functions (like 'user-name') which don't need
any of them, but make calls to other functions that do. In the code above,
both types of argument—the "transient" one ('user') that is merely passed down
the chain, and the "real" one ('subject') that is actually used—appear in the
function signature.

This can lead to confusing function signatures, since it isn't clear which
arguments are the "real" ones. There's even an example of such confusion in
the code above. The function is called 'user-name' and its first argument is
'user', so you'd naturally assume—I always assumed, reading the code—that the
name to be printed is that of 'user'. Wrong! It's 'subject'. This kind of
thing can, and did, lead to bugs.

Many web frameworks bundle all the request properties (like 'user') into a
Request object, and pass the Request around everywhere or keep it around as a
field. That works fine; there are lots of ways of solving this problem. Still,
it would be nice to know from a function's signature exactly what state it
does and doesn't use.

We decided to take advantage of the fact that in the HN code, each page
request gets its own (green) thread, and Racket, which Arc is implemented in,
has thread-local variables. At the beginning of processing a request, we save
all the request parameters—including the currently-logged-in user, which we
call 'me'—as thread-local vars. We made two changes to Arc for accessing
these.

The first change we made was to add a form 'the' that takes a symbol and
returns the value of the thread-local variable with that name, or nil if there
isn't one. So, to get the currently logged-in user, we just say:

    
    
      (the me)
    

... an odd phrasing, but charmingly succinct.

However, we were nervous about scattering forms like '(the me)' and '(the
page)' and '(the ip)' willy-nilly through the code. That's basically
programming by global variable. ("Thread-local" variables could also be called
"thread-global".) Accessing them randomly would wreck the discipline of
knowing which variables are used where—a good thing that having to declare
them explicitly in function signatures imposes on you.

To solve that, we made a second change to Arc: we added a new kind of optional
argument to let functions declare which of these variables they'll use.
Optional arguments work like this in Arc:

    
    
      (def foo ((o bar 123)) ...
    

If you supply the arg explicitly as in '(foo 99)', then 'bar' is 99. If you
omit the arg, as in '(foo)', then 'bar' is 123. If you don't supply a default,
'bar' is nil.

By analogy with 'o' for optional, we added 't' for thread-local, allowing you
to declare a function like this:

    
    
      (def foo ((t me)) ...
    

Now you can call '(foo)' without an argument and the variable 'me' will be
bound to the current logged-in user. If you need more than one thread-local
value, that's fine:

    
    
      (def foo ((t me) (t ip)) ...
    

And you can mix ordinary function arguments in along with these:

    
    
      (def foo (bar (t me) (t ip)) ...
    

Say you want a function to take an optional argument and use a thread-local
value if the argument isn't supplied. You can do it like this:

    
    
      (def foo ((t u me)) ...
    

That allows you to call 'foo' with an explicit arg, e.g. '(foo "pg")', making
'u' be "pg", or without an arg, '(foo)', making 'u' be the currently logged-in
user.

We can use that trick to make a new version of 'editor':

    
    
      (def editor ((t u me))
        (and u (or (admin u) (> (uvar u auth) 0))))
    

Now we can call it without an arg, '(editor)', to find out if the currently-
logged-in user is an editor. And that is what allows us finally to simplify
'user-name':

    
    
      (def user-name (subject)
        (if (and (editor) (ignored subject))
             (tostring (fontcolor darkred (pr subject)))
            (and (editor) (< (user-age subject) 1440))
             (tostring (fontcolor noob-color* (pr subject)))
            subject))
    

We've gotten rid of the transient argument 'user', because it was only
'editor' that needed it, and it no longer needs it from us.

This was not a sophisticated language change. Under the hood, the 'def editor'
signature above simply translates to:

    
    
      (def editor ((o u (the me)))
    

But it did require changing the Arc implementation to make it work.

This change to the language allowed us to simplify hundreds of cases where
arguments were passed through functions that didn't actually need them. It
also allowed us to clarify the code in many places. For example, now that
'user-name' no longer has any ambiguity about which user is which, we don't
need to call its argument 'subject'. It can just be 'user':

    
    
      (def user-name (user)
        (if (and (editor) (ignored user))
             (tostring (fontcolor darkred (pr user)))
            (and (editor) (< (user-age user) 1440))
             (tostring (fontcolor noob-color* (pr user)))
            user))
    

All these simplifications were nice indeed, but it wasn't clear at first
whether the change was a good idea overall. There's still something magical
about pulling thread-local vars out of thin air when you need them. It still
feels uncomfortably globalesque. But in the end, two considerations won us
over.

First, it turns out that this solution strikes a nice balance between
convenience and discipline. You still have to declare which variables you plan
to use in each function signature, so it's possible to look at the function
and know exactly what state it depends on. That's a lot more systematic than
loosey-goosey random access would be. And there's something pleasingly crisp
about declaring state only at the place where it's about to be used, even
though as a style it took a little getting used to (e.g. it's a bit weird that
'(editor)' tells you something about the currently logged-in user). On the
whole, it was a significant win for expressiveness. Here, being able to modify
the language to suit the application was the big win. The new language
construct is much more lightweight than any code we might have written purely
at the application level, even with Lisp macros.

Second, these request parameters like 'user' that we smuggle around in thread-
local variables are a genuine design invariant for HN. They're always present,
their meaning is always the same, and they never even change value once
assigned. This decreases the risk of having them universally accessible
through the code. That's the main thing we were worried about, and I was
surprised to find how little of a problem it turned out to be. It hasn't led
to any bugs. In fact I don't believe it has caused any trouble at all.

There is one downside. Some functions that no longer take arguments they don't
need are no longer so easy to call from the REPL. For example, to find out how
the username "dang" is printed when pg is logged in, we used to be able to
evaluate this from the REPL:

    
    
      (user-name "pg" "dang")
    

But now there is no way to pass in "pg" as the logged-in user, and if I
evaluate

    
    
      (user-name "dang")
    

that's going to show me how "dang" is printed when nobody is logged in. To get
around this, you must set up a thread-local value before evaluating the form.
We have macros for this:

    
    
      (w/me "pg" (user-name "dang"))
    

It's not as convenient as just passing an argument to a function, but still
nicer than having to, e.g., set up a stub Request object. In practice it's not
too annoying.

The last thing I'll mention is that the above was a good hack for the HN
codebase but probably wouldn't make a good extension for Arc in general,
because it's clearly a poor man's dynamic scope, and proper dynamic scope
shouldn't be that hard to add to Arc. That would subsume the above hack and
give us something both more standard and more powerful. Lexical scope by
default, dynamic scope when appropriate is what the Lambda Papers advocated,
what Common Lisp does (also Racket, in its verbose way), and by far the best
thing for practical Lisp programming.

~~~
akkartik
This was great! I wish there were more case studies like this.

I think you're selling it short, this is a truly novel "third way" of managing
state, after lexical scope and dynamic scope. The key idea is the hook into
the function signature that allows overriding without the push/pop dynamic
that characterizes lexical and dynamic scope.

I'm curious, does your system raise a warning if a function uses a thread-
global variable without declaring it in the function arguments? Do you still
feel the need for directly using _the_ anywhere in the HN codebase?

~~~
jcr
Dan's implementation is fascinating, but I'm unsure if it qualifies as "truly
novel" since I __think__ I've seen something like this before. I wish I could
remember where, and a few quick searches have turned up no results. As he
mentioned, thread-local-storage (the other "TLS") in CommonLisp is provided by
dynamically scoped variables [1], but this similar functionality is being
accomplished elsewise. Though this way of getting TLS may lack some of
benefits of full dynamically scoped variables, I've got a wild,
unsubstantiated hunch that this way may have better performance than full
blown dynamic scoping due to less overhead.

The fun part is, this way of doing things (i.e. without dynamic scoping built
in) is essentially doing "user-directed scoping" that is vaguely similar to
things like "SoftTyping" [2] and "SuccessTyping" [3,4] which are basically
"user-directed typing".

[1] [https://en.wikipedia.org/wiki/Thread-
local_storage#Common_Li...](https://en.wikipedia.org/wiki/Thread-
local_storage#Common_Lisp_.28and_maybe_other_dialects.29)

[2] [http://c2.com/cgi-bin/wiki?SoftTyping](http://c2.com/cgi-
bin/wiki?SoftTyping)

[3]
[http://www.it.uu.se/research/group/hipe/publications.shtml](http://www.it.uu.se/research/group/hipe/publications.shtml)

[4]
[http://www.it.uu.se/research/group/hipe/publications.shtml](http://www.it.uu.se/research/group/hipe/publications.shtml)

~~~
akkartik
Thanks for those links. Could you check #4?

After I posted my comment I connected up this idea with
[https://en.wikipedia.org/wiki/Google_Guice](https://en.wikipedia.org/wiki/Google_Guice)

~~~
jcr
The #4 link is fine, but the person who posted it failed miserably. ;)

I wanted to give you the direct link to the SuccessTyping paper listed on the
url given in #3 and #4.

[http://www.it.uu.se/research/group/hipe/papers/succ_types.pd...](http://www.it.uu.se/research/group/hipe/papers/succ_types.pdf)

------
Kinnard
See also:
[http://arclanguage.org/item?id=19464](http://arclanguage.org/item?id=19464)

