
Urbit language tutorial, chapter 0 - urbit
http://urbit.org/docs/dev/hoon/tutorial/0-nouns
======
MichaelGG
Really should lead with an end example then deconstruct how it works. But I
get the feeling they are trying to be intentionally obtuse, either to show off
cleverness or as performance art. Perhaps that's a bit cynical.

~~~
gclaramunt
>But I get the feeling they are trying to be intentionally obtuse, either to
show off cleverness or as performance art

Same here. I really don't see the point of renaming well known and defined
concepts

~~~
pcmonk
We're not trying to be intentionally obtuse. In general, we try to only create
new names when the old names are for a sufficiently different concept. Perhaps
we go slightly too far in doing this, but much of the "the terms are alien"
feeling comes from the concepts beings actually different than you're used to.

This is the start of a bottom-up style tutorial, but if you're interested in a
top-down tutorial, may I suggest this one from one of our open-source
contributors: [http://hidduc-posmeg.urbit.org/home/pub/hoon-
intro/](http://hidduc-posmeg.urbit.org/home/pub/hoon-intro/)

~~~
cwyers
> In general, we try to only create new names when the old names are for a
> sufficiently different concept.

You renamed the godforsaken PUNCTUATION, man. What exactly is the different
concept behind an equals sign in Urbit that you have to call it tis? Or the
colon, what conceptually did you change about the colon that it's got to be a
col or the comma that it's a com?

~~~
urbit
How many times a day do you want to say "percent equals" or "tilde ampersand,"
rather than "centis" or "sigpam"? Worse, you may not spend a lot of time
reading code out loud, but when you read your brain still thinks the sounds.

Of course, you're free to prefer keywords to digraphs. But sometimes I worry
that there's a reason the 787 looks just like the 707 [1], and it's not at all
about airplanes...

[1]
[http://idlewords.com/talks/web_design_first_100_years.htm](http://idlewords.com/talks/web_design_first_100_years.htm)

~~~
JoshTriplett
> How many times a day do you want to say "percent equals" or "tilde
> ampersand,"

Zero if I can help it. Even if I'm talking about code aloud, I normally do so
semantically, saying what the code is _doing_ , because people are more than
capable of reading the code itself. (Same premise as commenting code.) If I'm
actually reading the exact symbols aloud, a rare occurrence, then I'll say
things like "mod equal" or "mod eq", "tilde and", etc. If I'm reading
"#include <stdio.h>", I'll say "include standard I O", not "pound include
space angle s t d i o dot h angle newline".

> Worse, you may not spend a lot of time reading code out loud, but when you
> read your brain still thinks the sounds.

No, it absolutely doesn't. Learning to _avoid_ reading-aloud internally is one
of the first things taught in speed-reading of prose, and the same concept
applies to programming.

> Of course, you're free to prefer keywords to digraphs. But sometimes I worry
> that there's a reason the 787 looks just like the 707 [1], and it's not at
> all about airplanes...

That may be, but what you've done here is on par with APL saying "you need a
completely non-standard keyboard and character set to program this language",
but worse.

If you want to introduce new names for symbols, that's a separate exercise
from trying to make a new programming language; the latter is quite hard
enough.

~~~
urbit
But _worse_?

I assure you that we're not shipping Urbit with non-standard keyboards printed
with "com" rather than ",". If these syllables rub you the wrong way, you're
perfectly free to pronounce them any way you like.

Most people aren't speed-readers. I am, but only because I learned at an early
age. I don't know anyone who speed-reads code in the sense you mean.

~~~
eru
The grandparents post's first argument about semantic reading was much more
important than the speed reading bit.

~~~
urbit
This works fine for C, which is a keyword language.

What you'll find if you look at Hoon is about 100 combinator runes, which are
digraphs like "%=" ("centis") or "|-" ("barhep"). (There is of course a power-
law usage distribution -- a few common runes dominate.)

If we had to invent and remember semantic names for each of these runes, that
binding would really be a lot of work to remember. We could say "mutate"
instead of "centis" and "loop" instead of "barhep". Remembering these bindings
would be quite a lot of work.

Of course, we could use reserved words directly; you can experiment with
translating Hoon into reserved words. (The earliest experiments used keywords
with a sigil, like ":loop".) To my eye the keywords look hideously verbose and
are easily confused with symbols.

You also lose the ability to classify, say, "|=" and "|-", as closely related
runes. This is especially useful with unusual, rarely used runes; what is
"|/"? You might not know, but you know that it starts with "|", so it makes a
core.

~~~
JoshTriplett
> If we had to invent and remember semantic names for each of these runes,
> that binding would really be a lot of work to remember.

So is remembering what all the runes do; if the names and the functions
relate, they're both easier to remember.

~~~
urbit
The runes (a) help you remember the combinators, and (b) are two characters
long.

To get to reasonably precise, meaningful keyword bindings for this set of
combinators in particular, perhaps you'd be replacing ":-", ":_", ":+", ":^",
":*", and ":~" with ":pair", ":reverse-pair", ":triple", ":quadruple",
":tuple", and ":list" respectively. This is a pretty benign and easy case, and
you may have to trust me on what kind of hash this search-replace alone would
turn a Hoon file into. Then again, I suppose it all looks like hash to you.

Language design is hard; you can't please everyone. Probably the best-case
scenario is that you please some people, and the rest are dragged kicking and
screaming. And that's a best case.

~~~
JoshTriplett
No, I actually think that having combinators like that makes sense; it's a
perfectly reasonable (and very APLish) design choice to Huffman-code those
kinds of operations as combinators. I've worked with Haskell code that makes
extensive use of symbolic combinators, and once you get used to a given set,
code using them can become much _more_ readable. And using symbol pairs where
related operations have related symbols makes even more sense.

I just don't think it makes sense to invent fanciful names for them based
entirely on their symbology, rather than their function.

It's your language; you can call things whatever you want. But I've seen many
languages that I would like to succeed fail or grow very slowly because they
failed to engage well with prospective users. And when you already have a
nearly-asymptotic learning curve, why add _more_ to it?

~~~
urbit
We're at a pretty shallow level of disagreement here, because you'd just add
descriptive names to the combinator definitions. I'm not sure this would
improve the learning curve much if at all, but it couldn't hurt much either.

As I always say, there's a difference between real and apparent learning
curves. I don't think your perception of the apparent learning curve is wrong.
But some things are easier than they look; one of those things is remembering
associations. I've seen a good number of people walk the real learning curve
for this language, and it's not that steep. Hopefully reality also gets a
vote.

------
ajkjk
I'm finding this tutorial really hard to follow. It doesn't track my mental
state very well; each successive sentence that qualifies an idea doesn't
follow from the last one to me.

I'm struggling to describe this phenomenon exactly, but I want to try because
I'm interested in quantifying it. So here's an attempt:

When I'm reading a tutorial to a new idea, a qualification needs to follow
naturally. If you tell me "this is an A", and then follow it with "note: it's
not a B", that qualification only works if I would have thought, in my head
"oh, is it also a B?" after the first sentence. Otherwise, I'm left in a tough
spot: I _believe_ you that it's an A, but I don't trust that I know what an A
is, because I'm apparently supposed to have realized that that suggested it
could have been a B as well - because you just jumped in and pointed out that
it wasn't!

So such a qualification is not just a clarification; it's also narrowing down
the possible meanings of A. But I didn't know about all those meanings,
because all I have is the word 'A' right now, and I don't understand the
complexity of that yet.

Here's a specific example:

"A noun is an atom or a cell. An atom is any unsigned integer. A cell is an
ordered pair of nouns.

The noun is an intentionally boring data model. Nouns don't have cycles
(although a noun implementation should take advantage of acyclic graph
structure)."

As of sentence #3, it looks like a noun can contain cycles. I mean, it can be
a cell, and a cell can contain nouns, so that seems like a cycle. Then, in
sentence #5, I'm told they don't have cycles. What? Now I don't know what to
believe. I thought I understood nouns, but I don't anymore. Then, in the
parens, even more concepts are thrown in. Now we're talking implementations -
but I don't know what we're implementing. And we're taking advantage of acylic
graph structures. Taking advantage? To do what? And the word 'although' shows
up. They 'don't' have cycles, 'although' we should take advantage of their not
having cycles? Either this is mis-worded or there's even more context I'm
missing.

.. anyway, that got wordy. Point is, almost every sentence in this article
confuses me further, instead of confusing me less. There are no 'ah-ha'
moments. It's just "huh?" after "huh?".

~~~
urbit
Thanks for the detailed feedback!

 _As of sentence #3, it looks like a noun can contain cycles. I mean, it can
be a cell, and a cell can contain nouns, so that seems like a cycle. Then, in
sentence #5, I 'm told they don't have cycles. What?_

If we used the word "tree" and "leaf" here instead of "noun," would that help?
"A tree is a leaf or a cell. A leaf is any unsigned integer. A cell is an
ordered pair of trees."

It doesn't strike me that anyone reading this definition, even not knowing
what a "tree" and a "leaf" are, would assume that a tree can have cycles. And
then, two sentences later, if we say that a tree doesn't have cycles, I'd hope
that that would settle the matter.

It's probably a mistake to bring implementation issues (like the fact that you
can share a pointer rather than copying a subtree, thus creating a DAG [1])
into a discussion of this level, though. All kinds of people have different
kinds of CS backgrounds...

[1]
[https://en.wikipedia.org/wiki/Directed_acyclic_graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph)

~~~
ajkjk
So, on this noun/cell/thing..

As I'm reading it,

    
    
        noun := atom | cell
        atom := integer
        cell := (noun, noun)
    

So when you say "nouns don't have cycles", my problem is that I don't know
what you mean by "don't". Do you mean they're not able to have cycles? (am I
unable to write A = (A, A)?) or do you mean that you prohibit it? (that you
check validate nouns to avoid cycles?) or do you mean that model prohibits
cycles? (it appears to allow for A = (A,A), so I don't see that).

It helps to tell me it's a tree. It's okay to use the word 'noun' if that's
the word for it, but telling me "nouns make a tree, where the leaves are
called atoms and the non-leaves are called cells" \- that's far more useful
than describing properties of trees without using the word tree.

When you mention the implementation details in "although a noun implementation
should take advantage of acyclic graph structure", do you mean to say
"therefore" instead of "although"? Because if nouns don't have cycles it makes
sense that you can take advantage of that. The word "although", however,
signals an exception, and the "although.." clause should say something that is
somehow in opposition to the first part of the sentence. The nod to
implementation details is probably unnecessary,but it would be harmless if not
for that confusing word "although" that makes me think I've misunderstood
something.

.. though, I'm really sleep-deprived today, so perhaps I am struggling to
understand things that I would otherwise be able to parse. I dunno.

~~~
urbit
"Nouns don't have cycles" in two senses.

Abstractly, a noun as defined is a tree and trees don't have cycles. So there!
:-)

Concretely, the Nock interpreter, which Hoon compiles itself to, has no way to
construct a "noun with cycles," like your A == [A A]. I sense that this is
what's really bugging you. :-)

Of course, Nock is written in C and uses a C data structure. When we
manipulate this data structure from C, we can certainly construct all the
degenerate "nouns" we like. This generally results in a quick trip to the noun
hospital.

I'm genuinely surprised at how few acyclic programming languages exist. True,
there are plenty of algorithms with faster asymptotic runtime given
cyclic/mutable data structures. There are lots of ways to manage this problem,
though.

And both for persistence and network transmission, acyclic structures are just
much easier to handle. You'll notice that all major network data models (XML,
JSON) are acyclic, as are SQL and NoSQL databases alike. Cyclic databases
exist [1] [2], but they're very much the exception.

And of course, acyclic data models mean you don't need a tracing garbage
collector. Are doubly linked lists, etc, really worth a tracing garbage
collector? Anyway, in a language/OS where transmitting and persisting state
are core features, acyclic data seems like a pretty easy choice.

[1]
[https://en.wikipedia.org/wiki/Network_model](https://en.wikipedia.org/wiki/Network_model)

[2]
[https://en.wikipedia.org/wiki/Object_database](https://en.wikipedia.org/wiki/Object_database)

~~~
stcredzero
_Abstractly, a noun as defined is a tree and trees don 't have cycles. So
there! :-)_

You shouldn't require the reader to make this inference. The 2nd order
implications are far less obvious to the reader than to the writer. A good
writer should never assume the reader will get it on the 1st mention. In fact,
I'd hazard a guess that most of the time, when you've made such inferences
while reading, that an author has subtly primed you to do so beforehand.

------
openasocket
Two pieces of feedback:

First, at the end of this I'm still rather confused about what molds are for,
and it's implied that all will be explained in the next chapter. This is a
tutorial, not the next great American novel: do not end chapters on
cliffhangers! After finishing a chapter, the reader should be able to go back
and fully understand every concept introduced in the text. Either expand this
chapter to include a complete explanation of molds, or move molds into a
separate chapter.

Two, you include this example of a mold:

++ span

    
    
      $%  [%atom p=@tas]
    
          [%cell p=span q=span]
    
          [%cube p=* q=span]
    
      ==
    

This is chapter 0 of a tutorial on Hoon: do not just include Hoon syntax with
no explanation! I have some idea of what this function does from context
clues, but I have no way of knowing if my intuition is correct. This is why
pseudocode was created: use it! Here's my best guess of what that function
does:

    
    
      function span_mold(noun n)
    
        if n is of the form (p,q)
    
            return [%cell span_mold(p) span_mold(q)]
    
        else if n is of the form @a
    
            return [%atom %a]
    
        else if n is of the form %b
    
            return [%cube b span(b)]
    

Is that right? I have no clue!

~~~
urbit
It is right! You do have clue!

This is excellent feedback. Normally I'd agree with it. Ideally, this will be
the last cliffhanger in the tutorial...

------
SolarNet
Long story short, this quote:

> Lisp masters beware: Hoon [a b] is Lisp (a . b), Lisp (a b) is Hoon [a b ~].
> ~ means nil, with value zero. Lisp and Hoon are both pair-oriented languages
> down below, but Lisp has a layer of sugar that makes it look list-oriented.
> Hoon loves its "improper lists," ie, tuples.

along with the knowledge that the people writing this like really strange
names (tile, gate, vase, fish) and 2 part symbols (++ += =^) is all you need
to know.

~~~
kazinator
> _42 and 0x2a are actually the same noun, because they 're the same number.
> [...] But semantically, 42 has a decimal span and 0x2a hexadecimal, so they
> print differently._

A babbling load of twaddle which in and of itself wouldn't be a crime were the
word "semantically" not involved as an abuse victim.

------
theseoafs
All of this can be neatly summarized as "a noun is either an unsigned integer
or a pair of two nouns". The authors apparently love to give things funny
names and weird syntax, but there's nothing here that Lisp didn't do 50 years
ago. Let's see what happens in chapter 1, though.

~~~
urbit
Yes! Although there are some things Lisp did 50 years ago, that aren't here.
But this is less an improvement on Lisp, than a tribute to Lisp.

Chapter 1 should remind you less of Lisp. Although I do expect to hear "that's
just caddadr"...

------
ThrustVectoring
The "navigate to top" links in the upper left kept appearing and disappearing
as I moved my mouse and scrolled around, and it's extraordinarily distracting
to me.

~~~
incepted
Ditto. I even moved my mouse cursor on the right side of the window, hoping it
would stop.

It didn't.

~~~
ThrustVectoring
I just remembered that you can right-click it in Chrome and delete it from the
DOM.

~~~
incepted
Keep forgetting that, thanks for the reminder.

------
xixixao
Tududum, anyone's got it cached (Google has some JS only)?

%exit

/~zod/home/~2015.10.17..20.34.57..3a55/arvo/eyre/:<[1.104 9].[1.106 38]>

[%bad-beam %home]

/~zod/home/~2015.10.17..20.34.57..3a55/arvo/eyre/:<[1.103 9].[1.106 38]>

/~zod/home/~2015.10.17..20.34.57..3a55/arvo/eyre/:<[1.102 9].[1.106 38]>

/~zod/home/~2015.10.17..20.34.57..3a55/arvo/eyre/:<[1.101 9].[1.106 38]>

/~zod/home/~2015.10.17..20.34.57..3a55/arvo/eyre/:<[1.100 9].[1.106 38]>

~~~
state
Sorry for the interruption. Everything is being served from an Urbit. We have
done okay under HN load in the past but may have just screwed up a hotfix.

For the time being, here's a paste of the markdown:
[http://pastebin.com/G3zMqZN8](http://pastebin.com/G3zMqZN8)

~~~
urbit
Back up.

------
stevegh
If you want to know what he is actually talking about, this may help:
[https://popehat.com/2013/12/06/nock-hoon-etc-for-non-
vulcans...](https://popehat.com/2013/12/06/nock-hoon-etc-for-non-vulcans-why-
urbit-matters/)

------
mcnamaratw
"FP for street programmers" sounds like a win.

Unfortunately I couldn't really understand anything else. My Lisp is rusty,
but dotted pairs aren't something I usually have so much trouble following.

Is there a tutorial that's mostly code examples and not so much text?

~~~
urbit
The next tutorial is a lot less wordy, and should hopefully shed a good amount
of light backward on this one.

------
powera
So basically this is a lot of syntax to add something resembling a type-system
on top of a fundamentally un-typed language?

~~~
urbit
Yeah, that's pretty accurate.

~~~
powera
Possibly a silly question: did you consider using the complex integers (either
as a + bi, or as (r, 2*pi/T)) instead of the natural numbers as the basis for
Nock?

~~~
urbit
Cells make you think of a+bi a little bit, but no.

(Actually I was very pleased when I realized that signed integers have no
place in a fundamental interpreter.)

------
kzrdude
It looks like urbit isn't really “public” yet; they claim to be in “semi-
closed alpha” still.

~~~
state
It's actually more open than it looks. The code
([https://github.com/urbit/urbit](https://github.com/urbit/urbit)) is
available to anyone interested. Anyone can build and run a comet.

~~~
RexM
Can you explain what a comet is, exactly? If I were to setup my own "comet"
would that mean that I'd have my own network, detached from the one the urbit
developers are a part of?

I'm interested in playing with urbit, but I'd also like to try it with some
people to talk to, and not on my own island.

~~~
state
A comet is an Urbit with a self-signed certificate. You're a part of the same
network that everyone else is on. Comets are one of 2^128 possible addresses,
so they're anonymous and disposable. For the time being it's possible to use a
comet to come and talk with everyone. We're also glad to answer questions over
email.

~~~
RexM
Thanks for explaining this. So invites are only for "planet" names?

I'm still having difficulties understanding the reasoning behind the different
levels, explained in the white paper.

I've got it running, thanks for letting me know that I could just compile it
and spin it up. I've been getting the newsletters for urbit and have been
wanting to play with it. I didn't realize I could do that without an invite.

------
gclaramunt
curious: what's the underlying computational model for the language? the
untyped lambda calculus? universal turing machine ? Or there's something new?

~~~
urbit
Nock: [http://urbit.org/docs/dev/nock](http://urbit.org/docs/dev/nock)

------
dragonbonheur
Some languages are so disconnected from programmer needs and real life
usefulness that they make me say "I'd rather learn assembler".

------
tel
How in the world do typeclasses work in Hoon?

~~~
loqi
I don't think Hoon's type system actually has a typeclass analogue. As far as
I could tell, what it has is something like generics, but in a world where
everything is ultimately a noun. IIRC, a previous iteration of the docs
explained that "wet gates" (generic functions) are actually required to
compile down to the same Nock when "instantiated" at a particular argument,
modulo dead code elimination. Didn't look much like ad-hoc polymorphism to me.

A good example is Hoon's maps[1]. They're parameterized on type, but I'm
pretty sure those types can't affect the runtime behavior by, say, specifying
their own hash or comparison function. Instead, the map implementation[2]
hardwires a couple of specific comparison functions[3] that effectively toss
the type information and work in terms of the raw underlying nouns.

It is kind of weird that every type carves out a subset of nouns, even
function types (or function "molds" or "spans" or whatever... I'll stop
calling them types when the Urbit people stop calling it a type system).
Hoon's C-flavor really shows when it makes the likes of strlen((char*)strlen)
expressible in a purely functional way.

[1]:
[https://github.com/urbit/urbit/blob/7186219/urb/zod/arvo/hoo...](https://github.com/urbit/urbit/blob/7186219/urb/zod/arvo/hoon.hoon#L421-L422)

[2]:
[https://github.com/urbit/urbit/blob/7186219/urb/zod/arvo/hoo...](https://github.com/urbit/urbit/blob/7186219/urb/zod/arvo/hoon.hoon#L2530-L2751)

[3]:
[https://github.com/urbit/urbit/blob/7186219/urb/zod/arvo/hoo...](https://github.com/urbit/urbit/blob/7186219/urb/zod/arvo/hoon.hoon#L1006-L1034)

