Hacker News new | comments | ask | show | jobs | submit login

Why the superfluous syntax? I think remembering syntax like this is orthogonal to the goal of being easy to learn.

The syntax is basically Ruby + Python + Haskell. Each of those languages has a lighter, more intuitive and memorable syntax.

Why would the syntax be:

    data BinTree:
      | leaf
      | node(value, left, right)
Instead of just

    data BinTree = leaf | node(value, left, right)
The whole colon thing in Python is a mistake, it should have never been in Python, and it definitely not be repeated in other languages..

I would encourage you to view the syntax as "Python, but repaired", for the following reasons:

1. Pyret comes out of Shriram's group's expertise with pinning down exactly what Python and other dynamic scripting languages do right, and (mostly) do wrong. Check out their recent paper Python: The Full Monty: A Tested Semantics for the Python Programming Language http://cs.brown.edu/~sk/Publications/Papers/Published/pmmwpl... for more context.

2. Ruby is far and away not the originator of 'end' to end blocks -- this comes from Pascal and is in other languages that have nothing to do with Ruby, like Lua.

3. Languages that aren't Haskell have ADTs too; it happens that they've lifted an ML-style syntax for defining them.

4. Python is widely used pedagogically, so for better or for worse, students are already being familiarized with the colon, which I agree is a bit anomalous.

> Ruby is far and away not the originator of 'end' to end blocks -- this comes from Pascal and is in other languages that have nothing to do with Ruby, like Lua.

The use of 'end' comes from Algol, not Pascal.

Thanks for the link to that paper. Super interesting to see actual semantics for python come out. It would be nice if this leads to better tooling for analysis of the language. Currently, it is pretty hard to bootstrap anything for the language in comparison with what you can do for the JVM.

You'd first have to fix Python's broken notion of scope. If you want to read only one, simple, page, read the last page (appendix 2 on variable renaming). Pyret was created to not have such problems by design.

You can actually use ";" as an alias for "end" in Pyret, so you could write:

    data BinTree: | leaf | node(value, left, right);
We use "end" or ";" in order to have unambiguous delimiters for the ends of syntactic forms and avoid needing to depend on whitespace (we added some discussion on pyret.org about our philosophy on indentation and why we don't want to depend on whitespace).

Making that leading pipe optional is a good idea. I made an issue for it, we'll think about if it'll break or confuse anything and add it if it doesn't:


I realize it's a trade-off -- but I think having ";" as an alias for "end" (or vice-versa) is a pretty bad idea. I personally prefers python's indentation-for-blocks syntax, but I understand why you want semantics to be decoupled from indentation. But when using explicit end-markers, I'd prefer to match them to the start, maybe even introducing some verboseness, like: end-case, end-if etc -- maybe taking it further and allow/demand naming of blocks:

      4 + 5 is 9
      1 / 3 is 2 / 6
      9 - 3 is 6
      5 > 4 is true
Then becomes, not wrapped in end-check, but:

      4 + 5 is 9
      1 / 3 is 2 / 6
      9 - 3 is 6
      5 > 4 is true
Or something similiar. This makes mis-matching "end"s (either typos or artifacts from cut'n'paste coding) explicit errors that are easy to spot, and identify.

It would add a lot of verbosity, of course. As for ";"/"end", consider (the presumably valid):

      4 + 5 is 9
      1 / 3 is 2 / 6
      9 - 3 is 6
      5 > 4 is true
That trailing ";" is going to trip someone up. Also consider (cut'n'paste-with-quick-edit):

      4 + 5 is 9
      1 / 3 is 2 / 6;
      9 - 3 is 6
      5 > 4 is true;
Is this valid?

I have wrestled with "uniform closing" vs "non-uniform closing" designs for ages. For those not clear on what I mean here's an example: Lispy syntaxes have a uniform close (it's always ")"), whereas XML syntaxes don't (you have to put the name of the opening tag in the closing token).

So e12e is making an argument for XML-y over Lispy. However, choosing good words is really, really hard. If you pick a different word for each construct, there's a needless mental burden; if you pick a uniform strategy ("end-<kwd>") you potentially get less readable code. And either way it's far more verbose, as you point out.

Some of our target audiences are middle- and high-school students, some of whom have weak typing skills (we know from numerous workshops we run). Increasing even the raw number of characters is a real problem for them.

My preference is to use ; for one-liners, but end where you have a multi-clause entity and you want to be able to clearly tell where it ends. If we find that there is general agreement on this, we can make it a context-sensitive check, à la indentation. Then, a program generated by a tool can do something consistent and ignore all these checks, whereas human-facing environments would enforce them (by checking or correcting).

As for your examples, to my Pyretical eye, the third of your examples (with the dangling ";") looks just wrong. However, your fourth example is syntactically incorrect.

> Some of our target audiences are middle- and high-school students, some of whom have weak typing skills (we know from numerous workshops we run). Increasing even the raw number of characters is a real problem for them.

I certainly understand this argument, and I generally favour simple syntax over smart tools -- but how much of a difference would this make in an editor/IDE that automatically inserts the closing-tag? (you type case, the editor appends esac (but below the point you're typing)):

    1:   |   #your cursor at |

    2: case|
    3: case:
         |     # you hit enter or something, ready to fill in
       esac    # editor has closed block/statement
I imagine typing-skills isn't much of an issue when editing/copying text -- as opposed to typing in new code?

An express design goal is to not bake in assumptions about an editor. Programmers really like their editing tools. I've lived through the waves from vi to Emacs to vim to Sublime Text to what-have-you. We'd really, really like to make the language pleasing to work with without depending on an editor (indeed, each of us seems -- perhaps by accident -- to be using a different editor for Pyret, which may be influencing our decision).

When editing/copying, it's not typing skills but editing skills, which are arguably subtler and even harder.

Your point about programmers having favourite editors (which I consider valid) contradicts the point you made about an important part of your audience being middle/high schoolers, which I don't think have such strong preferences towards one editor or another. It seems to me that you are making compromises to cater to the needs of an audience which is much broader than that you have identified for the language.

It would be a shame to compromise the design of an interesting language just because the target audience is not clearly defined.

I have many target audiences. I mentioned middle- and high-schoolers in response to various other questions, but I don't anticipate them being the largest audience. I expect the biggest audiences to be introductory college-level education and second/third-year courses on "paradigms" (blecch) that want a flexible language to illustrate several concepts. Finally, I am also consciously building a language and book to appeal to the working programmer who has not learned to think of computation as primarily producing values, and would like to quickly get up to speed with that idea.

All of these are educational audiences. I don't see how one can define the audience more clearly than that, given that I have no control over any of these audiences.

Having ; on a new line is valid, but having both ; and end on a block is invalid.

I think the colon isn't a mistake, it clearly delineates an indented block; I think it's a great marker that indented languages might want to standardize on. Perhaps syntax issue is that it has both a colon and an "end" marker? The former indicates indentation-based syntax, the latter is often used in white-space agnostic contexts. The result of using both is confusion.

Pyret doesn't actually have an indentation-based format (so far as I know) it just happens to look like python.

That's correct. There is a preferred indentation (enforced by the emacs mode and our online editor built on CodeMirror), but whitespace is only needed to separate things - the actual amount of whitespace never matters.

So, for example, you can move chunks of code from one place to another and select it all and re-indent - something that you can't do (in general) in Python. It also makes it easier to programatically generate code.

When you have both braces/keywords and whitespace, it seems to me that you're essentially storing the same information about the program structure in two places, of which one is read by the machine and the other is read by humans. The "re-indent" operation reads the master copy and updates the other one. (And if you forget to re-indent, your human readers will be interpreting an outdated copy and be very confused.)

In Python, on the other hand, the information only exists in one place: the indentation. This is read both my machines and humans. There is no "re-indent" operation as the indentation is the only place the information exists, so there is nothing to sync.

I have heard people bring up programatically generated code before, but as another commenter put it "where Ruby usually denotes code blocks with a `do` and an `end`, Python denotes code blocks with a `:` and an outdent" – so is that really a problem? Isn't it just a slightly different way of encoding the same information?

(As to moving chunks of code in the editor: If the editor has a command to re-indent a selected block, it probably also has a command to shift it left or right.)

But if indentation matters, and it's pressent, then why do you need the colon?

Well, the colon can be used without indentation if the block that follows is just a single line:

  def embiggen(x): return x * 2
  for i in range(10): print(embiggen(x))
Also, it just makes things read more naturally to us humans, I think.

I think you mean that it's in opposition to their goal of being easy to learn, not that it's orthogonal, as that would mean "it doesn't interfere".

Anyway, it seems to me that the only real difference between your code examples is that the second one lacks an ending designator and uses an equals-sign instead of a colon. However, in order to make the lack of an ending designator work, you need a more complex parser (it needs to infer block-ends from indention or some other context). Leaving off ending designators also increases mental overhead for the user, as they must keep block-end inference rules in mind in writing code. Using colons versus using equals-signs is simply a matter of taste (not weight, as you seem to claim).

That said, I do see some waste in their datatype definition syntax. First, I'd prefer curly braces over the colon/end pairs that they went with, as that saves a few characters. Second, newlines are a fine separator, so why also require pipes? This requires the user to type three characters (newline, pipe, space after pipe) when just one would have done fine.

Newlines are not a fine separator; what if you have a definition that you intend to span multiple lines? In any case, the pipes go all the way back to EBNF, since what you are really doing when you specify an ADT is specifying a grammar of types.

I admit to also being a brace weenie; I would very much prefer if all the languages I had to use had them. However, Pyret exists in a tradition of many successful, braceless languages like Python, ML, and Lua.

I feel that the argument for the need for multiline definitions here is a bit weak. You are already putting each definition on its own line, so the definitions are actually fairly close to how their actual usage will look. I feel that if you really need them to be all that long, you are already in weird-style territory, so it isn't such a bad idea to just say "deal with wide files". Overall, it improves ergonomics for the vast majority of use cases, with the only cost being a possible aesthetic annoyance for users who are already writing aesthetically-annoying code.

At the other end are definitions of which several fit on a single line:

    data Color = Red | Black

Lines really aren't much of a thing: they are a single keypress, which generates a newline character and some autogenerated indentation from your editor. There's no real difficulty in breaking things up across two lines versus typing any other extra character (this touches on another issue, which is that I am of the opinion that the concept of code brevity estimation by line count comparison is fundamentally flawed, as code with fewer wider lines is often just cumbersome to type as code with more narrower lines).

That said, please consider the following:

data Color = Red | Black

data Color { Red; Black }

data Color = Red | Black | Green

data Color { Red; Black; Green }

I think the brace + semicolon-newline style actually compares pretty favorably on single-line width (it wins out on line width to an increasing degree as the line gets longer, which is important). However, it is visually more complex, which matters a lot for shorter lines. For this reason, I think that allowing both syntaxes would be ideal. The pipe-based syntax could be encouraged for single lines (newline-termination would sidestep block-end inference issues), with the curly brace based syntax being encouraged for multi-line definitions. There is a slight disadvantage in that now a user would have to know both syntaxes, of course.


Of course, since Pyret's syntax requires that ADT parameters be specified in parentheses, you can actually omit the pipes in single-line mode, too, and opt to use space-juxtaposition.

I can't respond directly to e12e's comment ("Please don't use equal for assignment" -- https://news.ycombinator.com/item?id=6704229) so I'll slightly abuse responding by doing so at this level.

We don't EVER use = for assignment. For us, = is binding. If you write

  fun f(x):
    y = x * x
    y = x + 2
    x - y
Pyret will say

  I'm confused: y is defined twice
and point to the two bindings.

The goal here is to make the common case work fine, where you create a bunch of distinct local bindings; but when you try to mutate, you have to do it explicitly:

  fun f(x):
    var y = x * x
    y := x + 2
    x - y
    f(10) is -2 # the test passes
The first line inside f says "y is _currently_ this, but it's variable, so look out!", and subsequent mutations use := to change y. Unlike JavaScript (say), mutating a variable that hasn't previously been defined is an error, rather than quietly adding the variable to the set of defined names.

Thank you for distinguishing bindings and assignments. Please consider `var y := x * x` for consistency (note the ':=').

We considered it and decided against it for a few reasons.

- We want := to mean "change". The initial binding is not a change. This is precisely the "bindings and assignments" distinction you're talking about.

- A variable may in fact not be mutated.

- A small point is that we want to localize editing when making a change.

So, the way I read

  var y = x * x
is, "y is currently bound to x * x. However, y is a variable, so this binding is not guaranteed to last. Be careful about assuming you know the value of y." If you then see

  y := x * z
it reads as "y's value changes to that of x * z".

So: "=" means "introduced a name with this value"; "var" means "but this binding may change"; ":=" means "and yup, it really _did_ change".

From the very beginning we have debated whether or not to make the initial stick optional. We wanted at least a semester of code to review before we made a decision. I personally lean toward making it optional and just have to persuade the others (-:. With that, you would be able to write

data Color: Red | Black | Green;

We did also initially discuss using = instead of : in places where we were defining things (functions, data). That's still not completely out of the question. That would get you to

data Color = Red | Green | Black;

We need a closing delimiter to avoid creating huge ambiguities in the grammar.

Please don't use equal for assignment, unless you do something along the lines of prolog etc:

    (a, 2a) = (1, b)
    > b == 2

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact