Hacker News new | comments | ask | show | jobs | submit login
Pyret: A new programming language from the creators of Racket (pyret.org)
396 points by sergimansilla on Nov 9, 2013 | hide | past | web | favorite | 283 comments

"""Pyret makes testing a natural part of the programming process. Functions can end in a where: clause that holds unit tests for the function. These assertions are checked dynamically."""

Fantastic idea! I'll keep that in mind, should be fairly easy to extend Lisps or other AST-Macro enabled languages (Elixir, Julia, Python) with such a functionality. I really like that. It makes it easy to work on a function and code tests down without switching contexts (files, workspaces, etc)

It clutters the code.

Tests are a form of documentation, but too much in the code, like too many comments, obscures. Higher level unit tests (acceptance tests) can be quite long, especially if there's a lot of setup - unlike their example code. Literate programming tried embedded documentation, but didn't catch on (even with Knuth's backing). Embedding tests makes them easier to keep in sync, but tests are already kept in sync by (hopefully) failing when not. However, their idea of automatically running tests is interesting, so there's no infrastructure to set-up, run etc (though you'd want to be able to disable them).

Nice for teaching.

I don't think the story of literate programming can be of any help predicting how well this idea here can work out.

Literate programming is not "embedding documentation". The main idea was to separate the order in which the code is read from the order in which the compiler sees it, and it embeds the code in the documentation, not vice versa. It was a very idiosyncratic thing, hard to imagine a team of programmers in a typical current commercial setting, developing a nice LaTeX essay around the actual code of the next social network web-app. As the program grows larger it also gets harder and harder to maintain the "story" around it.

That's not the "main idea". Originally, it was a part of the suggested way to accomplish the goal ... but the main idea is this:

    > Let us change our traditional attitude to the construction of programs: 
    > Instead of imagining that our main task is to instruct a computer what 
    > to do, let us concentrate rather on explaining to human beings what we 
    > want a computer to do.
    >                          — Knuth
If your program can be read more as an enlightening explanation of the job-to-be-done, and is more oriented toward the human reader of the code than the machine that will execute it, then you've succeeded.

Hopefully, it's clear how a program written in such a way could potentially be beneficial to a large team, trying to understand, modify, and improve it.

Right, but it's important to emphasize what stiff was talking about, that the structure of the narrative needs to be oriented towards the human instead of the machine, because so many people don't bother to do this and decide that literate programming just means having LaTeX or docbook comments in your code. That total inversion (the code is in the documentation, not the documentation is in the code) is important. (That said, I admit that most modern languages are so much more flexible compared to C or Pascal with regards the ordering of code.)

It's like saying the main idea in C was to allow writing programs. At this level of generality it is indistinguishable from tens of projects with similar goals, including also earlier ones.

It's also much less necessary with packages, where you can trivially change the order of most of the code is read.

You don't have to put the tests right in the middle of code. The `where` form goes with the definition, but a `check` form can float freely. So if you prefer a <Lang>Unit style of testing, where your tests reside in a separate file, you can do that with `check` blocks just fine.

However, the `where` tests play into the type-inference story for the language!

I think that Python's doctests have shown this to be a greyer area than this. Not all tests can fit nicely next to function definitions, but a few well-chosen ones are both fantastic documentation and not too cluttery.

Doc tests are a poor solution precisely because they are difficult to edit. The solved a non-problem and made writing documentation AND tests more difficult.

Instead of writing tests in strings, the documentation tool should have been modified to render that code _in_ the documentation.

Doc tests are butt-ugly in several ways. Code shouldn't be written in strings, and they take twice as much vertical space as they need (which is a problem because monitors are typically used much wider than they are tall).

I agree with you in that managing real code in a place that gets less frequently executed is a bear, but I wanted to emphasize the value of having tests very near to the primary documentation for a function. That's a complete win, I feel.

Which it totally is! That is why all the functions I test are in pairs suitable for nose or py.test

    def test_foo():
        assert False, "test code for foo"

    def foo(farb):
Tests should absolutely be next to the code. The test should make it into the documentation. Code is documentation and should make it into the generated documentation, not the first thing you see but it should be there along with commit history, etc.

If that ended up nicely formatted beneath the definition of `foo` in the documentation I would recommend it wholeheartedly. In practice it doesn't happen, sadly.

Thank you! I never had any idea this kind of things exists in python... I can already see some code that would benefit from this (it would be insane to put it everywhere of course).

On the other hand, inline test does make it easier to write tests while writing the function. Besides many well-written programs already contains inline documentation (not comments) that can be a lot longer than the function they describes.

Code folding in a IDE goes a long way to make this bearable.

> Code folding in a IDE goes a long way to make this bearable.

Code folding in IDEs is an indication that we are doing it wrong -- that is, we human beings are programming computers "wrong."

I'm not saying that code folding is a symptom. I'm saying that code folding shows how primitive our means of managing code is. It's as if the mesopotamians somehow invented computers, and because of tradition, all code has to be written as cuneiform on wet clay tablets, then fired in ovens before being read. At least text files in directories are digital, but they are as static and behavior-less as clay tablets, and all of the important relationships therein are expressed as implicit correspondences which programmers have to keep track of in their heads.

Code Bubbles is a beacon in the direction we should go.


Code Bubbles were invented one floor above me (by another team in computer science at Brown University). So, we're not at all unaware of it. And yes, it's great stuff.

But we also had to make a conscious decision about IDEs, and we decided to be IDE agnostic -- IDEs should enhance the language but not be the only means of working with it, because programmers are really attached to their editors, and a language predicated on tearing people away from their editors will stuggle. Also, having been part of DrRacket from version 0.0, I know how long it takes to make one that's any good.

Code Bubbles were invented one floor above me (by another team in computer science at Brown University).

I wrote a significant fraction of a Smalltalk class browser.

IDEs should enhance the language but not be the only means of working with it, because programmers are really attached to their editors, and a language predicated on tearing people away from their editors will struggle.

So, basically you made the pragmatic decision. What you say is true, but I suspect this tendency is ultimately holding us back.

I've had this argument for twenty years. I had my a-ha moment in grad school when, after I'd put a fair bit of work into a build system and showed it around, my advisor asked me, "What is a file?" And then I was enlightened. So I can play this song back with fairly good fidelity.

If I could get a reliable, efficient, cross-platform, extensible, quickly-ported-to-new-platforms environment that gave me rich editing, I'd grab it. I don't know of such a platform. So we made the "pragmatic" decision, yes. That's because this isn't the problem we're trying to solve.

We are actually trying to innovate in the programming environment space, and have been running with that experiment for the past semester internally at Brown. At some point, after it's been knocked around a bit more, we'll make it public. So we're not avoiding the environment space. But this is not a battle we're seeking to pick in that space.

I wish you luck with it, and hope we can build on the work of people like you.

While in some sense I agree with you, stuff like code bubbles is just a incremental step from folding. If you want better interfaces for your coding, you write bidirectional transformations between your representation and the text; the point being that the underlying representation does not particularly matter. Having underlying text, however, means you have a good fallback in case your tools do fail.

I would say the more potent point is that it should not matter at all if the tests are in the function or in a different directory: the IDE should be able to fold them in either way, and in some sense, IDEs already do have partial support for this.

> While in some sense I agree with you, stuff like code bubbles is just a incremental step from folding.

I said it was a "beacon in the direction we want to go," not a lighthouse at the destination!

> the point being that the underlying representation does not particularly matter. Having underlying text, however, means you have a good fallback in case your tools do fail.

Sure, do this, so long as the representation doesn't limit the objective capabilities of the environment and so long as it doesn't limit the ways tool makers think about code. Unfortunately, our current representations clearly do both.

Code bubbles looks cool, but I don't see the connection with your point about clay tablets. Any digital data is, by itself, flat and behavior-less, whether it's in the form of traditional files or some specially-designed backing store for something like Code Bubbles. I don't see any point in really trying to hide that base reality.

I am in favor of giving links between data a more prominent place in our storage systems. I envision a system where the data is mostly fine-grained trees, like sexprs, where hard-links between trees are first-class entities. But at the end of the day it's just an abstraction over a bunch of bytes.

> Any digital data is, by itself, flat and behavior-less, whether it's in the form of traditional files or some specially-designed backing store for something like Code Bubbles.

I know this is false. You can have a system where everything is an object and carries behavior around with it. We've had those for over 40 years.

> I am in favor of giving links between data a more prominent place in our storage systems.

A "more prominent place?" Isn't this a bit bass-ackwards? Aren't relationships in code where the primary value is?

The behavior carried around by those objects is ultimately in the form of flat digital data, interpreted by other, more general programs, executed on a CPU. Behavior, when it comes down to it, is a property of silicon, not data. This might seem like a useless distinction, but I don't think it is. I like to keep track of what things are underneath the abstractions.

I think you're completely backwards on this.

Code folding in an IDE is an example of a way in which text files aren't necessarily static and behavior-less. I see no reason to move away from an underlying representation as text.

I see no reason to move away from an underlying representation as text.

A big reason is that otherwise people won't move away from the paradigm of static text files. The best they'll do is text files with little gimmicks attached to them.

> Literate programming… didn't catch on

I would argue literate programming is experiencing a renaissance, thanks to docco:


I think literate coffeescript might be a better example of popularizing literate programming:


There also seems to be some traction for literate Haskell:


I experimented a bit with literate programming on an introductory course in programming using java. It's an interesting experience -- it's very easy to produce a very readable language that is still pretty poor java code, as it becomes so easy to split out fragments and "procedures", rather than follow the more common java class-oriented object orientated way of doing things.

Not really, I never heard of docco and am yet to see a RFP for a node.js project from my employer's enterprise customers.

Chances are much higher that you've seen the output of Docco or one of the multitude of imitations for other languages.

You would think we'd have editors by now that could, say, hide all the tests to declutter the code while you're reading it.

I think you are both right. D even takes it to the next logical step: Present the unit tests as examples in the API documentation (like Doxygen,Javadoc). It means the API examples are kept in sync.


If the tests are pages long, that might be a sign that the function could be refactored. Also, the optional type annotations and refinements seem to decrease necessary testing quite a bit, compared to the typical scripting language. I would use this all the time.

And as other people mentioned, you can always use the check statement out of band, too.

> It clutters the code.

Modern IDEs/code editors can support code folding as well as projections fairly easily. I don't see a problem.

Making tests part of language syntax is an interesting idea.

Of course it's not equivalent, but Python doctest module at least allows you to keep tests close to the code. Here's Pyret example, ported to doctest:

  def sum(l):
      >>> sum([])
      >>> sum([1, 2, 3])
      return reduce(lambda x, y: x+y, l, 0)

  if __name__ == "__main__":
      import doctest; doctest.testmod()

> Making tests part of language syntax is an interesting idea.

It's been tried many times before and it has always failed. A few reasons:

- It clutters the code

- If you need simple tests, asserts suffice

- If you need more sophisticated tests, write functional tests, separately

The approach offered by Pyret (along with similar ones, such as design by contract) are compromises that are the worst of both worlds.

It's actually widely used in Racket already. Yesterday a lead programmer at a major financial institution told me that he liked that style so much, he'd incorporated it into his company's OCaml system, and they use it in their production systems (lots of high-volume trading). Hardly "failure"s.

Nevertheless, perhaps you could provide some pointers to the "many" projects that have tried it before?

Assertions are not tests. This is a fundamental misunderstanding of the difference.

Pyret's `check` blocks exist to write complex tests, separate tests from definitions, or write tests that cross multiple functional units. So they offer the power of regular testing frameworks.

Thus, `where` is the bonus, not a "compromise".

This might work better for pure functions. I can't imagine it working very well for complex, stateful code which requires mocking and resetting the state between tests.

Given that Pyret is functional, there's a good chance there will be an emphasis on simple, pure functions.

Beyond that, I'd agree completely. Doctests, for instance, are quite terrible when you can't fit all of the logic into one line, but then I think it's usually a sign of a well-designed function when it's good a good enough interface to get a few pithy doctests in.

Though Pyret is not pure, our emphasis is heavily on "functional first", even for objects. We believe state should be used carefully and for good reasons. HtDP (www.htdp.org) has two whole chapters on state at the end, giving design recipes for their use, and I'm revising some of these in PAPL (papl.cs.brown.edu/2013/).

This is why a test block is a block. Writing inline tests for stateful functions in Racket was a bit painful because it is so heavily expression-oriented. We want to make this feel more natural.

(Another virtue of test blocks: you can write little local helper functions. There's an example of this on the Pyret home page.)

They didn't say one way or the other whether the language was pure FP, but I don't think any of the examples featured mutuality

Pyret is not pure. We have mutation. But mutation has its own syntax different from regular binding. Variables (as opposed to identifiers) are preceded by "var", and variable mutation uses ":=". [Similarly for fields.] This it to make clear to all forms of readers -- programmers, compilers, IDEs -- when something is mutable.

Have you thought about adding some kind of QuickCheck-ish model where test cases can be randomly generated for various types?

Yes, of course, we've thought about it. Equally importantly, we have a "satisfies" keyword for stating checks about properties. (See the example on the Pyret home page.) We use test oracles extensively in teaching; see these two assignments: http://cs.brown.edu/courses/cs019/2012/assignments/sortacle , http://cs.brown.edu/courses/cs019/2012/assignments/oracle .

This exists in Haskell's doctest library

    -- | Documentation goes here
    -- > 2 + 2 == 4
    -- True
    -- prop> \lst -> reverse (reverse lst) == lst

> Yesterday a lead programmer at a major financial institution told me that he liked that style so much, he'd incorporated it into his company's OCaml system, and they use it in their production systems (lots of high-volume trading).

Hmm...Jane Street?

Racket has a very powerful contract system that you might want to out. It also has support for higher-order contracts on functions (contracts that have delayed checks instead of being just assertions) and they also have a typed version of Racket (Typed Racket) built using the contract system.

Some of the Pyret designers are part of the Racket community, and know these ideas well. Where Pyret differs is that Racket has two different languages (for the purposes of this discussion), Racket and Typed Racket, and a program has to live in one or the other. Whereas Pyret takes a different "gradual" philosophy to type annotations, so a program doesn't have to move between the two languages.

I was just trying to point out that Pyret is not inventing the contract-checking and that Racket, the language Pyret owes a lot to, already has the contract system terhechte was looking for. Maybe I should have been more clear.

That said, I'm going to take the opportunity to ask a question: does Pyret's gradual typing cover parametric polymorphism and mutable data structures? Those usually are specially though to do in these fully gradual systems so I'm curious what you guys have done about that.

Sorry, lost track of what you were responding to.

Re. your other question, yes, this is a wide-open problem, and we're trying to not get bogged down in research issues like this. We actually have one answer about what to do, presented in a paper we wrote some time ago (http://cs.brown.edu/~sk/Publications/Papers/Published/gmfk-r...), but we don't really want to do that. But because we're working on a (curious) type inference story, we may be able to carve up this problem in a very different way.

Sorry for the non-answer. You're right, and we're working on it.

Thanks, in a way the non-answer is more useful than a real answer for me :)

I'll read between the lines and just say, "Send us a link to your paper when it's done" (-:.

In fact, we'd really love it if gradual typing folks would tell us what set of primitives they miss in a language. They aren't constrained to what Racket or Python or whatever; there may be a primitive that makes sense that Pyret can add to enable space-efficient, semantically-meaningful contracting or gradual typing in the presence of such operations.

As one of those gradual typing folks, I'm not sure what you mean by "primitives we miss". The things I'd want for space efficiency are things like a contract language with decidable 'and', and for regular efficiency, a compiler that is good at seeing through wrappers.

Well, there's a preliminary answer!

Though I'm not enamored of the Typed Racket style of wrapping everything at the boundary (even as I appreciate why it's there). Not even hypothetical: we actually began developing Pyret in TR but, with great sadness, ended up having to move it out because of the enormous performance hits.

But neither of those is a "primitive". The first is a major restriction on contracts (and not a restriction Pyret enforces, AFAICT). The second isn't a language feature at all. How would having a new language help give me either of these?

And yes, soundness + interoperation + structural types = heavy performance cost. You can give up on one of those easily, but the combination can't be free.

But the best way to make TR programs not pay that cost is to not cross the boundary all the time, which it sounds like is what you were doing. Why was that necessary?

Not even remotely in just about any respect.

Really? The concept of a block that specifies a contract does not sound similar to the design by contract capabilities of Eiffel. Care to enlighten us why not?

Perhaps you think I was referring to something other than the quote in the comment I replied to.

Contracts are fundamentally different from tests. Contracts are abstract specifications about values while tests are concrete specifications about values. The concrete vs abstract distinction is fundamental, as I hope is clear.

That's why Pyret has both tests and contracts (for now, in the form of refinements). This is a distinction that has been explored extensively in Racket also, all fully aware of Eiffel DBC.

Those refinements are nice. Pre- and Post- conditions are illustrated, is there any way to describe an invariant?

Also, maybe you should have called them something else since a lot of people will think about completely different when they hear this word... (http://www.ruby-doc.org/core-2.0.0/doc/syntax/refinements_rd...)

re: refinements nomenclature...

Rebol got there first though! - http://www.rebol.com/r3/docs/datatypes/refinement.html

Matz is known to like Rebol - https://twitter.com/matz_translated/status/25061436079433318...

Eiffel contracts are relatively primitive compared to the more modern Racket and Pyret contracts. Eiffel contracts are basically limited to assertions and don't support contracts on higher order function arguments, refinements, etc.

There is also the whole dynamic typing thing. Eiffel is still fully statically typed.

None of which changes what I wrote earlier.

In terms of my opinion of Racket/Pyret, though, there's been a number of Ruby projects to do similar stuff to this over the years, since it's trivial to add (just add a class method that redefines a method and wraps it in code to execute the contracts), and a lot of Ruby newbies try to find ways of adding back the restrictions they've lost when moving from statically typed languages.

Invariably they've ended up dying or not getting much interest, largely because a large part of the appeal of dynamic languages for a lot of people is that the code reads simple. In the Ruby world at least, aesthetics is a big deal. And putting the tests inline destroys that aesthetic and makes it harder to read the code.

Especially because tests tends to be trite and repetitive and contain a lot of stuff that is obvious to a human when you have the code right in front of you.

Especially when they go beyond documenting contracts that are intended to bind subclasses to an interface.

That code readability is important.

I have a prototype for my own Eiffel inspired language with DbC support, and my own experience with it was that as much as I love the idea, even Eiffel level contracts are bordering on being too verbose, for these reasons.

Pull them out into separate files, and they're a lot more acceptable, but then what is the difference from any other unit testing framework?

This "static" vs "dynamic" distinction is largely artificial. Every program has parts that make sense to think of as "static" and parts that will be very "dynamic", and the ratio of these two differs from program to program (and even within a program, from day to day).

The same goes for making statements about programs.

Therefore, in Pyret, we recognize that there are lots of different levels at which we make statements about our programs: from tests through types to specifications, spanning both static and dynamic aspects. They not only aren't in conflict, they're even related. Most languages include only a subset of them or make them appear to be disjoint. In contrast, in Pyret, we're already working on linking these to one another, and are going to keep pushing in this direction.

Languages that leave out one or more of these end up forcing users to reinvent these wheels for themselves (because they're each good ideas that people inevitably want). But by forcing people to reinvent a wheel, they end up giving users ad hoc versions of these tools, and also introduce friction between these parts. (E.g., as Robby Findler found several years ago, just about every single bolted-on DBC system had significant limitations or errors.)

D has something like this[1]. It might have come from a still-earlier language.

[1] http://dlang.org/unittest.html

My biggest gripe with these is the inability to name the tests. What exactly is a unittest block testing for? You need to rely on comments or read the test code where none exists. Doesn't seem like Pyret fixed that issue.

Everything is fixable in a new language (-:. Say more!

One of my favorite things about tests is that they let me read a new codebase more interactively. If I don't understand why a function is built like this and not that I can rewrite it the new way, run the tests. Without names I'd know I broke something, but with a good name I'll often realize why that something matters in the big picture.

(I think about tests a lot: http://akkartik.name/post/tracing-tests)

I understand your point. One thing I pushed for initially, dropped because we had bigger fish to fry, and will revisit later, is the idea that you should be able to name everything: not just tests but an individual variant in your data definition, lines of code, etc. And this should be intrinsic to the language.

That said, I couldn't see what your post had to do with this concept; if I've missed something, please explain. (As an aside, I believe this -- http://cs.brown.edu/~sk/Publications/Papers/Published/mcskr-... -- generalizes the idea in your tracing post.)

Thanks. My post wasn't about names, I was just looking for feedback. Which I got; thanks.

I would like it if the `where` tests supported DataTables for a slightly more declarative style. Here's an example from Scala/Specs2: https://github.com/snowplow/snowplow/blob/master/3-enrich/ha...

The fact that they don't now doesn't mean they can't in future!

One of the advantages of having a separate block for tests is we can do things in that block that won't necessarily pervade the whole language where it might not make sense (which we already do: eg, the keywords "is" and "satisfies"). We've all done some sort of "here's a list of tuples, now map the prefix of this as arguments and the suffix as the expected value over this function", so it is something on our mind.

But I've struggled a bit to find a good way of writing it. We've written a few things where it looked truly "naked" to have these tuples without the surrounding function call. So it's as much a matter of syntax design as anything else.

I must say that the code on that page does not look like something I want to copy; it looks like a good compromise, but with emphasis on both words. We don't need to compromise.

If you can up with a cleaner syntax for test data tables that would be awesome, thanks Shriram.

Could the tests be more declarative, defining properties over a space?

Absolutely! Pyret's test section has a "satisfies" keyword for property tests. E.g.:

    insert(a-value, a-balanced-tree) satisfies is-balanced
or even

    tree-examples = [ ... ]
    values = [ ... ]
    for each2(t in tree-examples, v in values):
      insert(t, v) satisfies is-rbtree

That's an advantage in my book... it is testing the unit (function in this case) and it is testing all the corner cases... do you have a name for each of them? Description through comments is much more suitable IMHO.

I really hate some things about the Rspec/Cucumber suite, but providing a way to output what the tests are testing as they run is a major advantage. I like being able to run my test suite and see "it does what it's supposed to do in this case" output.

At a glance, the Pyret default appears to be that tests run silently. How do I know the right tests were run?

Corner cases? Tests test the behaviour of the function. Functions, depending on their complexity, have a range of behaviour. And yes, you should (usually) be able to come up with a name for it. As a bonus, you'll get immediately an idea of what broke upon test failure.

in clojure:

      (defn my-function [x y]
      (+ x y))
    (is (= 4 (my-function 2 2)))
    (is (= 7 (my-function 3 4))))

AST-Macro enabled languages (... Python)

What?! You can't write macros in Python? (right? have I missed something?)

I'm not sure I like having tests in the same file with the code. In real code bases it's very common to just read the code, and cluttering that with huge swaths of unit-tests is counter-productive.

That said, Pyret appears to be aimed at education so this may make more sense.

This is bouncing against the perpetual "code as text" problem.

Put the tests and docs in with the code and you end up going into "scanning" mode. The actionable information density per character is lower.

Put the tests and docs outside the code and you (or that obnoxious guy who just changed > to >=) don't update the tests or the docs. In the worst case (cough functional web tests) people tend to disable the tests because it's too much work to get them to conform to the changes and everyone is on fire and babies are dying and this needs to be done in production now.

Then you can put your tests in a `check` block and move them elsewhere. You don't have to use `where` except when it makes sense.


Every time you write a refinement on a type, you're actually writing a contract. You can gracefully move between types and contracts this way, and also between thinking of them as checked assertions vs tested ones. The goal is to make this migration more and more seamless.

Yes, I although I have used a multitude of dynamic languages on my career I tend to lean on the static type languages with type inference side.

That really caught my eye!

Does anyone have any experience with workflows like this? Are tests next to code a pragmatic technique?

Python has them in the form of doctests. They are generally frowned upon as they tend to significantly clutter up code and documentation when they actually provide comprehensive testing. That said, Pyret's format may escape that fate as they are given their own section which most modern editors should be capable of folding.

Doctests: http://docs.python.org/2/library/doctest.html

Haskell also has a library for doctests, though I don't see them used that frequently:


I use this in clojure where I am interacting with the code from the repl. run-tests will execute tests for function defns wrapped with the with-test macro. This is fast and no external file or namespace needs to load just type run-tests; http://richhickey.github.io/clojure/clojure.test-api.html

Interesting. Could you propose such an extension by posting a code sample e.g. in scheme?

Sure. These are basically simple contracts. Contracts are implemented in Racket: http://docs.racket-lang.org/reference/contracts.html

I was thinking exactly the same thing after I read that :)

I took CS019 with Shriram a few years back and immediately guessed the authors based on paradigms like making testing a natural part of the language and the encouragement to use annotations. In the class, we were taught a design process for Racket that will work very well for Pyret:

1. Identify the data - create data definitions (You are gixen x and expected to produce y)

2. Write concrete examples of the data (This is hard and takes time)

3. Write contract, purpose, header for functions (contract and header are annotations in Pyret, purpose should be a commented statement)

4. Write concrete examples of the function (This is hard and takes time. This means test cases!)

5. Write the template (This may only apply to recursion in Racket, but the idea is if you're dealing with a cons, you always have the same structure of checking if a cons? or empty? and must recur)

6. Fill in the template (ie, complete the function)

There is a book about this approach (http://htdp.org/), as well as a MOOC on Coursera (https://www.coursera.org/course/programdesign)

The second edition of How to Design Programs is not finished yet, but I recommend it anyway: http://www.ccs.neu.edu/home/matthias/HtDP2e/index.html

htdp was my first programming book, and it is really helpful.

Can't suggest these enough for any programming student.

Well done, T.B.! The same idea applies to recursive data in any language; you'd use exactly this to write a tree-walker in Java, for instance. And likewise in Racket. E.g.:

  fun f(l :: List<T>):
    cases (List) l:
      | empty => ...
      | link(f, r) => ... f ... f(r) ...
Note that because you can put type annotations in the cases statement, you can remind yourself of the type of the locals:

  fun f(l :: List<T>):
    cases (List) l:
      | empty => ...
      | link(f :: T, r :: List<T>) => ... f ... f(r) ...
which further nudges you towards a possibly recursive solution.

Pyret looks like a there's a lot going on, based on the examples. It has implicit returns, special purpose syntax for data structures, iterators, assertions, refinements, etc. Having support for more stuff makes it less suitable as a teaching language, not more. Pyret looks like a has the good bits of Python plus a whole bunch of other cool stuff. But the language is pretty complex as a result.

Cool? Yup! Good for teaching? Probably not.

Take a look at the grammar and judge for yourself: https://github.com/brownplt/pyret-lang/blob/master/src/lang/...

Thanks for your comments!

I have two responses:

1. You're right that exposing too much, too soon is a recipe for disaster. People can only fit so many new concepts in their head at once, and blasting them with refinement types from day one isn't a great idea. However, there's two things at work here: the role of the curriculum and the role of the language. Pyret is careful to allow a gradual transition to more and more complex features. Annotations are not necessary on day 1, and neither are iterators. We write untyped, recursive-descent functions on lists in the beginning, and build up to types and the pleasant terseness of "for" loops. If Pyret required introducing these concepts to write any programs, that would indeed be a problem, but we've put thought into the dependencies between curricular concepts and the needed language constructs to mitigate exactly this concern.

2. I think that some of the features you listed are actually a huge necessity in early programming, particularly special purpose syntax for data structures. Teaching the definition and implementation of a balanced binary tree to folks without language experience in Python or Java requires a huge amount of boilerplate explanation. You need to describe classes, fields, and constructors just to do the equivalent of Pyret's "data". The alternative (at least in Python) is to encode everything in lists and dictionaries, but hopefully in 2013 we've moved beyond writing all our data structures that way :-)

Joe, any plans to push this out in Kathi's intro course at WPI?

Not our decision! Tell the WPI folks about it and why (if) they should consider using it.

Language complexity may not matter much.

"He speculated that the size of programming languages might confuse many students trying to learn to program. Java, the teacher’s programming language of choice at the end of the 20th century and at the beginning of the 21st, is defined loosely but at great length in several hundred pages.4 Natural Deduction, on the other hand, can be defined precisely on a single side of a single A4 sheet, in a normal font size. It is small and conceptually simple: surely it can be taught! Well, up to a point it can, but the results were depressingly familiar. Figure 14 shows the results from an in-course exam on formal logical proof. Unlike the examinations of section 3, students were strongly discouraged from guessing, with negative marks for wrong answers in multiple-choice questions and extensive unseen proof problems. As a consequence, two humps are clearly visible, with one at 33-40 (about 50%) and another at 65-72 (about 85%). (This data was collected long before Dehnadi’s test was developed, so we can’t analyse it further)."

Saeed Dehnadi - The camel has two humps (working title): http://mrss.dokoda.jp/r/http://www.eis.mdx.ac.uk/research/Ph...

Actually Brown teaches Intro to Programming Languages using Pyret. http://cs.brown.edu/courses/cs173/2013/software.html

By looks of the course number, this is a third or fourth year elective for undergrads. It seems that what is being taught here is the design and engineering of programming languages, not programming itself.

That's correct - 173 is definitely not an intro course. But, we also are teaching an intro course with it:


They also teach the Accelerated Intro to Computer Science (the first cs course some students take) using Pyret. http://cs.brown.edu/courses/cs019/2013/

The grammar can be described with a single page in actual BNF. Thats AMAZING. The only languages have shorter grammars are core lisp/scheme. Ruby is fundamentally impossible to describe with a BNF.

> Thats AMAZING. The only languages have shorter grammars are core lisp/scheme.

It's nice, but I'm not sure I'd go with amazing. Most or all of Niklaus Wirth's languages have shorter grammars, for example. I'd imagine quite a few others do as well.

E.g. Oberon-2 has 33 EBNF productions: http://en.wikipedia.org/wiki/Oberon-2_(programming_language) - that fits fully on screen for me with my default font size...

Regarding Ruby, I agree. I'm working on a Ruby compiler, and "decoding" the MRI parser into something closer to optimal will still leave an awful mess no matter what you do. I love programming Ruby, but implementing it is hell (and one of the reasons I'm trying it...)

If you think parsing is the hard part, wait for the semantics. See, for instance, the Ruby examples on the Pyret home page.

I don't see any difficult semantics in those examples.

The hard part of parsing Ruby is mostly trying to conform to MRI rather than to a formal grammar. In comparison the semantics are reasonably simple.

It's hard to compile Ruby efficiently, though, but for reasons unrelated to those examples.

(Incidentally I don't know if that scoping example is intentionally ignoring the fact that Ruby does support lexical scoping (EDIT: for blocks), or if it's lack of understanding of what what "def" is.

The Ruby way of achieving what that example is trying to do is:

    def f(x)
      g = proc do |y|
        x + y
(g[2] is shorthand for g.call(2))

"def" explicitly introduced a method-level scope of the innermost wrapping class scope. Nesting 'def's is not in any way idiomatic Ruby. It's not necessarily very useful to nest them given those semantics. But "def" is really just syntactic sugar for #define_method, and so it is logical for it to work where #define_method does.

I might be inclined to agree that the different Ruby ways of defining blocks, methods and closures could do with some simplification. Allowing "def" to bind surrounding variables would remove a lot of the need for class instance variable or class variables, for example. )


Also, the currying example would look like this in Ruby:

    o = Object.new
    def o.my_method(x)
      self.y + x
    def o.y
    method_as_fun = o.method(:my_method)
You can argue for implicit currying if you want, but to me that's far more confusing. That said, it is extremely rare to find currying used in Ruby code.

Yes, I spoke too loosely. What I meant to say is that you'll have hard time making Ruby fast. I assumed implicitly that the point of writing a compiler was to optimize performance, though maybe not.

Thanks for the feedback.

Re: scope, you're right that using "def" the way I did isn't very idiomatic Ruby, but I always run afoul of it because it looks so similar to what I'd write in another language. I took that example down for the moment; I still have a personal gripe with it, but my "fundamentally broken" language was a bit strong. If I think of a more illustrative example, I'll put something back up.

The currying example I still think is weird in Ruby, and it's because it interacts bizarrely with the syntactic choice about optional argument lists. The thing that is wrong with JavaScript and Ruby is that they both have dot expressions and application expressions; o.m and f(x). It looks like


should be a composition of dot lookup followed by an application, since both of those raw expressions make sense on their own. But in neither does the decomposition actually work:

m = o.m m(x)

There's no state or funny mutation going on here, but a simple kind of substitutability isn't working. JavaScript does it especially poorly, and Ruby has this awkward inability to decompose because of its choices about application syntax. Now, in Ruby I'm aware that with or without parens are actually both method calls, so it's not like there's a field access and an application in the underlying language model. But Ruby then adds the syntactic convenience of no arguments to make it look like access is possible, but that syntactic convenience is a bit of a leaky abstraction.

I should note that Python actually gets this nicely right IMO, and dot lookup curries self so this works out.

The underlying thing that irks me and I'm calling out here is the non-compositionality of what looks like two expressions that should compose. This is something we felt like figuring out and getting consistent for Pyret.

> Incidentally I don't know if that scoping example is intentionally ignoring the fact that Ruby does support lexical scoping (EDIT: for blocks), or if it's lack of understanding of what what "def" is

I also don't think it is fair to say that ruby "failed to nail lexical scope in fundamental ways".

I like how it enables lexical scoping with blocks, instead of enabling it in, maybe, more common way with embedded functions. To me, the ruby way is more intuitive.

Also, 'breaking scope' with block (that acts as anonymous function) instead of defining new (embedded) function feels more explicit wrt real scoping intentions.

I've done a fair amount of programming in Ruby (two reasonably large Rails applications, spanning about two years of development time). When switching to Ruby from just about anything else that I program in from JavaScript to Python to Racket to ML, I get tripped up on exactly this issue because it doesn't match my expectations at all.

My gripe is that the simplest, cleanest construct in the language for defining named functional abstractions---"def"---doesn't support closing over variables, which other languages do, so I have to warp my thinking a bit to program in Ruby. The analogous program in the four languages I mentioned above (and Pyret, and more) works and creates a closure. Maybe I just don't think well in Ruby, because I end up feeling frustrated that Ruby seems to not like functions.

All that said, "in fundamental ways" is a little harsh. At least in this example, Ruby hasn't done anything to violate the integrity of static scoping the way, say, JavaScript's "with" does. I'll try to see if I can come up with a better example that's a clearer problem and less a personal dislike.

agreed, having decent semantics is key. :)

> The only languages have shorter grammars are core lisp/scheme.

Smalltalk would like to have a word with you: http://chronos-st.blogspot.be/2007/12/smalltalk-in-one-page....

DrRacket pioneered the notion of "language levels" that grow with the student's needs. Pyret will end up taking some variant of this route. It's just too early to design those levels yet. Once we have more experimental data about what kinds of inadvertent mistakes students make, we'll be in a position to say, "Oh, yeah, that was a bad thing to release on day 1", etc.

The language is in use in introductory courses, with a gradual introduction of these topics. It's no different from using any other modern programming language in education.

Our goal is to eventually create language levels, pioneered by DrRacket, based on what we learn from observation. This is a research project in addition to a development one, where the research is into human factors.

Why replace Racket with Pyret? Is this purely to stop students from being able to complain about syntax or is there a deeper strategic reason behind this?

Refinement types are nice. I really wish those were more common.

Your second paragraph answered your first paragraph (-:.

1. We have a different view of static typing than Racket.

2. We have a different view of the type language than Racket.

3. Long term, we are working on smoothly integrating testing, types, and specification [as a spectrum -- perhaps even as a cycle -- rather than as three different things]. Refinements are an instance of this. Another is our plan for how to do type inference.

4. I am also convinced that parenthetical syntax has real problems that I can't completely ignore. Of course, it doesn't hurt if it can get people to stop complaining. (It's not only students: the bigger opposition often comes from teachers. Unfortunately the teachers control the path to the students, so their views _really_ matter!)

I am also convinced that parenthetical syntax has real problems that I can't completely ignore.

What brought you to this conclusion? Experience from Bootstrap (in which I found arithmetic and parenthesis counting to be the biggest stumbling blocks) or just personal reflection?

All of the above, plus the research that Guillaume Marceau, Kathi Fisler, and I did.

My understanding is that it's a relatively simple place to learn about how to manipulate data (as in algebraic data type style, FP data) and has all of the conveniences of syntax that allows someone to think fluently about that concept.

So it's syntactic complexity might be high, but it's conceptual complexity is rather low.

Edit: deleting snark.

Oz is one of the best languages ever for learning CS.

By that logic, assembly is the best language to learn programming.

Presumably it is easier to learn about concepts with a language that has them.

Actually, there's a (devil's advocate) argument for teaching assembly first. After all, TAOCP is all assembly for good reason. Many programmers who never learn any assembly language lack a kind of procedural literacy, as well as a mechanical sympathy, which are a big part of the craft of programming.

But, as you state, there are a lot of higher-level programming concepts you can't effectively learn in assembler (one imagines a bizarrely circuitous route around learning to build compilers in assembly (via Forth perhaps) in order to teach lexical scoping or similar).

Obviously I'm not suggesting that assembler is the best teaching language. I could also take your logic and use it to conclude that C++ is the best programming language to learn first because with C++ you can learn about the implications of heap corruption when improperly handling exceptions in constructors. C++ allows functional, procedural and OO and many more programming styles. It supports manual and automatic memory management. It can be used to teach high level and low level programming constructs. But I think we're all in agreement that C++ is not a good first language to learn despite (or because of) its versatility.

I think Python is a very accessible programming language that still has enough depth to go into a lot of interesting CS subjects. That Python doesn't have manual memory management or support for true multiple inheritance means that you can't teach those things effectively with Python, and I'm completely OK with that trade-off.

Defining basic data structures is a pain. See the example on the Pyret homepage.

I am already seeing a trend in Python tutorials where they just stay away from complex data, and stick to what is easy to encode in the data structures that Python does provide (lists, dictionaries, arrays).

We've seen this movie before: it's what happened when languages like Fortran, Pascal, and C dominated programming education, and the fact that they made some data (particularly of the linear kind) really easy and others a nuisance meant that curricula diverted towards the path of least resistance.

It's harder now to spot the pattern because it's better hidden. But it's still there.

I don't see why a complex language can't be good for teaching if its designed in such a way that the complexity is optional, and you can use a focussed subset for pedagogical purposes (especially when the language supports using different focussed subsets depending on the specific focus of pedagogy.)

I mean, even if you look at how language like Scheme/Racket, or Python, or pretty much any other language is used in teaching, you don't usually use the whole thing in any one teaching context.

This is true of all educational settings -- except that the language subset then becomes an unenforced (and hence very leaky) abstraction. Students inadvertently step outside the subset and get either bizarre errors or, worse, their program runs and produces strange errors.

DrRacket pioneered the notion of "language levels" that grow with the student's needs. Pyret will end up taking some variant of this route. It's just too early to design those levels yet.

Pyret has lots of nice ideas, looking forward to see it evolve.

That said, the title of this post is a bit misleading so I wanted to correct it. The group of people who develop Racket and Pyret are mostly disjoint. I'm not trying to diminish Pyret at all, but wanted to make sure the right people get the credit. You can find out who develops Pyret here: http://www.pyret.org/crew/

Why the superfluous syntax? I think remembering syntax like this is orthogonal to the goal of being easy to learn.

The syntax is basically Ruby + Python + Haskell. Each of those languages has a lighter, more intuitive and memorable syntax.

Why would the syntax be:

    data BinTree:
      | leaf
      | node(value, left, right)
Instead of just

    data BinTree = leaf | node(value, left, right)
The whole colon thing in Python is a mistake, it should have never been in Python, and it definitely not be repeated in other languages..

I would encourage you to view the syntax as "Python, but repaired", for the following reasons:

1. Pyret comes out of Shriram's group's expertise with pinning down exactly what Python and other dynamic scripting languages do right, and (mostly) do wrong. Check out their recent paper Python: The Full Monty: A Tested Semantics for the Python Programming Language http://cs.brown.edu/~sk/Publications/Papers/Published/pmmwpl... for more context.

2. Ruby is far and away not the originator of 'end' to end blocks -- this comes from Pascal and is in other languages that have nothing to do with Ruby, like Lua.

3. Languages that aren't Haskell have ADTs too; it happens that they've lifted an ML-style syntax for defining them.

4. Python is widely used pedagogically, so for better or for worse, students are already being familiarized with the colon, which I agree is a bit anomalous.

> Ruby is far and away not the originator of 'end' to end blocks -- this comes from Pascal and is in other languages that have nothing to do with Ruby, like Lua.

The use of 'end' comes from Algol, not Pascal.

Thanks for the link to that paper. Super interesting to see actual semantics for python come out. It would be nice if this leads to better tooling for analysis of the language. Currently, it is pretty hard to bootstrap anything for the language in comparison with what you can do for the JVM.

You'd first have to fix Python's broken notion of scope. If you want to read only one, simple, page, read the last page (appendix 2 on variable renaming). Pyret was created to not have such problems by design.

You can actually use ";" as an alias for "end" in Pyret, so you could write:

    data BinTree: | leaf | node(value, left, right);
We use "end" or ";" in order to have unambiguous delimiters for the ends of syntactic forms and avoid needing to depend on whitespace (we added some discussion on pyret.org about our philosophy on indentation and why we don't want to depend on whitespace).

Making that leading pipe optional is a good idea. I made an issue for it, we'll think about if it'll break or confuse anything and add it if it doesn't:


I realize it's a trade-off -- but I think having ";" as an alias for "end" (or vice-versa) is a pretty bad idea. I personally prefers python's indentation-for-blocks syntax, but I understand why you want semantics to be decoupled from indentation. But when using explicit end-markers, I'd prefer to match them to the start, maybe even introducing some verboseness, like: end-case, end-if etc -- maybe taking it further and allow/demand naming of blocks:

      4 + 5 is 9
      1 / 3 is 2 / 6
      9 - 3 is 6
      5 > 4 is true
Then becomes, not wrapped in end-check, but:

      4 + 5 is 9
      1 / 3 is 2 / 6
      9 - 3 is 6
      5 > 4 is true
Or something similiar. This makes mis-matching "end"s (either typos or artifacts from cut'n'paste coding) explicit errors that are easy to spot, and identify.

It would add a lot of verbosity, of course. As for ";"/"end", consider (the presumably valid):

      4 + 5 is 9
      1 / 3 is 2 / 6
      9 - 3 is 6
      5 > 4 is true
That trailing ";" is going to trip someone up. Also consider (cut'n'paste-with-quick-edit):

      4 + 5 is 9
      1 / 3 is 2 / 6;
      9 - 3 is 6
      5 > 4 is true;
Is this valid?

I have wrestled with "uniform closing" vs "non-uniform closing" designs for ages. For those not clear on what I mean here's an example: Lispy syntaxes have a uniform close (it's always ")"), whereas XML syntaxes don't (you have to put the name of the opening tag in the closing token).

So e12e is making an argument for XML-y over Lispy. However, choosing good words is really, really hard. If you pick a different word for each construct, there's a needless mental burden; if you pick a uniform strategy ("end-<kwd>") you potentially get less readable code. And either way it's far more verbose, as you point out.

Some of our target audiences are middle- and high-school students, some of whom have weak typing skills (we know from numerous workshops we run). Increasing even the raw number of characters is a real problem for them.

My preference is to use ; for one-liners, but end where you have a multi-clause entity and you want to be able to clearly tell where it ends. If we find that there is general agreement on this, we can make it a context-sensitive check, à la indentation. Then, a program generated by a tool can do something consistent and ignore all these checks, whereas human-facing environments would enforce them (by checking or correcting).

As for your examples, to my Pyretical eye, the third of your examples (with the dangling ";") looks just wrong. However, your fourth example is syntactically incorrect.

> Some of our target audiences are middle- and high-school students, some of whom have weak typing skills (we know from numerous workshops we run). Increasing even the raw number of characters is a real problem for them.

I certainly understand this argument, and I generally favour simple syntax over smart tools -- but how much of a difference would this make in an editor/IDE that automatically inserts the closing-tag? (you type case, the editor appends esac (but below the point you're typing)):

    1:   |   #your cursor at |

    2: case|
    3: case:
         |     # you hit enter or something, ready to fill in
       esac    # editor has closed block/statement
I imagine typing-skills isn't much of an issue when editing/copying text -- as opposed to typing in new code?

An express design goal is to not bake in assumptions about an editor. Programmers really like their editing tools. I've lived through the waves from vi to Emacs to vim to Sublime Text to what-have-you. We'd really, really like to make the language pleasing to work with without depending on an editor (indeed, each of us seems -- perhaps by accident -- to be using a different editor for Pyret, which may be influencing our decision).

When editing/copying, it's not typing skills but editing skills, which are arguably subtler and even harder.

Your point about programmers having favourite editors (which I consider valid) contradicts the point you made about an important part of your audience being middle/high schoolers, which I don't think have such strong preferences towards one editor or another. It seems to me that you are making compromises to cater to the needs of an audience which is much broader than that you have identified for the language.

It would be a shame to compromise the design of an interesting language just because the target audience is not clearly defined.

I have many target audiences. I mentioned middle- and high-schoolers in response to various other questions, but I don't anticipate them being the largest audience. I expect the biggest audiences to be introductory college-level education and second/third-year courses on "paradigms" (blecch) that want a flexible language to illustrate several concepts. Finally, I am also consciously building a language and book to appeal to the working programmer who has not learned to think of computation as primarily producing values, and would like to quickly get up to speed with that idea.

All of these are educational audiences. I don't see how one can define the audience more clearly than that, given that I have no control over any of these audiences.

Having ; on a new line is valid, but having both ; and end on a block is invalid.

I think the colon isn't a mistake, it clearly delineates an indented block; I think it's a great marker that indented languages might want to standardize on. Perhaps syntax issue is that it has both a colon and an "end" marker? The former indicates indentation-based syntax, the latter is often used in white-space agnostic contexts. The result of using both is confusion.

Pyret doesn't actually have an indentation-based format (so far as I know) it just happens to look like python.

That's correct. There is a preferred indentation (enforced by the emacs mode and our online editor built on CodeMirror), but whitespace is only needed to separate things - the actual amount of whitespace never matters.

So, for example, you can move chunks of code from one place to another and select it all and re-indent - something that you can't do (in general) in Python. It also makes it easier to programatically generate code.

When you have both braces/keywords and whitespace, it seems to me that you're essentially storing the same information about the program structure in two places, of which one is read by the machine and the other is read by humans. The "re-indent" operation reads the master copy and updates the other one. (And if you forget to re-indent, your human readers will be interpreting an outdated copy and be very confused.)

In Python, on the other hand, the information only exists in one place: the indentation. This is read both my machines and humans. There is no "re-indent" operation as the indentation is the only place the information exists, so there is nothing to sync.

I have heard people bring up programatically generated code before, but as another commenter put it "where Ruby usually denotes code blocks with a `do` and an `end`, Python denotes code blocks with a `:` and an outdent" – so is that really a problem? Isn't it just a slightly different way of encoding the same information?

(As to moving chunks of code in the editor: If the editor has a command to re-indent a selected block, it probably also has a command to shift it left or right.)

But if indentation matters, and it's pressent, then why do you need the colon?

Well, the colon can be used without indentation if the block that follows is just a single line:

  def embiggen(x): return x * 2
  for i in range(10): print(embiggen(x))
Also, it just makes things read more naturally to us humans, I think.

I think you mean that it's in opposition to their goal of being easy to learn, not that it's orthogonal, as that would mean "it doesn't interfere".

Anyway, it seems to me that the only real difference between your code examples is that the second one lacks an ending designator and uses an equals-sign instead of a colon. However, in order to make the lack of an ending designator work, you need a more complex parser (it needs to infer block-ends from indention or some other context). Leaving off ending designators also increases mental overhead for the user, as they must keep block-end inference rules in mind in writing code. Using colons versus using equals-signs is simply a matter of taste (not weight, as you seem to claim).

That said, I do see some waste in their datatype definition syntax. First, I'd prefer curly braces over the colon/end pairs that they went with, as that saves a few characters. Second, newlines are a fine separator, so why also require pipes? This requires the user to type three characters (newline, pipe, space after pipe) when just one would have done fine.

Newlines are not a fine separator; what if you have a definition that you intend to span multiple lines? In any case, the pipes go all the way back to EBNF, since what you are really doing when you specify an ADT is specifying a grammar of types.

I admit to also being a brace weenie; I would very much prefer if all the languages I had to use had them. However, Pyret exists in a tradition of many successful, braceless languages like Python, ML, and Lua.

I feel that the argument for the need for multiline definitions here is a bit weak. You are already putting each definition on its own line, so the definitions are actually fairly close to how their actual usage will look. I feel that if you really need them to be all that long, you are already in weird-style territory, so it isn't such a bad idea to just say "deal with wide files". Overall, it improves ergonomics for the vast majority of use cases, with the only cost being a possible aesthetic annoyance for users who are already writing aesthetically-annoying code.

At the other end are definitions of which several fit on a single line:

    data Color = Red | Black

Lines really aren't much of a thing: they are a single keypress, which generates a newline character and some autogenerated indentation from your editor. There's no real difficulty in breaking things up across two lines versus typing any other extra character (this touches on another issue, which is that I am of the opinion that the concept of code brevity estimation by line count comparison is fundamentally flawed, as code with fewer wider lines is often just cumbersome to type as code with more narrower lines).

That said, please consider the following:

data Color = Red | Black

data Color { Red; Black }

data Color = Red | Black | Green

data Color { Red; Black; Green }

I think the brace + semicolon-newline style actually compares pretty favorably on single-line width (it wins out on line width to an increasing degree as the line gets longer, which is important). However, it is visually more complex, which matters a lot for shorter lines. For this reason, I think that allowing both syntaxes would be ideal. The pipe-based syntax could be encouraged for single lines (newline-termination would sidestep block-end inference issues), with the curly brace based syntax being encouraged for multi-line definitions. There is a slight disadvantage in that now a user would have to know both syntaxes, of course.


Of course, since Pyret's syntax requires that ADT parameters be specified in parentheses, you can actually omit the pipes in single-line mode, too, and opt to use space-juxtaposition.

I can't respond directly to e12e's comment ("Please don't use equal for assignment" -- https://news.ycombinator.com/item?id=6704229) so I'll slightly abuse responding by doing so at this level.

We don't EVER use = for assignment. For us, = is binding. If you write

  fun f(x):
    y = x * x
    y = x + 2
    x - y
Pyret will say

  I'm confused: y is defined twice
and point to the two bindings.

The goal here is to make the common case work fine, where you create a bunch of distinct local bindings; but when you try to mutate, you have to do it explicitly:

  fun f(x):
    var y = x * x
    y := x + 2
    x - y
    f(10) is -2 # the test passes
The first line inside f says "y is _currently_ this, but it's variable, so look out!", and subsequent mutations use := to change y. Unlike JavaScript (say), mutating a variable that hasn't previously been defined is an error, rather than quietly adding the variable to the set of defined names.

Thank you for distinguishing bindings and assignments. Please consider `var y := x * x` for consistency (note the ':=').

We considered it and decided against it for a few reasons.

- We want := to mean "change". The initial binding is not a change. This is precisely the "bindings and assignments" distinction you're talking about.

- A variable may in fact not be mutated.

- A small point is that we want to localize editing when making a change.

So, the way I read

  var y = x * x
is, "y is currently bound to x * x. However, y is a variable, so this binding is not guaranteed to last. Be careful about assuming you know the value of y." If you then see

  y := x * z
it reads as "y's value changes to that of x * z".

So: "=" means "introduced a name with this value"; "var" means "but this binding may change"; ":=" means "and yup, it really _did_ change".

From the very beginning we have debated whether or not to make the initial stick optional. We wanted at least a semester of code to review before we made a decision. I personally lean toward making it optional and just have to persuade the others (-:. With that, you would be able to write

data Color: Red | Black | Green;

We did also initially discuss using = instead of : in places where we were defining things (functions, data). That's still not completely out of the question. That would get you to

data Color = Red | Green | Black;

We need a closing delimiter to avoid creating huge ambiguities in the grammar.

Please don't use equal for assignment, unless you do something along the lines of prolog etc:

    (a, 2a) = (1, b)
    > b == 2

I really love the surface design decisions taken here. Particularly

* ML-like syntax for algebraic datatypes and matching. ML got it right; it always seems a bit off when languages try to make ADTs look like some other syntactic construct

* the cyclic and graph declarations

* accessing datatype variants using an OO-like syntax. simply brilliant.

* non-significant whitespace. for all the pros and cons, autoindenting is something i hate to give up.

Python-like syntax with pattern matching and recursive ADTs!? What a great idea!


This looks a lot like the language I've been dreaming about. I actually like the explicit `end` keyword to materialize the end of blocks. The lambda expression syntax, as illustrated by the filter/map/fold example, looks like Ruby blocks with a much simpler syntax. A couple of questions to the crew:

Why advertise it as a teaching language ? As a working programmer this looks very appealing to me. Are there limitations that don't make it a good language for practical programming (other than the fact that it's not ready yet, of course)?

Are you planning to provide a tutorial more approachable than the language reference?

Thanks for the questions!

As to why we're advertising it as a teaching language: Our group has a design sense of how to do this, and Pyret's success in the classroom is what we're able to study, measure, and focus on improving. That doesn't mean we won't be thinking about managing million-line codebases or getting great performance out of the runtime, but those concerns won't necessarily be the main guide of our design decisions. We may very well end up with something that's a superb general-purpose language, but I'm not yet willing to give folks that expectation, or do so at the expense of the learning experience. Make sense?

EDIT to add: One thing that the Racket community has very successfully done is separate teaching languages from the main Racket language. A probable future for Pyret is splitting it into teaching and professional versions. I'd love to hear feedback about what more you'd like to see in Pyret, with this split in mind.

The getting started guide (http://www.pyret.org/getting-started/) and tour (http://www.pyret.org/tour/) is the best reference for getting started right now. If you didn't see those then they probably need more prominent placement.

I'd disagree with the "end" keyword. You could add more whitespace, which our brains naturally deal with very well, or you add more "cruft" which we have to actively think about. In typography white space is what makes good layout work. And here we're going back to replacing white space with a word. So from those perspectives I think the "end" keyword is a regression.

Having said that, it's the first language that I prefer from a typographical perspective to python. So very well done on everything else.

We tried very hard to do without an ending delimiter. But it boils down to a simple matter: either make whitespace significant or make the grammar hugely (and perhaps irredeemably) ambiguous. Our position on whitespace is:

- it's very important for readability

- it should not be semantic

- it should be possible for a program to reindent your code

That is, if I copy-and-paste code from email or a Web page, my environment should be able to "move it into place". Significant whitespace means that that's no longer possible.

Therefore, we want whitespace to be a context-sensitive check that is layered atop the language. You could even see turning this off for code that is machine-generated, sent over a wire, etc. We are looking at actual usage patterns before we decide on the precise rules of indentation.

To this end, we needed some way of indicating that a block had ended. Ergo "end" and its alias, preferred for one-liners, ";".

Let me add one more argument in favor of an ending keyword: quality error messages. The more complex the grammar and hence the parsing job, the much more likely parse errors will be very complex. What we've learned over a few years of doing user-study research into error messages is that ultimately there is a complex algorithm running underneath and the more we can minimize surprise the better, because the user (especially the beginning programmer) simply has no mental model for what that algorithm is doing.

It still has some performance issues that I say would make it hard to put into an industry environment.

I think the teaching language part is in reference to Captain Teach, an IDE++ that the developers are using in the classes to support code review, automatic backups, and more.

Pyret is being designed with the goal of encouraging good code, with tests being an integral part of every function, and clear differentiation between mutation and simple let bindings. I would teach this language just for those features, so my students would see how easy it is to get lost in spooky action at a distance, or in debugging the wrong portion of your code.

There's another response to this that I didn't mention. Shriram is writing a textbook that includes all the material for our advanced intro course and our programming languages course. There's a lot of good examples here:


It's not my language, so I can't really talk to those questions. I think it's fair to say that the Racket group has always been focused on teaching languages not to exclude "production languages" but instead because it's of their personal interest and lets them focus on certain aspects of the language to exclusion of others. They don't need to build performant standard libraries---they need to cleanly introduce good programming concepts.

Did you know about Elixir? It has a Ruby like syntax, pattern matching, functional, has macros so can build DSLs with it. Also most importantly, runs on BEAM VM (Erlang's VM).


If you're looking for a more mature language that feels like that, I very much recommend scala.

Scala is statically typed through-and-through. Pyret is designed to always offer a dynamic account of the language. We do have a static type checker under development (it's all-but-ready to release), but it will always work over a language where type annotations are optional. We also have a radically different idea about type inference, which we are currently trying out. In short, Pyret and Scala are already quite different and will grow even more so soon.

I'd be interested in seeing some more explicit examples of the differences, because I get a very scala-like vibe from all the examples in the article (I accept that dynamic vs static is a fundamental difference that doesn't really show up in syntax).

Scala is a great language with a rich excursion into type system design (and other things). Pyret is designed to reinforce specification (more general than types) and to do so through a combination of static and dynamic means. That much I've already said.

Let me point to testing as an example of where we differ. Testing is really important to us. As you've seen, we have lightweight, in-place test cases. In addition:

1. We have a growing family of language support for writing tests well (e.g., see "satisfies" on the Pyret home page). I expect this to grow richer and richer.

2. We are working on some interesting ideas about the scope of testing blocks. We don't say this on the home page, but you can also write "test" blocks for data definitions, not only for functions. But now the functions that operate over those data are going to want to use instances of the data. We're working on a "scope conduit" to take definitions from the data definition to its functions.

3. As some readers have noted, in most languages, you simply cannot write unit tests for a function nested inside another, because you can't even name it. In Pyret, because definitions can include tests, this creates no problem at all.

4. We are now working on a type-inference approach that uses tests, rather than code, to infer types. The code is then checked against these inferred types.

Essentially, testing is a kind of metaprogramming, and metaprogramming works best when you integrate it into the language (what Racket has shown) rather than as ad hoc external tools. Each of the above four examples is really just a specific instance of this general issue. Since testing is an especially important form of (dynamic) metaprogramming, integrating it well into the language from the very beginning is likely to lead to expressiveness, flexibility, and cleanliness that may be harder to achieve from the outside.

This is part of a more general philosophy. To me, testing is a form of _specification_, and we want to see rich descriptions of program invariants expressed with a gentle blending of static and dynamic methods. We don't want to be so grandiose as to call Pyret a "testing-oriented" programming language, but in our minds it is, with testing having both the base role of making sure we (as programmers) didn't screw up but also the exalted role of providing an alternate, independent description of the program we're trying to write. (This is something I emphasize a lot in my teaching: eg, http://cs.brown.edu/courses/cs019/2012/assignments/sortacle, http://cs.brown.edu/courses/cs019/2012/assignments/oracle).

This is a philosophical position, and so hard to quantify through individual bits of code, but I already see small influences of this (as above) and I expect it to guide us to a very different point in the design space over time. It feels safe to say this is quite different from Scala's viewpoint.

That's pretty fascinating. (static) Type inference based on tests sounds like an ill-posed problem, or at least one that can't produce human-meaningful annotations. That is to say, If one function has spec A for a generic parameter, and another function has spec B, it doesn't seem like it's possible to generate in general the spec for their composition in "closed-form".

Like, what's the type spec of "map" where the monad argument is a list of even numbers and the function to be lifted maps numbers to uppercase strings? It doesn't seem like we can meaningfully talk about the "type" of the function's result with any sort of specificity beyond List[String] or maybe List[String[Uppercase]], even though the real type is quite a bit more specific. 

How does inference work downstream of this point? Can we get reasonable errors before run-time?

Note: what follows is all highly experimental.

Our philosophy is that the typed Pyret language is an explicitly-typed one. Inference is just a convenience to save you some amount of typewritering. We want to do a good enough job of it, and if you want to get more specific, you do it yourself. For instance, in the foreseeable future we do not plan to ever infer a refinement. So I think that addresses your example.

Furthermore, we aren't going to play the cute trick of having an inference process that is so consistent with the type-checker that we can get rid of a type-checker. One nice thing about old fashioned recursive-descent type-checkers is that they give simple and familiar error messages; I don't know of any research papers that have been written about better error reporting by them (at least in the past few decades). That's a feature. [NB: There's always an exception. Hopefully my point is clear.]

So, whatever type we infer is one that gets fed to the type-checker to check against the function. Typically, assuming the function has passed its tests, the function should be consistent with its inferred type, unless the tests didn't cover the code properly.

What happens with code that can apply to multiple types, i.e., parametrically polymorphic code? We made an explicit decision to not have union types; had we had them, this would have been more sticky. But because we don't, we simply assume you mean to use the function in a polymorphic fashion, and that's the type we "infer". [Put differently, if you have a polymorphic function, you should test it on at least two different types!]

For reporting errors, we intend to make a distinction between declared types and inferred ones -- a distinction that I don't believe is commonly made in other type inference error presentations. For instance, inconsistency between two inferred types should perhaps be treated a bit differently than inconsistency between two declared types or between an inferred and a declared type.

More broadly, types are specifications. The point of a spec is to provide a redundant statement of program behavior, that can be checked against an implementation. From that perspective, inferring the type from the implementation is a strange idea: one shouldn't be inferring specs from code. [Yes, I know, it's been fashionable to infer "specs" from code for over a decade now. I would not call those specs -- they're more like summaries of the code.]

But tests, to my mind, are a lightweight form of specification, so it does make sense to infer one specification from another [just as people have gone the other way and used specifications to derive tests: the area of specification-driven test-generation]. Especially since it's being backed up by a true type-checker.

As I said, these are all pretty novel and possibly heretical ideas. But this is the experiment we're trying.

I also like the fact that you can put unit tests for helper functions that are inside of other functions. This means you can use the inputs to your outer function as part of the tests for an inner function, which has been hard to do in the past.

Wow, I didn't think of that. That is actually amazing.

This project seems really promising, but I got one concern. Lisp's syntax have been one of its strength for beginners. Easy to learn, easy to solve common errors. I don't see the point of having Python-like syntax really, the user could learn other syntax after they've understood programming.

Syntax is tricky and contentious, and our team members do love our Schemes. Members of the Pyret team have actually done some research on the issue of parenthetical expressions in introductory programming, and seen that parens aren't necessarily the best:


One issue is that it's too regular: since open-paren means so many different things (start of a "defun", start of a function application, start of an argument list in a "defun", start of a syntactic form like "cond" or "if", start of a clause of "code", the list goes on), a typo can easily and drastically change the kind of error message you get.

More closely matching syntactic forms to the type of behavior the expression has will hopefully let us improve error messages and grokkability of the difference between concepts. We're collecting data about common syntax errors and actively asking what we can do better in syntax design based on what we observe about Pyret's use.

I wish this could be upvoted so much more.

As someone who hasn't spent much of my time in Lisp world, most of my time when I come across some supposedly elegant implementation of an algorithm in Lisp, what my brain sees is impenetrable parenthesis soup. One incredible win of Python is the resemblance of the syntax to pseudocode, and how easy that has apparently been for new developers to absorb. I'm glad to see your team has taken that to heart for a language that's supposed to be pedagogical. And that is beautifully said about how a misplaced parenthesis can lead to all sorts of errors.

Error reporting, debugging, and documentation are in that "meta" tier of programming ergonomics that few people care to reason about, and yet they are oh so important to allowing actual human beings to learn programming languages and use them to produce rereadable and maintainable code.

Did you ever do any research/tests/courses with Smalltalk as the first language?

Nope. Virtually nobody uses Smalltalk as the first language any longer anyway, to the best of my knowledge.

Well, I'm guessing not that many are using Pyret as a first language (yet) either... (Actually, I think a few people are still using Squeak to teach kids programming, so that's not even true)?

Anyway, I was more curious if you'd done such a study because

a) It's a very concise and consistent syntax, and

b) with all the great work that appears to go into Pharo Smalltalk it would seem to be viable option (again).

And yes, I do mean for computer science (not "just" programming):

http://www.lukas-renggli.ch/blog/petitparser-1 http://www.squeaksource.com/OMeta/

(This in addition to the nice design of Smalltalk-80 - and no, of course it isn't perfect -- is any language)

Sorry, my point was not to get into a popularity pissing match. I do believe popularity and quality are largely unrelated.

What I meant is, since most people have stopped teaching with Smalltalk, it's really hard to do the kind of research I'm talking about! We had no trouble doing it for Racket because we were able to get lots of data and from it measure for statistical significance.

No worries, I understood that. I just thought you were in a great position to actually preform such a study, if you were willing to run it on a group of students :-)

I can certainly understand why you wouldn't "just try Smalltalk on a class or two", though!

Yep, that's just not feasible. It would require me to first master Smalltalk, then find suitable textbook material, figure out how to adapt it to the material I'm teaching, etc. (Eg, where are the really good intermediate-level algorithms textbooks for Smalltalk?) The list goes on. So it's really just not feasible at all.

My jaw dropped slightly to see someone suggest Lisp syntax would be a better starting place for beginners than Python's syntax. Is it just me or is that a fairly unorthodox point of view?

I'd say the real trouble is teaching any syntax at all to beginners, because their first language really taints them for life. Most people will think of successive languages in terms of their first, until they learn many and are able to think more generally.

Lisp has the advantage that it really teaches the general semantics of programming languages, because it's syntax is just the syntax tree, and it basically only has one form, which is function application. There's some "special forms" built into the implementation of a lisp to make it useful, but those shouldn't be confused with syntax. In fact, the parenthesis and space between function and arguments shouldn't be considered the lisp syntax either - it's just one way to represent a tree structure in linear text.

I think the real issue is the confusion in teaching is between Computer Science and Computing as a vocation. Nobody teaches the former any more, because it's less useful in the real world. As a result, we have languages which try to make a distinction between "programmer" and "programming language implementer", where the programmer generally knows less about what he is doing because someone has imposed a specific, narrow set of ideas on him.

I've programmed in and taught Lisp syntax for 24 years, much of it exclusively in that syntax. I've also extensively researched errors in parenthetical languages.

The simplicity of parens is widely regarded as a strength, but I believe it is also a weakness. There is too much regularity in the syntax, resulting in numerous errors. Programmers, especially beginning programmers, need more "guideposts" in their programs. Additionally, parsers also benefit from this when producing errors.

The developers of Pyret are drenched in Lisp syntax; we know it profoundly well. Pyret is a conscious attempt to fix what we regard as flaws with that syntax.

I disagree about Lisp's syntax being "just the syntax tree", it is instead as you say a little later on, one specific way of writing linear strings of characters that correspond to syntax trees. Of course any other unambiguous grammar is also just one specific way of doing that. Lisp's syntax is not special because it somehow magically corresponds to parse trees where other syntaxes do not; rather in the case of Lisp the correspondence is simpler than for other languages. You make it sounds like Lisp does not need to be parsed, which is clearly false, it's just easier than most (but not all) other languages.

At Brown (the place this research is coming from) one of the main CS intro courses is taught in Racket. It's worked really well and people enjoy the simplicity in getting programs running quickly. People with previous programming experience have more trouble than true beginners at that point in the course.

The course later moves into 3 other languages (OCaml, Scala, and Java) to try and get the beginners not focused on any particular language, though there is some debate on whether that effectively teaches the beginners any language particularly well.

Pyret would definitely be a candidate for replacing OCaml in that sequence as anything with better error messages would be very welcome.

I am not a fan of that sequence of languages, and the fact that it seems to keep changing every so often shows that the course(pair) hasn't yet found a canonical, good answer.

However, I don't see any reason why Pyret needs to be the _second_ language. OCaml is brought in to introduce types (amongst other things) because Racket doesn't have them. Pyret eliminates that need.

There are other reasons, too. But this sounds like a debate we should be having in a hallway, not on the Web. (-:

The key word is beginners. The people who have the most trouble with Lisp syntax are those already accustomed to other languages. Beginners pick it up with no trouble at all.

This claim has rarely been formally studied, and what data we do have does not entirely bear this out. I have taught Lisp syntax for 24 years, and it was my experience with doing so that led me to design Pyret.

Lisp syntax is (<operator> <args...>) with a few special forms (def, if, cond etc.). This is really easy to understand and allows the user to go straight forward and learn about the semantics and general programming practice immediately.

That same regularity also means there are few guideposts as a student programs, and this makes it much harder to recover from mistakes. We've looked at a lot data on this (and have taught Racket at every level for decades).

This reminds me a bit about a similar problem in Haskell. The syntax is very whitespacey so lots of things that would usually be syntax errors in other languages end up being interpreted as invalid function applications, leading to complex type-errors.

Absolutely. For instance, if you leave off an argument in a recursive call, Haskell will quietly think you meant to curry the called function, which can manifest itself as a type error in some entirely different place.

If you need guideposts, use an editor with paredit support and rainbow delimiters. This makes Lisp feel far more intuitive and tree-like than normal.

You seem to be under the impression I haven't programmed in Lispy languages. And our data are from students using DrRacket, which very much has paredit support. These are just inconvenient truths.

DrRacket is a very big language. How does it compare with using a much smaller language such as the Scheme used in SICP?

DrRacket is an IDE that supports many languages. The curricula that Shriram has used at Brown, and that others have used in many other places uses a sequence of smaller languages designed for teaching.

Our studies were on students using Beginning Student Language, the smallest of the languages built into DrRacket. It is much smaller than the Scheme used in SICP.

No that's fairly normal. It's the whole reason the scheme language, the second most popular lisp, was created. it's lost favor to python and java and c++ because those languages have more of a foothold outside of academia and might be more useful to students who only take a couple cs courses.

MIT used to teach intro to CS using Scheme, I thought. It's significantly simpler than Python.

How many people taking intro to CS at MIT do you think are beginners who have never programmed before?

Quite a few, actually, which is why the Scheme class ended up being awful in practice -- you'd get a bimodal distribution of students where some are already familiar with multiple languages (usually more mainstream than Lisp) and some aren't familiar with any, and the class, by necessity, targeted the trough right between those. So it was too fast for the students who never programmed before, and either too slow or too _different_ for the students who had, and served neither group very well.

The new (Python-based) classes handle this a little bit better, from what I hear.

The new course is so different in content from the old course that I don't think any pat comparison is meaningful.

At any rate, I went from being a staunch believer in the simplicity of Scheme's syntax to starting to think -- based on lots of observational data, and some preliminary studies (http://cs.brown.edu/~sk/Publications/Papers/Published/mfk-va...) -- that it's too simple (a failure of the "everything should be made as simple as possible, but not simpler" maxim).

But MIT's decisions are their own, and potentially peculiar to their specific curricular needs. Pyret was not influenced by them. Brown is proud to teach Racket and does so very successfully. Not every university is a dedicated follower of fashion!

I've seen an MIT Scheme class video, and it was painfully slow and the guy seemed to explain everything in painstaking detail. I've heard about the bimodal distribution, which I believe I witnessed among fellow students when I was taking classes. But I'm having trouble seeing how Scheme made it any worse or how the class was too fast.

And, Pyret looks like a terrible beginner language! I like PltScheme, but I can't see how anyone would ever want to use that.

Scheme makes it worse, I think, because the way in which you usefully teach Scheme to someone who's never programmed before is different from the way you teach Scheme to someone who already knows C++ or Java or something pretty well.

Scheme makes it difficult because it more quickly gets you to teaching difficult ideas. This is an issue about SICP that most people simply never grok. Most courses don't cover a third of the concepts (if that) of SICP in the same amount of time. That's the book, not the language (and the only "fault" of the language is that it lets the book get that far that soon).

Yeah, 6.001 never covered the bulk of what the book covers.

Still, I think Scheme is sufficiently different from mainstream programming languages that it's... like teaching git to a crowd composed half of SVN power users and half of people who have never copied a folder to folder.old in their life. You're balancing teaching "oh, here's how Scheme's different" from "oh, here's how you should have thought about it all along".

In particular I remember the OO system being pretty confusing at first, since my background, from high school, was C++. It makes a lot of sense that Scheme gets out of your way and lets you implement a nifty OO system using just closures, but to someone who expects a language to treat objects as first-class and lambdas not, you're inevitably going to be trying to learn the SICP OO system by comparison to the C++ or Java one.

All that said, I'm not blaming the language at all. I think the quickest fix would have been for MIT to offer either a placement exam or voluntary registration for two classes, one for students who were more-or-less new to programming and one for students with a strong background in something like C++ or Java. Both classes could have used SICP as a text and worked fine.

I completely agree. In fact, this sort of two-track approach is what universities like Brown, WPI, etc. now do. Using the same ("weird") language at both levels works fine so long as you separate the crowd. Otherwise, as you very astutely point out, the two groups are asking completely different questions and answering them clearly and consistently is absurdly hard. Too bad MIT didn't think of this (or, presumably, did and rejected it for whatever good reason, that somehow doesn't seem to be applying to other places like Brown).

I don't know about this class in particular, but the speed of a course does matter. I guess programming is one of the harder courses to teach?

I've programmed in and taught Lisp syntax for 24 years, much of it exclusively in that syntax. I've also extensively researched errors in parenthetical languages.

The simplicity of parens is widely regarded as a strength, but I believe it is also a weakness. There is too much regularity in the syntax, resulting in numerous errors. Programmers, especially beginning programmers, need more "guideposts" in their programs. Additionally, parsers also benefit from this when producing errors.

Like it or not, universities have started to move away from teaching Scheme to introductory CS students. You may remember that much was made of MIT and Berkeley switching away from Scheme for Python. I think Pyret comes out of a precedent of Python as the new standard, so in that light, it's certainly an improvement.

That's because they stopped teaching CS, it's not that they replaced Scheme with Python in CS courses. How can you possible teach metacircular interpreters, constraint propagation languages, ambiguous computation and similar constructs using Python when you need to be able to easily define a completely different language semantics using language at hand? In that sense, Pyret will allow them to teach a bit more CS than possible using ordinary Python but still is in no way fit for something on the level of SICP.

The textbook that accompanies Pyret (http://papl.cs.brown.edu/2013/) covers many of these topics: the second half of this book is a full-blown programming languages text (formerly a stand-alone book known to some as PLAI).

I fully agree that you can't cover these things well in Python, and most Python textbooks don't, because of the poverty of datatypes and the difficulty of creating new structured ones.

However, while I think SICP is the greatest computer science book ever written, it has its own share of blind spots (for a simple example, see [the lack of] types or pervasive specification and testing). So the above book takes a somewhat different take on these issues. But it hews closer to SICP than any Python book I've seen.

The university where I graduated from in 1999, still has lots of CS lectures, including lots of compiler design lectures.

We never used Lisp based languages on our lectures, rather Caml Light (nowadays OCaml) and Prolog.

Somehow, it looks a lot more like lua with a bunch of type-checks and tests added then like python. No named parameters, no generators, no comprehensions that I can see, no focus on iteration in general, no significant indentation, etc.

Instead: "end" syntax, unified number type, all blocks produce new scopes (not just functions), local variables are explicit instead of default, etc.

That's a useful comparison, thanks. While we were certainly inspired by Pythonic syntax, we have ended up somewhere slightly different.

The note about local variables is particularly important; we actively don't want Python's model of variables and scope.

Note: this wasn't a criticism of your language: I rather prefer lua's way of doing numbers, co-routines, and even scoping to some degree. I was more criticizing your comparison.

The language itself looks pretty slick, and I am a sucker for gradual/optional typing. Do the type annotations result in performance/compiler optimization, as in Julia?

> Do the type annotations result in performance/compiler optimization, as in Julia?

We've been designing the static system with exactly this in mind. The current implementation doesn't do anything fancy yet, but that's just because we've been getting off the ground and focusing on ergonomics and curriculum first. Getting performance in return for types is absolutely where we're going.

Its shocking to me that so many languages designed after Scheme get variable scoping wrong.

Indeed, but apparently, it doesn't have multiple return values, and I don't know if there is an equivalent to metamethods.

It's easy to obtain the equivalent of multiple return values by using a literal object.

I've been burned by multiple return values numerous times in Scheme and Racket. They're very subtle and hard to make performant. It's one of those language features that very much does _not_ pay its own way, so you have to really, really need it to want to put it in your language. My view is that its uses are not many, and most (all?) can easily be achieved with literal objects. That's why Pyret doesn't have them and is unlikely to get them.

Can you say more about how multiple return is "very subtle and hard to make performant" and leads to getting burned?

As far as I can tell the main advantage of multiple-return over literal objects is that callers who only care about the primary value can call the function the normal way, whereas if it's wrapped in an object they have to "pay" for it by explicitly extracting it. That's a nice-to-have but you seem to be saying that it isn't worth the trouble because the trouble is considerable. I'd like to hear more about why. (A friend is working on a language where this is an issue.)

p.s. This thread is just awesome. Thank you for engaging so extensively here.

1. Designing a performant multiple return values mechanism is very subtle. For instance, see J. Michael Ashley, R. Kent Dybvig: An Efficient Implementation of Multiple Return Values in Scheme. LISP and Functional Programming 1994: 140-149.

2. Write me the identity function. (No, really. Stop. Try. Then read on.)






I hope you didn't say it's

fun id(x): x;

because if F returns multiple values and G consumes them, I can't refactor G(F(...)) into G(id(F(...))), which is pretty much the definition of an identity function. You should see the true identity function in R5+RS Scheme....

3. The way to make simple functions like that continue to work is to make the return values not be multiple values but a single one (i.e., some kind of tuple). At which point...

4. Pyret's objects are very lightweight. They don't carry baggage à la those in JavaScript, etc.

5. If you have multiple values, every single library function has to be rewritten to work with them. What should map do? filter? fold? And the hundred other library functions? It's also really hard to remember this when writing every line of library code (see #2).

6. We'd need a new binding construct in the language to bind the multiple return values. That's yet more syntax design, but also, it's yet more opportunity for typos to turn into indecipherable error messages ("Expected two values, got one" is a horrible thing to read for someone who doesn't even know a function can return two values!).

So, I claim this is simply not worth the frequency with which you actually need precisely this, as opposed to just returning some sort of tuple or object. Also, once I've bound the return value (say to r), saying r.x and r.y instead of just x and y is really not a big deal.

> 2. Write me the identity function. (No, really. Stop. Try. Then read on.)

In Lua:

    function id(...) return ... end
    a, b = id(1, 2) --> a == 1, b == 2
    c, d = id(3)    --> c == 3, d == nil
    e = id(4, 5, 6) --> e == 4, the rest is discarded.
I use it all the time :-)

The vararg system is built on a stack mechanism, and it is efficient.

Regarding map and other higher level functions, I'd only take the first return value.

If you have tuples it is indeed less of a concern. Better still if you have destructuring assignement, à la Julia.

I hope you didn't say it's fun id(x): x;

Well, I did. :)

That's an interesting example. My first reaction was that one might be able to construct a similar anomaly by abusing optional arguments, but perhaps not: optional arguments to F don't get magically passed on to G. Still, I feel like I've seen function-signature constructs that would break the identity function similarly to how multiple-return does in your example. Maybe some twisted aspect of Common Lisp...

No doubt. Things with optional arguments, keyword arguments, all sorts of contraptions like that. That's why we've studiously avoided such things.

Still, there's an asymmetry, at least to our minds, between calls and returns, so the argument put forth for multiple values ("a function can take multiple arguments, why can't it return multiple answers?" -- made especially compelling in a CPS context, because returns turn into calls) doesn't quite play out in practice.

Ergo, no multiple return values.

PS: Fun exchange for me, too!

The Ruby examples are unfair.

Ruby intentionally makes parenthesis optional. So `do_something` is `do_something()`.

For the first example,

  - method_as_fun = o.my-method
  - method_as_fun(5) # not reached
  + method_as_fun = o.method(:my_method)
  + method_as_fun.call(5) # or method_as_fun[5]
And for the lexical scope thing,

  def f(x)
    g = ->y {x + y}
Or, explicitly use class variables,

  def f(x)
    @x = x
    def g(y); @x + y; end

If one has to rewrite code like this, you're making our point.

The Ruby way is:

  1. common (not all) things are simple / elegant.
  2. Advanced things are doable, usually require a slightly more complex syntax.
For example, Ruby allows one "block" (anonymous function) per method. Why not 2 or more blocks? Because Matz has studied common lisp (likely, maybe something else) standard library and noticed that in ~97% (maybe not exact number) cases, one anonymous function is enough. But you can use more anonymous functions using a different syntax.

And here is the same case. People call methods much more often than to get the method. Therefore Ruby makes it default to call the method, but not stop you from getting the method object. It's just unfair to say Ruby cannot do these things.

Thanks for the feedback! There are tradeoffs here, and I appreciate that it's a little glib to say things are necessarily right or wrong. I took down the scope example for now, but in relation to the method example, I posted some thoughts about why I disagree on this post:


See if that helps clarify my position for you.

I can't find any documentation on the type system. In particular, I'm interested in how objects are typed (that they're both structurally typed and may be abstract is novel), and the capability of the type refinements.

I loathe template-based OO due to the possibility of monkey-patching but the fact that objects are immutable looks like it might ease this. (Generating efficient code might still be difficult.)

I don't see the use cases for the cyclic structure as presented. If your data at all resembles a database relation, then (a) you have scalar keys anyway so you don't need language support, and (b) you probably need to key off of more than one column.

The "method-as-fun" semantics seem weird. Given this code:

    o = { x(self): self.y end, y: 10 }
    f = o.x
f() returns 10, as per the examples on the front page. But then:

    p = o.{y: 15}
    q = { x: o.x, y: 15 }
Clearly p.x() should return 15, but what does q.x() return? It doesn't seem clear whether it should return 10 or 15. (Without reading through the reference manual (it's late), my guess would be that method definition is special-cased, so that the answer would be 10.)

The type system hasn't been published yet.

We don't have monkey-patching. We've been burned too much by JavaScript and the like.

Cyclic structures: think about trying to teach graph algorithms. Let's say you believe it's important to write tests. Many graph algorithms aren't inherently mutational. But you need mutation just to create examples of your data. The graph construct gets around this problem entirely, so mutation and graph algorithms become orthogonal topics. In general, I'm a big believer in eliminating unnecessary curricular dependencies. Having taught graph algorithms this way for a few years now, I can't imagine going back.

See the notes on graphs in PAPL (papl.cs.brown.edu/2013/).

But you need mutation just to create examples of your data.

OK, I'll buy this.

> Most “scripting” languages don't support annotations for checking parameters and return values

PEP 3107 introduced function annotations to Python 3. The following syntax is valid:

    >>> def square(n: int) -> int:
    ...     return n * n
    >>> square(3)
Nothing is done with annotations by default. Here's an article discussing this "unused feature": http://ceronman.com/2013/03/12/a-powerful-unused-feature-of-...

Unless I'm missing something, all the examples in that PEP are first-order. There's no discussion of what the semantics is in the higher-order case. Pyret's annotations are perfectly well-defined and draw on a long history of research of getting these very subtle cases right (starting with Findler and Felleisen's ICFP 2002 paper). There's much more to this than just syntax.

I lot more Ruby-like than Python. Not sure why it both has ":" and "end" and the "|" is kinda ugly the way it's used.

This doesn't seem like it would be easy for novices to grasp, though it looks like a solid language. For teaching, I think Quorum (http://www.quorumlanguage.com) seems like a stronger alternative (disclosure, I am a member of a CS senior project team building their web system).


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact