Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Risp – Lisp in Rust (stopa.io)
314 points by stopachka on May 2, 2019 | hide | past | favorite | 84 comments



> `List(Vec<RispExp>)`

The big question when implementing a runtime for a Lisp-like language in Rust is how the interpreter will interact with the GC. It looks like this project has avoided most of the complexity by not implementing actual lists, and allowing multiple references to the same value to exist only in restricted circumstances, hewing close to Rust's ownership model rather than a typical Lisp's. I wonder, is the plan to stay on this side of the space, as a lightweight language with potentially smooth FFI interoperation with other low-level/Rust code? Or to rewrite almost everything to support the high-level semantics of list-processing languages, with their emphasis on shared structure?


The way I’ve approached it in my compiler [1] is that garbage collection can never occur while Lisp/RFI (Rust function interface) code is executing. Once the program returns to the event loop GC can occur with a small number of known roots. This works because there will be no way to block without waiting on a future which can then return to the event loop. I’m considering adding a `conditional-gc` macro for long running computations that’s internally implemented as waiting on a future that’s immediately resolved.

This means that RFI code can treat GCed values as having a `'static` lifetime. One caveat is that RFI code cannot capture values except for a few special runtime functions. This works naturally with the language’s pure functional design, however.

As far as data structures go its impossible to support performant Lisp code by implementing lists as native vectors; the guarantees and idioms are too different. My runtime [2] provides Rust bindings for data types that e.g. allow you to create an `Iterator` from a list or construct a list from one. That should allow transparent interoperability with idiomatic Rust patterns and data structures.

[1] https://arret-lang.org/

[2] https://rustdoc.arret-lang.org/arret_runtime/index.html


> I wonder, is the plan to stay on this side of the space, as a lightweight language with potentially smooth FFI interoperation with other low-level/Rust code?

I personally hope this is the case; I feel like Lisps in that space are scarce. Strange interactions between the GC and "alien" functions are one of the things that make FFI painful even when using a Lisp with otherwise-thorough FFI support (like Chicken Scheme or most Common Lisps).


I think ultralight scripting languages are an interesting quadrant of language space, but I'm not sure the strengths of Lisp would be evident in a language without lists.


Fair point. Perhaps "Lisp" would be a misnomer, but I do think there's value in an S-expression-based scripting/configuration language even if that language doesn't insist on revolving entirely around linked lists and the semantics/quirks thereof. Seems like vectors/tuples/structs should be just as valid as lists for a program to represent itself.


Great question firethief! I think both approaches -- a) sticking close to rust and supporting beautiful interop, or b) creating our own GC and using interesting data structures (I've wanted to implement persistent vector tries for a while) -- could be very cool.

For now, this was more of a toy project. If you have an interest in pushing either of these directions, would love to talk!


> Finally, I have to say, I loved using Rust. It’s the least mental overhead I’ve had to maintain with a systems language, and it was a blast to use.

This mirrors my experience learning Rust while working on the C2Rust translator (HN discussion https://news.ycombinator.com/item?id=17381946). Great read!


Awesome, thanks for the kind words dataking! :)


Passing both a &[String] and an index into the function is duplicative, because a slice already has an index in it (the start point). It would be cleaner to just use `tokens.split_first()` to yank the first token off the list, which gives you both that first token plus the remainder, then you can pass the remainder into subsequent calls, and return the remainder at the end when you're done. Or, heck, you could even take a `&mut &[String]` instead, allowing you to mutate the slice directly and not have to return anything.


This is a great suggestion eridius! I'll update it to use split_first.

Note: updated the essay and left a thanks to you :)


I love the article, I actually tinkered with something like this a few moths ago.

I tried to navigate back to HN using the back button, however it was inoperable. I found that passing every section heading on this article adds a line in your browsing history. Please don't mess with my browser history while I'm just scrolling. If I click on something that's fine, but not while I'm scrolling. I already have a history of where I am in your article, it's called the SCROLLBAR!


This is a fairly recent addition to medium I think, but I also hate it a lot. This kind of behavior can make sense on news sites that load another article when you get to the bottom of the page, but adding history entries for each header you haven't interacted with is beyond words.


re: article -- thank you for the kind words TehCorwiz :).

re: scrollbar -- oi -- definitely did not intend that. Will look into this (I am hosted medium, so don't have much ability to move things around, but will see what I can do)


Thanks for the response. Yeah completely understand.


I would’ve went with “Lust”...


I find humor in the choice of “risp” given the obvious alternative.


Sorry, that name is reserved for the next step in the project - a parentheses-heavy input syntax for Rust, or more plausibly, for the simplified representation thereof known as MIR. Its biggest impact on the tech industry will actually be to significantly broaden the appeal of the Rust Evangelism Strike Force; indeed, it's expected that very few programs will ever be written in Lust, and that even the programs that are written in the language will be largely toy examples! But the overall effect has nevertheless been considered to be highly worthwhile, so the project will go forward.


Just curious: Isn't MIR not a moving target with lots of changes going on? Perhaps a direct translation to Rust syntax would be a better choice.


MIR is not stable, that’s correct.


eyes widen...nods -- ah, amazing name!


Not too late to rename it, hmm?


blush -- my defense is that, I had to create a loot of gists on github to publish on medium, and it would take a while to change the name xD


I'm partial to scheme.rs

[edit] Which apparently does exist: https://github.com/isamert/scheme.rs


R7.RS


I am positive someone would complain and ruin the fun for the vast majority of us normal people.


It would definitely hinder adoption in business.


It's long been a dream of mine to make some software package that would be immensely useful and offer it for free with the catch being that the name is something childish and embarrassing and offensive.

And then maybe offer a version of it with a nice, normal, appropriate name, but charge money for that version.


"git"


"fruity loops" had to rename to FL Studio


BeautifulSoup!

It even comes with more "enterprisey" class-name aliases, if you need them.


What's wrong with "BeautifulSoup"? Is it a euphemism for another term with the initials "BS"?


“Google”


I worked on a product sold in the financial industry, and we wanted to ship it with a built-in 3rd-party debugger. The execs complained, so we had to buy and repackage the debugger source as "debug utility".


I thought this was just going to be another forgettable "X implemented in Y", but this is actually a Rust-flavored version of Peter Norvig's classic "(How to Write a (Lisp) Interpreter (in Python))", which is an absolute treasure. Glad I didn't just skip over this!


Thank you for the kind words kibwen :)


If you like Rust and Lisp this is also a cool project: https://github.com/carp-lang/Carp.


oo, very interesting! will take a deeper look, thanks! :)


I did almost the exact same thing (follow Norvig's Lisp tutorial, but in Rust) at one point, though I didn't get as far: https://github.com/eamonnmr/atmoslisp


Awesome EamonnMR! Hope it helps kick it back up :}


Embedding Lisp as a scripting language is a nice way to balance performance vs. expressive power from my experience.

I recently started working on something similar [0] in Go.

[0] https://github.com/codr7/g-fu


Cheers codr7!

You're project looks great -- you even got channels! Can't wait to go deeper.


I sort of got them for free :)

g-fu adds task isolation though, more along the lines of Erlang than Go.

Remixing Lisp is one of the most interesting things you can do in software as far as I'm concerned.


There is also quite older, but still maintained other implementation https://github.com/murarth/ketos


I've used ketos for https://shopify.github.io/shadowenv. It was relatively nice to work with and has pretty good sandboxing/resource limiting support.


this looks quite fully featured! thanks for the link :)


A bit of feedback to the author: I cringe really, really hard when I see things like:

  #[derive(Debug)]
  enum RispErr {
    Reason(String),
  }
One of the beauties of Rust's type system is that you can combine meaningful error types with the From trait to build zero-effort error handling. When you turn every error into a String, you lose information that future programmers (including yourself!) will be glad to have. I know it can be a little bit painful to get used to, but once you do, meaningful error handling will be at your fingertips.

To be helpful to readers of this comment and not just critical of the author, I recommend you don't overthink your error types when you're first starting to write your program. Think of the contents of the enum like a "scratchpad" where you list every failure mode you've encountered so far. While you're still developing, you can greedily match on every variant in your match statements, and it's fine. As you do this, you'll begin to notice themes that will inform the eventual refactor. Best of all, once you do refactor, you won't have to re-understand your code to tease out the various kinds of failures - you'll already have a list of them!


Because I'm on my soapbox, I'll also recommend implementing ToString[1] right off the bat, too. That way you can print out useful error messages immediately.

For example, if your errors are:

  enum RispError {
      /// Syntax error returning line number and char
      SyntaxErr(u32, u32),
      /// Parens not balanced; contains number of parens needed
      UnbalancedParens(usize),
  }
You can impl ToString straight away:

  impl ToString for RispErr {
      fn to_string(&self) -> String {
          match self {
              RispErr::SyntaxErr(l,c) => format!("syntax error at line {}, col {}", l, c),
              RispErr::UnbalancedParens(n) => format!("use your % key more! add {} more paren", n),
          }
      }
  }
Then, when handling the error, you can

  match do_thing() {
      Ok(v) => { ... },
      Err(e) => { // TODO handle error
          println!(e)
      }
  }
where do_thing's signature looks like

  fn do_thing(...) -> Result<T, RispErr>;
It takes the same amount of code, but all your error messages are in the same place, you haven't lost information about them, and refactoring to handle them becomes super easy.

(n.b. I dashed this comment off without actually compiling the above, so please forgive any dumb errors =] )

[1] https://doc.rust-lang.org/std/string/trait.ToString.html


Instead of implementing `ToString` directly, you should implement `Display`. A `ToString` implementation will automatically be added, and you get all the formatting goodness out of the box.


That's a good idea! I don't often impl Display so it didn't even occur to me.


As a Lisp programmer, the way most other languages handle errors make me cringe. How could I even make a decent ice cream sundae without error recovery? But I do enjoy that 'make a Lisp' is a like a sport in every other language.

http://www.lispworks.com/documentation/HyperSpec/Body/f_cerr...

http://www.lispworks.com/documentation/HyperSpec/Body/m_rst_...


Oof, I love this — thank you thenewwazoo!

I gotta get to sleep, but will either update the post or add a note in the morning.

And of course, will do this in all future rust projects :D


Glad to help! I am a huge huge fan of Rust. I'm almost always hanging out on ##rust and ##rust-beginners on Freenode if you want to ask questions synchronously. :)


Use the "?" operator. That way, you only handle errors one time (probably at the root of your program).

See here: https://github.com/omarabid/rust-starter/blob/master/src/mai...


included this as a note in the essay. I think a quick blog post about this approach would be quite useful actually, if you're up for it thenewwazoo!


YAGNI, KISS. Lots of code never ends up caring about differentiating between different error conditions. A lot doesn't even care if the code it called failed - maybe it'll log, but the caller can continue regardless. And for a parser, I'm more likely going to want a struct with file/line info (or a full blown list of context) than a bunch of hardcoded enum cases. Having the enum in place to easily identify refactoring points is nice, but having a bunch of mostly unused variations inside the enum will mostly just distract me, and slow me down, giving me more code to re-understand regardless.

In the cases where I do want to distinguish between error cases, I'd still rather wait until I have the code consuming those error enums to help guide what errors are meaningfully different - evidenced by needing different error handling paths - instead of making every caller decide which of the dozens or hundreds of edge cases they want to be handled which way(s).

There are cases where your approach of distinguishing cases up front would be better - as an example, if you're limited in your ability to refactor due to the interface being part of a public API with semver restrictions to your refactoring - but there's plenty of cases where the author's approach is the correct one, where you should be cringing if you don't see things kept as simple as RispErr.


I feel there's a special type of irony in criticizing a Lisp implementation for not being sufficiently type-safe.

(Or maybe it's just a generic irony ... I can't tell yet.)


Oh, I know. :) I mostly spoke up because of my own journey with Rust, and knowing how important articles like these could be to people new to the language. I totally understand why Stringifying errors is appealing in Rust but breaking the mental habit is harder than learning good habits straight away.


Thanks for the feedback thenewwazoo!

This sounds quite cool -- if you have the time, would you mind giving me a quick example? Want to make sure I have a full understanding of what you mean.


Whenever I've read into error writing and handling I inevitability get lost in the weeds. A tl;dr summary like you provided below would be a really helpful writeup in general, especially if it showed how it's just as easy as string errors and showed how much you get for free when doing things the right way.


One interesting way to extend this is to implement ‘call with current continuation’ or call/cc from Scheme. It’s a great challenge. I’ve done it in the past with rewriting the interpreter to continuation passing style. This also make it easy to implement breakpoints and a whole debugger for the interpreter. I still have it running here: https://csokavar.hu/projects/mini-scheme/


Great work encse -- love that you have a playground online for your language!


> It’s the least mental overhead ...

Arguments to values in Risp:

  fn env_for_lambda<'a>(
    params: Rc<RispExp>, 
    arg_forms: &[RispExp],
    outer_env: &'a mut RispEnv,
  ) -> Result<RispEnv<'a>, RispErr> {
    let ks = parse_list_of_symbol_strings(params)?;
    if ks.len() != arg_forms.len() {
      return Err(
        RispErr::Reason(
          format!("expected {} arguments, got {}", ks.len(), arg_forms.len())
        )
      );
    }
    let vs = eval_forms(arg_forms, outer_env)?;
    let mut data: HashMap<String, RispExp> = HashMap::new();
    for (k, v) in ks.iter().zip(vs.iter()) {
      data.insert(k.clone(), v.clone());
    }
    Ok(
      RispEnv {
        data,
        outer: Some(outer_env),
      }
    )
  }
Arguments to values in TXR Lisp (interpreter), written in C:

  static void do_eval_args(val form, val env, val ctx,
                           val (*lookup)(val env, val sym),
                           struct args *args)
  {
    for (; form; form = cdr(form))
      args_add(args, do_eval(car(form), env, ctx, lookup));
  }


They're not comparable. The C version contains no error handling, and off-loads most of its work to another function.


What work does it offload to another function that the Rust function doesn't also? We can make a table:

  C function   Rust analog
 
  args_add     data.insert
  do_eval      eval_forms
Rather, the Rust function actually off-loads the evaluation work to another function, eval_forms. I should be comparing that one to do_eval_args:

  fn eval_forms(arg_forms: &[RispExp], env: &mut RispEnv) -> Result<Vec<RispExp>, RispErr> {
    arg_forms
      .iter()
      .map(|x| eval(x, env))
      .collect::<Result<Vec<RispExp>, RispErr>>()
  }
That's a lot smaller, but still very noisy. My eyes bleed!


FYI, you don't need to use the turbofish to hint `rustc` what time you need to `collect` here because it takes it from the return type:

    fn eval_forms(arg_forms: &[RispExp], env: &mut RispEnv) -> Result<Vec<RispExp>, RispErr> {
      arg_forms
        .iter()
        .map(|x| eval(x, env))
        .collect()
    }
You could also make the stylistic change (but not necessarily good change) of wrapping eval to make it less wordy:

    fn eval_forms(arg_forms: &[RispExp], env: &mut RispEnv) -> Result<Vec<RispExp>, RispErr> {
      let eval = |x| eval(x, env);
      arg_forms
        .iter()
        .map(eval)
        .collect()
    }
At that point, it all fits in one line

    fn eval_forms(arg_forms: &[RispExp], env: &mut RispEnv) -> Result<Vec<RispExp>, RispErr> {
      let eval = |x| eval(x, env);
      arg_forms.iter().map(eval).collect()
    }
And you could make it so that you don't need to call `iter` inside of the fn on `arg_forms` and potentially accept many things besides slices:

    fn eval_forms(
      arg_forms: impl Iterator<Item=RispExp>,
      env: &mut RispEnv,
    ) -> Result<Vec<RispExp>, RispErr> {
      let eval = |x| eval(x, env);
      arg_forms.map(eval).collect()
    }
:)


Try this instead:

    fn eval_forms(arg_forms: &[RispExp], env: &mut RispEnv) -> Result<Vec<RispExp>, RispErr>> {
      arg_forms.iter().map(|x| eval(x, env)).collect()
    }
Or, if you don't like return value polymorphism, then:

    fn eval_forms(arg_forms: &[RispExp], env: &mut RispEnv) -> Result<Vec<RispExp>, RispErr>> {
        let mut exps = vec![];
        for x in arg_forms {
            exps.push(eval(x, env)?);
        }
        Ok(exps)
    }
It looks no more noisy than your C code, and I can't actually tell whether your C code is doing error handling. Is it? If not, try adding it. Now which is noisier?


It's at least twice as noisy. It has more operators, more different punctuators, < > brackets in addition to ( ) and { }.

Not sure what error handling you have in mind, but it's robust. We can interactively call it from gdb with some garbage values:

  (gdb) r
  Starting program: /home/kaz/txr/txr-dbg 
  This is the TXR Lisp interactive listener of TXR 215.
  Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet.
  1> (raise 5)

  Program received signal SIGTRAP, Trace/breakpoint trap.
  0x00132416 in __kernel_vsyscall ()
  (gdb) p do_eval_args(9, 9, 9, 9)
  Too few arguments in function call.
  (gdb) p do_eval_args(9, 9, 9, 9, 9)
  ** car: 2 is not a cons
  ** during evaluation at expr-1:1 of form (raise 5)
  ** run with --backtrace to enable backtraces
  2> _
The function relies on args having enough room for all the values; both callers ensure that. The lookup function can't be wrong, either.


What a strange comment. When compared to my first Rust snippet in my previous comment, your C code has more parens, the same number of brackets, more semi-colons, more asterisks and more inscrutable function pointer syntax. Of course, I don't actually find C's syntax all that noisy, but I don't find Rust's syntax particularly noisy either. Probably because I know both languages. But at least be fair and try to look at both with virgin eyes instead of only one of them.

The fact that it does error handling is not at all clear from its type signature, in contrast to Rust's function signature. So that's going to contribute "more noise" from your perspective, but on the flip side, it also conveys more information. Based on your demonstration, it looks like your function just aborts the program on an error, but the Rust function is a bit more versatile. It gives the caller a choice of how to deal with an error. Otherwise, I could just write this instead:

    fn eval_forms(arg_forms: &[RispExp], env: &mut RispEnv) -> Vec<RispExp> {
      arg_forms.iter().map(|x| eval(x, env)).collect()
    }


Since we're working on Lisp internals, we can accept a few parentheses.

The code has two asterisks, both in parameter declarations, indicating pointers.

It has exactly one binary operator in the body, the assignment = denoting the one local side effect (stepping the iteration variable of the simple for (;;) loop).

All else is simple function calls. Except for the function pointer declaration, and perhaps not knowing car and cdr, an Awk or JS programmer might grok this.

The program doesn't abort; the exception was caught in the REPL. We were thrown right out of the GNU Debugger where we caused the problem, and back into the Lisp REPL. We can demonstrate that in other ways, like this:

  (gdb) r
  Starting program: /home/kaz/txr/txr-dbg 
  This is the TXR Lisp interactive listener of TXR 215.
  Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet.
  1> (catch (raise sig-trap) (error (x) (put-line `caught error: @x`)))

  Program received signal SIGTRAP, Trace/breakpoint trap.
  0x00132416 in __kernel_vsyscall ()
  (gdb) p car(9)
  caught error: car: 2 is not a cons
  t
  2> (+ 2 2)
  4
  3> _
gdb gets confused here, though:

  3> (exit 0)
  [Inferior 1 (process 8529) exited normally]
  The program being debugged exited while in a function called from GDB.
  Evaluation of the expression containing the function
  (car) will be abandoned.
Quite understandably, it doesn't understand the exception handling and didn't notice that we jumped out; it still thinks we are executing the car function. Oh well!


Lisp in Rust? Have we reached peak HN?


Nah, that will happen once someone implements a Javascript compiler written in a Scheme dialect with a VM which is written in Rust, with the entire stack compiling to WASM so it can run in the browser.


`tokenize` function looks weird. Why use replace and then split? How would you extend this to stirngs (which might contain "()" chars)? It might be OK in a toy example, but it would have been better if the author wrote an actual Reader (not familiar with Rust, but I guess it should have some readers that allow peeking 1 char ahead OR something like Java's PushbackReader).

Then `parse_list_of_floats`, if I am reading it right, this function is fail-safe, not fail-fast, right? It won't panic upon the very first reading error?

Why eval makes environment lookups? Shouldn't environment itself know how to perform a lookup?


It's basically a clone of the tokenizer from this python implementation https://norvig.com/lispy.html which was referenced in the introduction.

The same code has been seen in a bunch of simple/similar lisp interpreters in different languages.

To be honest I'm not sure if that was the first documented tokenizer using this simple approach, but it is definitely a common pattern for such things - and as you say it is naive at best, and buggy at worst. That said it is easy to get it working, and later fix it properly.


RispEsp is a tree using value semantics. How will RispExp evolve to support e.g. circularly linked lists? How will that interact with Rust's ownership requirements?


Glad to see the rule being proved, albeit intentionally :)

> Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule


In Lisp notation that would be (In List Rust) or maybe (In Rust List).


Heck yeah Bill :) -- you know what, will update the title on medium!


(reduce Rustify Lisp)


Rust is the best lisp ever invented. Why bother another lisp inside it?


I wonder why not call it Lisp + Rust


Perhaps because it's ... boring?


Should've gone with Lust it's way better than Risp.


Why is this named Risp? Lust is much better...


Hahahaha Lust !




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: