Hacker News new | past | comments | ask | show | jobs | submit login
A Programmable Programming Language (acm.org)
283 points by samth on Feb 24, 2018 | hide | past | favorite | 114 comments

I have been building a new not-exactly-a-programming-language using Racket and it has been an absolute pleasure, and the community has been extremely friendly and helpful.

One existing stumbling block that I encountered on the way is that I could not find documentation for the full process for the setup for creating a new packaged language (I have been known to fail hard at finding the right documents so this may just be my problem). https://beautifulracket.com/ has excellent coverage of the general process, but the differences between #lang br and #lang racket/base are just large enough that I couldn't fill in the gaps without going and looking through multiple repos to see how others had implemented their repo layout and then could intuit which setup commands I needed to run.

If I find time I intend to write up my findings and submit them to the documentation, the short version is here in case someone finds it useful.

  ├── .git
  ├── my-lang       ; run `raco pkg install` here
  │   └── info.rkt  ; meta package depends on my-lang-lib
  └── my-lang-lib   ; run `raco pkg install` here
      ├── info.rkt  ; include actual dependencies
      └── my-lang   ; implementation goes here
          └── main.rkt  ; in theory you can put your whole implementation here
Once that set up is complete you should be able to use `#lang my-lang`.

Thank you for the link. Reading the example on the creation of stacker.rkt (a toy reverse polish language) was dramatically more enlightening than the main article. As I understand, Racket = source-to-source interpreter with a lot of helpful shorthand for defining syntax of {insert language here} or arbitrary language you make on the spot. Do they already have libraries for major languages like c++, JS, python? Would be nice to write in a syntax I like (perl) and turn it into c++ that compiled.

Racket isn't really a source to source compiler/interpreter (you could try to write a compiler that converted racket into your language of choice, but that can often be quite difficult). Everything ultimately has to pass through Racket's language semantics, so matching implementation details requires a lot of work if the underlying language spec is large. The #lang C example down below sidesteps these issues because it acts as a pass through to the system c compiler and then handles returning those results into racket, essentially it delegates the language semantics to the c compiler and only deals with how to map the results of calling c code back onto racket objects. Similar issue with the FFI if you want to translate from one language to another, you still have to go through the process of mapping language semantics.

Two examples where languages/parsers have been implemented in racket:

Algol60 is the best and most complete example https://docs.racket-lang.org/algol60/index.html. A python parser has been implemented https://github.com/dyoo/ragg/blob/master/ragg/examples/pytho...

Thank you for the clarification. Even after two articles I obviously still didn't fully get it. Appreciate the help, and the example links.

Racket uses macros to transform your language’s syntax into valid racket code, which is then interpreted.

No, this is far too glib, and the last part is wrong. See https://www.hashcollision.org/brainfudge/ instead.

Try reading https://www.hashcollision.org/brainfudge/ and see if that helps.

More than just general purpose languages, the article highlights how creating small languages that integrate well into the host language is a powerful way to layer your system.

I've experienced this first hand in making a DSL for sophisticated video editing and "style authoring" [1] in the context of muvee, and it was all much inspired by Racket (then known as MzScheme). You can tell based on the fact that the docs were written yet another embedded language "scribble" that's available with Racket/MzScheme.

The speed of iteration and ease of discovery and abstraction of patterns in lisp/scheme is hard to do justice to in a descriptive article. The Racket team has taken that to a whole new level that you'll be hard pressed to find in any other "batteries included" system.

[1] https://srikumarks.github.io/muvee-style-authoring/ (disclaimer: I used to work for muvee)

I also found the stacker example to be the clearest, because it really tackles the basics. I am however also confused as to whether Racket can transform any type of language. Indeed, how about interpreted-only languages such as for example J? Would the Racket JIT not be an obstacle in such a case?

(Shameless plug for my askHN on this exact question I have been wondering about: https://news.ycombinator.com/item?id=16456562)

Racket includes a lazy language/library, and you can see some of the side-effects from the compiler there: #lang lazy is incredibly slow compared to #lang racket.

Racket probably can transform any kind of language, but some seem to require extra work to get them to run smoothly, by which I mean you may have to write your own compiler as well as the rest, before you see real performance.

However, #lang datalog is pretty much a pure interpreter, and it does have acceptable performance. So it should depend a bit on the language's fundamentals.

Speaking of J, there is some work on getting it to play nice with Racket [0], though it is incomplete, and I have no idea if the work is ongoing.

[0] https://docs.racket-lang.org/j/index.html

You're missing a reader, I think.

That was the 'in theory' bit. You can create a reader module inside main.rkt using (module+ reader (provide read-syntax)) where read-syntax can be defined in main.rkt or required from elsewhere. Good point though since I actually don't know exactly where to put a reader.rkt file to get the equivalent behavior of defining it as a submodule inside of main.rkt.

You put in in a `lang` subdirectory (so `my-lang/lang/reader.rkt`).

I don't think I've seen any convincing examples yet of how writing a new language or DSL is superior to the already-very-flexible idea of user-defined functions, supported by almost all languages. Isn't a DSL just syntactic sugar in most cases?

Functions are verbs, usually named in problem domain terms, that can operate on a huge variety of nouns (data) also namable in the problem domain. So I don't buy that existing languages can't represent the problem domain, or force programmers to only do things in machine or traditional-language terms.

Don't want to be a spoilsport either, glad people are researching and experimenting with new approaches - just not sure I'm convinced yet the "make a DSL for everything" idea seems all that promising or different from what we already do as programmers.

> Isn't a DSL just syntactic sugar in most cases?

I don't think so. Maybe with things like LINQ, but in many cases a DSL lets you abstract things to such an extent that you get a new way of thinking about the problem. That's why DSLs are almost always very high level languages.

SQL and regex are the obvious counterexamples, where you don't describe how to return the data you're looking for or how to store it on disk, you just describe that data and the language does the work for you. That's a completely different way of thinking about a problem compared to procedural logic. The languages allow you to describe very complex data in an explicit and unique manner, and in doing so allow you to abstract what the data is from how the data is maintained. Obviously, that can bite you if you structure your DB poorly, but on the flip side it allows people to learn how to use very complex descriptions of what are essentially abstract concepts instead of needing to know details that are extremely easy to screw up. Declarative programming is extremely powerful because of this, and it allows for extremely complex systems to be built on top of this new layer of abstraction.

A shell is another counterexample where being "just" semantic sugar is really oversimplifying it. You're in a fixed context with a default function, so you get a lot of acceleration over having to spawn all those processes by hand. You're stripping out so much overhead code that you're into a new level of abstraction where -- unless you need to -- you don't think in terms of procedural logic.

For a less straightforward answer, a generic programming language would be like a hex editor, whereas a DSL would be like PhotoShop (or even Paint). Sure, a hex editor can modify any type of file in any way, but the power of PhotoShop comes from all the power you get out of being domain specific. It's not just the additional built in functions the program gives you, it's also the manner in which the data are presented that allows you to see things which you simply would miss at the hex editor level. Here the abstraction is allowing you to see what the data actually represent instead of just seeing how the data exist in the computer.

Let's take one of your "obvious counterexamples":

Exhibit A:

Exhibit B:

    cat(or(range("A","Z"), range("a", "z")), zero_or_more(alphanum()), "(", one_or_more(digit()), zero_or_more(cat(",", one_or_more(digit()))), ")")
Exhibit A is absolutely syntactic sugar for Exhibit B.

Regex is great if someone has already implemented a regex engine for you. But a key part of the post you're responding to is that it's talking about writing a DSL yourself.

I can write the functions to run Exhibit B in probably 3 hours. I'd challenge you to write a reasonably language-integrated Regex engine in under 3 days, even assuming you have a good understanding of the domain. That difference in implementation time is going to take a while to pay off in composing actual regular expressions. And regular expressions are a very simple DSL. A more complex DSL would take longer.

Exhibit B is still a DSL - just expressed in a syntactic manner that is compatible with C-style programming languages. In your example, each function like cat, or, and range would create an AST node, and you wold pass the result of the expression to an interpreter that executed the regex according to RE semantics.

In this sense, the function call mechanism of the host language isn't being used in it's usual sense - that is for actually doing something. Rather, it's a way of declaring an AST similar to S-expressions (though slightly clumsier as you have to define each of these node-creating functions separately).

> Exhibit B is still a DSL - just expressed in a syntactic manner that is compatible with C-style programming languages.

True, I suppose. But I think I can agree that you could also call the functions in Exhibit B a library. The difference, in my mind, is that Exhibit B gets you most of the benefits of Exhibit A without most of the implementation difficulty, because it avoids the trap of getting too caught up in syntax over semantics.

There is a point where DSLs are used enough that implementing the syntax becomes worth it: there's a reason people implemented SQL and Regex.

> In this sense, the function call mechanism of the host language isn't being used in it's usual sense - that is for actually doing something.

Eh, I don't know about that. If you recognize that functions can be used in this way, it can quickly become a very powerful tool, even moreso if you're working in a language where those functions can return closures. Lots of code is written in this style. Of course nobody writes these specific functions because most languages support regexes. :)

A recursive descent parser to convert between A and B is trivial.

But there are implementation details of functions vs code writing macros that are important. This is not about syntax at all.

If recursive descent parsers are so trivial, why do we have so many DSLs to generate parsers? :P

Also, a recursive descent parser isn't going to spit out an AST that matches the needed NFA/DFA, so there's going to be a transformation step needed.

I think, taken to the extreme, all programming languages would then "just" be syntactic sugar over lambda calculus

Greenspun's tenth rule: Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

That's true, but I think you look at my example you'll see that it's not that extreme. There's a 1-to-1 correspondence between each function call in Exhibit B and each semantic element in Exhibit A. Such a correspondence doesn't exist between most programming languages and assembly, let alone lambda calculus.

DSLs are usually written for domains where the procedural style has dominated for a while already and there’s a concrete set of concepts that have been well researched.

So I believe your point is relatively useless: for example, one of the use-cases of a DSL is one’s ability to change the program at run-time , having run time predictions and also the proof of termination - all of which are a sign of a well-behaved product from the customer’s point of view.

The problem with the claims you're making about DSLs is you're not backing up anything you say with concrete examples where the implementation cost of the DSL is actually worth it. You're not responding to my point.

> for example, one of the use-cases of a DSL is one’s ability to change the program at run-time , having run time predictions and also the proof of termination

Returning the program as a closure with different values closed at runtime achieves the same theoretical benefit and is supported by a wide variety of popular languages with no cost of implementing the syntax.

It does not, however - allow me to re-implement the methods based on the values passed into them.

The greatest example of this is SQL: it takes the set of statements and transforms them into a set of steps a database needs to run in order to give you a result (and there are many ways of doing so). So essentially we end up with an optimisation problem that has to be solved (and many problems in computing are the optimisation problems where the question can be answered in many different ways).

You could, however - argue that you could also implement an SQL-like language with the language of your choice; but then you had just re-implemented a DSL.

And the last point against you is the fact that we are currently at the point where writing plugins for the IDEs is getting more easy, so quite frankly - the overhead of writing a new DSL is becoming less of an issue.

Both of these examples would require implementation of a regex engine to provide the same set of functionality.

You’ve just given the syntactic sugar next to the ast parse, parsing a regex into an ast representation is fairly easy and can be done with recursive descent (it is trivial).

But this is supposed to be about DSLs not parsing, the DSL is either, because you’ve just given the DSL in two different forms. (One with sugar, one without).

Yeah, but in that sense, C is "just syntactic sugar" for Assembly. You keep using "syntactic sugar" in a disparaging sense when everything short of machine code essentially qualifies. After all, what is a compiler but a program to strip out and distill the abstractions of using a higher level language into machine code?

Like I don't need functions or methods, either, but that doesn't mean that I should avoid their use. Shouldn't I decide when implementing a function or DSL is worth the effort? Yes, it's time consuming, but abstractions provide real value in programming.

All you're really saying is "I can implement a regex engine in C." I mean, no shit. But you're implying that means DSLs are somehow inappropriate or unworthy of use because of that. There are big flaws in that logic.

> Yeah, but in that sense, C is "just syntactic sugar" for Assembly.

That's not true in anywhere near the same sense. There's not a 1 to 1 correspondence that lets you transform C semantics into assembly. Assembly has no counterpart to C types, for example. And if you recognize that your initial argument is flawed, the rest of what you're saying falls apart.

> Like I don't need functions or methods, either, but that doesn't mean that I should avoid their use.

If you can point out where I said you should avoid the use of DSLs I'll be very surprised.

What I said is that the cost of implementing a DSL is rarely worth it, given a library will often provide 90% of the benefit a DSL with 10% of the effort.

> Yes, it's time consuming, but abstractions provide real value in programming.

Yep. And functions are an abstraction which has a very low cost in most programming languages and provides.

> All you're really saying is "I can implement a regex engine in C."

If that's what you think I said, you should read what I said again.

As others have said, regex only works well because you have a sophisticated implementation underneath.

SQL is fine and good, but the syntax part is easy - hundreds of ORMs, and LINQ, let you express SQL in native-language syntaxes, e.g.:

Select(columns=['fname', 'lname'], table='People', where=Equals('fname', 'bob'))

The challenge is in the under-the-hood query optimization, finding the best index to use, caching results in memory, using highly efficient on-disk data structures etc. That stuff is usually written in a procedural language (though it could perhaps be done functionally etc.). So while Racket might make the parsing and transformation to Racket constructs easier, that's just a tiny aspect of making an efficient DBMS. It's not some DSL you'd just cook up anytime you need it.

Your hex / photoshop example is about the farthest thing from syntactic sugar imaginable - "syntactic sugar" imples a trivial, usually purely lexical transformation between two ways of expressing something (like the SQL example above).

>Isn't a DSL just syntactic sugar in most cases?

Maybe. But you probably have a programming language that you like to use when building Web apps, and another that you like to use when building command-line utilities. The differences aren't usually just the libraries available, but how easy it is to express certain solutions.

Nailed it. As you work longer on a project you'll realise patterns emerging. It's the same behavior repeated again and again. Take Rails for example, it is where it is because someone saw the patterns for a webapp and made a framework that conceptualizes it. It's something that's not appreciated very well. The concepts serve a vital purpose in organising your thinking in the problem domain.

In a sense a DSL lets burried implicit knowledge become enshrined more formally (even if it is still implicit in a way) so that people don't keep making the same mistakes over and over because they are presented with tools where all the footguns haven't been safely dealt with. Consider how many people make the same mistakes over and over and over again because they go back to html, css, an js which give ZERO constraints on what one ought to do with them to make a useable website. Now imagine that we had been growing the language for writing websites so that new users literally could not write code that would footgun themselves into oblivion.

Your examples amuse me because CSS & HTML are essentially DSLs :) I have to agree with the parent, well designed abstractions and functions in my opinion go 99% of the way towards providing the same benefits as a DSL without the additional complexity.

Yes DSLs for constructing and styling a hypertext document, but as soon as we go beyond the simple document model of the web they are like handing someone a chainsaw and asking them to sculpt a chair that will be comfortable to sit in. Some people can do it with just the chainsaw and a lot of practice. However, in order to get most people to not damage themselves or the people who may someday try to sit in the chair, you have to create a robot to control the chainsaw. What you probably wanted was not a chainsaw attached to a robot, but rather a lathe, a plane, and some chisels, and some other hand tools.

I believe what you’re trying to say is that every product has it’s own lifetime.

Not really lifetimes, even a stone hammer like one used 30k years ago is still useful in the right context (I still find html and css quite useful for what they were originally intended for). More that even very powerful tools have a niche or set of use cases where they excel, and outside that nice other tools are likely to be more empowering to their wielder.

I'm starting to look at it like this -- imagine writing an entire web app with Rails. Then, imagine re-writing it from scratch as a single page app, say with Angular. Then, imagine rewriting it _from scratch again_with React. (Obviously this is just a thought experiment, but an interesting one).

_Something_ in your brain would remain the same through all of those projects. By discovering and articulating that something, you have arrived at more clarity on your domain. You've also made an asset that is more "fundamental" than the 3 web apps themselves. This asset would help you if you ever needed to port to other technologies in the future.

You would also get incredible mileage out of this asset. It would represent the "skeletal structure" behind everything else you need to build, which means you'd have something like a, eh, 40% head start on most things. Maybe a 40% head start on automated tests, a 40% head start on validation code, etc.

This would most likely lead you down the path of code generation, aka "everything is a compiler."

It sounds promising and interesting, but I cannot "prove" the ROI. The time it would take to discover and articulate that "true core" could be significant (though in reality it wouldn't require rebuilding the same product 3 times).

It seems primitive that we typically insist on articulating what we've understood about the domain in the same general purpose programming language that we intend to implement the solution in, and that we do so almost without being aware we're making that assumption.

I'm taking a class with the first author of this paper known as "Hack Your Own Language", where we read the draft of this paper as our first assignment. While I won't do it full justice I'm sure, I'll try to detail some advantages, though not all of them:

1. You can override built-in language features to better suit the domain of your problem.

2. You can control scope and evaluation of variables. While they often have good defaults, there are cases where you want to override them.

3. You could easily add partial or full typing as you desire to a language. Not runtime typing, but compile time with proper error messaging, and it will be a heck of a lot easier than hand rolling it trying to do it with functions.

4. You can change the very shape of the syntax. Imagine specifying a graph with only identifiers and ascii arrows and just listing them out, then performing DFS. No more Node/Edge class structures, you just work at the abstract level. That applies to any domain. If you're working with functions, you can't change the syntax of the language, which leads to very awkward code in some cases.

The graph language: http://www.ccs.neu.edu/home/matthias/4620-s18/network-syn-la...

The graph language sample program: http://www.ccs.neu.edu/home/matthias/4620-s18/network-syn-la...

The language is simple, but as is the point, can be extended by anyone. That's powerful.

DSLs are a great solution where the problem domain is difficult to express in anything less. Matlab and Make are both good examples. My experience has been that they also bring enough documentation and maintenance burdens along with the flexibility that you should avoid them without a very good reason though.

What does Matlab do that is not easy to express in say Python with sciencetific libraries?

Setup of build rules is nice in Make, but if one accepts stringly-typedness one could reimplement something just as nice in say Python/Ruby/JS

Let's say a language A is syntactic sugar over a language B if A can be translated to B by macro expansion. In that case, if B is sufficiently expressive, e.g., has first class functions, then in many cases A will be syntactic sugar over B. On the other hand, there are many language features that are orthogonal to first class functions, for which the translation is no less complicated than the frontend of most compilers. For example:

- Delimited continuations

- Related to this, algebraic effects and handlers.

- Implicit types.

- Join patterns for concurrency.

- Linear/Affine ressource management.

- Related to this, type state and session types.

- Probabilisitic programming.

- Differential programming.

And many more. The point is that you do not want a language that includes all of these extensions at the same time, since many are mutually exclusive. For example, differential programming does not mix with state or first-class functions. Linear types ensure safe ressource management, but you need an escape hatch both for efficiency and your sanity. Algebraic effects or delimited continuations in a language which already has imperative features will lead to very fun bugs as soon as someone unfamiliar with the language implementation tries to mix the two features.

On the other hand, for every single example in this list I can point to an application domain where the additional expressivity is beneficial.

E.g., delimited continuations are extremely nice when implementing some complicated backtracking search. A DSL with (pure functions and) first class support for delimited continuations can express such a search function very naturally and later on - in the compiler for the DSL - we can decide how to implement this feature.

This allows for additional optimizations, which would otherwise be implemented in an ad-hoc manner. For example, several linear return continuations can be implemented by having several return addresses and stack pointers and restoring one of them, rather than using a cactus stack. Another example, rarely used return continuations can be tracked out of line as in "zero-cost exception handling".

Mixing these implementation details with the implementation of your search function is just a bad idea, since they are orthogonal to the problem you actually want to solve. And, as we all know, mixing continuations into an existing imperative language just gives continuations a bad name, which is precisely why you want a DSL. :)

Thank you for writing this. I wanted to say something like this yesterday, but couldn't find the words and decided to wait for today to try. But I would have never put it this well even if I spent a year.

I recall going to a talk when I was at Northeastern, and at one point the speaker said something like, "But why would you create a new programming language just to solve one problem?"

Matthias Felleisen (the guy in the video) interrupted him from the audience to respond, "I know several people who have done exactly that, and it worked out quite well for them!"

As he says in the video, Northeastern teaches students how to program in a systematic way, rather than by copy-paste-modify from other examples. They certainly teach a powerful way of thinking, and it has served me well over the years.

If anyone out there is reading this and is thinking about going to college for computer science, definitely check out Northeastern. I could not be more thankful that I went there.

You can read the freshman text book online: "How to Design Programs"[1].

[1] http://htdp.org/

I had a similar experience at Waterloo. The first semester of class we used Racket, HTDP (linked above) and even Haskell. I had already some experience with programming but it was like starting all over again to build solid foundations. If you are in the business of teaching introductory computer science material, I suggest you consider "How to design programs". Fun, good times.


> "I know several people who have done exactly that, and it worked out quite well for them!"

I've seen it done several times. It tends to work better in the short run than the long run compared to alternative approaches.

The problem is that the several people he knows are not the average architecture astronaut. It takes a lot of taste and experience to design a usable language.

And it takes a lot of organizational dedication and work to support the language properly: training, documentation, packaging, regression testing (1), editor support, debugger support, libraries, and the list goes on. I doubt a single individual could do all that for more than a short while, so you need an organization that wants to do all that right for the sake of the language. Most companies won't really care that your language isn't in fedora upstream.

(1) Lots of popular languages get the community to do their regression testing for them, but if you are the sole user of your language, you need to do all the testing yourself somehow.

I can just imagine the learning curve when encountering legacy code in a language that's only used for the project, product, or company. I can't begin to explain how many times I have just Googled a question in a particular programming language and very quickly found an answer. When I work in new or unpopular languages, I often can't find an easy answer.

The difference in time is staggering. The learning curve for a popular language means that learning something new takes me 5 minutes, but in an unknown language learning something you could take me all day if there are not a lot of things like stack Overflow posts.

If a project specific custom language deviates too far from the norms, then it means that incoming programmers will have a harder time maintaining it.

Having seen all kinds of abuse of C macros or python metaprogramming I can't even imagine the mess this could create. Some people just seem to love the idea of creating a undebuggable Frankenstein macro system just so they can save 3 characters of code when writing foo.bar instead of foo["bar"]. Python libraries depending on @ annotations everywhere is another favourite. You save 1 measly line of code by not having to call or register the function somewhere but you loose so much more as you can't control initialization order and your program usually is forced to store it's data as global state.

I do see the powers and benefits this can give but with great power comes great responsibility. It must be truly isolated to places where you absolutely need it instead of just for fun.

Oh my god - this. I have recently started writing my own library that required to register functions for the task runner. At first I’ve tried to copy the @ approach, but then realised what an unmaintanable mess that becomes.

> I can just imagine the learning curve when encountering legacy code in a language that's only used for the project, product, or company.

I've seen some pretty extensive class libraries that required a lot of study to understand, so I'm not sure a DSL makes a difference.

One aspect of a DSL is its relation to the way business folks talk about their jobs and the jargon they use. A DSL can get you pretty close to matching how a process is described by the business in business terms. I am reminded of an old Dr. Dobbs Journal article[1] about Botanical Latin. It seems like from an understanding and conversation view a DSL has a pretty good fit at a lot of businesses.

At the end of the day, a lot of programmers are trying to implement business solutions and not trying to create new web products.

1) http://collaboration.cmc.ec.gc.ca/science/rpn/biblio/ddj/Web...

There is a third circle of hell: kdb. Unpopular and what is there is unsearchable as the syntax consists of items like ,:/.

I like kdb and the docs are okay, but my word...

Not easy to get into indeed but I find it a pleasure with some experience. Personally I find k/q nicer to work with than say Ruby as used in Rails.

"Lisp is a programmable programming language." — John Foderaro, CACM, September 1991.

Yes our title intentionally alludes to CL.

Instead of having to think about every possible way a language might be used and account for it up front, languages with good metaprogramming facilities let the users them to fit their needs. This way the core language can stay small and focused while remaining flexible.

While you don’t want everybody building their own language, userland extensibility can play a huge impact on the overall cleanliness of the language because new ideas can be tried out as libraries. If these ideas don’t pan out, the community moves on. Projects using these libraries will keep working, but the core language doesn’t end up accumulating cruft. I wrote about that in detail here: https://yogthos.net/posts/2017-10-03-MovingForwardByLettingG...

...which is why Scheme has, like, 37 different, incompatible object systems.

Everything has costs.

You say that, but SLIB [0] is the canonical place to go with Scheme, which has the "Macroless Object System" [1], based on CLOS from Common Lisp.

SLIB is included with Guile, and compatible with almost all Schemes. If a Scheme has an object system inbuilt, it is likely to be built around SLIB's.

[0] https://people.csail.mit.edu/jaffer/SLIB

[1] https://people.csail.mit.edu/jaffer/slib/Object.html#Object

That's Scheme. Lisp, meanwhile, has one: CLOS. One is excellent for learning about things like hand-rolled object systems, and one is excellent for building large systems.

Common Lisp has CLOS. I'm aware some purists don't think Scheme is a Lisp but, regardless of one's stance on this, there are most definitely other Lisps than Common Lisp.

Having called the result of Lisp standardization ANSI Common Lisp instead of just ANSI Lisp doesn't change the fact that the whole effort was all about standardizing Lisp.

Moreover, why not just say Scheme or Clojure when talking about Scheme or Clojure?

If ANSI C had been called ANSI Common C, would Java fans never cease to insist on calling Java a dialect of C?

> the whole effort was all about standardizing Lisp

There are more Lisps now than ever, so perhaps that was not as successful as was hoped.

> why not just say Scheme or Clojure when talking about Scheme or Clojure?

Why not just say "Common Lisp" when you mean Common Lisp, and "Lisp" when making true general statements that include other things that have always been considered Lisps? How are we to refer to Emacs Lisp without tying ourselves in knots if we refuse to say that "Lisp" includes it?

Sure, it's not a guarantee that the language will stay small and focused, but it provides a path towards that. Clojure has been around for a decade now, and the core language has grown very little. I think it helps that Clojure is a BDFL driven language as opposed to being a design by committee.

> languages with good metaprogramming facilities let the users them to fit their needs.

No amount of metaprogramming can fix the intractability of proving programs correct in the base language, especially if the metaprogramming facilities leak identifiers like a sieve (unhygienic macros).

> userland extensibility can play a huge impact on the overall cleanliness of the language because new ideas can be tried out as libraries.

Instead of (uselessly!) trying new language features, how about trying to find usable proof principles for the language features that you already have?

There's absolutely no evidence that formal methods approaches are effective in practice. These ideas have been around for decades, and they've only been found to be effective in a few niche areas such as compilers.

That's not empirical evidence that formal methods are more effective than other approaches. It just shows that the culture at these companies is conductive to using them. Walmart https://www.youtube.com/watch?v=av9Xi6CNqq4 and Boeing https://www.youtube.com/watch?v=iUC7noGU1mQ etc use different approaches. :)

You claimed "There's absolutely no evidence that formal methods approaches are effective in practice." and said the only real use was for compilers. I pointed out that they are effective in practice in domains completely unrelated to compilers. Your response is to bring up a totally different point ("more effective than other approaches" — a statement that can't even be falsified), talk mumbo about "culture", and point to Clojure talks, which is neither here nor there. In short, you are perfectly suited for "conversations" on the Web.

How do you define effectiveness? A company using a particular approach does not automatically imply that it's an effective way to do it. You can make any approach work if you invest enough resources into it.

To determine whether a particular approach is effective, it has to be compared against the available alternatives to determine whether it's effective or not. So, I'm a bit baffled by you saying that's a totally different point.

So, perhaps you can explain the metric that you're using here to define effectiveness.

"No amount of metaprogramming can fix the intractability of proving programs correct in the base language"

It's actually rare people need to do proofs. If they do, they can also prove the programs correct with a tool suited for it like Isabelle/HOL or Coq then extract them to the desired language. It's worked for extractions to ML, Haskell, Scheme, and recently C (see below).


There's also been verified LISP's (Scheme48, LISP 1.5) and metaprogramming constructs. So, it could be done in or for a LISP-like language with metaprogramming. I even found two metaprogramming schemes for Coq recently.



That brings me back to an old idea I was just discussing with someone about SPARK. I've pushed for using all high-value/low-cost methods on highly-assured programs before using proof to avoid wasting time on stuff easy to prove wrong. Heavy hitters from Verisoft to Data61 have been coming around to that idea using Alloy and property-based testing of specs respectively. Many of these formal languages also have crappy tooling on FOSS side versus mainstream stuff. So, I suggested embedding them in something like Racket to: get the IDE and REPL benefits for productivity; use combo of Cleanroom-style development, Design-by-Contract, contract-based testing, combinatorial testing up to 6-way (3-way minimum), and fuzz testing with contracts running; extracting to Rust, C and/or SPARK w/ their static analyzers or model checkers run on that code; fix any problems in logic in original with repeats of that process as much as necessary; final result extracted to something with certifying compiler to knock out that risk. I called the concept Brute-Force Assurance given one just throws CPU power and low-cost labor at the problem without much specialist skill required past what DbC or SPARK needs.

That combined with basic methods for high-availability or fault-tolerance plus recovery should make for one hell of a high assurance system at a fraction of the cost of formal verification. And if you wanted proof, then that system would already be nearly flawless to begin with in a structure (i.e. Cleanroom) amenable to easy proof with ability to extract to semantics of your choosing. If not Racket, I had K Framework used for KCC Executable Semantics for C in mind for formal alternative. Regardless, every tech I just cited (except SPARK) has low cost, lowers defects, and was easy to pick up. The combo should really bottom out defects without slowing time-to-market down too much to be useful. That's what formal proof did almost every time it was tried in normal development, esp if using 3rd party libraries.

We've worked on Racket + Alloy for ~8 years now, including using the Racket REPL for similar purposes. And built several tools with it (e.g., for networking analysis). Email me (sk@cs.brown.edu) if you want to chat more.

I knew I recognized you: typing your name brought your page up immediately. I had previously posted your ADsafe work in a lot of places interested in how to do JS better. Just posted Pyret on Lobste.rs with them enjoying that. Didn't know you worked on that, Racket, and theorem provers. Neat stuff in your publications. :)

Yeah, I'll probably email you this week or the next.

Thanks kindly. Will look forward to chatting.

> It's actually rare people need to do proofs. If they do, they can also prove the programs correct with a tool suited for it like Isabelle/HOL or Coq then extract them to the desired language. It's worked for extractions to ML, Haskell, Scheme, and recently C (see below).

It baffles me that people think they can solve the problem of the intractability of systems by piling up more complexity on top.

> There's also been verified LISP's (Scheme48, LISP 1.5) and metaprogramming constructs.

It doesn't make sense to “verify language features”. What you verify is programs. Of course, the Lisp crowd is fundamentally incapable of distinguishing definitions from implementations, which leads to this regrettable sort of mistake.

> So, it could be done in or for a LISP-like language with metaprogramming. I even found two metaprogramming schemes for Coq recently.

This fails to address my original comment. Have you actually understood it? The point was (and still is) this: “Metaprogramming might write half of your program for you, but it won't help one iota with the proof of correctness unless the base language is already something mathematically nice to work with.” The excessively operational definition of pretty much every Lisp dialect in existence makes it prohibitively expensive to formally analyze the humongous swathes of code generated by Lisp macros. The problem with Coq is, surprisingly for something coming from the type theory crowd, exactly the same: You can't really understand a Coq proof script just by reading it. Instead, you have to replay it. Operational thinking does not scale.

> That brings me back to an old idea I was just discussing with someone about SPARK. (...)

This sounds insanely complicated. The only justification I can come up with for this kind of thing is manager thinking: “This process has lots of steps and makes people's lives miseable, hence I must be doing something right!”

"It baffles me that people think they can solve the problem of the intractability of systems by piling up more complexity on top."

The systems that ran for over a decade without downtime weren't done by formal proof: they had lots of review, checks and redundancies that came from prior experience with failures. Tandem NonStop comes to mind with VMS clusters getting pretty far. Identify a problem, develop a mitigation that works for all inputs, review it, test it, and deploy. Integrate with careful review and thorough testing. Whereas, a number of systems that were "proven" have failed after basic review or testing. Proof is just one tool among many instead of anything to think of as necessary or perfect.

Far as empirical evidence, those prior systems plus current ones with the most uptime or correct functioning in industry didn't have formal proof or use languages you'd recommend. That shows they're not necessary to achieve that goal. Formal proof with realistic models indeed showed the highest level of correctness once the tools matured. The drawbacks were: cost was insanely high, the models required overly-simplistic implementations, and time-to-market put them behind competition which can kill a company in markets. So, I recommend most don't do that kind of stuff. Whereas, I recommend CompSci do a mix of developing productivity/maintenance boosters (eg Racket, refactoring aids), optimizations, push-button tooling for verifying apps in popular languages (eg Pathfiner, Astree Analyzer) and clean-slate, verified stuff like miTLS or CompCert that can see real-world use without requiring specialists. Gotta balance across what people care about.

"The excessively operational definition of pretty much every Lisp dialect in existence makes it prohibitively expensive to formally analyze the humongous swathes of code generated by Lisp macros. "

The simplicity of it should make a SPARK-style analysis easy if the meta eventually decomposes into simple functions or operations with contracts describing intended structure or value ranges. The macros themselves can also be proven to be certified generators. Anyway, I think it's easier to address your point by comparing the base language to LISP in terms of ability to write a formal semantics or easily analyze it. Most mainstream languages fail immediately. A subset of LISP is doable given ACL2, VLISP and Myreens work. SML seems to win overall since it was designed for this but looses on many practical fronts like performance, developer tooling, and library ecosystem. Ocaml subset is closest to the goal for ML's but still behind mainstream stuff on factors more important to developers. Maybe Haskell, too, given verification tooling for it I've seen but I don't know its semantics so can't say more.

In any case, you'd be trading everything that gives developers velocity on software development for the ability to formally prove something more easily in the language which 99.999% of developers won't do. They can reach higher than their QA goals without it using lightweight methods with empirical proof of effectiveness above. Also, 99+% of proof engineers won't care about provability of base language either since they prefer to use dedicated tools with an extraction (eg CompCert) or equivalence check (i.e. seL4). If real-world deployment, people wanting a proven component can also prove it down to ML, C or ASM to wrap with a FFI (eg miTLS pieces in Firefox) or hide it behind an API (eg MULTOS CA in SPARK/Ada/C++ mix). Things like linking types by Patterson and Ahmed will make that safer over time.

"This sounds insanely complicated. The only justification I can come up with for this kind of thing is manager thinking: “This process has lots of steps and makes people's lives miseable, hence I must be doing something right!”"

I already gave you the reason: get confidence or defect rate close enough in practice to formal proof that software will meet its specification or fail safe taking developer right to error... without costs or time-to-market impact of formal verification. At most, we might be talking one well-trained, median programmer added per team instead of a proof engineer. Managers barely care about software quality on average much less formal proof. Selling them on improving defect rates at low cost in ways that makes extensions or maintenance easier is much more likely to succeed. Good example below by a Lockheed person on Design-by-Contract comparing it to other methods:


Did you ever progress that ‘old idea’; seems worth pursuing with some r&d grant?

Thanks. I didn't since I was still heavily researching the components I might use to design it. Previously, both temporal and concurrency errors were a problem to point that I might have had to integrate Java tooling, some commercial. That Rust knocked GC out of the equation makes it a lot more practical now for system-level code if whoever builds it can ensure the Rust and C code it generates are equivalent. The results on capability-secure, distributed languages such as Pony made me wonder if I should pause again to figure out a parallel or distributed by default version. However, I think we can tons of value out of something like a safe C with Racket expressiveness with the distributed stuff getting figured out over time.

So, I'd like someone to try to build it with all I asked for being credit for the idea on top of a FOSS or free as in beer implementation for wide availability. We don't need another five-digit tool only enterprises can buy that taxpayer-funded R&D often goes to. (sighs) I might take a stab at it in the future, though. If not Racket, I was also looking at PreScheme, Red, and Nim for base languages. Main syntax, like Nim's, might need to be pleasant, though, so it will get adoption in first place with picky programmers. ;) Also, these days I recommend seemless compatibility with C data types or calling conventions so one can plug it into the C ecosystem without abstraction gaps or performance hits.

Can you mail me or I you? I believe I can make this work as open source and with enough funding.

Sounds good. I got your email. I'll respond within the week. :)

Great article. I have in the past (when I wanted to use Scheme) used either Gambit (great support from the author!) or Chez Scheme (fast!), and only have played around with Racket a bit (perhaps 50 hours in the last 5 years).

I have been thinking of switching to Racket, and this article 'pushed me over the edge' on that decision.

Language-oriented programming sounds like metaprogramming with DSL’s just with a new toolkit. Language-oriented programming might be a more approachable term for that, though. If I heard it, the first things I’d think of were tools such as Rascal and Ometa that let one arbitrarily define, operate on, or use languages. That covers language part. Far as integration, a number of techs supporting DSL’s… probably a lot of LISP’s and REBOL… had a powerful language underneath one could drop into.

So, this seems like a new instance of old ideas they’re giving a new name. I do like how they’ve integrated the concept into a general-purpose, GUI-based, programming environment instead of it being it’s own thing like the others. You get best of both worlds.

An old idea I had was researchers should do more experiments in building things like Rascal or Ometa alternatives in Racket leveraging what they already have to see how far one can stretch and improve it.



Indeed, though I admit my first thought when reading things like this is of FORTH rather than Lisp. I think there's a lot of value in DSL and implementing them in some languages is a lot easier than others.

Language oriented programming is far from a new term. See the history of jetbrains MPS and intentional software. There was a strong meme in the early 2000's this was the future, it never seemed to take off.

I even bought into that meme a bit in the early 2000s. It turns out that most "language" designers (including myself) aren't that good at it, and that designing a language that isn't just yet another leaky abstraction takes a lot of work. Your "interface" is the entire syntax and semantics of your "language."

Yes - there is definitely a hierarchy of difficulty: user code, library, macros, full syntax extension. The trouble is that designing new languages takes a lot of _taste_ and experience. We don't know the full domain, so DSLs tend to have bits added to them as knowledge of the domain grows - and that's the scary point in language evolution. So, despite these things being tremendous fun, I tend more towards correctness and clarity at the expense of sheer expressiveness. (An example of ad-hoc language design is CMake, which makes my eyes bleed. Kitware should have used an existing language - even tcl would have been better.)

I wonder how this approach scales to large code bases maintained by global teams. Instead of the usual discovery that each contractor or "rockstar" programmer created modules using their preferred language or in vogue technology, you might now find they derived their own obscure language flavors that read right to left and use "=" for function calls.

In large projects, readability and maintainability trump expressibility.

> In large projects, readability and maintainability trump expressibility.

and neither is a substitute for proper governance...

Naughty Dog seem like a relatively large team. If you're thinking a megacorp tech team, there seems to be few examples of truly large successful Lisp projects, though it will be interesting to see how some of the growing Clojure companies get on in the next 5-10 years.

Racket is awesome. I originally used scheme48 but was impressed with rackets tooling (when it was known as plt scheme) . After it branched out as racket and added a large standard library with batteries included it became much more practical to use. I am experimenting with using it with smart contracts and having a library where you can make your own language and work with program synthesis to web development it’s really awesome!

> Racket eliminates the hard boundary between library and language, overcoming a seemingly intractable conflict

I don't buy it.

A language extension means the syntax wouldn't ever be valid in the base language. The fact that Racket defines its 'for' construct using macros/functions/whatever is cute, but I see no reason to pretend it's a language extension.

> I don't buy it.

> A language extension means the syntax wouldn't ever be valid in the base language.


     (struct pt ([x : Real] [y : Real]))
That's not valid Racket, but is valid Typed Racket.

A more extreme example:

    int main(int argc, char** argv) { return 0; }
That's C! Oh wait, it's the C #lang [0]...

Which means you can things like:


    #lang C

    float f( float x1, float x2 ) {
      return x1 / x2;

    #lang racket

    (require base)

    f(10.2, 1.8)
Those syntaxes aren't compatible, by any stretch. The magic is, Racket gives you access to powerful pre-processing, in the form of some amazing reader macros.

What syntax you use, doesn't really matter, so long as you provide that bridge. They're incompatible syntax, with a custom made parser allowing data to flow both ways.

[0] https://github.com/jeapostrophe/c

Shouldn't that second example be

    (f 10.2 1.8)

Yep. My bad.

Where does the conversion between C's simple "float" and Scheme's notion of numbers take place? Does the call from "#lang racket" check and convert the arguments before passing to a simple f(), or does a complex f() check and convert its parameters?

It depends on the implementation, that's for the expander to handle.

In the case of the implementation I listed, it passes it on literally, and compiles it and runs it, and then takes the output and passes it back to Racket. [0]

[0] https://github.com/jeapostrophe/c/blob/master/c/lang/runtime...

It shells out to the C compiler (hard-coded as "cc") for every function? Oof. That's a pretty limited FFI.

Or maybe I wasn't clear. If Scheme tries to pass a double/bignum/whatever to the C function that expects the float, where is the error caught?

It's not Scheme that does the passing, just to be clear, it's the library.

In this case, it's just throwing strings around without changing anything. No passing to transform a bignum, no bounds checking. Just tries to run it. (Racket does have a decent FFI [0], but this library is just for classroom example stuff. Not serious work.)

[0] https://docs.racket-lang.org/foreign/index.html

The base language is not "racket". The base language is "#%kernel".

The langue "racket" is implemented in "racket/base" that is implemented in "pre-base" that is implemented in #%kernel. If you want to use `for` you must use "racket" or "racket/base" or "pre-base", because it isn't defined in #%kernel. If you want to use `match` you must use "racket" because it isn't defined in the other three languages.

The language "#%kernel" has a minimal syntaxes defined, so it's painful to use.

> A language extension means the syntax wouldn't ever be > valid in the base language.

Well, this works:

     #lang base-without-for
     (require the-for-library)
     ...use for here...
Of course if the author of `for` has made a `base-with-for` language, then I can just use:

     #lang base-with-for
     ...use for here...

I’m curious how this compares with the similar built in tools for building grammars, etc in Perl 6, and also if the syntax flexibility possible is worth the trade off vs using more native (if obscured or downplayed) syntax eg the sorts of DSLs you can build with Ruby.

Wasn't "Script X" Apple's failed attempt at this back in the early 90's? People found creating custom for every problem was unmaintainable. Each developer was encouraged to customize their own personal language, and work with syntax translators to work with group written code. Nightmare.

I presume you are talking about ScriptX created by an Apple spin-off Kaleida Labs? I'm reading about it right now, and I don't see any specific features useful for language-oriented programming, not even Lisp-like macros. From what I can see it doesn't seem to encourage this approach to software development at all, it looks like some kind of Smalltalk derivative with multiple inheritance bolted on.

Forgive me, but I recall, with some decades of practice, a major point in FORTH is that one constructs a vocabulary and syntax specifically for solving the problem at hand. Compiler primitives are just as first class as regular word (aka function) definitions, including how to compile itself as well as what to do when executed (DOES> keyword, which can itself be redefined). FORTH may have a clunky and primitive reputation, but it seems to me it qualifies as a programmable programming language.

Sure. But the degree of clunkiness matters. Forth doesn't have the ability to build linguistic _abstractions_ the way Racket does. But if you want to add Forth to the evidence of the paper's thesis, that's fine, more the better.

This is offtopic, but seeing ACM made me think about my long-lapsed membership.

Are you a member of the ACM? What do you get out of your membership?

A side note: React and jQuery are not really complementary libraries dealing with separate concerns..as suggested by the article,unless i have fundamentally misunderstood both.

Tcl can do wonders in this area. uplevel is in some way more powerful than Lisp macros, since it happens at runtime.

Uplevel and its ilk is related to the Lisp FEXPR and NLAMBDA mechanisms. Mainstream Lisp discarded these things decades ago; they didn't surive into ANSI standardization.

Isn't "uplevel" essentially Tcl's way to choose dynamic scope? Lisp has ways to do that, too.

It's more like "eval" in the stack frame you select. This added to the fact Tcl, like Lisp, has the same form for all the commands, allow to do anything you could conceive and it will look like if it's part of the language. For instance you can do a looping construct that works in a different way depending on the calling function name or state or any other crazy stuff like that.

Title should be edited: Lanugage -> Language.

Thanks! Updated.

Typo in the headline: lanugage

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact