Ask HN: Writing DSL in Python

apgwoz · on Dec 19, 2008

Study up on context managers, decorators, and meta-classes, which will assist in creating DSLs. However, you don't really need to use these features to build DSLs.

The use of method chaining (think jQuery) can lead to interesting languages.

Or, take Django's chaining for making queries. User.objects.filter(first_name='Andrew').order_by('last_name')

Or, you can get more complicated and do something like described at: http://code.activestate.com/recipes/534150/

EDIT: s/be vital/assist/ on line 1.

russell · on Dec 19, 2008

apgwoz has a excellent suggestion. Python has a clean syntax, so you may not need to invent your own. Meta-classes allow you to do the DSL magic behind the scenes. It's less fun than inventing a whole new language, but you can get something working in a couple of hours. Even more useful are the mind stretching exercises inherent in understanding meta-classes and their friends. You will see programming in a whole new light.

cabalamat · on Dec 19, 2008

> Study up on context managers, decorators, and meta-classes, which will be vital to creating DSLs. However, you don't really need to use these features to build DSLs.

Make your mind up. Either they're vital or they're not. You can't have both.

apgwoz · on Dec 19, 2008

No. I won't make my mind up. I'm not providing a recipe for DSLs in Python. I'm simply stating ingredients and saying "now go create."

petercooper · on Dec 20, 2008

I think he/she was more criticizing that you effectively said "X, Y and Z are vital" and then "But X, Y and Z are not vital."

If you'd said "useful" instead of "vital" it'd have been fine logically, but "vital" implies necessity.

apgwoz · on Dec 20, 2008

Yeah. I ended up editing the original post when I realized it later.

bayareaguy · on Dec 19, 2008

I would recommend taking a quick look at how SCons and Conary use python. You may also want to consider implementing your DSL using a python parser generator such as Ply.

SCons - http://www.scons.org/doc/HTML/scons-man.html#lbBB

Conary - http://wiki.rpath.com/wiki/Conary:Recipe_Structure

Ply - http://www.dalkescientific.com/writings/NBN/parsing_with_ply...

amethyst · on Dec 19, 2008

I still haven't quite grasped the concept behind what a DSL really is, and what makes your code a DSL versus any other program or API. Any insight that can be offered, especially in the context of something other than Ruby and Lisp? I just want to know what all the hubbub is about, and why it matters to me?

russell · on Dec 19, 2008

A Domain Specific Language is optimized for performing tasks within a limited problem area. Usually the syntax is simplified to strip out features that dont pertain to the problem. Examples are shell scripting languages like bash or windows shell, awk, sed, SQL, HTML, TEX, or JSON. Imagine trying to typeset a document in raw java.

There used to be lots of little declarative DSLs. Think make. Now they have been co-opted by XML. Ant uses XML, but would be much prettier if it had its own DSL. Another example: you can hand code a parser, but it is much easier to code and debug, if you use BNF (YACC or Bison).

neilc · on Dec 19, 2008

There used to be lots of little declarative DSLs. Think make. Now they have been co-opted by XML. Ant uses XML, but would be much prettier if it had its own DSL.

XML is orthogonal: it is just another way of expressing the syntax of a DSL. Ant's build file format may be ugly, but it is also a DSL (it is just syntactically encoded using XML).

jerf · on Dec 19, 2008

There's multiple definitions, so what follows is not authoritative, but IMHO it will help you understand the field better, as well as why so many people seem to be talking past each other as they silently adopt one definition while arguing with someone based on another.

A strong DSL is a specialized language designed for a specific task, with its own parser and syntax not directly based on another language. Ideally, it is not Turing complete, because that opens a huge can of worms, and much of the point of a strong DSL is to avoid this can of worms. The DSL should be carefully designed in conjunction with the eventual users, who most likely will not be programmers. I've heard of impressive results with this approach, but I've never witnessed them firsthand. This definition is favored by Martin Fowler, and much less importantly/impressively, me.

Weak DSLs are basically APIs written such that in their target containing language, the calls into the API read conversationally and with minimal boilerplate not directly related to the domain of the problem. In general (and I emphasize that I am generalizing), they can't be used to their full power without understanding the containing language, so you still need to be a programmer, albeit perhaps with less experience. You may see demos on the home page for the DSL about how it is possible to use without knowing the containing language, but it tends to be demoware; much of the gain is intrinsically that you still have a full Turing-complete language backing you up.

People will bikeshed about punctuation and claim that Ruby can build better weak DSLs, but in my experience, like I said, you still have to understand the containing language to really use the language, so the virtue of dropping off punctuation marks is oversold; the syntactic constructs are still there and you still need to understand them, so whether you actually see them in source is much more a matter of taste than a matter of "goodness".

Mixing people who use different definitions of DSL is a recipe for much heat and no light. It should also be pointed out that the strong DSL people make certain claims about why DSLs are good, along with the reasons why these things are true, and these claims should not be carelessly translated to weak DSLs.

While I use the terms "strong" and "weak", they should be understood as descriptive, not a claim of goodness, much like "strong AI" and "weak AI".

If you look at a weak DSL and wonder what the deal is, it is because rather than a magical technology, it is merely a style of API... there really isn't much "there" there, but it can be useful in some cases. I would never write a weak DSL for a DSL's sake, though; just evolve the API and see what happens.

Personally, I think it is a continuum between "API" and "weak DSL", not a binary distinction, but for "strong DSL", I would say that if you're backing to a full Turing complete language like Ruby, you do not have a strong DSL and should not expect to reap the benefits. (Which is good, since you're not paying the price, either. Nothing is free.)

12ren · on Dec 20, 2008

Examples of successful Strong DSL (ie. standalone) are James Gosling's project for satellite control (I think it was), which he did so that users could make their own changes without bothering him (I doubt details are online - he mentioned it in an interview as a virtual machine that was a precursor to Java).

Our own pg also had a DSL for the Yahoo Store (was viaweb), so users could customize it. From the docs, it looks like it is implemented as Lisp with isomorphic friendlier syntactic sugar.

On weak DSLs (ie within a host language), I still think it's a virtue to try to make it as self-contained as possible. Mini-languages are surprisingly common: regular expressions; printf format syntax; XPath within XSLT. I saw a great comment recently [1] suggesting having DS error messages. I think this is an oft-overlooked aspect of the abstraction (it's not just syntax!), and to make the abstraction as non-leaky as possible helps everyone (programmer and non-programmer alike).

[1] http://debasishg.blogspot.com/2008/04/external-dsls-made-eas...

amethyst · on Dec 20, 2008

Thank you. That's the type of response I was hoping for.

begemot · on Dec 19, 2008

I've also been interested in DSLs and found some very cool code samples implementing an EDSL (embedded DSL) for describing shapes. I have looked at it countless of times for ideas when I'm writing my own. Although the code is in Haskell I think you can learn a thing or two from it.

http://www.cs.chalmers.se/Cs/Grundutb/Kurser/afp/lectures.ht...

cabalamat · on Dec 19, 2008

A DSL is a language that is specific to a domain. DSLs may or may not be Turing-complete, but typically you wouldn't write the whole of a big project in one.

The difference between a DSL and an API? An API uses the host language's syntax, but a DSL introduces its own syntax, which must then be parsed.

apgwoz · on Dec 19, 2008

> The difference between a DSL and an API? An API uses the host language's syntax, but a DSL introduces its own syntax, which must then be parsed.

Well, this is a conundrum then. You can't just create new syntax in languages that don't support a way to extend the syntax (see Lisp macros, and I even get to mention TCL here). So, most of what we normally call "DSLs," (i.e. method chaining languages, overloading operators) aren't.

So, I guess we should start calling them Domain Specific APIs--DSAPIs, if you will.

cabalamat · on Dec 19, 2008

> So, most of what we normally call "DSLs," (i.e. method chaining languages, overloading operators) aren't.

I don't call these things DSLs. Other people might.

wilkes · on Dec 19, 2008

http://www.martinfowler.com/bliki/DomainSpecificLanguage.htm... distinguishes between Internal and External DSLs. Internal DSLs are a style of programming, usually non-idiomatic. External DSls require parsing.

apgwoz · on Dec 19, 2008

> Other people might.

Other people do. Maybe it's not the proper terminology, but it's too common to ignore it.

tocomment · on Dec 19, 2008

I think it just means writing a programming language but a little one. So use pyparsing as your parser. The May 2008 Python magazine has a good article on making an interpreter/compiler for brainf*ck using PyParsing.

wilkes · on Dec 19, 2008

I've really enjoyed using PyParsing. O'Reilly has a ShortCut book on it that is worthwhile. http://oreilly.com/catalog/9780596514235/

cabalamat · on Dec 19, 2008

I have implemented two languages in Python, one a DSL, the other a general-purpose language. Both times I used the SPARK parsing framework http://pages.cpsc.ucalgary.ca/~aycock/spark/ which I recommend.

jlc · on Dec 19, 2008

I ran into this a while back:

http://blog.brianbeck.com/post/53538107/python-dsl-i

nostrademons · on Dec 19, 2008

I used yapps2 for the DSL parts of GameClay. It's probably not the best Python parser generator out there, but it was installable via apt, simple to use, and familiar to people who have used Antlr. Had a simple DSL written in 2 days.

Also, if it's just a quick-n-dirty hack, don't underestimate the power of regexp/split() parsing. Yeah, everyone says you shouldn't do it, and you probably shouldn't if you're planning on maintaining things for a while, but you can get going really fast without learning any additional tools. And for maybe 80-90% of the things you'd use a DSL for, it's perfectly adequate.

gruseom · on Dec 20, 2008

Although I got excited about it at first, I've come to the conclusion that this DSL stuff is seriously overrated. I've seen few successful examples and many silly ones.

There are certainly good little specialized languages -- Common Lisp's LOOP is an example, as are regular expressions -- but these are for programmers, not domain experts.

ivank · on Dec 19, 2008

You could look at SQLAlchemy, Storm, Schevo, or Nevow Stan. Or even BeautifulSoup's element-finding.

thorax · on Dec 20, 2008

No joke: take a look at the LOLpython source for an example of creating a variant language that can compile to Python bytecode. It's one of many methods to create a DSL, though you often don't need access to all of Python in the DSL itself.

intellectronica · on Dec 20, 2008

You can customize the behaviour of Python objects in ways which may allow you to implement DSLs (for example, by responding to arbitrary messages, or by constructing callables).

A good example to look at might be https://storm.canonical.com/ which utilizes every trick in the book to allows you to write SQL-like queries in Python.

jgalvez · on Dec 21, 2008

http://github.com/galvez/gae-rest/tree/master/xmlbuilder.py