Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Writing DSL in Python
26 points by oltmans on Dec 19, 2008 | hide | past | favorite | 29 comments
Hey Hackers,

I was reading this blog article http://weblog.jamisbuck.org/2006/4/20/writing-domain-specific-languages

and I've no idea how can one go about writing a DSL in Python. I've rudimentary knowledge of Python and I'm interesting in writing a DSL just for fun and knowledge. Can anyone please point out what do I need to learn to be able to write a DSL in Python? Any help is appreciated.




Study up on context managers, decorators, and meta-classes, which will assist in creating DSLs. However, you don't really need to use these features to build DSLs.

The use of method chaining (think jQuery) can lead to interesting languages.

Or, take Django's chaining for making queries. User.objects.filter(first_name='Andrew').order_by('last_name')

Or, you can get more complicated and do something like described at: http://code.activestate.com/recipes/534150/

EDIT: s/be vital/assist/ on line 1.


apgwoz has a excellent suggestion. Python has a clean syntax, so you may not need to invent your own. Meta-classes allow you to do the DSL magic behind the scenes. It's less fun than inventing a whole new language, but you can get something working in a couple of hours. Even more useful are the mind stretching exercises inherent in understanding meta-classes and their friends. You will see programming in a whole new light.


> Study up on context managers, decorators, and meta-classes, which will be vital to creating DSLs. However, you don't really need to use these features to build DSLs.

Make your mind up. Either they're vital or they're not. You can't have both.


No. I won't make my mind up. I'm not providing a recipe for DSLs in Python. I'm simply stating ingredients and saying "now go create."


I think he/she was more criticizing that you effectively said "X, Y and Z are vital" and then "But X, Y and Z are not vital."

If you'd said "useful" instead of "vital" it'd have been fine logically, but "vital" implies necessity.


Yeah. I ended up editing the original post when I realized it later.


I would recommend taking a quick look at how SCons and Conary use python. You may also want to consider implementing your DSL using a python parser generator such as Ply.

SCons - http://www.scons.org/doc/HTML/scons-man.html#lbBB

Conary - http://wiki.rpath.com/wiki/Conary:Recipe_Structure

Ply - http://www.dalkescientific.com/writings/NBN/parsing_with_ply...


I still haven't quite grasped the concept behind what a DSL really is, and what makes your code a DSL versus any other program or API. Any insight that can be offered, especially in the context of something other than Ruby and Lisp? I just want to know what all the hubbub is about, and why it matters to me?


A Domain Specific Language is optimized for performing tasks within a limited problem area. Usually the syntax is simplified to strip out features that dont pertain to the problem. Examples are shell scripting languages like bash or windows shell, awk, sed, SQL, HTML, TEX, or JSON. Imagine trying to typeset a document in raw java.

There used to be lots of little declarative DSLs. Think make. Now they have been co-opted by XML. Ant uses XML, but would be much prettier if it had its own DSL. Another example: you can hand code a parser, but it is much easier to code and debug, if you use BNF (YACC or Bison).


There used to be lots of little declarative DSLs. Think make. Now they have been co-opted by XML. Ant uses XML, but would be much prettier if it had its own DSL.

XML is orthogonal: it is just another way of expressing the syntax of a DSL. Ant's build file format may be ugly, but it is also a DSL (it is just syntactically encoded using XML).


There's multiple definitions, so what follows is not authoritative, but IMHO it will help you understand the field better, as well as why so many people seem to be talking past each other as they silently adopt one definition while arguing with someone based on another.

A strong DSL is a specialized language designed for a specific task, with its own parser and syntax not directly based on another language. Ideally, it is not Turing complete, because that opens a huge can of worms, and much of the point of a strong DSL is to avoid this can of worms. The DSL should be carefully designed in conjunction with the eventual users, who most likely will not be programmers. I've heard of impressive results with this approach, but I've never witnessed them firsthand. This definition is favored by Martin Fowler, and much less importantly/impressively, me.

Weak DSLs are basically APIs written such that in their target containing language, the calls into the API read conversationally and with minimal boilerplate not directly related to the domain of the problem. In general (and I emphasize that I am generalizing), they can't be used to their full power without understanding the containing language, so you still need to be a programmer, albeit perhaps with less experience. You may see demos on the home page for the DSL about how it is possible to use without knowing the containing language, but it tends to be demoware; much of the gain is intrinsically that you still have a full Turing-complete language backing you up.

People will bikeshed about punctuation and claim that Ruby can build better weak DSLs, but in my experience, like I said, you still have to understand the containing language to really use the language, so the virtue of dropping off punctuation marks is oversold; the syntactic constructs are still there and you still need to understand them, so whether you actually see them in source is much more a matter of taste than a matter of "goodness".

Mixing people who use different definitions of DSL is a recipe for much heat and no light. It should also be pointed out that the strong DSL people make certain claims about why DSLs are good, along with the reasons why these things are true, and these claims should not be carelessly translated to weak DSLs.

While I use the terms "strong" and "weak", they should be understood as descriptive, not a claim of goodness, much like "strong AI" and "weak AI".

If you look at a weak DSL and wonder what the deal is, it is because rather than a magical technology, it is merely a style of API... there really isn't much "there" there, but it can be useful in some cases. I would never write a weak DSL for a DSL's sake, though; just evolve the API and see what happens.

Personally, I think it is a continuum between "API" and "weak DSL", not a binary distinction, but for "strong DSL", I would say that if you're backing to a full Turing complete language like Ruby, you do not have a strong DSL and should not expect to reap the benefits. (Which is good, since you're not paying the price, either. Nothing is free.)


Examples of successful Strong DSL (ie. standalone) are James Gosling's project for satellite control (I think it was), which he did so that users could make their own changes without bothering him (I doubt details are online - he mentioned it in an interview as a virtual machine that was a precursor to Java).

Our own pg also had a DSL for the Yahoo Store (was viaweb), so users could customize it. From the docs, it looks like it is implemented as Lisp with isomorphic friendlier syntactic sugar.

On weak DSLs (ie within a host language), I still think it's a virtue to try to make it as self-contained as possible. Mini-languages are surprisingly common: regular expressions; printf format syntax; XPath within XSLT. I saw a great comment recently [1] suggesting having DS error messages. I think this is an oft-overlooked aspect of the abstraction (it's not just syntax!), and to make the abstraction as non-leaky as possible helps everyone (programmer and non-programmer alike).

[1] http://debasishg.blogspot.com/2008/04/external-dsls-made-eas...


Thank you. That's the type of response I was hoping for.


I've also been interested in DSLs and found some very cool code samples implementing an EDSL (embedded DSL) for describing shapes. I have looked at it countless of times for ideas when I'm writing my own. Although the code is in Haskell I think you can learn a thing or two from it.

http://www.cs.chalmers.se/Cs/Grundutb/Kurser/afp/lectures.ht...


A DSL is a language that is specific to a domain. DSLs may or may not be Turing-complete, but typically you wouldn't write the whole of a big project in one.

The difference between a DSL and an API? An API uses the host language's syntax, but a DSL introduces its own syntax, which must then be parsed.


> The difference between a DSL and an API? An API uses the host language's syntax, but a DSL introduces its own syntax, which must then be parsed.

Well, this is a conundrum then. You can't just create new syntax in languages that don't support a way to extend the syntax (see Lisp macros, and I even get to mention TCL here). So, most of what we normally call "DSLs," (i.e. method chaining languages, overloading operators) aren't.

So, I guess we should start calling them Domain Specific APIs--DSAPIs, if you will.


> So, most of what we normally call "DSLs," (i.e. method chaining languages, overloading operators) aren't.

I don't call these things DSLs. Other people might.


http://www.martinfowler.com/bliki/DomainSpecificLanguage.htm... distinguishes between Internal and External DSLs. Internal DSLs are a style of programming, usually non-idiomatic. External DSls require parsing.


> Other people might.

Other people do. Maybe it's not the proper terminology, but it's too common to ignore it.


I think it just means writing a programming language but a little one. So use pyparsing as your parser. The May 2008 Python magazine has a good article on making an interpreter/compiler for brainf*ck using PyParsing.


I've really enjoyed using PyParsing. O'Reilly has a ShortCut book on it that is worthwhile. http://oreilly.com/catalog/9780596514235/


I have implemented two languages in Python, one a DSL, the other a general-purpose language. Both times I used the SPARK parsing framework http://pages.cpsc.ucalgary.ca/~aycock/spark/ which I recommend.



I used yapps2 for the DSL parts of GameClay. It's probably not the best Python parser generator out there, but it was installable via apt, simple to use, and familiar to people who have used Antlr. Had a simple DSL written in 2 days.

Also, if it's just a quick-n-dirty hack, don't underestimate the power of regexp/split() parsing. Yeah, everyone says you shouldn't do it, and you probably shouldn't if you're planning on maintaining things for a while, but you can get going really fast without learning any additional tools. And for maybe 80-90% of the things you'd use a DSL for, it's perfectly adequate.


Although I got excited about it at first, I've come to the conclusion that this DSL stuff is seriously overrated. I've seen few successful examples and many silly ones.

There are certainly good little specialized languages -- Common Lisp's LOOP is an example, as are regular expressions -- but these are for programmers, not domain experts.


You could look at SQLAlchemy, Storm, Schevo, or Nevow Stan. Or even BeautifulSoup's element-finding.


No joke: take a look at the LOLpython source for an example of creating a variant language that can compile to Python bytecode. It's one of many methods to create a DSL, though you often don't need access to all of Python in the DSL itself.


You can customize the behaviour of Python objects in ways which may allow you to implement DSLs (for example, by responding to arbitrary messages, or by constructing callables).

A good example to look at might be https://storm.canonical.com/ which utilizes every trick in the book to allows you to write SQL-like queries in Python.





Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: