Hacker News new | past | comments | ask | show | jobs | submit login
Typographic Programming Language (joshondesign.com)
107 points by nickmain on Aug 22, 2014 | hide | past | favorite | 73 comments

I've been doing some of this in my latest prototype; check out http://research.microsoft.com/en-us/people/smcdirm/managedti... (look at the last few examples; it is not ready so please don't submit it to HN).

A few notes:

We convert () into a box for viewing, but de-convert it back to () when the line is selected for editing.

Unicode (and non-uncide special characters) is used as a rendering for ascii based operators. So >= is rendered as "≥"; however, for editing to work, we add a dot at the end so the character count remains the same, so its actually "≥∙", if you delete the dot, you are basically deleting the ascii "=" so the character becomes ">".

I thought about colors and graphics, but they create huge real estate issues, and they aren't widely applicable. Also, people get what "red" is while the literal color red is quite ambiguous.

Rather than find a new symbol for multiplication, it probably makes more sense to render the single x variable differently (as well as other single variable identifiers like i and j). I haven't implemented this yet, but its on my list.

If you look at the videos, you'll see that static and run-time errors are rendered underneath the tokens they are related to. The idea is to keep the feedback as close to the code as possible.

Isn't this really just clever syntax highlighting? Emacs, for instance, can already draw boxes around things. I spent about 5 minutes and got close[1]. If you had something like "pretty-symbols" installed [2] then you could probably make the quotes characters themselves not even show up at all. You could probably also do it with some font-lock code.

If it's more than syntax highlighting, then how do you make the "quoted" text? What keypress to you use to start and end the markup that signifies quotes? Maybe the quote key ;-)?

[1] http://i.imgur.com/BNO5fNJ.png

[2] https://github.com/drothlis/pretty-symbols

I agree with this (the second part). You HAVE to somehow make that string a string. So maybe you will only hit one key rather than 2 (quote, and quote again), but honestly you have macros that allow you to hit it only once.

I think one big part of all of this is speed of execution: both maximum speed, but also the learning curve. And I'm not at all sure that we can do better than typing text. Hell we've been trying to get rid of the keyboard for years, and yet nothing is faster than a VIM ninja

If the source file format for the language is not stored as plain text but it's already in a tree like file format then no, it's not just syntax highlighting but the natural way to handle this file.

I could imagine a source file format where the nodes have different types for different literals. There would be much less parsing problem than parsing plain text, of course it would need a special editor.

Needing a special editor is probably an insurmountable problem. There's a reason why textual representation of source code is so convenient.

Also, there must be a better example than quotes and strings for this "visual" representation to make sense. There is nothing hard about either reading or typing them. It's not even a particularly nerdy concept; readers and writers everywhere know how to handle quotes.

Quotes are already visual by the way. They simply aren't color-based. Which probably makes colorblind readers glad.

Already in a tree format like lisp? Or some binary structure? Even if it's some fancy binary format, it's still going to have to be parsed by the compiler and the editor (and thoroughly checked for syntax errors), making it no better than what we already have now (just different).

And unless those node types are "green round-rectangle", then I posit that it's still syntax highlighting, since the editor is deciding how to display "string node".

Wouldn't storing and editing source code as an AST make a lot of syntax errors impossible?

User-created syntax errors, possibly. But it can't completely eliminate them:

    5 ↤ "hello"
Now you could argue that the editor might detect that and not allow it, but if your abstract syntax tree data structure can support it, then you will be handed it at some point and so you should detect it.

But more importantly, the abstract tree itself has to be stored. Whatever the format (binary or otherwise), that has to be checked, too (unless you want buffer overflows or code execution exploits in your compiler/editor).

All in all, it sounds like the same stuff compilers already have to deal with, so I don't think it wouldn't be a win there.

Or JSON. Images could be even be embedded using base64.

It's the opposite of syntax highlighting, it's highlighting which is syntax.

I guess I'm saying I don't see the distinction.

How would you store that highlighting to disk?

In a .pdf?

In seriousness, I think this kind of extreme highlighting could be an important step in making programming more accessible. There are only so many characters like ,;:.()[]{}<> available to use in code, and often most of them indicate completely different things in different places.

Done the right way, and combined with live evaluation á la Bret Victor or the Swift playground, I think this has the potential to be the next paradigm of accessible programming, after Excel.

Mathematica/Wolfram Language enables this with quite a bit of success, but it's by no means a silver bullet. The reality is this does not change semantics.


There are definitely things like LISPs with AST editors, or Squeak and the blocks, but the closest to what you're on about would be http://colorforth.com/ which is "niche" but has a strong heritage.

APL uses/used non-ascii characters. I don't think most people think of that was one of it's strong points.

What seems to have caught on instead is editors that use syntax highlighting/formatting to show ascii source code in not-pure-ascii ways. It would be interesting to take this further.

JavaScript also supports non-ascii characters in variables. You can make some pretty beautiful code involving mathematics (if the formula happens to be simple enough, that is). E.g.:

{ C: r => 2 * π * r }

I believe these visual cues work against thinking in code and delay learning, though the concept has use as a debugging layer, or transitory programming aid. That said, truly visual and/or AI assisted programming, well executed, could blows the doors off programming productivity and accessibility.

Related: http://tratt.net/laurie/blog/entries/an_editor_for_composed_... (posted a few days ago here: https://news.ycombinator.com/item?id=8201707).

There's a link inside to their paper on Language Boxes, which formalize the whole idea. It uses an incremental parser where nodes of the syntax tree can be opaque "boxes" which could contain any other nested AST, as the languages are parsed separately. It does a nice job of tackling the syntax composition problems, of which the proposals seem like a special case.

As Tratt points out though, the ideas are not particularly novel - they've been around for decades, but are usually met with resistance as they break existing workflows or programmer expectations. The design philosophy behind Eco, their prototype editor for language boxes, is that it should look and behave almost exactly like a traditional text editor. It does support highlighting of inner languages when they're selected, but the highlighting is usually not visible. I'm pretty sure we could turn them always on and get the kind of appearance this post is looking for.

Am I completely wrong in thinking that this isn't really e.g. "removing delimeters", but effectively changing the delimeters into invisible markup?.. It seems to me kind of an arbitrary distinction whether the way you represent a string to a compiler is with quotes or markup that renders a green box.

As far as the concatenation thing surely that's solvable by a grammar that interprets a single line list of expressions as an implicit concatenation. It doesn't seem like something that requires a particularly smart compiler - or a new IDE. Maybe I'm overlooking something inherently hard about the problem though.

I think the deeper point is not about the highlighting, but the ability to interact with the code semantically. If your editor was completely aware of the semantic structure of your code, it could provide very powerful refactorings that respected line-associated comments and things like that.

For example, it you would type a string, and wouldn't need to worry about whether it is represented as multi-line string or not or how to properly escape newlines and special characters. Your editor would know that you are typing a string and do the appropriate thing.

That said, many advanced IDEs do have very sophisticated syntax parsers which can do many of these things already. Light Table and Lisps evolve this even further. I would imagine that a language designed with this use case in mind would enable even tighter integration with the development environment though.

The languages that first became popular were restricted to the inflexible orthography of pre-computer punched-card accounting systems, but they aren't the whole story. Other early programming languages took advantage teletypewriter I/O, and used features like two-color ribbons (e.g. with comments in red) and half-line motions for superscripts (exponentiation) and subscripts. Some, like COLASL, MADCAP, and Klerer-May system, even accepted expressions typed in two-dimensional form.

This (free PDF) article contains some examples of programming with typed two-dimensional input:

M. Klerer and J. May, A user oriented programming language, The Computer Journal (1965) 8 (2): 103-109. doi:10.1093/comjnl/8.2.103 http://comjnl.oxfordjournals.org/content/8/2/103.abstract

It's definitely an idea I've thought about a lot (and would gladly work on). I really think strictly using plain-text as the interface to the compiler is a dead-end for language design. Language designers base features around what's symbols are available in ASCII... What would be needed of course would be library support for extending this interface. For example, an image library, and a complementary IDE plugin for managing image resources.

One huge thing plain text representation has going for it is the diversity of text editors, and the fact they can be trivially swapped. In my opinion, "visual" source code representations will also be a dead-end if it means propietary, mutually incompatible IDEs.

It might be nice to read code like this, but what is the big idea? I had to use labview and out of comfort opted for the script engine to implement as much as it could, because the graphical building blocks still represent the same syntax.

Not particularly hard to implement a code viewer for the specific examples and the parser is one of the less complicated parts of a compiler, i suppose. Instead of quotes as delimiters you get e.g. xml tags, that the editor generates.

Seems like everyone who spends 20 minutes designing a programming language comes up with this idea or similar. Yawn. Syntax isn't what makes programming hard.

It is not about writing, but about reading. We actually spend most of our time doing the latter rather than the former.

A nice syntax that makes things easier to read is a nice to have. It doesn't make writing or reading programs substantially easier, though. IMO.

Did you contradict yourself or am I missing something between the first and second sentence?

Maybe the word 'substantially'

Fair enough. I guess you don't really believe in UX design either, since how something looks and feels is orthogonal to its functionality.

The shallower the problem you're solving, the more important UX issues will be, relatively. When you're writing complex software, putting the text in boxes instead of quotes matters approximately not at all.

Programming language designers are basically UX designers for the task of programming. It's all very important, or Lisp would be much more popular than it is.

This reminds me of Scratch. Indeed, it seems that if you would develop this idea further you would end up with something resembling Scratch.

Hi. I'm Josh, the original author. Sorry, I didn't realize someone posted this to HackerNews. I'm happy to answer any questions you have.

I've also posted a follow up to my blog, this time focusing on fonts.


Hi Josh,

You might want to check out this LTU thread:


You can find much old work on code typography linked there, like




I'm kind of disappointed (but not surprised) that you didn't try a proportional font for code...it is my mission in life to banish fixed-width fonts from the earth.

While we are listing "typographic" languages, Fortress from Sun deserves a mention. Its syntax was designed to be renderable as LaTeX to get nice typography. Here is one small example: https://software.intel.com/en-us/articles/first-impressions-...

edit: more extensive example: http://imgur.com/a/grrzl (from http://stephane.ducasse.free.fr/Teaching/CoursAnnecy/0506-Ma... )

In my mind, the ideal way to accomplish this would be:

- Encode the program in a data format like JSON or XML. Ideally, this should be readable (though verbose) on its own.

- Create an IDE that renders the JSON/XML using the typographical stylistic flourishes, and that possibly allows the definition of new styled "blocks". This could even be done with CSS.

Creating a language that is sufficiently connected to its IDE that it can define new syntax highlighting, autocompletion, etc. in the code itself would go a long way toward making something like this practical. DrRacket (http://racket-lang.org/) sort of does this already; it even has image literals!

Why use JSON or XML instead of storing it as code? The IDE can parse it even if it's normal code. In fact, all IDEs already parse and understand the code to a degree.

This is the idea behind colorForth: http://www.colorforth.com/cf.htm wherein color has semantic meaning (i.e., it's not just syntax highlighting).

I believe these ideas are separate. I don't think the author is proposing that format be a part of language semantics, just that we should format better.

I dunno, colored boxes around strings instead of quotes sounds an awful lot like format being part of semantics.

This blog post misunderstands the concept of "sufficiently smart compiler". He even links to the c2 wiki page for it, yet if he actually read it he would realize it has nothing to do with what he is writing about...

At the end of the article the author mentions IDEs, but also says "if only our compilers were sufficiently smart". Why? What do compilers have to do with the pretty-fied visual representation of the code, and corresponding input methods? Isn't this all about the IDE?

Also, what's the big deal with removing quotes? I didn't know they were hard to read or understand. Are we removing them from books as well?

I'd welcome a nice rendering of mathematical formulas, sure. But that's merely presentation.

The deal with having string literals not be delimited by actual characters is that you don't have to escape characters. What you see is what you get.

Still completely unnecessary. The editor can display the string minus the escape characters with whatever color, highlighting etc. you like, and still keep the escape characters in the code as stored on disk for the compiler's benefit.

Basically, if you want an editor that displays code with colors and boxes instead of delimiters, that's fine; writing one would be a practical project. It is neither necessary nor desirable to attempt to design a new language, write a new compiler, version control system and half a dozen other ancillary tools at the same time as part of the same project.

Along the lines of some things I have proposed before like in this post http://lambda-the-ultimate.org/node/4998 or this one http://lambda-the-ultimate.org/node/3033 (need to read most of my comments in there to get to the details of the idea).

I like the idea of an intermediary step between textual programming and visual programming. However the author only discusses how the symbols are rendered. What about input methods? I can think of a few options: use keyboard shortcuts, use the mouse, implement some kind of modal editor (vim-like) or a use special keyboard (APL keyboard comes to mind)?

That's simple... to start a string literal, you type a ", and as you type the green box will continue to expand to contain what you're typing until you type another " to end the literal. This is, somehow, better...

Simply render "xxxx" differently and you are about done; e.g. render it as ˹xxxx˼ with some background highlight.

Incremental text input is a different problem. If you only have to deal with a batch renderer, then no change is needed; if you have an interactive IDE, you might want to complete the closing " so feedback remains sane.

What I don't see is how the compiler is supposed to distinguish strings from int and byte array literals in those examples.

I am skeptical, it is yet another level of indirection. You write ascii code with control sequences that is compiled to this semi graphical representation, and then that is compiled to running code. You can't write the semi graphical code directly anyway.

It would make more sense is with a custom keyboard perhaps, like apl.

Re. units, see the dimensional library in Haskell. It encodes units at the type level.

Re. encoding data like images at the program level; this is not suitable for general purpose languages (IMHO) because it forces an implementation on the programmer. It's fine for relatively narrow-purpose languages like Mathematica.

I normally hate visual programming languages, but I really like this idea. Still symbolic, but adding a representation where you can... it makes a lot of sense to me.

I could imagine an IDE where it still serializes/deserializes to normal text, but you edit it in a mode like this.

For an example of how this can go horribly wrong, look at the keyboard required to code in APL (https://en.wikipedia.org/wiki/APL_(programming_language)).

I don't think that's all that horrible. If specialist tools require some training to use and have user interfaces that are extremely productive nobody bats an eye. A specialist keyboard for a programming language with very high productivity would make good sense.

Remember that when APL was created these things were not set in stone and what seems wrong to you in retrospect made perfectly good sense at the time (and in fact still makes perfectly good sense today, it's just that the world has moved on from APL to languages that are more verbose and/or that do not require symbols like these).

Math is another such language, and there is no keyboard suitable for entering mathematical expressions so we use software like LaTeX instead.

At least with the APL keyboard the link between input and display was very direct, with LaTeX much less so.

You're equating syntactic abstractions one-to-one with pieces of plastic.


You would need new input methods, because with ascii, the src code shows what you need to input. But how do you input a colour? There's no standardised method. Also all the other text based channels need to be upgraded at the same time.

What we need is a generic tree editor for directly editing your code's parse tree.

Like… Lisp with a structure editor¹?

1) https://en.wikipedia.org/wiki/Structure_editor

Woah, just imagine the endless metaprogramming capabilities!

Imagine... A tree editor that could edit itself...


I actually wrote one of those a while back:


I remember watching your video dude, I like your work

Not necessarily. Could be done as with a normal syntax highlighter + word processor. Think like WordStar with a RTF kind of text format.

Could this help us avoid the ugliness of nesting quotes and backslash-escaping hell?

Backslash-hell can be solved in ASCII-based languages too. E.g. in ruby there are several ways to have a string literal:

    "hello world"
    "hello, #{name}!" # interpolation    
    %{hello world}
    %{hello, #{name}!} # interpolation
    %{hello, "#{name}"} # interpolation with quotes
    %|hello, "#{name}"| # you can use other kinds of surrounding "brackets" if you don't like curly ones

Seriously, try emacs. You want to see what color you have put in a css field?[1] Want to see the image you are referencing in a repl?[2]

More amazingly, want to combine TeX, lisp, a markdown like language, and python/whatever in a single document? Try org-mode.[3][4]

[1] http://ergoemacs.org/emacs/i/emacs_xah_css_mode_2014-04-22.p...

[2] http://www.nongnu.org/geiser/geiser_3.html#Seeing-is-believi...

[3] http://orgmode.org/worg/org-screenshots.html

[4] https://www.youtube.com/watch?v=1-dUkyn_fZA

The mac app Soulver gets pretty close to this, although it's more of a math scratch pad than a programming language:


Mathematica has a few of these features. If you paste in an image from a previous operation, then the image is shown inline in the piece of code you're typing.

That'll be painful to merge, especially in big teams.

That's why it wasn't presented as a solution, rather as an inspiration. IMO it makes more sense to read code like this, editing it can still be done with the usual delimiters. I'd see an editor that by default shows the formatted code but for lines with an active cursor, it switches to plaintext.

I was thinking in terms of say do a diff between previous versions of the files stored in svn, p4, git, etc. - and then keep in mind that these might be diffed/merged/viewed/reviewed in tons of different ways - from automatic tools checking certain coding rules, or expanding $fields$ (perforce/rcs), or reviewing online (ReviewBoard), or who knows what.

so far TEXT files (ASCII, utf-8, utf-16, etc.) have made most sense for source code, when comes to huge group of people (100+).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact