

Ask HN: Thoughts for a new language? - aarongough

Hey all!
A few months ago I decided that I needed to learn the nitty-gritty of how programming languages work much more intimately. With that in mind last week I started the journey toward creating my own programming language called Koi.<p>The idea so far is that Koi will be a simple, object-oriented language with a syntax much like Ruby's, but built from the ground up to avoid blocking IO and using Fibers for concurrency. On a whim I decided to start from the bottom and am currently working on a VM prototype for the language. The VM is currently written in Ruby for ease of development/experimentation, but will be ported to C once the toolchain/architecture is somewhat stable.<p>I am interested in hearing any suggestions/tips that you have for the language's design and implementation. At this point it's impossible to tell if the language will be useful, or just a toy. But I would like to give it the best chance possible at succeeding and learn as much as possible in the process.<p>For those interested, the experimental VM implementation is here: http://github.com/aarongough/koi-vm (Note that it's very incomplete, slow and changing quickly...)
======
chipsy
Try writing a type inference engine. I did one over the last few days as the
first big step in a game scripting language I'm working on - and at least in
my case the implementation turned out to be only a bit convoluted; the main
hurdles are in the details of coercions and casts. Once you have the engine
running you can get lots of "bang" out of it in terms of helpful error
messages and syntactical conveniences.

I represent the different types and coercions as graph nodes(casts are direct
connections between types), and then cache all the possible paths for
inference by walking the tree from each node with a depth-first search.

Once the graph is set up and the paths are assigned, then I can run tests to
see if a coercion path is possible, whether additional coercion or casting
steps are needed, and if there is an ambiguity in the input or output types at
any point. Today I retrofitted my engine to include multiple arguments in
coercions, so that many->one functions can be included in the graph.

A side effect of resolving ambiguities is that I have to include hinting for
both which argument of the coercion is used for input, and for the output
type, if multiple output types are possible.

I should describe the two goals of my language while I'm at it:

1\. To allow the game engine to treat its entities and components as types, so
that the scripts never have to deal with the difference between a "Monster"
archetype and a "Collision" component attached to the monster - where the
collision data is, and how it's accessed, are just part of the type system.
Thus the syntax will let you say something like move(me(),vec2D(3,3)); without
explicitly resolving me() into "the collision component of the entity of the
calling script."

2\. The language includes constructs for timing and tweening; events are
atomic transactions with applicative/imperative abilities, but they hold a
time value, and yield execution after processing "everything that happened"
during a single update timeslice; the script can jump to different moments in
time to loop a cycle of actions; and tweening operations like fades or bounce
effects can be queued to run on every update with new parameters, so that
there is no more timer bookkeeping going on.

I still have to nail down all the details of the runtime model, and then the
syntax. But so far it's looking pretty good.

------
mahmud
Try to implement the flash AVM2 virtual machine:

<http://mahmud.arablug.org/avm2-opcodes-complete.txt>

I started work on it in Lisp and made some minor progress when Adobe decided
to cockblock and released Tamarin under the GPL.

Bit more stuff:

<http://mahmud.arablug.org/avm2-undocumented-opcodes.txt>

Feel free to pick my brains on the flash platform, if you want your toy to
support multimedia. I got a gut-full of it with absolutely no possible use :-/

[Edit:

Should you go the flash route, I should be able to pull some IRC logs of
interesting discussions as well.]

~~~
mahmud
Let me add that language design != language implementation.

Try to prototype your language design in a very high-level language,
preferably one with macros. So you can iron out the kinks rapidly without
being bogged down by machine limitations. Also, try to use a homoiconic syntax
for easy parsing, so you're not stressing over lexical analysis and other
string manipulation crap.

Once you have the high-level semantics ironed out, you can choose a suitable
intermediate representation, one that can handle several phases of
optimization. Register or graph based IR, like SSA, will allow you to capture
code and data flow, and it presents an underlying "machine" architecture with
abundant resources (cheap infinite registers for starters.)

Stack-based IRs are compact and excellent for virtual interpretation, they
also capture lexical scope and procedure call very well, but you will have to
expend a bit more effort should you want to compile them for a native
processor.

The easiest language to implement would probably be a block-structured Algol
dialect (say, Oberon or Pascal) but without runtime heap allocation. That way
you're not messing with garbage collection. You can allocate your application
memory at startup or compile time and it will sit and stay in there. See
Chapter 10 of this:

[http://homepages.cwi.nl/~steven/pascal/book/pascalimplementa...](http://homepages.cwi.nl/~steven/pascal/book/pascalimplementation.html)

If you want first-class environments and closures you will need GC and a heap.
You will also need a heap if you want first class objects or dynamic exception
handling.

The language design stuff is boring, and you can be inspired by various
designs out there. It's the runtime implementation that's funny, but very
exciting.

Good luck.

~~~
aarongough
I like the idea of prototyping the language syntax in a high-level language,
it reflects nicely what I'm currently doing with the VM. I'll definitely look
into that.

Clearly I still have a lot more reading/learning to do! Should be fun :-p

