

C as a Portable Intermediate Language - stephth
http://cybertiggyr.com/ick/

======
russell
I can attest; it's easy to do, and quite fruitful if you are playing around
with language design. The major pitfall is the libraries. Unless you find a
set that magically fits your semantics, you will have to roll your own. Think
garbage collection, as well as hashes, lists, and strings. That can be way
more work than you want to get into. I suggest picking a target language that
better fits the semantics of your language. Java, C++ or the like for a
statically typed language or python for a dynamically typed language. Or
Haskell, or lisp.

As the article mentions pay particular attention to debugging, even to
compiler errors. Put out line numbers, even if just comments. Write the source
statements as comments, because a syntax error in generated code can be hell
on wheels to track down. I know wherefore I speak. I once wrote an application
generator that generated embedded SQL, which in turn was translated into C
with mountains of library calls (before the days of ODBC). A syntax error in
the ultimate C was often nasty.

If you want a debugger, be prepared to roll your own, unless your target
language has serious support for generated languages.

~~~
stephth
I have no experience building languages, but as someone who has to build
relatively performant code and longs for a clean, DRY language, I completely
agree with you about picking a target language that closely fits the
semantics. Rolling out your own libraries and runtime seems like a gargantuan
endeavor too likely to fail, and portability (not to mention performance) will
likely suffer.

I think the CoffeeScript approach is the way to go: syntactic sugar that
generates human readable code. I'd be happy to have an elegant language that
would compile to C++ simply adding some DRYness and elegance (for example not
having to write headers), but without trading it for performance/portability
(for example without abstracting pointers or new/delete). I would happily use
a subset of C++ if required.

This in C++:

    
    
        // in file MyClass.h //////////////
    
          class MyClass: public ParentClass
          {
      
          public:
    
            int myInt, myInt2;
            float myFloat;
            Arr *myArr;
    
            MyClass() { }
    
            MyClass();
            void doStuff();
    
          };
    
        //in file MyClass.cpp  //////////////
    
          #include "MyClass.h"
          MyClass::MyClass()
          {
            self->doStuff();
            int poop = 321654987;
            myInt = 112233 * poop;
            {
              // new scope here
            }
          }
    
          void MyClass::doStuff()
          {
            int myNum = 123;
            poop *= myNum;
          }
    
    
    

Could relatively easily be generated from this syntactic sugar:

    
    
        class MyClass < ParentClass
      
          int myInt, myInt2
          float myFloat
          Arr *myArr
      
          void initialize(args)
            self.doStuff()
            int poop = 321654987
            myInt = 112233 * poop
              // new scope here
          
          void doStuff()
            myNum := 123;
            poop *= myNum;
    
    

And wouldn't generating such readable code bypass the need for a debugger?

~~~
sea6ear
I think Cython/Pyrex might be cool for this. Python(ish) syntax that generates
C code when run through the compiler. I wonder how hard it is for that
generated code to be portable, or if it is already optimized for the
architecture that the Cython/Pyrex compiler is run on.

~~~
stephth
If I understand correctly, Cython generates C code that still depends on a
Python runtime.

~~~
sea6ear
Ah, it looks like you are correct. Perhaps shedskin would work better for this
idea, except for the fact that it produces C++ rather than C.

Added: Also, I missed that you were talking about creating human readable C++.
I'm not at all certain that shedskin produces C++ that is readable.

~~~
stephth
I could live without human readable code, that was more about bypassing the
need for an intermediate debugger; and readable code is a good proof that the
language doesn't try to do too much.

Shedskin looks like a very cool project, and it's quite fast [1], thanks for
the tip. Alas, it depends on a garbage collector (not to mention a substantial
library), which reduces flexibility and portability - you can't for example
build a portable C++ library with it.

[1] although not as fast as C, or even V8 js:
<http://attractivechaos.github.com/plb>

