
Let's make a Teeny Tiny compiler - pavehawk2007
http://austinhenley.com/blog/teenytinycompiler1.html
======
dang
[https://news.ycombinator.com/item?id=23441767](https://news.ycombinator.com/item?id=23441767)

~~~
azhenley
That was before it was completed :(

~~~
dang
That's one problem with these multipart blog posts. It becomes ambiguous
whether the submission is "part 1" or the head of the list ("part 1" "part 2"
"part 3").

If you want to make a single long version of the page that includes all three
parts, I'd be happy to arrange a repost. It would be best to email
hn@ycombinator.com about it.

------
kerkeslager
I'm sorry to be negative here, but this article, like most articles about
compilers, is bike shedding.

For those not aware of the bike shedding metaphor, it's the assertion that
when discussing the design of a nuclear power plant, everyone will want to
discuss the color of the shed where the workers store their bikes because they
understand it. Meanwhile, nobody will want to discuss the nuclear reactor
itself, because that's complicated and they don't really understand what's
going on with it.

In compilers, parsing is the bike shed, and code emitters are the nuclear
reactor. I've read probably hundreds of articles on "compilers" at this point,
and they're all actually just about parsers. I can't point to a single one
that actually emitted working assembly.

~~~
azhenley
Part 3 is the emitter that produces C:
[https://web.eecs.utk.edu/~azh/blog/teenytinycompiler3.html](https://web.eecs.utk.edu/~azh/blog/teenytinycompiler3.html)

~~~
kerkeslager
Ah, okay. It looks like you're taking a fairly C-like language and
transforming it to C--I'd call this a transpiler rather than a compiler.
You're not wrong to call it a compiler, but it's a pretty noncentral example
of a compiler.

Congrats on at least having an emitter, but I'm still searching for an article
that shows how to emit assembly of any kind.

~~~
jermaustin1
Instead of outputting c, could you not just output the equivalent assembler?

so instead of

    
    
        self.emitter.emitLine("printf(\"" + self.curToken.text + "\\n\");")
    

you do something like

    
    
            self.emitter.emitLine("STRING DB " + self.curToken.text + "', '$'")
        ...
        self.emitter.emitLine("LEA DX,STRING")
        self.emitter.emitLine("MOV AH,09H")
        self.emitter.emitLine("INT 21H")

~~~
kerkeslager
Well, what you just posted already highlights one difficulty: when you come to
the PRINT statement, you have to emit the string and the instructions to two
different places, so we're already talking about having two different
emitters, or some other way of handling this. And we need to generate
different non-conflicting names/addresses depending on architecture.

You said in your other post that this can be done with minor modifications,
but I can already foresee a few modifications that would need to be made which
aren't minor.

And then there's the problem that you may want to target more than one
architecture. We can write two completely different code generators, but it
would be nice if there were an architecture that could share some of the code.

~~~
jermaustin1
I honestly can't tell if you are just trolling now, and I'm falling for it,
but you seem to think this 2000 word set of tutorial on the basics of compiler
design (lexing, parsing, emitting) is supposed to be the one and only document
you will ever need to create the next C++.

> You said in your other post that this can be done with minor modifications

And it probably can, depending on the flavor of assembly you want to use,
there are dozens (hundreds?) of them, i'm sure some will allow you to inline
the string declaration. The example I gave probably doesn't even work since I
haven't programmed in 8086 in close to 20 years, and I don't even remember how
to set up data blocks and code blocks in it any more.

> And then there's the problem that you may want to target more than one
> architecture.

This is a toy compiler written by a professor of computer science meant to
teach you the basics of building a compiler (lexing, parsing, emitting). This
isn't a tutorial on building the next GCC.

~~~
kerkeslager
> I honestly can't tell if you are just trolling now, and I'm falling for it,
> but you seem to think this 2000 word set of tutorial on the basics of
> compiler design (lexing, parsing, emitting) is supposed to be the one and
> only document you will ever need to create the next C++.

That's a fair criticism.

I'm frustrated with the lack of material on emitting assembly, but it wasn't
right of me to take that out on the author of this post. I apologized in a
different post.

> And it probably can, depending on the flavor of assembly you want to use,
> there are dozens (hundreds?) of them

How about one I can run on my machine? There are maybe 5 that are useful
targets I can think of:

    
    
       * x86 or ARM (depending your machine)
       * LLVM
       * GCC RTL
       * Web assembly
       * Parrot? Maybe the JVM has some low-level bytecode?

