Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: A compiler for a small language into x86-64 assembly (github.com/mauricegit)
74 points by EllipticCurve on April 30, 2020 | hide | past | favorite | 22 comments



This is my first compiler, so I expect there to be several things I can improve :)

I tried to follow the Assembly calling conventions the best I could.

I am looking forward to any feedback!


Sorry I won't be able to give you a lot of feedback.

As a matter or preference I really like colons and semi-colons.

That said your work is amazing. This is a true example of simplicity. I don't think most of people would get how difficult it is to keep it simple.

Everything is clear I can read the source code without asking myself "what is that", everything makes sense.

Thanks for presenting your work.

Do you have any constraints, like "no meta programming", "generated library should be as much as possible compatible with C", "it should have one pass optimization" or even "the compiler must be embeddable in most place as possible" ?


Nothing wrong with semi-colons. Everyone has a different preference anyway.

Thanks for saying all that. It was a huge amount of work and getting appreciation makes it all worth it!

No hard constraints as of now. But I don't think I want to include meta programming or a pre-processor (don't really like it to be honest). I do want to keep it compatible with C internally, on Assembly level. One thought is, to create a file with function headers/definitions that are then dynamically linked and can just be used.

I used some C std library functions that way for debugging (printf, ...). And as I follow the standard calling conventions, the compiler should automatically generate compatible code.

With this, it would also be possible, to write OpenGL code. That would be really awesome :)

As of usage of my language - Not sure yet. Up until now, the road was more of the goal then the finished language.


Congratulations! It looks like a useful and practical design. How is the performance of the generated code? What are you thinking about doing for memory management? Have you thought about using an intermediate representation to make optimization and retargeting easier?


Thanks :) That's where I was trying to get to. I did some smaller performance comparisons against C (with -O0), where I was at about 90% speed. But there is a lot of performance to gain, if I optimize the resulting assembly. There are lots of cases where I push and pop directly afterwards because of the general expression code generation (no real knowledge of broader context). So I expect that to help a lot regarding performance. Also things like jump tables for simple switches are on the table.

Yes, I thought about going for LLVM or another representation but decided to do it once myself (no given performance optimizations or the like) with room for improvement.


Very cool! Reading the "Why" section resonated with me, as I think creating one's own language is something every programmer should do for the experience.

The syntax flows well, in fact it feels very intuitive for my taste. I like the type inference and no semicolons. I wonder if the latter posed any trickiness, for example, with the next line starting with an expression "(" or operator like "+".

I'm also curious about use cases, what is possible with this language. I guess anything assembly can do - which is..everything? :) Would it be suitable to run on microcomputers like Raspberry Pi?

EDIT: The Pi and Arduino are typically ARM, it seems, with a different instruction set. Well, that shows how little I know.


:) I (now) agree. In fact, each part isn't even really that hard. It's mostly just a lot to work. And then code generation. I had some headaches with multiple return values and keeping the calling conventions intact... And then with structs as well.

That language flow and general simplicity was one of my most important goals. Thanks for noticing :)

No, I had no problems regarding that. What you mention ('+', '(') are all part of simple expressions when parsing. And I strictly parse right recursive and re-order the expressions later (for operator priority). So that was not an issue. Most of these problems I solved, by making my parser a lookahead of >1. In a few cases, there is a lookahead of 3 to determine what exactly should be parsed.

I guess anything, that can run an X86-64 Elf executable? ;) Although there is still a lot missing, for it to be taken serious. Starting with strings, files, input, ... But thats for another time or whenever I need it, I guess.


Thank you for this. I like that it's small enough to be motivational instead of being complex and intimidating.

World needs more of these for other complex topics.


That is really motivating, thank you for saying that!


Very cool. What resources did you use to learn how to write the compiler? The "turn code into ASM" step has always mystified me, and I'd like to learn more about how that part of the process works.


Those are some of my currently open tabs :) Lots of Google use on top of that. The parser is actually quite straight forward. The much harder part (for me) was the code generation afterwards (No experience with Assembly so far).

- General compiler design - This was one of the main ressources. https://www.tutorialspoint.com/compiler_design/compiler_desi...

- https://www.lua.org/manual/5.3/manual.html#9

- https://www.godbolt.org/

- Linux system call table: http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x...

- A bit on floating point: https://cs.fit.edu/~mmahoney/cse3101/float.html

- Assembly https://www.cs.yale.edu/flint/cs421/papers/x86-asm/asm.html

- More assembly: https://www.complang.tuwien.ac.at/ubvl/amd64/amd64h.html


very cool. I'm writing a compiler and language myself and I was on the fence about anonymous structs / tuples vs multiple returns, and seeing multiple returns in your examples nudged me that way.

I'm also influenced by Lua and I picked it up in your grammar right away : )


It's a bit of pain to implement in some cases, but worth it :)

Let me know, if you have questions or need hints where to find some relevant details in my code.


You can try FASM (http://flatassembler.net) besides YASM and ld.


Well this is funny. Originally (a number of years ago) I looked at using MASM, then I was informed to switch to MASM.

Then (last year or something) I find out YASM is actually intended to be an improvement to NASM, which I've used for about 6 years. And now I find out FASM is an improvement on YASM? How deep does this go??

To clarify: I had heard of all of these different assemblers and at some point looked at their websites but it was a very, very long time ago :)


Curios if you looked into ANTLR? That's what was used most recently as part of my compiler's course at Georgia Tech.


Yes, I did (at least a bit). That and I read, that Go has a build-in yacc as well.

But decided, that for the first compiler, I want to actually do all steps manually myself for best possible comprehension of the general topic.


What would be your advice to someone who's about to write a compiler? I'm planning to start by reading the Dragon book.


I never got on with the Dragon Book despite several attempts.


You both should try "Compiler Construction: Principles and Practice" [0]. Theory is interleaved with substantial examples and exercises. You create an entire compiler for a tiny language called TINY (hah.) You write it in C and generate code for a portable virtual machine -- the book also goes into detail on the VM, with source code.

Lastly, the book's appendix has guidance for writing a compiler for a subset of the C language.

[0] https://www.cs.sjsu.edu/~louden/cmptext/


I didn't really read any books on that topic. But did lots of general research about compiler stages. I also posted a few links some comments up, that helped a lot.

For for it! Start small and increase. But seeing a program in your own language output something makes it all worth it :)


http://t3x.org/t3x/index.html#t3x9

If you plan to buy the book, you will have to go to Lulu.com and search for it, as order links seem to be down after a site redesign.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: