Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Where to start when creating a programming language?
11 points by cannibalXxx 3 months ago | hide | past | favorite | 15 comments
I've been a programmer for more than 5 years and during that time I've already developed a few small things, but now I want to start understanding in depth what a programming language is and how it works, but I don't even know where to start. If anyone can help me, I'd really appreciate it.



Structure and Interpretation of Computer Programs is a classic book to start and where many started.

https://mitp-content-server.mit.edu/books/content/sectbyfn/b...

Video lectures around the course as well

https://ocw.mit.edu/courses/6-001-structure-and-interpretati...

MIT keeps a lot of course materials available online. I like to go through syllabi for text ideas for example.

https://people.csail.mit.edu/feser/pld-s23/

https://ocw.mit.edu/courses/6-035-computer-language-engineer...

Let's Build a Compiler

https://compilers.iecc.com/crenshaw/

Introduction to Compilers and Language Design

https://www3.nd.edu/~dthain/compilerbook/

Dragon book is where I started, but it's getting long in the tooth. Still a classic

https://www.amazon.com/Compilers-Principles-Techniques-Tools...


amazing! i had a quick look and i liked it. Thanks for the content


I'd also recommend the free online book Crafting Interpreters by the lead designer of the Dart language at Google.


First you should decide what kind of language you want to create. You'll need to decide on syntax, compiled or interpreted, target platforms, etc.

Most resources are going to be focused almost entirely on parsing, which is almost all you need if you're just going to create a toy language that doesn't aspire to be performant and have no interest in optimization, code generation, linking, etc. The difficulty of parsing depends almost entirely on the syntax of your language though.

A pre- or post-fix language is far simpler to parse than an infix language, to the point of being trivial and invalidating 90%+ of most books out there, so if you want to focus on low-level stuff and don't care about the syntax as much, choose one of those. If you are interested in lower-level stuff but want a more conventional syntax, use a parser generator (lex/yacc, ANTLR, tree sitter, etc).

Compilers and interpreters aren't much different in terms of difficulty or complexity, especially if you're concerned about optimizing performance, but an interpreter is much quicker to reach a minimum level of functionality which can be very good for keeping your interest up in the early portion of a project. The difference in complexity between a "transpiler"-style simple compiler and a simple interpreter are minimal though.

For target, choose what you know and are interested in. If you want to target hardware and do code generation or optimization, choose a RISC or less complex ISA at first unless you're already familiar with x86 - the principles are the same but the details are much less finicky.


Do you have any idea of what kind of language (what are your influences, and where do you want to innovate?) you'd like to build? For which ecosystem? In those 10* years, how much have you worked with trees and graphs (both reducing them down to other datatypes, and expanding other datatypes out into them)?

Some things that are much smaller than a language (easy études — KISS):

- a desk calculator

- a template system

- a source-to-source transpiler for a single feature

- a simple cpu emulator

- a metacircular evaluator

- an esolang

- a build system

- a regex engine

EDIT: * according to c-t.d; please use this list to aid in writing a language, not a language-learning post.


ooh! do you know https://chat-to.dev? in fact i wanted to open a chat room there to talk about this topic which i think would help a lot of people to really understand what programming is. would you like to join?


I'm sure there will be lots of people recommending great books, so i want to give you some general advice:

sadly, a lot of language related libraries are in C/C++, so if using those sounds a pain to you, just ignore them for now.

get somthing simple working soon, such as an interpreter for Forth [0] written in any language you already know.

do the parsing later, there is so much more to learn about compilers and language runtimes!

when it's finally time for parsing, i recommend parser combinators [1]. they are pretty easy to implement yourself, once you have understood the concept and are very flexible.

do not forget about proper error messages with line/col information.

any kind of performance optimisation is strictly forbidden until you know what you are doing.

[0] https://en.m.wikipedia.org/wiki/Forth_(programming_language)

[1] https://www.theorangeduck.com/page/you-could-have-invented-p...


1. Know your primary design goals. These should be extremely few, as in MVP.

2. Write a parser.

3. Write an interpreter. Do not bother with a compiler yet. Writing an interpreter is a challenge and your final proof of MVP.

4. Refactor and optimize your prior logic.

5. Write documentation. Write with extreme empathy because if the level of effort remains too high nobody will look at this. Identify current issues and shortcomings. Identify next steps.

6. Socialize the work product.

7. Now, write a compiler.

---

I have thought about writing a ridiculously scaled down minimal JavaScript like language with TypeScript like type annotations and named procedures.


+1 This works. The first step is, "Don't bother about Compiler at Beginning"


A good place to start is "Crafting interpreters" by Bob Nystrom. Then, after you gain some experience, you could try tackling Nora Sandler's "Writing a C compiler".


Why is this downvoted?


Clever name, icon or mascot, then the t-shirt.


the python sly library looks good. by david beazley who created ply earlier.

parser combinators also look interesting, like another commenter said. not understood them fully yet, but have just started looking into them. seems like describing the grammar in code using parser combinators can closely parallel the EBNF grammar for one's language.

i also saw this one recently:

https://github.com/google/compynator


Write some examples of usage. What's different about your language? How does it look in various situations? For example - coding a HTTP API endpoint, parsing text data, doing complex math...


idk maybe Type Theory




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: