Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Going into freshman year, figured I should build an interpreter
138 points by liamilan on Aug 27, 2023 | hide | past | favorite | 79 comments
Hi all!

I'm going into my freshman year, and figured that the best way to prepare for the intro to programming Racket course would be to implement my own garbage-collected, dynamically typed, functional programming language in C ;)

Anyways... here's the repo: https://github.com/liam-ilan/crumb

I started learning C over the summer, so I still have a whole lot to learn... Any feedback would be greatly appreciated! :D




Here's perhaps some unexpected and unsolicited advice... so pretty clearly you aren't going to have a hard time with the CS curriculum (assuming that's what you intend to study) so my $0.02 is - find a second or even third major to augment your skill set. If you are already able to get to this level on your own, you'll likely breeze right through a typical undergrad CS course of study. So consider other courses of study that might be a bit more of a challenge / might give you an even broader skill set four years from now when you enter the workforce or consider grad school.


I don't know the specific curriculum but I would caution that being a competent programmer does not directly translate into academic success. Yes, the author will have a massive head start but but that does not necessarily mean they will find studying easy.

At least that was my experience as a self-thought programmer. The first weeks were super boring for me but also lulled me into being complacent until I suddenly found myself in deep trouble. Just because you understand the practical side does not mean you can will automatically grok the academic side of things.


Fully agree on that. In my first year I started discussing with the professor during a lecture that for sure java strings are not immutable because you are still able to access the memory behind it and proving it to him, while completely missing the point of the concept of an immutable object and that he was fully aware of that.


i did this and regret it (38 now and have a technical CS startup), i studied electrical/computer engineering but what i should have done was gone to a better CS school or get approved for accelerated course work. the main problem for 18 yo me was lack of visibility into what a good CS program looked like, as opposed to just knowing more than the CS teachers at an eng school not ranked for CS. also, take as much math as you can.

you should be studying: lisp ocaml haskell, interpreters (SICP), compilers, type systems, transaction processing, effect systems, FRP, concurrency NOT java guis python SQL databases webdev gamedev .. whatever


Can you expand upon why you chose those particular things to focus on? I’m genuinely interested and don’t have a traditional CS background so I’d like to know what I’m missing.

(Serious question. I’m not being snarky)


As a self-taught seeker who spent 20 years in search of a better way, this is where I ended up for the heart of computer science. Most software engineering topics I picked up on the job — even distributed systems — but the actual computer science aspects I had almost zero exposure to at work. I had to seek those out, which meant rejecting the commercial methods/doctrines/thinking and escaping from Conway's Law which has infected anything touched by money. "It Is Difficult to Get a Man to Understand Something When His Salary Depends Upon His Not Understanding It"

I now see actual CS exposure in industry as rare broadly (you'd have to work in a research org, which is both rare and also requires a PhD or other credential for one to be selected for the opportunity). Furthermore, the bulk of the CS literature & papers I encountered is embedded in those three programming languages. Now editorializing: I think Haskell is like "the periodic table of computation" as well as basically "math notation for computational structures." These deep science-y topics are hard to learn outside of school, the material is dense and there's no clear and accessible trajectory to get there, and to even identify such at trajectory you need role models and teachers of which kind industrial programmers aren't exposed to.

In conclusion, I'd likely have gotten to where I am today at age 30 instead of 38 and regret the lost time wandering in the swamp of silicon valley arrogance. FWIW my startup is a CRUD Spreadsheet, we apply functional programming research to user interfaces and web development as per https://github.com/hyperfiddle/electric


> lisp ocaml haskell

Covers all of the major programming languages in a deep way.

> interpreters (SICP), compilers, type systems, transaction processing, effect systems, FRP, concurrency

These subjects are the core of almost any system that you'll encounter, in practice - academic or industrial.


if only the industry recognized this – the-world-if.jpg. We get Conway's Law instead, which I'll cynically restate as "the org is a reflection of it's leadership" or "the org exists to lever up it's leadership, recursively" and that unfortunately has little to do with science


Thoughts from a Stanford CS graduate:

https://www.youtube.com/watch?v=4SiFgB1lGxw


Laughing, funny video. I don't agree with the thesis though (is he being sarcastic? I only skipped around), what the guy seems to want is a 2 year apprenticeship in software engineering aka bootcamp and internships. So in the end comes off to me as a childish and ignorant perspective


Can't agree more with this. I ended up double-majoring Philosophy and it was fantastic and has surprisingly helped several times in my career


I still honestly believe coding should be a requirement in all philosophy courses. A system for layout out logical structures efficiently was something philosophy to this day sorely lacks, with formal logic being practically unreadable. And computer science has literally spent 50 years working on how to do that well.


I actually ended up getting to teach a 20 minute segment on neural networks to my fellow philosophy students during a senior seminar on phenomenology, was great fun


Reading this comment and its replies, I think it's evident that there's no "rule" as far as what you should or shouldn't do goes.

For example, I had a second major in theatre, but I had no delusion of working a living in theatre that could compare with the kind of career that a CS degree typically delivers. It was and is a personal passion of mine, and to be able to deepen my understanding of it in an academic setting was valuable and rewarding to me personally.

Studying theatre paid dividends in other ways as well. Managing a production is a beast, and you learn a lot about how to manage people and get things done. Dramaturgy, often called literary analysis, was quoted by my professor (a scholar on the topic) as "what are the parts, and how do they go together?" -- a shorthand that I still use to this day when thinking about a new system.

If it wasn't clear already, I had a very positive experience in my studies at a Liberal Arts school, where the connections between fields, and the value they can offer each other, are a big focus.

Just taking on a second major won't necessarily pay off, but keeping in mind the holistic point of your education may create an outcome where the sum is substantially larger than its parts.


I'd suggest the opposite. I originally double majored in the arts before switching to tech. For me, the classes in all cases were less useful than self-study with projects, more expensive, and oftentimes outdated. The degree opened the door to more jobs, but the classes themselves weren't worth the expense in my opinion. Your mileage may vary by university, of course.


I agree with this, but have a note about what the second major should be.

I went into mechanical engineering because I had a great natural understanding of basics physics, materials, design, mechanics, blah blah. Even UC Berkeley engineering basically taught me things I knew or would have easily understood it I had to lookup that specific thing. I learned very little from my mechanic engineering degree.

I ALMOST think people shouldn't go into a major based on their strongest aptitude, but I'm not sure yet.

But definitely 100% add another focus since you'll probably breeze through the main course work.

But I dont think it should be philosophy or art or calligraphy like Jobs would recommend. I wouldn't pick a second major that would be a foolish main major.

Pick something that is a hard skill that is useful to learn and understand.

Maybe electrical engineering. Maybe industrial design. Hell, I think tons of people would be well served with what they learned with a second major in accounting.


The part of the advice that suggests acquiring more skills is good.

The other part, that suggests doing more courses of study is somewhat questionable.

Uni courses are designed partially to help you learn skills. But also for other requirements.

If OP wants to learn more skills, they should go for that. Whether in uni or outside. If OP wants to relax and have fun, that would also be a good use of time in your twenties.


Whether done as a minor or second major, or not, I'd like to "second" what dustingetz said: take as much math as you can. You can, IMO, basically never take too many math courses.


Take the grad courses.


My main advice: never trust C, C++, or their compilers. Undefined behavior bites. Even if you just care about the latest 64-bit Ubuntu Linux LTS, compiler updates or flag changes bite as well.

That's the main difference between C and C++ and almost any other language. Not the syntax, neither the perceived "low-levelness", nor manual memory management. In any other language, if you make a mistake, the program behaves badly, but in a predictable way. In C and C++ it may behave reasonably, it may crash in another file altogether few seconds after the erroneous line is executed, it may produce the correct answer. Or it may crash only sometimes. Or it may silently produce an incorrect answer. Different behaviors on different systems, and even on the same system with different compiler flags (e.g. optimization level), of course.

That makes reliable experiments with C and C++ impossible. Even if you have just five lines of code, you almost never know if they're valid C or C++ for sure (separately, whether they do what you want with modern/legacy compilers). It's still fun and everything (congrats on making your own language, that should've been fun!), but you never truly know whether what you wrote is not going to break in a few years (or months) with zero code modifications just because there was another UB lurking around.

See https://evan.nemerson.com/2021/05/04/portability-is-reliabil... if you're curious about more practical implications.


I suppose this is one of those things where people’s experiences vary widely, but FWIW after 20 years of writing C/C++, I can count the number of times this has bitten me on one finger. And even then, I still think it was more likely just an optimizer bug in the god-awful compiler.


Yes, I think it heavily depends on what you're doing and your mindset.

E.g. if you work with a single compiler on a single platform all the time and expect the compiler to "just compile the code", then you're likely to learn all its quirks, avoid them automatically, use proper abstractions/contracts/invariants to guard against incorrect code, etc. Moreover, you likely know how exactly the most popular UB shows itself with your compiler.

If you juggle compilers all the time and work with not-so-high-quality code (legacy or not), especially the one with little tests, I expect UB to pop up a lot. Or even if you just test on one platform and do final runs on another, that happens all the time in competitive programming or home assignment grading, especially with beginners.


Man, your blog (https://www.liamilan.com/) is wonderful. I wish you had an RSS feed, hint, hint.


I agree.

@OP: I recommend to submit your posts here when you have new ones, and post the older ones slowly (like one per week) so people don't get annoyed.


Congratulations, this is great. Have you thought about spending some time writing down what you had to learn and how you went about figuring out what to learn in order to do this? I think it'd be even more interesting than the language itself.


Yeah actually, stay tuned! :D


For furthering your skills, i can suggest a related topic (that is never taught officialy though): debugging/reverse-engineering an interpreter. Does not have to be a JVM-size, most games have lots of interpreters in them too.. (although there might be multiple layers of interpreters - which is magnitude harder)

You may start with reversing compiled-code, but interpreter is much-much more.. interesting. Especialy if of unknown (simple) language.


Love it! I learned so much from building projects like this! None of mine had cool logos like yours - this looks really pro.

Have you tried teaching Crumb to GPT4? I bet it could surprise you with what it can do.


clickable link to repo: https://github.com/liam-ilan/crumb


Well done, I had a degree and 5 years professional experience under my belt before doing that.

Make sure you foster that curiosity continuously until the day you die, because it'll serve you really well


Learning C controlling some shitty robot in my first year was soo much fun. I was waking up and going to sleep thinking about it. So I hope your first year is also fun.

Obviously you are much more advanced, so my advice would be to keep tinkering with everything u can get your hands on.., and keep spreading to adjacent disciplines (electronics, cad, startupy stuff, and even non tech stuff) until u take over the world haha.


Ambitious! What were some of the educational resources you used or that inspired you before and throughout the project? Textbooks, tutorials etc..?


Robert Nystrom's Crafting Interpreters (https://craftinginterpreters.com/) is an awesome resource :D

Though I didn't implement his language exactly, a lot of what he talks about carries over anywhere. It especially helped me figure out how to handle the memory management and scoping.

Vincent Jacques' DrawGrammar (https://jacquev6.github.io/DrawGrammar/) was also super helpful. Getting to see the syntax definition visually made writing it so much easier :).


This is so good for your age! I don't think I made anything beyond a calculator or a website before entering college.


I wish I was this talented when I was going into college. Nice job though. :)


Did you wrote your own garbage colector? Is it a moving one or a not moving one?


Crumb is garbage collected (there is no need to manually allocated/deallocate memory)... though there is no background "garbage collector" process running... The interpreter for Crumb is a tree-walk interpreter, and it just frees memory whenever it can... Crumb frees memory in the following cases:

1) When a function is finished, all memory related to the scope of that runtime is freed.

2) When an value is not returned out of a statement, or assigned to a variable, said value is freed.

3) When a function is applied, if an argument has no reference (it is not stored in a variable), it is freed.

4) Additionally, if the function itself has no reference (such as in the case of an immediately invoked function), it is freed.

Hope that clarifies things a bit :D


Sorry if I got the syntax wrong, but in something like

  f = {
     x = (list 1 2 3)
     y = (list x x)
     z = (get x 1)
     <- y
  }
How does the compiler decides if it must free the memory used by x?


All lists are passed by value and X isn't the return value, would be my guess


It's an interpreter rather than a compiler, and looking at the code it seems to use ref counting


Ref counting makes a lot of sense with the description of the OP, and if there are no cycles it's good.


It should. Everything is copied and by-value, there are no references/pointers or even closures, cycles are impossible. Think Pascal that has garbage collection for strings and dynamic arrays.


This is seriously impressive. I’m also just about to enter college for software engineering and projects like this make me concerned about how I’m even gonna compete with such skilled peers.


If it helps I would absolutely struggle to do this as a dev with 9 years experience. You’ll never totally shake imposter syndrome in a field where you can easily see such excellent projects routinely.

Hasn’t hurt me from finding interesting jobs that pay well!


The skills of work are only partially overlapping with coding skills, so you’ll have new kinds of angst later.


I love this! Amazing work!


Thanks!


Amazing!


Nicely done.

You're a freshman, picked up C over the summer, and have already built an interpreter for a language that you haven't officially been taught yet, and have a blog with a couple dozen posts.

I'll be honest, people with achievements like yours make me feel pretty worthless.


Ya, but would you rather spend the summer enjoying summer or writing C


This assumes you don't enjoy writing C


This is them enjoying the summer. Aspie supremacy.


It depends. When I was young I could spend hours coding and it felt amazing and like a great use of my time. Now in my 30s? No way, writing detailed code and spending all my time in front of the screen feels like I’m throwing away my life. All that minutia, whether useful or useless, is repulsive.


That's pretty much how I feel as well. In my 30s now, but unemployed and working a bit day to day on a project while sending out the odd job application in an abysmal market (in the same city as the author), but it's a gradual process and only after enjoying the sunny days as much as possible, socializing, fitness etc..

It's not that there's no time for coding in a hobbyist manner, but I'm careful about moderating the amount of time I allocate to it.

That said, I wasn't at even remotely the same level of execution ability as the author, and are probably still not, but I was starting to work with freelance clients in PHP and Angular.js, building some small technical experiments end to end, and relentlessly trying to get better at technical stuff. I could definitely still do that, just not at the expense of other things.


I'm in my 20s and I'm disappointed in myself for not spending time with my friends during school instead of wasting time making hacked together programs.

I came to realise I don't even like "real" programming only random quickly put together scripts.

When my friends explored all kind of things I was so stuck on coding and web dev and php. I didn't check out any other fields of interest or professions. When I read all the threads here about parents teaching their 4 year old BASIC to "indoctrinate them into programming", I cringe hard. They're stealing their kids' childhood and their drive to explore what they'd actually love.


As I've entered my 30s, I've noticed that people have also squandered their opportunities to get reps at friendbuilding outside of school, instead focusing on work, and everything has a cost. Now they're more employable than me, but I know exactly how to go about establishing strong friendships, and many of them are hopelessly stuck to either their spouse or their high school buddies and the prospect of moving elsewhere to advance the opportunity in the field they invested in would be quite difficult for some.

Regarding parents, what's made me cringe lately has been the shutting down of opportunities for their kids to form new bonds and play. There's a park I go to regularly that seems to be attended by well-off parents who go in groups. Last time I overheard a kid who must have been 5 playing with some other kid she just met on the swings. She ran over to her mother and said "This is ___ she's my new friend" and the mother basically wrote it off and said that's great but the other one has to go and then they left. The same outgoing kid had earlier wanted to try climbing this artificial rock I was doing, and the parent just showed absolutely no enthusiasm for letting their kid do it, she had somewhere else to be. I get that people are busy, but let your fucking kids play.


Since you're going into your freshman year I'll offer my college advice as a current senior. The moment that you feel a class will be covering something you already know, reach out to your advisor and the professor. I personally wasted far too much time in intro classes that could've been easily bypassed. While easy A's are great for the GPA, it's better to spend your time and money actually learning.


[flagged]


I'd say C is actually a pretty good choice for an educational project like this. Having to write out all of your data structures by hand and manage your memory manually is a good learning experience, and since you're not writing serious production code, you don't have to worry too much about making mistakes.


_someone_ has to learn to build runtimes


_someone_ “else” :)


Given OP's (presumed) age, curiosity and enthusiasm, my advice to OP would be to not worry about what language you are going to specialize in at all. That can come later. In fact, do the opposite: learn every language under the sun and prioritize weird languages with unique ideas that are outside of the mainstream. Learn Rust, Haskell, Lisp (OP is already learning Racket so that's covered), Prolog, Forth, assembly, Smalltalk, ... (admittedly I haven't learned every language on this list; there are some that I wished I learned when I had more time for it).

OP is at a stage in life where fluid intelligence is still very high and crystallized intelligence is growing rapidly. This is the time in your life where you are perhaps most able to absorb new ideas and ways of thinking, before you get set in your ways and less receptive to different ways of thinking.


C is a very popular language to use in undergrad courses, all our entry classes (intro to programming, data structures and algorithms, OS fundamentals, etc.) were in pure C - exactly because you have to deal with memory from the get-go.

If that's the case at the Uni OP is studying at, then he should be come fluent in C - because from experience as a TA, it's not necessarily the CS knowledge/fundamentals that tend to be lacking when people struggle with classes, but rather that they're fighting a new language. Even if they have years of previous experience...especially in these days, when Python or JS tend to be the first (and only) language to many freshmen.


This is bad advice.

> several better options these days

Every other option requires you to buy into some additional paradigm-of-the-day.

You can easily lose decades going down the wrong programming rabbit holes.


No. Paradigms are few (OO/logic/functional/procedural/a few more) and learning those is essential and very transferrable.

> You can easily lose decades going down the wrong programming rabbit holes

total rubbish.


Why do you think the industry keep circling back on old stuff?

Look at how much time we lost with OO. I am only recently coming to grips with how bad it is. C++ was the "thing you are interested in today" back then. And it wasn't well thought through.

What were people doing before Rust?

C is great.


C is great, agreed. That's the procedural paradigms I mentioned. But the others work very well if used appropriately.

> Why do you think the industry keep circling back on old stuff?

What does this even mean? Can you give some examples?

> Look at how much time we lost with OO

Okay, what do you mean by 'lost' because I find OO to be extremely useful (if used appropriately). Ditto functional, ditto logic.

Are you suggesting we should scrap all of these and just use code procedurally and nothing else? Because that would be a massive step backwards in my view, and I do have decades of experience so I can't understand why you're so negative about anything else but procedural. And maybe C++ was just a bad implementation in hindsight. But it took us a long way.


> What does this even mean?

There is a lot of good wisdom from the 60s, 70s, 80s that people forget and end up re-inventing. I don't think I'm the first person to make this comment.

> OO

It's very subjective I'll admit. But I don't like OO, and it took me along time to realize why. There are many people who feel the same way. The difficulty is in explaining it. Because you have to look at large systems, explore the evolution of these systems, and pluck enough examples where things don't work.


You say so but give no actual evidence against it.

I can't learn from that.


There are a ton of resources online about "why OO is bad". I think knowing that this viewpoint exists is a good lesson in itself.


You don't consider the fact that I've worked successfully with OO many times on various systems of various sizes to be of any significance, then tell me "there are a ton of resources online" which translates to "now go and google for viewpoints that support grumblingdev" instead of actually giving me any data.

You've given me nothing to work on, nothing to learn from, nothing I can even oppose because your objection is so nebulous. I can't even knock down your claims because you haven't really made any except "muh, OOP baaad". I haven't learnt anything, equally you haven't learnt anything... what was the bloody point in you saying anything at all?

> I think knowing that this viewpoint exists is a good lesson in itself

I'll skip the sarcastic reply and just say I draw rather different conclusions.


They built an interpreted language in C over a summer, I think they are beyond “getting into programming” - they’ve clearly been programming for years.


This kid has plenty of time to dive into other languages. C is a great choice.


C is like a high level assembly language. It maps very closely to how the computer actually runs programs. That, and it’s simplicity, make it a great way to learn.

It’s not usually a great idea for production code anymore, but for learning it’s fantastic.


That old chestnut. C is nothing remotely like an assembly language, of any kind. It is a low-level programming language, but only compared to most languages. Disassemble some compiled C code sometime -- it's another world entirely.


I’ve programmed in both assembly and C, among other things. I stand by it, C is much closer to the machine than most languages.


I think Lisp won that battle. The core syntax was named after actual cpu registers (cdr/car).


Seriously? :)

Is lambda calculus or the PDP instructions set a closer match to which language?


C may not be the same as assembly language but the value of learning it is that in general it is the lowest level that we get to for most software. So, most programming languages are either compiled to C or have a virtual machine that is written in C. Many of the things our programs rely on are written in C, for example operating systems and databases. So in a way we can say that "the metal" is C and by understanding it we can potentially understand how our software works from top to bottom.


Maybe it is on a PDP-11 . . .




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: