Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: I wrote a BASIC interpreter in Go (github.com)
144 points by stevekemp 5 months ago | hide | past | web | favorite | 76 comments

A while ago I started making a game (https://www.youtube.com/watch?v=GwBiJR_rj_w) that was going to be about typing in program listings from magazines and books (like "back in the day")... but the "twist" was going to be that the whole world turned out to be alterable and scriptable with BASIC. The WIP title was "BASIC Instincts".

The problem was I spent way to much time having fun making the interpreter (in JavaScript) and then Else.HeartBreak() was released and it took the wind out of my sails (because it was great!)... I might go back to it now though, writing interpreters is really enjoyable!

I still have fond memories of the various BASICs. I first used BBC basic at school (unofficially), and it seemed like magic. I had an Atari ST and a got a copy of STOS at home.

Whilst BASIC encourages all kinds of spaghetti, I'm still convinced it is a better language than Python to start on. Control flow is so obvious to non-programmers when it has line numbers.

Not all BASIC dialects have line numbers, there are some very elegant version of basic out there. GFA basic for instance and BBC Basic. Both have named functions and do not use line numbers.

Not sure which BBC basic you used, but it certainly DID have line numbers when I used it. (BBC Model B)

However there were user defined functions and procedures as I recall.

Yes your are 100% right, the line numbers were there in BBC Basic, but mainly for editing rather than control purposes (though you could use the dreaded GOTO statement if you absolutely had to). Keep in mind that this was just before the days of the full screen editor embedded in your favorite programming language.

GFA did away with them entirely.

The BBC had an odd editor, it was partly line and partly screen based, allowing you to move a secondary cursor across the screen with the arrow keys, and then a ´copy´ key to copy whatever character was under the second cursor to the input cursor as though that key had been pressed. It sounds terrible but it was actually quite quick in actual use.

What you’re describing there was pretty standard across all of the 8bit micro computers. A few machines had the copy feature (eg the CPC464) and the control flow in the BBC BASIC wasn’t anything that couldn’t be done in pretty much every other dialect from that era (be it Microsoft BASIC, Locomotive or whatever).

Personally I think it’s a bit of a stretch to say line numbers were mainly there for editing. Even if you ignore GOTOs the most common method of running pseudo-functions on the BBC Micro was GOSUB.

The BBC personal computer (that ran BBC Basic) also had a cool feature that allowed you to drop down into 6502 assembly language right in the middle of your BASIC program, just by enclosing your assembly language statements in [ and ].

I remember trying it out at a British Council library which had one such machine available for members to use. Although I had done a little 6502 assembly language programming earlier (using both the PEEK and POKE approach as well as using an assembler), just for fun, on other home computers, I never got to do a lot of it, since by then, I moved on from BASIC and assembly to languages like Pascal and then C. Good times.

There were actually a few machines that ran BBC BASIC. I’ve got a BBC Micro model B hooked up in my “man cave” and I still play on that from time to time.

I’ve also got a few Amstrads CPCs - another range of British 8bit micros. They ran Locomotive BASIC which was largely inspired by BBC BASIC so there is a hell of a lot in common between those two dialects.

>There were actually a few machines that ran BBC BASIC.

Interesting, didn't know that.

>I’ve also got a few Amstrads CPCs - another range of British 8bit micros.

Cool. Had read about them in computer mags.

And the ARMs as well.

> Even if you ignore GOTOs the most common method of running pseudo-functions on the BBC Micro was GOSUB.

I don't even recall a single time when I used that, the usual control structure for that was

DEF PROCprocedurename(parameters)


See: http://www.bbcbasic.co.uk/bbcwin/tutorial/chapter16.html

Ditto 'proper' functions using DEF FN with parameters and a return value using the '=' sign.

BBC BASIC was pretty advanced but most people’s exposure to it was in schools so if kids wrote anything, it was usually using a common syntax that they’d learned on their Sinclair, Commedore or whatever (BBC Mircos weren’t cheap from what I recall).

So I think while the FN / FUNC approach was arguably better, I’d be surprised if it was more commonly used than GOTO or GOSUB. But who knows, maybe your circle of friends did things differently?

Well, my 'Circle of friends' was rather limited, basically just three guys that taught each other programming and lots of books to read.

This one:


and the BBC manuals were pretty good. So was Newman and Sproull as well as David Levy's computer gamesmanship. And now that we are talking about this I do remember when I used the line numbers. I came to the BBC from the TRS80 Color Computer/Dragon 32, and that one had the good old Microsoft ROM basic, which did not have such luxuries.

It really clicked for me when writing a musical score editing program, that I tried to put together using the 'old' way and then tried it again using the 'structured' way. The second time around the program made it to completion and it was lots easier to understand so I never really looked back after that. So to me that became the 'normal' way of doing things but I totally see how it could be that this experience was not typical.

You are also right that they were not cheap, my fully decked out 'beeb' (double drives, 256K ram expansion) cost me more than a years worth of savings. But looking back it was most probably the best investment I ever made. It gave me a career.

Interesting stuff. Thank you for sharing. I do think BBC Micros (Or at least 8bit BASIC machines in general) we’re responsible for a great many careers - mine included.

I never used a Dragon in the 80s but do have a Dragon 64 packed away, waiting for me to hook up and play. I keep meaning to get it out but between work, family, and the other retro hardware, it sadly never gets a look. I’ve heard Dragons were nice machines though.

>BBC Mircos weren’t cheap from what I recall

Yes, I think I paid £400 for mine.

Yes, I read that there were some more structured-programming-oriented BASICs that came out some time after the structured programming movement started [1]. IIRC, one such BASIC was by a company founded by John Kemeny [2], one of creators of the first BASIC. It might have been called TrueBASIC or PureBASIC or some such name. Think I read about it in some computer magazine.

[1] https://en.wikipedia.org/wiki/Structured_programming

Aside: Just looked up Kemeny in Google.

[2] https://en.wikipedia.org/wiki/John_G._Kemeny

Interesting snippet from his Wikipedia page:

[ Kemeny's family settled in New York City where he attended George Washington High School. He graduated with the best results in his class three years later.[2] In 1943[1] Kemeny entered Princeton University where he studied mathematics and philosophy, but he took a year off during his studies to work on the Manhattan Project in Los Alamos National Laboratory. His boss there was Richard Feynman. He also worked there with John von Neumann. Returning to Princeton, Kemeny graduated with his BA in 1947, then worked for his Doctorate under Alonzo Church, also at Princeton. He worked as Albert Einstein's mathematical assistant during graduate school.[1] ]

And the company and BASIC dialect he created later were both called True BASIC:


That was a lot of other eminent people he was associated with :)

Kemeny was a really interesting guy. His obituary[1] gives some good background on him. I read his 1972 book, "Man and the Computer"[2], in the early 90s. Some of his predictions about how computers would be used in the future (ubiquitous networking, online shopping, etc) were spot-on. His idea of "mainframes" providing centralized services seemed backwards to me in the early 90s, when I was just getting my PC onto the Internet for the first time, but it appears that he got that one right too.

[1] http://www.columbia.edu/~jrh29/kemeny.html [2] https://archive.org/details/mancomputer00keme

Thanks for the info. I read his obituary that you linked to. It really is interesting, as you said. He had great vision. And did a lot practically too, of course.

I know, and they (like Python) all make excellent second steps. The line numbers are bad for structure, but good for stage 1 learning is what I am saying. In fact, letting people find the limitations of line numbers and GOTO statements is an excellent way to show them why you would want something more structured, rather than just telling them.

I do not know about that. Teaching people something bad first only to show them a better way later still wastes their time and ingrains bad habits. Personally I prefer teaching the better way right from the beginning (assuming I know what the better way is).

So we should skip Scratch for kids and go straight to Java?

Static types and Object Oriented Programming are better ways, I think most of us will agree?

I want to give noobs a taste of the magic of programming immediately

    10 Print "hello"
    20 GOTO 10
Is something anyone who can read could understand.

Even the relatively simple

    while True:
Has several concepts to explain before you can understand it.

> Static types and Object Oriented Programming are better ways, I think most of us will agree?


I'm with you on static typing but disagree with OO. I know other developers I strongly respect with every permutation.

Personally, I probably wouldn't object to OO so much if Java weren't such a commonly taught first language, which gets right back to the discussion at hand.

I do not think the Scratch vs Java comparison is a fair one here. There is nothing wrong with Scratch.

But bad Scratch vs good Scratch or bad Java vs good Java would be a fair comparison and in those cases I would advocate for teaching the right thing from the start.

> Static types and Object Oriented Programming are better ways, I think most of us will agree?

I'm pretty sure you won't find near universal agreement on either.

After I got the tokenizer working the first thing I did was implement "PRINT" and "GOTO". Specifically so that I could write:

   20 GOTO 10
Everything else was added after that, which is why the PRINT function is the oldest and worst code (until this evening).

I still have a BBC basic re-implementation in C floating around here, code written in 1989, would that help you? If so I'll be happy to tar it up and send it to you.

I think seeing the code would be interesting, not just for me but for others too.

Where should I mail the archive?

Ideally post to github so everbody can read.

But failing that my domain is steve.fi, and my email address is steve@.

I do not do github, but I will mail you the archive later today when I am behind my desk again.

Sent! Have fun with it.

Programming without language-level support for structure isn't “something bad”, and in practice it resulted in learning how structure is implemented at a low level.

Plus, mandatory line numbers is what enables having a REPL that's also a program editor, which is a big win.

BASIC dialects with support for structured programming without line numbers, and compilation to native code, were already available on CP/M.

Great, and I'm saying that was worse for beginners than line numbers (like the BASIC in the linked article)

Letting people crawl is a great way to teach them the advantages of walking.

What I meant is that people were able to crawl, walk and run without switching toolchain.

Does anyone remember some installs shipped with a game called gorilla.bas? A fun little game where you had two gorillas across a cityscape and the whole idea was to take turns throwing bananas at each other to see who can get a critical hit.

What made it hard is you had to enter both an angle and speed and hope you calculated the right combination to hit the other player.

Eventually I wrote my own version of the game in flash/actionscript....but it’s long gone.

When I was first learning about programming, I had hours and hours of enjoyment just from modifying that game even though I didn't quite know what I was doing. Just things like changing the banana speed or making bananas spawn other bananas when they hit.

Another fun one was called Labyrinth (I think it was spelled LABRNTH.BAS or something). I liked it so much that years later I started making two remakes (one in HTML canvas/JS, one in DCPU-16).

QBasic was a great introductory language for a kid wanting to learn programming IMO.

Here's a js/html5 port: http://www.kylem.net/stuff/gorilla.html

Yep that looks like a legit clone of it. Thanks for the link.

Somebody on github wrote a version for CP/M-80, and I ended up getting it running on my Z80-emulator-inside-Unreal-Engine project:



Original file looks like it's available here: https://gist.github.com/caffo/1326838

That's a very common game or example often with tanks. I like this gorilla/banana theme.

Yes! This was the best BASIC game I ever knew.

Peter Norvig did one for Python, and it's pretty cool: https://github.com/norvig/pytudes/blob/master/ipynb/BASIC.ip...

Interesting abuse of a regex as the tokenizer. Cool.

How is it an abuse? I thought using regex as the tokenizer was the main approach

I've never seen a tokenizer as one big regex that finds all the types/functions/keywords all at once.

To get a bit theoretical, one way to write a lexer is to use a state machine called a deterministic finite automata - a fancy term for a state machine that is finite and always gives the same output given an input. Each new character moves the state, and there are terminal states when you reach e.g. a space which will spit out what kind of token you just read.

You can show that the things a regex can compute are the same things that this kind of state machine can compute. I believe that most regex under the hood use such a state machine, so it's very natural to use regex to tokenize.

You can't do it in general, e.g. to tokenize Python you'd need a layer that'll count tabs (because INDENT and DEDENT are tokens). That it, unless your regex dialect is Turing complete, and if my memory serves, some of them are? I can't really remember. In general though, you can't do it with "normal" regex.

It's the way I always approach it when parsing a DSL, but always line-by-line, rather than a regex across the whole script.

I know it isn't going to be the most performant approach, but if you're comfortable with regular expressions it's really simple, and performant enough for most cases.

A fun project! I had a buddy tried this, wrote it in C++ over a week. Had console, graphics, formatting and networking (I think).

The trick was first a parser, then use C++ objects for each construct and instantiate one for each parsed element. The objects had a 'list' and a 'run' method. They were linked in programmatic order. So run the first one, which returns the next one to run (since it may vary with IF, GOTO etc)

Here it is! http://www.moondew.com/basic/

Sounds like interpreting from an instance of the AST. Pretty easy but there are gotchas; like handling break/continue in loops. It’s an interesting excercise everyone ought to try IMO.

I once wrote a Logo interpreter in BASIC... For my Computer Studies GCSE (UK national exams at 16) final project. It never actually worked, but it was a heroic effort!

I once wrote an assembler in BASIC. We extended the instruction set of a microprocessor by some external circuitry, so the standard assembler was not very useful. BASIC was fine for the job. And it actually worked :)

I think BASIC is something that everyone has to write at least once in their lifetime.

My personal take is here (over 7 year old code, beware): http://tech.nimbus.fi/minibasic/

And I wrote that just to make this functional: http://tech.nimbus.fi/c128/ (Try the "help" command.)

Yep, and here was mine: https://github.com/darius/parson/blob/master/eg_basic.py written to exercise a parsing library. It's super-slow.

Could someone that has the ability to compile this check the output of

PRINT 3+4*5


  $ echo 'PRINT 3+4∗5' > test.bas 
  $ go run main.go test.bas

  $ echo 'PRINT (3+4*5)' > test.bas
  $ go run main.go test.bas

  $ echo 'PRINT (3+4)*5' > test.bas
  $ go run main.go test.bas
No idea where it's getting 34 from though.

Grr. Expressions are hard:

    10 LET a = 3+4*5
    20 PRINT a, "\n"
Produces the expected outcome:

    -> 23
Bug filed:


Yes, but it should not require that LET statement. The expression evaluator looks subtly broken.

Bug filed, as per the edited comment above. I definitely agree this is wrong!

Don't know if it's the same bug, but "PRINT 1/0" returns 1. "PRINT (1/0)" prints nothing. Using "LET a=1/0" raises a div by zero error. Also, "LET a=1%0" panics.

The issue with the "PRINT 1/0" and "PRINT (1/0)" are the same as the previous one.

LET a = 1/0 SHOULD produce a division by error, so I think that's correct. But 1%0 shouldn't panic. I'll fix that.

Thank you! At the airport and no access to my desktop so I could not run the code but I suspected that was going to cause a problem. Deskchecking ftw ;)

In the first case, maybe it's treating the '+' as string concatenation - so 3+4 is giving 34? Then it's just dropping the 5 because it can't multiply a string and a number.

Regardless, I really like this project. :)

Ahh, you're correct.

   go run main.go /dev/tty
   PRINT 1+2*5,"\n"

I wrote a BASIC interpreter in VAX assembly language (in 1986). I feel old..

I enjoyed Atari 800 BASIC, so it was modeled after that, tokenized during line entry so that it would not allow you to enter lines with bad syntax.

VAX assembly had instructions for spanning over characters given in a set or scanning to a character in a set, so string processing was very nice.

You should use floating point for the line numbers so that you can always insert more statements between existing lines :-)

Oooh wonderful! I wrote hundreds of games as a kid in QBasic and have had the idea for years to write a QBasic to JavaScript transpiler so I could actually showcase some of them online. With Go adding its WASM support, I’ve actually been considering a Go interpreter as an option.

This is super inspiring. Nice work.

That's how Bill Gates started

Technically, I believe they (Paul Allen & Gates) ported the Basic from the Harvard Dec-10 using a cross compiler. This was before there was licenses in code.

No. They wrote a fresh version, but they did use an 8080 assembler on a Dec-10.

His first program he wrote as a professional was a payroll program. Altair BASIC came later.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact