Hacker News new | past | comments | ask | show | jobs | submit login
Microsoft BASIC for 6502 – Original Source Code (1978) (pagetable.com)
250 points by dezgeg on Jan 13, 2015 | hide | past | favorite | 70 comments



My father bought our family an original IBM PC (4.77MHz) when I was around 15 years old, after I spent 2 years riding my bike to the local Radio Shack and sitting, literally, in the display window of the store hacking on a TRS-80 and saving my "work" on a cassette drive.

I became good friends with the salesmen (yes they were all men back then) and they liked me working in the window...it helped them sell the system if they could show a 12-year old kid writing programs on it. "How hard can it be if..."

The first real program I ever wrote was in MS ROM-BASIC (that's what I remember calling it, because it booted into it if you didn't put the DOS floppy into one of the two (yes two, we were lucky!) 360k floppy drives it had.

The program was a blackjack simulation and trainer for counting cards. My older brother became infatuated with the topic and I remember reading the books on it and wondering if the math and systems were real or just casino trickery.

So I learned this version of BASIC and went to town, building decks of cards and configurable card-counting systems, and proved to myself that, yes, under certain conditions, BJ counters do actually have a long-term advantage over the house.

It's fascinating to see the actual code that ran my simulations.


@cubano oh, man, your story struck a nerve with me. I did exactly the same thing. A friend and I would ride our bikes to a shopping mall about 5 miles away to hack on commodore 64 and, others up in Hudson's department store. Hudson's use to have a small electronics area where they mostly sold stereo equipment but they had 3-4 tables setup with computers on them. We saved our programs to cassette tapes. I was bitten by the computer bug the summer War Games came out. The local movie theater was auctioning off a commodore 64 so the same friend and I went to the movie theatre every day, took home a stack of contest entry forms, filled them out, went back and stuffed them in the box. Never won anything. It was probably another year before my paper route gave me the $300 I needed to buy my first computer.

As I read this post I thought the same thing as you; how fascinating it was to see the actual code that ran many of my programs.

In my time, I got into BBS software and assembler and I find myself copying the BASIC Emulator out of ROM and studying it.

anyway, it's just so cool to read posts by so many people that thought the same way.


I use to sell computers and electronics at Hudson's (in Harper Woods, MI). Commodores were a bit before my working days. I was selling IBM PS2, Epson and some Macs. Good times! Sorry, a bit off topic. But, this is where I started my software development career. :-)


Harper wooeds, MI.... oh, jeeze, I grew up in the downriver area; the mall was Southland mall in Southgate, MI.


anyway, it's just so cool to read posts by so many people that thought the same way.

Just FYI, I felt the same way reading your post. Thanks for that.

Of course I remember the C-64, and it's "advanced" sprite handling routines...it sure beat PEEKing and POKEing memory in BASIC.


"I’d befriended a number of Tandy store managers in and around Amsterdam, they let me play with the bigger machines"

Tandy was the RadioShack brand in Europe.

From: http://jacquesmattheij.com/computers-I-have-known

I wonder how many kids got their start with computers because of friendly RS employees.

Thank you Gidon & Gert Jan, wherever you are.


I did the same, with ageing Commodore 64s and Amstrad CPC 464s - and one Acorn Archimedes, although that was the superior BBC BASIC - on display in a variety of independent computer shops around Galway, Ireland.


This is a fun coincidence, considering I was just about to assemble BASIC for a custom 6502 system. I suppose it is just that, a coincidence, but it gives me an excuse to rant about my silly project for a moment.

To keep myself sane lately, I've been working on a fun hobby project. It's an HTML5 game. You pilot a spacecraft, but the only way to control the spacecraft is through a 6502 based computer. So all flight controls have to be coded yourself on the machine. There are hardware modules for radar, engines, weapons, etc, but you need to write software to control them all. I haven't fleshed out all the ideas yet, but I get a strange kick out of the idea of piloting a spacecraft with a Commodore 64.

So far I've got an asm.js, cycle accurate implementation of a 6502, a bus system which is roughly a combination of the C64 and Apple IIe, and 320x200 4 color display. Just started working on the radar hardware subsystem, and was going to try assembling C64 BASIC for it today. Which is why seeing this article on Hacker News was a neat surprise.


To what degree would you say your idea was influenced by Notch's cancelled 0x10C? Not a criticism, it's actually a game I truly want to see made.


Oh significantly so, but more recently because of watching the BBS Documentary by Jason Scott (highly recommend). But I don't intend to finish this project as a game; it's just a creative outlet. I've always been fascinated by the magical simplicity of the 6502 and related machines, even if they were a few years before my time. A few years ago I even implemented a logic accurate 6502 in Verilog for an FPGA devboard. Working with the 6502 is peaceful in a way; it lies in stark contrast to the towering complexity we as engineers deal with in the present computing age.

I wonder what became of 0x10C after Notch dropped it. Has anyone picked it up and kept working on it? Alternatively, I know Space Engineers is similar in kind, and was quite a bit of fun to play. It just doesn't (or didn't?) have any computing component.


>I wonder what became of 0x10C after Notch dropped it. Has anyone picked it up and kept working on it? Alternatively, I know Space Engineers is similar in kind, and was quite a bit of fun to play. It just doesn't (or didn't?) have any computing component.

There is some successor projects. Perhaps the most famous is Trillek at trillek.org .Forums are dead, but IRC is pretty active at freenode #project-trillek

Actually we have a full computer working with a more realistic behaviour, using TR3200 CPU (RISC like 32 bit cpu), with screen, keyboard, floppy drive, timer, RTC, beeper, hardware "random" number generator, etc. Tehre is two assemblers for it, WaveAsm and VASM. Also, SmallerC can output assembly code for it, and there is other C compiler that would have TR3200 backend (qcc , D language write C compiler).

A BASIC interpreter for it would very appreciate (or FORTH or any other easy learn language)

Links :

Computer specs : https://github.com/trillek-team/trillek-computer

Computer implementation library with a not very friendly emulator and a few tools (I'm working in an user friendly emulator using the lib): https://github.com/trillek-team/trillek-vcomputer-module

WaveAsm : https://github.com/Meisaka/WaveAsm

VASM : http://sun.hasenbraten.de/vasm/

SmallerC : https://github.com/alexfru/SmallerC

Toolkit recopilatory (wip) : https://github.com/trillek-team/computer-toolkit

PD: I forgot to mention that we have a simple firmware that detect devices, prints some information, try to bootup from floppy, and have a machine code monitor that is a clone of Woz's monitor ? https://github.com/Zardoz89/trillek-firmware


I wrote a simple FORTH-like compiler/interpreter for a 0x10C CPU simulator. It was a really fun little project, and completely doable with my tiny amount of prior assembly experience. Very peaceful, like you say.


Community is working on a few successor projects. Search for Trillek


Have you a link to see it ? I have curiosity.

Also, you should check FUZIX (unix clone for 8 bit machines), there is some working to get it working on a 6502 cpu


Some context for this code can be found here [1] in Lammers' great book.

Gates specifically says that "not a line of code went out that I didn't look over" for the BASIC 6502 product. At the time (1986) he said he considered BASIC for the 8080 his "greatest achievement ever in programming" and admitted that he no longer programs himself but does was still looking at code and discussing algorithms with his 160 Microsoft engineers.

[1] https://programmersatwork.wordpress.com/bill-gates-1986/


So I called Microsoft in 1979 after purchasing a OSI C1P and asked some bozo named Ballmer what the cost of the source code to MS Basic for the OSI was. He rustled some paper and said $50,000. My college dreams were destroyed, and I never liked Microsoft after that.


Awesome (well, awesome you called Ballmer direct back in the day, not so much the loss of college dreams...)

My dad had one of those OSI machines. Just a few years ago he found it again in the basement and booted it up, and we were greeted w/ the message saying something like 'OSI Basic, (c) Microsoft 1979'

There was also a pretty cool hack he did - where the OSI supported 40 columns of text, and connected to B&W TV screen. Somehow my dad hacked the motherboard to show 80 columns of text, which worked perfectly on the TV screen, fully readable with characters at half width. Not sure how he added the extra memory for the text, but it worked super well. He even mounted a small switch on the motherboard to flip between 40/80 mode. (EDIT - it may have really been a doubler from 24 columns to 48, as I'm reading up on old OSI specs right now)

I think that's the machine I learned to code on, when I was 6 years old. My dad showed me how to write a BASIC program to count, and I was hooked ever since. (Though I shortly migrated to a TRS-80 for my main childhood hacking).


The TRS-80 can flip between 64 or 32 characters per line, accessed by SHIFT -> The 32 character mode was the only way I could see enough to program on my old 5 inch Russian portable TV. Fantastic days.

Back on topic, the TRS-80 also had basic supplied by Microsoft (but Z80 code). I spent a lot of time reverse engineering some of it to add new functionality to the disk basic commands, which were unused if you only had cassette tape storage.

We used to "Protect" BASic programs by poking a dummy line number -1 at the start of the program. The program would still work, but refused to list.


I had the TRS-80 Model 1 with the expansion interface. I never remember seeing 32 column mode, my video was always 64 cols by 16 rows.

Though I thought the way it handled graphics was pretty ingenious, where each character could be split into 2x3 blocks, and these 64 combinations were represented in the upper ASCII characters, for 128x48 resolution that is really just 'text'.

That's the machine I really learned how to code on, in Microsoft Level 2 Basic, armed with some David Ahl Creative Computing compilations.

How did you figure out the negative line number hack, just playing around to see what happened?


The 32/64 column trick was a bit obscure. I learned it from the sales guy in the Tandy store. I think from memory that printing CHR$(23) would also invoke it in software.

I cut my programming teeth on the old TRS-80 too. Learned Basic, then Z80 assembler. Still have mine with the interface and pair of disks.

I came across the line number hack when writing a line renumber program in assembler. While getting it working, I screwed up quite a lot and ended up with all kinds of values in there. Screw-ups can be a great way to learn, or especially trying to fix them!

The line number was held as two bytes at the start of each line, which could also be massaged by using the Poke command. When listing the program, Basic expects the lines numbers to increase in value and stops listing as soon as it comes to one that's smaller than the last one (presumably thinks it's reached the end of the program). Poking 65535 (255,255) into the first line does the trick.

Surprisingly, it doesn't affect the running of the program. When looking for targets of GOTO or GOSUB, it seems to scan the whole program and not stop at the hacked line number. Weird.

A friend of mine used to "protect" his basic programs against being line printed by poking 200 page feed characters into one of the early lines of the program. Really made the paper shoot out of the old Epson MX80 dot matrix printer :-)

Cheers! [Edit typo]


That's awesome, thanks for the description!

Sadly I never got into assembly on the Z80, I guess I didn't realize it would be accessible :-(

Cheers for reminding me of the Epson MX80 which I also had. I loved the programming manual for it, embedded with funny comments like how to forge the Mona Lisa.


Well, if it makes you feel any better, I heard he got fired ... eventually.


After several decades? :D


$50k? Just for MS Basic?

Heck, Bill and Steve bought QDOS from Tim Paterson for that exact amount, and resold it to IBM for use as PC-DOS.

Maybe they needed your $50k to help finance the purchase.


Well according to Gordon Letwin who was there at the time "DOS was a one-time throw-away product intended to keep IBM happy so that they'd buy our languages" ( http://en.wikipedia.org/wiki/DOS )


I never liked microsoft for no particular reason


My first assignment at my first professional programming job (way back in 1980) was to write a 6502 macro assembler that ran on an Ohio Scientific machine. We took those machines and built our own OS and a three-user travel agency automation package that worked remarkably well, given the constraints of 48K of RAM and 2 floppy disks.

I did some research in to hashing, simulated a couple of hashing functions in FORTRAN (I was still in school), and found a good way to optimize the function to avoid collisions on the core set of 6502 opcodes. This earned me an immediate raise from $7/hour to $9/hour.

The assembler ran blazingly fast and I still have the listings for it. As soon as it was able to self-assemble, I added macros for all of the opcodes that rightfully belonged in the 6502 ISA but were not there. Mostly a set of loads and transfers that made the ISA almost fully orthogonal.


    DEFINE    BCCA(Q),<    BCC    Q>    ;BRANCHES THAT ALWAYS BRANCH
    DEFINE    BCSA(Q),<    BCS    Q>    ;THESE ARE USED ON THE 6502 BECAUSE
    DEFINE    BEQA(Q),<    BEQ    Q>    ;THERE IS NO UNCONDITIONAL BRANCH
Huh? The 6502 totally does have a regular unconditional branch instruction, as JMP. http://e-tradition.net/bytes/6502/6502_instruction_set.html


Yes, but JMP uses 3 bytes (absolute addressing) and BXX uses only 2 (it's relative).


That is right. The JMP is a jump, because it takes a full 16bit address. Branches where all relative (-128 .. +127 bytes relative to current PC). I am not sure, but it also is very probable, that branches are faster -- at least the processor had to fetch one byte less. That should be the other reason, that newer processors had the unconditional versions.

That was one hacking thing in these days: finding flags that are guaranteed to be set or cleared to use as condition for a branch instruction. I remember, that I always tried to avoid JMPs.


On 6502, JMP uses 3 cycles. A branch uses 3 cycles if no page boundary is crossed, and 4 cycles if a page boundary is crossed. So there, JMP is "faster".


Ok, on 6502, you might be right (if your numbers are correct). But the 3 cycles might have been, because a decision according flags had to be made -- so a newer processor could have made an unconditional branch faster.


Perhaps wasting a single byte was regarded as a bad programming choice... :)


There is also an other reason for not wasting "a single byte":

Branches are the only Jumps with condition. So, when you have a far destination that is more than 127 bytes away, you have to do that:

   want to do:
            BCS far_dest

   Have to do:
            BCC no_jump
            JMP far_dest
    no_jmp: ...
So, when you wasted to many "single bytes", you may end up up adding 3 Bytes and at least 3 cycles to your conditional jumps.


When you are space constrained, there is no ":-)" on that.

-- veteran of several game cartridges for the 6502


That is correct. Also, when you see, how much space was available. On Commodore, the Basic had to fit into 8k+ (together with the "Kernel" it had 16k of Rom and the Kernel needed ~7k) -- so space was really scarce -- a bigger ROM would have made the computers more costly.


Additionally, a branch (relative jump) is useful for making code that can be relocated in memory. You can't do that with an absolute jump.


As I recall it was written in CROSS (http://pdp10.nocrew.org/its/its.os.org/ai/info/cross.doc)


There's a whole pile of references to the "MACRO-10 assembler" such as this:

"Paul Allen wrote the macro package for the MACRO-10 assembler"

In the part below it.


I read those sentences as well, but as I recall (and this was a long time ago!) it was written in CROSS which ran on the PDP-10 and borrowed a lot of syntax from Macro 10 but Macro 10 wasn't actually used to run CROSS. But I accept that I might be confusing it with Altair BASIC which was the progenitor.


While it's interesting to get the original comments, it didn't take long before there were a number of books published with full disassemblies of both the BASIC and kernel ROM's with extensive comments out there, full of disclaimers about how they were "just for reference". It's kinda hard to prevent extensive reverse engineering when the each of the twom ROMs were just 8KB each...


I did this for the 6809 version of Microsoft Basic as used in the Dragon 32 (color computer clone made in the UK). It took the better part of a year before I fully understood how it worked. Reverse engineering is hard work, not as much as writing something but still quite a bit of effort went into it.

I learned a lot from that, on a high level how a language implementation works, on a lower level how things like arrays, tokenization, memory management, strings and statement dispatch worked. The hardest part to get my head wrapped around by far was the expression evaluator.

But once understood you could do all kinds of nifty things such as extend the language and call bits and pieces of the interpreter from assembly to perform some task, it instantly became like a library accessible from any machine language program (no such thing as upgrading your ROM).


Commodore famously licensed Microsoft BASIC on a "pay once, no royalties" basis, and independently kept developing it for its 8-bit line of computers. If you're interested in its history David Viner has posted the source for version 4.75 from 1982 with several enhancements:

http://www.davidviner.com/cbm9.html


Interesting. Perhaps this BASIC reached as far as the C=64, 128 and Plus-4 too?

BTW AmigaBASIC must have been a very interesting affair too. I think Commodore's agreement with Microsoft was somehow broken after release; it was never updated, it was never really integrated into the system - Intuition, the co-processors, the famous HAM mode, etc. So it left everyone unsatisfied. And it was slow and bloated. There was a compiler for it, the AC-Basic, which I successfully used for a short while, until AmigaBASIC was literally eclipsed by AMOS.


Yes, BASIC 3.5 (264 machines) and BASIC 7.0 (C128) were further developments, as well as the unreleased 3.6 (LCD) and 10.0 (C65): http://en.wikipedia.org/wiki/Commodore_BASIC#Released_versio...

AmigaBASIC was indeed a bit of a lame duck. I used HiSoft BASIC myself, which was backwards compatible but added a lot of features, including a compiler.


What is the provenance for this? How did it get released to the public ?


From the bottom of the article under "Origin of the File":

  The source was posted on the Korean-language blog
  6502.tistory.com without further comment, in a marked-u
  format:...
  [ lots of interesting background here ]
  ...
  Given all this, it is safe to assume the file with the
  Microsoft BASIC for 6502 source originated at Apple, and
  was given to David Craig together with the other source
  be published.
So my guess is potentially safe to look at, but that's about it. If you have any doubts, I'd recommend not reading the original post.


It should be safe to look at from an archaeological perspective. You're not going to get any competitive advantage out of this or achieve anything by publishing a derived work unless you have access to a time machine.


You know, in case you are working on any competing BASIC for some hot new microcomputer coming into the market later this year.


Or you know, you decide to bundle it in an emulator for fun not profit, then get sued by Microsoft.


distributing ≠ reading


I was responding to johansch's new trail of thought. He is implying more than reading.


I didn't read it as implying more than reading. Reading alone could get you into trouble if you end up making something similar from scratch. AIUI/IANAL, there's not a legal requirement to implement competing products in a clean room fashion, but it's easier to defend against a lawsuit if your implementers never read the competing code.


Yes I would agree. There is reading as in opening the link and thinking what the hell, glad I don't have to code in assembler. (What I did).

Then there is reading as in opening the link, examining the code, deeply understanding it, then going off and writing your own one 'from scratch'.


I learned how floating point arithmetic worked by disassembling the ROMs on my Commodore PET as a child. There I learned that they used a Taylor series expansion to compute the values of transcendental functions. It was fun to see the coefficients in the source.

When I wrote a set of 80 bit IEEE Temporary Real floating point functions for the 6502, I used an early version of Maple running on a VAX that I, ahem, "acquired" an account for running at a local university to compute the coefficients for my functions.

I grew up with this code; it's so awesome to finally see the source!


    LDWDI	WORDS		;MORE BULLSHIT.


looks like there is a PHP intruction.

    TABER: PHP			;REMEMBER IF SPC OR TAB FUNCTION.
	   JSR	GTBYTC		;GET VALUE INTO ACCX.
	   CMPI	41
	   BNE	SNERR4
also it appears right in the middle

    FRETMP: STWD	INDEX		;GET LENGTH FOR LATER.
	    JSR		FRETMS		;FREE UP THE TEMPORARY DESC.
	    PHP				;SAVE CODES.
	    LDYI	0		;PREP TO GET STUFF.
	    LDADY	INDEX		;GET COUNT AND
http://www.obelisk.demon.co.uk/6502/reference.html#PHP


Here's a nice overview of all the 6502 instructions: http://www.obelisk.demon.co.uk/6502/instructions.html


Did you try searching for "6502 php"? The first hit would have given you a pretty good overview of its operation. Assembler mnemonics are usually unsearchable without including the architecture in the query.


Push processor status (so saves processor flags on stack)


Push the processor status onto the stack.


Basic is the first programming language I learned !


me too, on a ZX Spectrum +3


me too. c64


Bill wrote well-commented, well-factored code.


Is the source for the 6809 available anywhere?


A friend of mine may have an annotated reversed source for the 6809. If you really need it I can put you in touch. You'll probably need to change the code a bit to assemble it on a 'standard' 6809 assembler, we rolled our own. He's a worse data packrat than I am, his house probably qualifies as a computer museum and I wouldn't be at all surprised if he still had a collection of files from those days.


Thanks, I think I read that source back in the day. Just asking out of nostalgia.


KIMROM=1

Hehe, that's been a while.

http://en.wikipedia.org/wiki/KIM-1


Ah - I've always ignored the existence of a 6501.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: