That is kool. I was really into Forth in 1978 when I bought an Apple II, serial number was 72. I was going to port Forth to my Apple but two guys at Computer Land in San Diego dissed the idea, saying there were a few ports already. I was young and stupid and actually cared about what other people said. I ended up six months later going deep in every Lisp language implementation I could get access to, and that changed my tech life. Anyway, Konilo looks cool!
I remember typing in program into GraFORTH on my friend's Apple ][ as he read it to me. The resulting animation of the earth and the space shuttle opening its cargo doors was amazing back then.
I give a little chuckle whenever someone mentions Esperanto. It's the official language in the tv show Red Dwarf and Rimmer always struggles to use it. Its a bit of an ongoing joke.
100r have really set the bar for what stack language/VM I would actually play with if I had the time. I see how it can immediately be useful for writing GUI/music/etc apps. And the instructions/guides are sooo good.
OOI, this bytecode VM uses 30 instructions while something like Freeforth has 55 forth instructions- is it actually worth having a bytecode VM under the forth VM?
Arguments for are that you could migrate processes easier, and I/O has a std interface. But against that it's not native code.
I'm struggling with defining the instruction set for my own forth-style VM.
I think the instruction set, or the minimum number of primitives you need to implement in the "host" not the language itself is fascinating.
I've looked at a lot of implementations that have a DROP definition like this, for example:
: drop ( x y -- x ) dup - + ;
That works UNLESS your stack has only a single value on it. Things like this get simpler if you have a variable pointing to the top of the stack where you can just change that - but otherwise you end up having to implement DROP as a built-in.
You can obviously implement "<" in terms of ">", and if you can multiply by -1 you can implement subtraction in terms of addition, but at the same time writing "-" is no harder than writing "+", if you're not targeting a small CPU like a Z80.
I love to compare implementations, to see which core primitives they've chosen, and explore the consequences. These days I think my interest is more in implementing FORTH than actually using it, but I suspect I'm not alone in that regard!
Obviously instructions can be chosen for efficiency, else we'd all be using the 7 instructions of sectorforth. In fact freeforth has additional instructions for register renaming (to avoid SWAP instructions).
So what's a good set? Most seem to have 25-30 (Mako, Retro, Jones, Stoneknife)
I'm not sure I'd quite count this as Forth; it's more of a very simple monitor for loading and running code. It's not a bad approach for use with a host Forth though.
Not the intended audience I guess. I wonder whether one can use this to learn Forth. The free-to-evaulate Forth SwiftForth has a Starting Forth book. Can there be something like a tutorial like that. Those comments does not run here (and SwiftForth is 32 bit not running on my Mac Arm BookAir 15.
Konilo is not an ANSI Forth. It's a unique dialect with similarities to my older RetroForth system, which is also not a traditional model.
In Konilo, that example would look like:
:variable d:create #0 comma ;
and using it would appear as:
'ORANGES variable
While much of the system is complete, documentation has been lacking, partly due to me not investing much time on this while things were evolving. Now that it's stable, I am working on this, and have added some information to the manual (see https://konilo.org/manual) to help clarify that it's not a traditional Forth model.
Thanks. It is great as I can run under my ipad both pythoista and ish even. But changing the basic syntax will be hard for beginner. May I suggest have some tutorial. In particular for both external (say vim or any external shell script) and internal (hoe to use any editor and save the current work in progress) use. I found it hard even to navigate under your hypertext, sorry …
I'm hoping to have the documentation considerably improved over the remainder of the year. Apart from the current work on the hypertext manual, a larger book format manual and tutorial are being worked on, though they're both very incomplete at present.
On the internal/external bits, Konilo is largely intended to be a self contained environment, but it is possible to use in some contexts with outside tools. I'll write up something on this in the next few days and will post a reply in this thread with a link once it's done.
It's interesting to contrast a "Forth OS" with a what might be considered a more conventional one. And by this I'm talking about early small machine systems like CP/M and its contemporaries.
At a fundamental level, you had a kernel of some kind, and free space within which programs could be loaded and executed. With a Forth, you could get that with about 6K of total overhead.
One of the big contrasts is that with Forth, at the time, programs were loaded from source code, whereas everyone else used binary code. But another interesting aspect was that in Forth, when you were done with a program, you had to free up the space, if necessary, to run another program. In something like CP/M, when a program was exited, the OS reclaimed the space, and, to run it again, the program would have to be reloaded from storage.
The next big difference was the file system (or lack thereof). Forth does not have a file system, it simply organizes storage as a sequence of 1K blocks. It was common to have "table of contents" screens to track where programs were stored on disk, these all had to be manually maintained.
CP/M had a much better file system, but its interesting to contrast it with the UCSD P-System, which had a very crude file system. While it indeed had named files, it suffered from requiring them to be contiguous, and really only allowing one file to be written at a time (not exactly true, but only one file could grow freely at a time -- the file "at the end"). However, reorganizing, and compacting free space was a routine task under the P-System.
The Forth system being rock simple was quite easy to make a mistake and lose data, for example if you were relocating source code to make new space, you could easily overwrite something accidentally, since you were responsible for the actual disk locations of the operation.
The final nice feature of the Forth system, speaking of disk blocks, was the crude virtual memory system. Here, disk blocks were arbitrarily mapped to memory buffers, with a simple LRU system, and dirty buffers automatically being flushed back to disk. This made persistent memory mechanics quite easy to use.
Of course, for most micro users this was not as hazardous, as most of the work was done at the "floppy" level in contrast to working on a hard disk with 5000 screens. But that was the reality of native Forth systems of the day, and folks just worked with it, and leveraged it to get work done.
The idea of those basic ideas being scaled up to a modern machine with GB of RAM and enormous hard disks is an interesting curiosity. It certainly gives you a lot of room to be sloppy. Such a system wouldn't happen, the frugal nature of those early Forth systems was no longer required, so adding complexity and space to make the userland more friendly is worth the investment.
This is an interesting idea but I'm not sure fitting in L1 caches would work that well for threaded Forths. Most CPUs align on word boundaries based on their pipeline size and unless you organize your Forth around both the L1 size and the instruction pipeline, you're gonna get huge pipeline stalls and pay branch predictor penalties. Also with modern processors you're gonna pay a huge hit with stack operations as opposed to using registers.
As far as using OO Forth, you can also just use specific words that offer a safe interface over more dangerous raw memory words. Old Forths probably worked directly with the filesystem more because of code size limitations than any real abstraction penalty.
The forth base I was thinking of is Freeforth, which is an STC Forth that assigns registers to the top two stack elements to avoid SWAP instructions.
I take your point about caches, but surely a Forth like this that optimises the words during compile (inlining etc) would help?
I guess that this hypothetical stc Forth would have to have abstracted I/O and registers- any spare ones outside the stack management could be used for data passing, maybe as the top of the data stack.
Of note, this cache friendliness would dictate the kernel wordset, so choosing it would be dependent on your cpu targets.
Really the OO is in support of vocabulary modules and data spaces. Freeforth has a cross compiler environment to MSP430 cpus - as a module this would make hygienically writing to new targets easy.
I took a look at Freeforth. It's a pretty innovative Forth.
> which is an STC Forth that assigns registers to the top two stack elements to avoid SWAP instructions. I take your point about caches, but surely a Forth like this that optimises the words during compile (inlining etc) would help?
Certainly. The way Freeforth seems to be doing things is by storing the swapped state and by testing against that to make a branch. I'm curious how this works with branch predictor penalties, but if swaps are done in loops then it might actually work out very favorably. You need to be careful that your code doesn't use too much ROT or DUP but as long as it's mostly SWAP oriented it should be okay.
> I guess that this hypothetical stc Forth would have to have abstracted I/O and registers- any spare ones outside the stack management could be used for data passing, maybe as the top of the data stack.
Forth implementations are notoriously architecture specific for exactly this reason and the differences in registers and stack semantics between architectures. A lot of recreational Forths target abstract VMs exactly to be able to target a single architecture and not care much about performance, but if you want to adapt Freeforth to different architectures you may need to make deep changes. I'm not sure how FASM works but maybe FASM decreases some of the pain.
chuck moore is running colorforth as a process under windows these days because it's gotten too hard to write the damn device drivers
before that, though, he was using ram as his 'block file' (i think having reduced from 1024-byte blocks to 256-byte blocks) and only using the disk to make changes persistent. also he still recompiles programs from source every time he switches context from one program to another; the program he's switching from gets discarded, and the one he's switching to gets recompiled
i remember having to defragment my floppy under the p-system in order to make a big enough piece of free space for the file i wanted to copy onto it. i have the vague memory that cp/m supported non-contiguous files; hdos certainly did
while i don't intend to question your assertion that forth users at the time maintained their tables of contents blocks manually, it seems like you could implement a fat-like directory-maintenance system in less than one screen of code. like cp/m, you could maintain file sizes in blocks rather than in bytes (but 1024-byte blocks, of course); like some dec tape formats (?), you could use four bytes for the filename with a weird encoding (forth's native base 36 instead of dec radix-50). suppose you have a double-sided disk with 40 tracks of 9 512-byte sectors per side; that's 720 in total, or 360 forth blocks. you can't fit a block number into one byte, so the simplest thing is to use two. so if you want to make fat-style linked lists, you need 720 bytes of block numbers, for the following block number. that leaves you 304 bytes for filenames and starting block numbers; if filenames are 4 bytes and starting block numbers are 2 bytes you have room for 50 files, which is probably okay
so i feel like maintaining a table of contents automatically would have cost about 1% of the disk space (1 block for the code and 1 block for the directory and allocation table), so it would have been a good idea
the interface might look something like
open ( d u -- flag ) create file named d, size u blocks, returning 0 on success
extent ( d -- u ) return the number of blocks in file d, or 0 if nonexistent
delete ( d -- ) delete file named d if it exists
filename ( c-addr u -- d ) convert ascii filename to a double-precision number
.filename ( d -- ) print the filename d
file ( u -- d ) return the name of the uth file in the directory or 0 if none
maxfiles ( -- 50 ) return the maximum possible file index to pass to file
index ( d u1 -- u2 ) return the block number for block u of file d
and from there you can use the usual forth block words to read or modify the contents of the files, such as block, update, load, and save-buffers. things like refill and thru, which presuppose semantics for the sequencing of block numbers, might benefit from file-aware versions. for example, you might supplement thru with a word old which loads each block of a file, whatever sequence they're in on disk:
2variable oldfile
: old ( d -- ) 2dup oldfile 2! extent 0 ?do oldfile 2@ i index load loop ;
or i guess that, if you can trust the code being loaded to not leave crap on the stack between blocks, it might be justifiable to just write
: old ( d -- ) 2dup extent 0 ?do 2dup i index load loop 2drop ;
you might also want to list the directory of files:
: stat ( d -- ) 2dup or if 2dup .filename ." " extent . cr else 2drop then ;
: ls-l ( -- ) maxfiles 0 do i file stat loop ;
on further consideration, i think that interface requires about one and a half blocks of code to implement. this interface would be more useful and have a simpler implementation
open ( d -- flag ) create file d, size 0 blocks, returning 0 if ok
extent ( d -- u ) return the number of blocks in file d, or -1 if none
delete ( d -- ) delete file named d if it exists
extend ( d -- flag ) append a block to file d, returning 0 if ok
filename ( c-addr u -- d ) convert ASCII filename to a double number
.filename ( d -- ) print the filename d
file ( u -- d ) return name of the uth file in the directory or 0
maxfiles ( -- 50 ) return the maximum valid file index
index ( d u1 -- u2 ) return the block number for block u1 of file d
In the Forth golden age there was a convention of grouping blocks in triads -- you could print one per page. E.g. when I proposed to rewrite the "full-screen-style" editor at FORTH, Inc., they were like "if you think you can get it down from 4 blocks to 3, then go ahead" so it wouldn't be wasting 2/3 of a triad. (You might remember this, it was the more-conventional editor they didn't use, contributed by a customer.)
So maybe make this filesystem work on triad units: an index fits in a byte, more space for longer filenames. Not sure what to do with the extra block in the filesystem triad.
that makes sense; i had seen forth printed that way but had forgotten about it. i think the code was printed side by side with the shadow screens (the comments), requiring 132 columns (4 of padding i suppose)
i think that, if you want to put the metadata into multiple blocks, you need to have some control over the order of buffer saving. one way to do this would be to have two copies of the metadata that you alternate between, which would be desirable anyway in case of power loss during metadata writes. maybe once every 30 seconds, say. after your last modification to replica tweedledee and before your first modification to replica tweedledum, you would call save-buffers, ensuring that briefly tweedledee is also consistent on disk (tweedledum already having been consistent), even if buffer evictions later put tweedledee into a torn state
then the question is how, upon startup, to know whether to use tweedledee or tweedledum? maybe an update serial number at the beginning and end of the block that has to match?
using a cluster size larger than the forth block size seems like it might complicate the calling interface somewhat
I wasn't proposing keeping file metadata in multiple blocks -- just, might as well consider if any functionality might be worth adding in the remaining block of the filesystem triad, if it's gonna be a triad (source code, metadata, empty). (This makes the larger minimum size of these triads a bit of an attractive nuisance from a "Forth Zen" pov...)
oh, yeah, that makes sense. you could probably fit most of the standard cp/m or unix filesystem utilities into the remainder; cp, rm, mv, cat, head, and so on. maybe nsweep
ilo could likely be made to work, though with some limitations.
From a quick reading of the linked system:
- ilo is little endian (for block & rom format); if you wanted to keep external compatibility, this would need to be dealt with. It's not difficult to do this; e.g., the 68k Mac version does this: fossils.retroforth.org:8000/ilo/file?name=vm/ilo-68k-mac.c&ci=tip
- this system presents some memory limits that would necessitate changes to ilo. Two aspects stand out to me.
First, separate code & data sections, with a limit of 64K each. ilo's standard memory area is a flat 65,536 32-bit words, needing 256K. This can be reduced, but will then not work with a standard Konilo rom. (The memory layout in the Konilo rom can be edited by modifying the assembly, but the overall reduction in available space will be very limiting for a full Konilo system. It'd be fine for using just the basic wordset in the rom or for assembly programs).
Second, memory is divided into 2K banks, so a bit of work in ilo may be needed to deal with that. The previously linked Mac version can deal with this as well.
- performance will be slow.
ilo is internally a 32-bit system. I've built & run it on an old 8088 MS-DOS system, but it's very slow there. I suspect it'd also be quite slow here as this is another 16-bit system.
I don't have the timings for this, but starting up a minimal Konilo system on an emulated Mac Plus (mini vMac @ 1x, emulating an 8MHz 68k CPU) takes 1m8s. It took quite a bit longer to start up on the DOS system.
Though again, this is when running Konilo. It'll be much faster with smaller assembly programs or things that do less i/o.
There is some limited graphics support. The underlying system (ilo) has a branch for x11 providing a limited framebuffer and pointing device support, and the native x86 system optionally supports a low resolution display (but no mouse yet). My son & I are working on expanding the graphics functionality, though we aren't expecting to have this work finished until later this year.
You might want to take a look at forthstation which runs on an esp32 and outputs to vga.
It's relatively simple to switch to SPI lcds as the underlying esp32forth compiles under the arduino ide
That was more of a thing in earlier days (before my time) when Forth typically ran on the bare hardware, replacing the OS. Later, blocks vs files became a big debate in the Forth world, but with Forth now usually hosted under a conventional OS, files have mostly won out except among some die hards. The ROM code for the Green Arrays processor though is still written using blocks. You can download it as a nicely formatted HTML file from greenarrays.com.
Here's a Usenet post from 2003 that discusses blocks:
The central thing Forth depends on is KISS. We've consistently rejected
complicated solutions when simpler solutions work. The result is fewer
co-adaptations, where you can't change one thing because it's built into
a bunch of other things. We do have some of that but not very many.
Giving up blocks removed a lot of them. Blocks gave an integrated
editor, and fast integrated compiling, and a sort of paged virtual
memory, and something you could easily turn into a somewhat inferior
database, etc. It wasn't the best solution for very much but it gave an
easy adequate solution for a whole lot. I read that using blocks was a
fundamental cause of the Valdocs failure so many years ago. They
started out with blocks and then ran into limits and instead of throwing
them out and starting over they tried to design around them and ran out
of time. Now we use blocks when the problem demands them instead of
wherever they look good enough. It takes longer and we probably get
better solutions. -- Jonah Thomas, [[https://groups.google.com/d/msg/comp.lang.forth/7gWtjMoQXk8/Wr0yBdbZMWsJ][comp.lang.forth 2003-07-18]]
Filesystems introduce complexity that may not be needed, especially on smaller targets.
I'm personally not averse to file systems. I've implemented (but not yet published) a couple for testing purposes. It's probable that one or more of these will eventually be included in Konilo. But that won't happen until I write a program that needs this and have an implementation that I'm happy with.
There is a version that runs natively on x86 hardware (through it's a 32-bit system, aimed at older hardware; my newest x86 hardware is over a decade old and I've not had time to work much on a 64-bit system & drivers). See konilo.org/x86
As someone born after C and its derivatives took over the world, Forth has always been this "weird cool old language". It seems like Forth was this very beloved language by a small set of enthusiasts, but I'm not entirely sure why.
I also cannot quite tell if it's a high-level or low-level language, though maybe that's the point? It definitely seems like it might be worth learning for the retro computing development world. Is my understanding correct?
It's a high level language compared to Assembler. Like a lot of those early experimental languages it's worth learning a bit. It's also a pretty much complete failure other than a few standout uses (the main one I can think of is that the boot loader on Sun SPARC machines was written in Forth).
To me it fits in an odd place. It was much hyped/replicated on Micros in the 1980s as it made the Basic implementations on those machines seem very old fashioned. For most users who were already working on minicomputers in languages like C or Lisp, Forth was just a weird riff on stack based languages that had already been considered and discarded. So the general impression on whether Forth was any good or not depended entirely on what else you had programmed in.
Having said that, stack based languages like Forth, PostScript or even xslt are challenging and thought provoking.
Personally, if I wanted to play with a stack based language I'd play with PostScript or xslt as at least you'd have half a chance to use it professionally.
edit: yeah xslt isn't a stack based language but I always thought it helped when learning it to consider it that way
I would call it critically endangered, but would never call it a failure. Forth landed on a comet, and ran most astronomy labs and telescope control systems back in the day. For a couple decades there was a large amount of productive and unsexy work getting done in Forth, much of it under the radar in control systems for manufacturing, scientific tooling, waldos. It wasn't really until the mid 80s that it started drying up.
I've wanted to write some retro computer games for awhile, but I really hate mucking with Assembly language. It would be cool to have something faster than BASIC but higher level than raw assembly.
There are new versions of the classical 80s computers such as the Commander X16, Zx Spectrum Next, and C65. They're pricey, but I think at least two of them have Forths. It might all be easier to go with an emulator too.
As of right now, I'm happy enough to play with emulators and/or the MiSTer. I was initially excited about the Commander X16, butI've been a little disappointed with the concessions that the 8BitGuy made, and I feel I'll have enough fun with emulation.
There was a really neat looking FORTH for the C64 focused on game programming: White Lightning[0]. oh - it runs on Zx80 too... I always wanted to play with it as a kid, but never got around to it :-/
i think the great appeal of forth is that it's a great repl for interacting with hardware and assembly code. c kind of sucks at this even though c is a better language for programming as such
it's the minimal amount of extra stuff you need to add on top of assembly language to give you recursive subroutines with parameters; structured control flow constructs like if-then-else (ahem, if-else-then), for-next (uh, do loop), and while; user-defined data types; closures (does>); turing-complete compile-time metaprogramming; embedded domain-specific languages; multitasking; and paged virtual memory. all this costs you about 4k of ram and (usually) a 5× slowdown compared to assembly. nowadays some of those are commonly provided by hardware, but forth has them even on hardware and operating systems that don't
in forth you can very easily invoke your assembly code interactively, look at the results, hex-dump parts of memory, script your assembly code, script your hex dumping, script your scripts, and so on. and it makes a fantastic macro assembler if you can get used to the syntax. the built-in repl gives you a built-in user interface, and it also serves as a sort of interactive debugger; you can't step through your code line by line, but you can invoke it word by word, which is pretty close to the same thing
i did this screencast of debugging an integer square root routine in gforth three months ago https://asciinema.org/a/621404 but i should ofer two cautions:
- i'm no forth expert
- if you actually need square roots in gforth you should probably just use fsqrt. 1024 0 d>f fsqrt f>d d. gives you 32 as ahura mazda intended
i haven't tried the 'three-instruction forth' approach myself (which i had misremembered as being due to brad rodriguez rather than frank sergeant) but that paper was definitely a major factor in forming my viewpoint that forth is not really a programming language but rather an interactive environment that happens to have a programming language (a language the 'three-instruction forth' omits!)
you probably do need a fourth 'instruction' in practice: reset, or at least some kind of interrupt that returns you to the interpreter when you're in an infinite loop. usually there's a separate reset pin, though, which leaves memory intact so you can debug what went wrong — or, at worst, set up the thing so you can power-cycle it. usually that goes without saying, but sometimes hardware is physically far away these days...
Forth's niche was embedded control using small 8- and 16- bit computers with from 8 or 16 kilobytes up to 64 kilobytes of memory. Its heyday was the 1970s and 80s when those were the largest, most powerful computers that were economical for embedded control applications. As larger, more powerful computers with more elaborate software became affordable, Forth's niche shrunk away and the language became an obscure curiosity.
In its heyday Forth delivered impressive functionality on very small systems.
You could have an interactive development system with a command interpreter, editor, assembler, compiler and the application on an 8 bit microcomputer with less, sometimes much less, than 64 kilobytes of memory.
Playing with it on a Commodore 64 emulator, I can kind of see the appeal. Writing equivalent programs in BASIC vs White Lightning Forth, the Forth version is substantially faster, without being that much more difficult than BASIC once you figure RPN out.
I used an HP calculator in high school and use an HP emulator on my iPhone, so I'm already pretty good with RPN, so the weird syntax is coming fairly natural to me, and it's substantially easier than doing everything in raw assembly.
It's a very good language to play with to get a glimpse of "just the computer". It's still an interpreter environment, but it's one that crashes spectacularly because you accidentally swapped your arguments and wrote to address 3.
The most fundamental difference in feel when programming Forth for an application vs C is that you don't have a stack frame. A stack frame is what lets you throw 10 arguments and 20 local variables into a function, because the compiler binds them all to names and you fire away and then it's cleaned up automatically when you're done.
In Forth, you do not want to do that because every stack variable you use, you have to manually bring to the front with stack manipulation words(e.g. SWAP to flip the position of the top-most and second, ROT and -ROT to shuffle the top three). It's
a pretty big deal, and if you let that define your application, you're cooked. You can define temporary buffer or register spaces instead and do some load and store from those to the stack to get through the nastier bits of computation.
But in exchange for that, you get some fascinating possibilities to do tacit(point-free) programming instead. That is, because you've removed names from everything except word definitions(which are just sequences of calls to other words), your stack operations now compose together without needing any additional ceremony. And because you can instruct the interpreter to switch modes between "compile" and "execute" at will, you have a Lisp's degree of metaprogramming, so things like strings and formatted numbers are implemented as modes that jump out of the regular interpreter and wait for an end character to show up in the buffer. This extends even to stuff like IF ELSE THEN(control flow is negotiated with the help of a separate stack).
So it's wildly powerful in particular ways, but boring algorithmic code is often just as hard to write as the metaprogramming stuff, because every time, you have to work through your stack positions and try to break it down to a reasonable composition. And by the time you do that, you have a ton of confidence in having written a very exact spec, but also, the thing you did was probably something like "print a fixed point number".
So it's a very good language to use to make you detail-oriented and build up a personalized way of working. It really doesn't lend itself to flinging things around and telling the rest of the team "the code is self-documenting".