Have you made or plan to make any contributions to Mezzano (https://github.com/froggey/Mezzano) or are you mainly interested in seeing how far you can take this thing on your own?
I didn't know about Mezzano until now and have never contributed to it. Massive respect to them for what they accomplished. I don't think I have enough knowledge to contribute to a real operating system project like that right now. My experience with Linux drivers is just one small user space driver for my laptop's keyboard LEDs. So for now I think I'll see how far I can take lone.
What’s the minimum kernel version required? (No need for an exact answer, I just want to know if it’s “in the last 3 years” vs “in the last 10 years” etc.)
And is it possible to resolve network names or do anything network related? (Or is it planned?)
I’m always looking for some way to create portable Linux binaries, and I happen to like Lisps. Right now, my best bets are Janet compiled against musl libc or maybe ECL… or just use Python (distributed as source)…
These are the only headers that lone currently requires. When lone is built, a script will read all the system calls defined in those headers and create a table mapping symbols to system call numbers. This is so that you can write symbolic code such as:
(system-call 'write fd buffer length)
Instead of:
(system-call 1 fd buffer length); write is system call 1 on x86_64
Once compiled, however, it should work with literally any version of Linux. The system call primitive can issue any system call given its number and its parameters. If an unsupported system call is made, Linux will simply return -ENOSYS which can be handled at runtime and should not crash the program.
> I’m always looking for some way to create portable Linux binaries, and I happen to like Lisps.
I have this vision in my mind: embedding lone modules into sections of the lone ELF and shipping it out. Zero dependencies, self-contained.
Linux passes processes a pointer to their own ELF header via the auxiliary vector. Lone already grabs this data and puts it into a nice lisp object. Now I just need to code that feature in somehow: parse the header, find the sections, read them into a lone module and evaluate...
SBCL lets you drop core images which if you setup your system properly can be made executable by usesing sbcl as the interpreter, like /bin/sh and shell scripts.
> Lone is a freestanding Lisp interpreter designed to run directly on top of the Linux kernel with full support for Linux system calls. It has zero dependencies, not even the C standard library.
Cool project! Not sure if I'm going to start using it any time soon, but cool nonetheless.
This is interesting. Although, I think it's always useful to point out that runtimes that directly make system calls will have portability issues in other *nix systems. For instance, OpenBSD has recently restricted system calls to a single offset of their libc in order to reduce ROP attacks.
If Lone wishes to be portable, it will need to consider a libc dependency as an option. If not, it can probably get away with direct syscalls on Linux unless Linux kernel developers decide to add support for pinning system calls. I doubt that this would ever be a hard requirement in the kernel, as it would break userland all over, but I could see this being an option for certain distributions as a ROP mitigation. Many features like these flow from OpenBSD to other OSes.
You're absolutely right, it is not portable to other operating systems. I have written about this portability and system call stability here on HN a few times, at some point I decided to compile all the information into a post on my website.
I started lone (and previously a liblinux library) in order to make applications directly targeting the Linux system call binary interface. I chose Linux precisely because it's the only one with a stable interface.
I currently have no plans to make lone a portable programming language.
Many projects start this way. But, as per my comment, the assumption that direct syscall support will be maintained in future Linux distros is also risky.
I worry about that risk as well. I assume that even if Linux were to introduce a mechanism for system call authentication, it would be something lone would be able to use to mark its system call primitive as allowed.
That's a subtle point though. The kernel can't change defaults that break userland, nor can it change or eliminate features that would cause a breakage in userland. But, the kernel can certainly add an optional feature, like syscall pinning, that distributions can enable -- in userland -- to restrict userland. We see this already with seccomp policies meant to restrict and potentially break userland programs that misbehave.
All that Linus guarantees is that, by default, the Linux kernel has no regressions that impact user code. If distributions enable breaking changes through syscalls or sysctls, that doesn't violate any of the rules imposed on Linux. syscall pinning -- if that becomes a thing in Linux -- is something that distributions would enable in order to mitigate ROP attacks.
I think this decision from OpenBSD will more likely discourage developers from even considering it a supportable platform for software that originates in Linux land.
Direct syscall access is not something that is guaranteed in Unix derivatives. Linux is rare in that it provides a stable syscall API. Source compatibility is often only guaranteed when linked against libc or an equivalent low-level runtime library.
And this is generally a bad pattern unless those libc equivalents are services you call (like syscalls) and not a library you have to import or FFI. Requiring importing a library, probably from another language, is not a good alternative to syscalls.
A bad pattern according to whom? Most language runtime libraries import other system libraries as needed. For better or for worse, libc is typically considered to be a system library. It's something that every distribution or Unix flavor provides that is guaranteed to work within the POSIX standard for interfacing with the operating system. It's up to the distribution maintainers to make that happen, even if they tweak things to support syscall pinning or seccomp rules.
Userland directly calling a stable syscall API is a rare thing outside of Linux, and there is no guarantee that it will last forever even in Linux given the latest attacks. With modern ROP mitigations like syscall pinning, it will in fact be more dangerous to make syscalls directly -- if allowed in your distribution -- than it would be to call the minimal footprint of libc required to bootstrap a high level language runtime.
Of course, with special pleading, it could be possible for distribution or OS maintainers to carve out an exception for syscall pinning for a particular language runtime. Ask Go how that's going for their OpenBSD port.
The problem with system libraries requiring importing them as C libraries isn't new and doesn't seem to be going away. It has caused all sorts of problems for alternative languages over the years that it seems like an alternative would give all of computing a giant boost by allowing different models that don't work well with C. Stabilizing and standardizing the syscall interfaces would be one way to accomplish this and is the closest thing we have to it now. Implementing syscalls as a separate service might also work but then you have the IPC overhead. That might not be as bad though as we'll end up with something like that anyways as requirements ramp up for C to have it's own runtime (eg. https://dslab.epfl.ch/research/cpi/).
For most language runtimes, the minimal requirement for libc integration is to cover the standard Unix calls (unistd) which don't require specific memory management and typically just pass buffers directly from the caller to the kernel. For most of the system calls in which a high level language runtime would be interested, the libc code is largely a direct pass-through already. As such, either directly using the system calls or calling them through libc will have negligible impact on how the high level language chooses to model these things.
libc isn't really getting in the way here.
Perhaps POSIX might come up with an alternative library to wrap system calls in the future, but I would suspect that it would probably be written in C on most platforms, or at least using a C compatible ABI. So, even if a platform chose to use Rust with a large littering of unsafe all over to make it work with the kernel, it would still have to be able to be linked with C userland.
It depends. For the standard set of system calls, the libc is pretty great. For Linux-specific features, it could take years for glibc to gain support, if it ever does. All the libcs will get in the way if you try to use something like the clone system call:
My obsession with Linux system calls started years ago when I read about an episode where glibc literally got into Linux's way. The LWN wrote extensively about the tale of the getrandom system call and the quest to get glibc to support it:
> maybe the kernel developers should support a libinux.a library that would allow us to bypass glibc when they are being non-helpful
That made a lot of sense to me. I took that concept and kind of ran with it. Started a liblinux project, essentially a libc with nothing in it but the thinnest possible system call wrappers. Researched quite a bit about glibc's attitude towards Linux to justify it:
The more I used this stuff, the more I enjoyed it. I was writing freestanding C and interfacing directly with the kernel. The code was so clean. No libc baggage anywhere, not even errno. And I knew this could do literally anything when I wrote a little freestanding program to print my terminal window dimensions. When I did that I knew I could write code to mount disks too if I really wanted to. I was talking to the kernel.
Eventually I discovered Linux was already doing the same thing with their own nolibc.h file which they were already using in their own tools. It was a single file back then, by now it's become a sprawling directory full of code:
> there is no guarantee that [syscall stability] will last forever even in Linux given the latest attacks
That's true, but what of it? Linus won't last forever, Linux won't last forever, computers won't last forever, and Homo sapiens won't last forever. Everything needs maintenance sooner or later. "The Rockies may crumble / Gibraltar may crumble / They're only made of clay."
What you say is true, but you've inserted an inaccurate context with the quote.
There is no guarantee that _direct access to system calls_ will last forever...
Stability in the syscall API exists because Linux is a kernel that supports multiple distributions. Not because random applications could call it. The latter is an emergent feature, but not one that distribution maintainers will necessarily respect. POSIX only guarantees access to functions in libc that can perform these calls. As mentioned elsewhere in this thread, there are specific reasons why future direct access to system calls in user code could be restricted. Whether they will or not comes down to how distribution maintainers decide to deal with syscall related ROP gadgets.
To be fair, linking to kernel32 is a bit different than linking to msvcrt, but yeah, it’s Linux who’s the slightly insane person in the room, not the other way around.
So I guess the difference from Janet is no C standard library dependency; is this being targeted for hyper-slim and embedded uses? Because Janet seems like a good choice for most desktop and server cases.
I think it would be really interesting to load the init system (scheme-based GNU Shepherd in this case) directly from the kernel instead of ever loading a shell environment. I bet we could factor out a lot of cruft from the current GNU OS implementation that way, especially if you are managing your system environment/configuration declaratively via scheme/lisp.
I've heard about it and read discussions about it here on HN. I've never used it or learned the system.
> load the init system directly from the kernel instead of ever loading a shell environment
That's essentially what I want to accomplish with lone, and what I wanted to inspire others to do.
Can I help with that endeavor somehow? It's my understanding that GNU has a huge focus on portability, I assumed they would not be receptive to my Linux first approach. Perhaps the requirements are different for GUIX?
I haven't really given GUIX an honest try yet. So far, I've been content to use NixOS. I would definitely recommend either, so long as you are patient enough to learn their unique idiosyncracies.
GUIX does claim very broad compatibility, particularly with kernels. You can even use HURD instead of Linux if you really want to.
On the other hand, the GUIX project is ideologically opposed to proprietary software, so you won't find much help in that arena.
It's certainly what I had in mind when I started the project. Writing a full Lisp operating system is extremely hard, better to take advantage of Linux and its drivers so as to avoid spending an entire lifetime recreating them.
It's my understanding that a true Lisp machine would have hardware support for Lisp abstractions at the instruction set level, so I don't think the concept would apply to lone. I would be seriously honored if people considered it one though, even if only in spirit.
Well, if by lisp machine we understand a processor that can run native lisp, of course not, but I was dreaming with a modern lisp machine, and this is the best that can be practically made
Very cool project. Not commenting on that. Lisp and it's derivatives often hits the front page of HN and I always wonder why. What is it about lisp that is so powerful? so much so that some see it as the platonic ideal of programming languages or so it seems?
I have a lot of respect for lisp and its heritage. Respect is certainly one reason why I chose to write a lisp.
Other reasons include simplicity, practicality and ease of implementation. Lisp has a very simple syntax and it is relatively easy to parse it and implement a basic interpreter. I wrote the lexer and the parser by hand, there was no need to mess around with parser generators.
Another reason is I've come to see lisp as something of a frontend for C data structures. I have a byte buffer, encoded text, linked lists, resizable arrays, hash tables... Lisp is the language that binds them all together.
Another reason is that I knew how powerful lisp was despite the simplicity. Despite being a small project, lone is already metaprogrammable with FEXPRs. It turned out I needed exactly one bit in order to give lone macros.
if (function->function.flags.evaluate_arguments) {
arguments = lone_evaluate_all(lone, module, environment, arguments);
}
It just doesn't evaluate the arguments if it's a macro. The function gets the lists that represent the expressions of each argument instead of the final value they compute to. And those lists can be manipulated just like any other list.
I think that was the moment I got the fabled enlightenment they say lisp programmers experience. It just brings a smile to my face.
Well, Y combinator, the domain of HN is intimately related to Lisp, as the founder Paul Graham. That is how I landed in this site to begin, and I assume is so for many around. That could explain it a little bit.
Lisp code is made of lists. And lists are the main data type for lisp programs. So you can naturally produce and transform lisp code using lisp code. Not all lisps allow that, though, but I think that's the main differentiator and it's unique in that aspect.
So it's like generating JS code text from JS, but working with JS code as a text is much more confusing compared to working with lisp code as a list.
Beautiful. Also, after 30+ years of writing C and C++, I learned one thing by casually browsing the source code: you can use a preprocessor macro in an #include statement. Thanks.
I used it to include architecture-specific source code and also a generated C file containing a table of Linux system calls defined by the Linux UAPI lone is compiled against.
The makefile defines those macros by passing flags:
Oh so it's a GCC-specific thing... that explains it. In any case, congratulations on the lisp! It's beautiful code. I wish my current codebase was that beautiful. I mean, it's a lot of us contributing to it so the beauty is a kind of "meeting of the minds" situation... but still :)
PS: My current project implements a lisp... rendered as JSON. Some other angle on beauty...
Writing your software to directly use the syscalls of a specific kernel does not make it "zero dependency", it makes it "one dependency" - and non-portable.
TBH I have mixed feelings about this approach. It's true that this is more or less what Go or (Cosmopolitan libc) do, but the motivation in their case is to maximize portability (by making cross-compilation trivial). However when you #include <linux/...>, you not only make your software non-portable, you also make it a PITA to cross-compile as you need the kernel headers on the host machine.
In contrast, with Go or cosmo, I can trivially build a tiny /sbin/init for amd64 Linux, pack it up with cpio, and run it with qemu-system-x86_64 -initrd - all from a Mac/arm64 host.
> However when you #include <linux/...>, you not only make your software non-portable, you also make it a PITA to cross-compile as you need the kernel headers on the host machine.
Yes, that is certainly a problem that I need to solve.
I added some support for cross compilation in the makefile. It currently requires clang for that.
ifdef TARGET
ifndef UAPI
$(error UAPI must be defined when cross compiling)
endif
TARGET.triple := $(TARGET)-unknown-linux-elf
override CC := clang -target $(TARGET.triple)
else
TARGET := $(shell uname -m)
endif
With this, I was able to cross compile lone for x86_64 from within the Termux environment of my aarch64 smartphone. All I had to do was obtain the Linux user space API headers for x86_64. Getting those headers was somewhat annoying but doable, there are packages for them in many Linux distributions.
I made a Termux package request for multiplatform Linux UAPI headers specifically so I could cross compile lone but unfortunately it was rejected.
Surely its more than one dependency. For example, you'll need a processor. Not just any processor, but a processor for which a C compiler has been written.
Like sibling comment points out, the CPU and the rest of the universe can be considered indirect dependencies. Once you have everything you need to boot the Linux kernel (e.g. laws of physics, paid the power bill...), you're good to go ;)
For that matter you’ll need a universe, the physics of which must allow both semiconductive metals and the eventual evolution of multicellular biochemistry.
At that point we really get into semantics... But now that you mention it, I would be interested to see if it could be built with APE to benefit from their bare metal support.
Yes. It has zero runtime dependencies but development currently assumes GNU tools. The test suite for example is entirely written in bash and uses GNU coreutils. I submitted a patch to coreutils to allow env to set argv[0] of programs specifically so I could use it in my test suite.
Currently lone is a single C source file. It could easily be compiled manually if necessary.
I've started reorganizing the repository though so that's likely to change.
Is there a plan to write a lone compiler so as to eventually have lone bootstrapping/compiling itself so that you no longer have to sully your hands with C?
I would like that. I wanted to create the simplest possible reference C implementation first so the language can always be bootstrapped with a C compiler. After that, I'll probably make a better one. I'm considering a Rust implementation as well.
At least that's what I tell myself. Just this simple interpreter has already generated a lifetime of work. It's making me wish I had infinite time to work on it.
I understand what you mean. By dependencies I meant user space libraries such as glibc and musl.
The language itself is fully self-contained. It initializes itself with nothing but a static array of bytes.
Could be possible to modify lone to run on bare metal instead. Perhaps by replacing the Linux system call code with BIOS I/O functions and replacing the Linux process entry point code with boot code that initializes the CPU and hardware.
>Could be possible to modify lone to run on bare metal instead. Perhaps by replacing the Linux system call code with BIOS I/O functions and replacing the Linux process entry point code with boot code that initializes the CPU and hardware.
Just don't make the mistake golang made. In most systems, the "stable interface to the kernel" isn't the syscall, but the c library.
I have always found it absurd that Linux insists on stable syscall ABI, and yet does not have a standard driver API.
Hopefully, the world will migrate to a better system at some point. It will be exactly the opposite: It will not provide a stable syscall ABI, and it will have a standard driver API.
Incidentally, it'll be microkernel, multiserver.
Many articles will then be written about the maintenance burden Linux had, and how we should have done this much earlier.
Unstable kernel ABI gives the Linux kernel enormous leverage: device makers either upstream their drivers under the GPL or they get left behind.
Stable userspace ABI gives people like me the complete freedom to build anything on top of Linux. I can just discard stuff like glibc and build my lisp user space for the fun of it. Rust programmers can do the same.
>> Assuming that you are in protected mode and not using the BIOS to write text to screen, you will have write directly to "video" memory.
... is bypassing the BIOS and interacting directly with hardware. Which is a thing you can do in some circumstances, but it's very limited -- especially if you want to do anything beyond simple console I/O.
I was hoping to polish it up a bit more so it would be worthy of a Show HN, never thought someone else would submit it here. Really made my day!