Hacker News new | past | comments | ask | show | jobs | submit login
A standalone zero-dependency Lisp for Linux (github.com/lone-lang)
191 points by keepamovin on Nov 6, 2023 | hide | past | favorite | 89 comments



Hello everyone! I started this project, happy to answer any questions.

I was hoping to polish it up a bit more so it would be worthy of a Show HN, never thought someone else would submit it here. Really made my day!


Have you made or plan to make any contributions to Mezzano (https://github.com/froggey/Mezzano) or are you mainly interested in seeing how far you can take this thing on your own?


I didn't know about Mezzano until now and have never contributed to it. Massive respect to them for what they accomplished. I don't think I have enough knowledge to contribute to a real operating system project like that right now. My experience with Linux drivers is just one small user space driver for my laptop's keyboard LEDs. So for now I think I'll see how far I can take lone.


What’s the minimum kernel version required? (No need for an exact answer, I just want to know if it’s “in the last 3 years” vs “in the last 10 years” etc.)

And is it possible to resolve network names or do anything network related? (Or is it planned?)

I’m always looking for some way to create portable Linux binaries, and I happen to like Lisps. Right now, my best bets are Janet compiled against musl libc or maybe ECL… or just use Python (distributed as source)…

Edit to add: I love the idea. Great work!


> What’s the minimum kernel version required?

In order to build lone, any version after the Linux UAPI headers split should do.

https://lwn.net/Articles/507794/

https://stackoverflow.com/q/18858190/512904

https://github.com/torvalds/linux/tree/master/include/uapi

These are the only headers that lone currently requires. When lone is built, a script will read all the system calls defined in those headers and create a table mapping symbols to system call numbers. This is so that you can write symbolic code such as:

  (system-call 'write fd buffer length)
Instead of:

  (system-call 1 fd buffer length); write is system call 1 on x86_64
Once compiled, however, it should work with literally any version of Linux. The system call primitive can issue any system call given its number and its parameters. If an unsupported system call is made, Linux will simply return -ENOSYS which can be handled at runtime and should not crash the program.

> I’m always looking for some way to create portable Linux binaries, and I happen to like Lisps.

I have this vision in my mind: embedding lone modules into sections of the lone ELF and shipping it out. Zero dependencies, self-contained.

Linux passes processes a pointer to their own ELF header via the auxiliary vector. Lone already grabs this data and puts it into a nice lisp object. Now I just need to code that feature in somehow: parse the header, find the sections, read them into a lone module and evaluate...


SBCL lets you drop core images which if you setup your system properly can be made executable by usesing sbcl as the interpreter, like /bin/sh and shell scripts.


The dumped images are not portable, right? I thought I read they can only reliably be run by the exact same build of SBCL.


Executable images bundle the runtime with the image.


> Lone is a freestanding Lisp interpreter designed to run directly on top of the Linux kernel with full support for Linux system calls. It has zero dependencies, not even the C standard library.

Cool project! Not sure if I'm going to start using it any time soon, but cool nonetheless.


Thank you!


This is interesting. Although, I think it's always useful to point out that runtimes that directly make system calls will have portability issues in other *nix systems. For instance, OpenBSD has recently restricted system calls to a single offset of their libc in order to reduce ROP attacks.

If Lone wishes to be portable, it will need to consider a libc dependency as an option. If not, it can probably get away with direct syscalls on Linux unless Linux kernel developers decide to add support for pinning system calls. I doubt that this would ever be a hard requirement in the kernel, as it would break userland all over, but I could see this being an option for certain distributions as a ROP mitigation. Many features like these flow from OpenBSD to other OSes.


You're absolutely right, it is not portable to other operating systems. I have written about this portability and system call stability here on HN a few times, at some point I decided to compile all the information into a post on my website.

https://www.matheusmoreira.com/articles/linux-system-calls

I started lone (and previously a liblinux library) in order to make applications directly targeting the Linux system call binary interface. I chose Linux precisely because it's the only one with a stable interface.

I currently have no plans to make lone a portable programming language.


Considering the title and use case it seems to be intentionally pretty specific to Linux


Many projects start this way. But, as per my comment, the assumption that direct syscall support will be maintained in future Linux distros is also risky.


I worry about that risk as well. I assume that even if Linux were to introduce a mechanism for system call authentication, it would be something lone would be able to use to mark its system call primitive as allowed.


Perhaps. To be fair, I'm not aware of anything on the horizon, other than the fact that OpenBSD has been showing off their pinning implementation.

As long as you know it's a possibility, then the point of my original comment is met.

Good luck on this project. I look forward to seeing it progress.


Thank you!


Why is it risky? Linus is adamant that the greatest sin is breaking userland.


That's a subtle point though. The kernel can't change defaults that break userland, nor can it change or eliminate features that would cause a breakage in userland. But, the kernel can certainly add an optional feature, like syscall pinning, that distributions can enable -- in userland -- to restrict userland. We see this already with seccomp policies meant to restrict and potentially break userland programs that misbehave.

All that Linus guarantees is that, by default, the Linux kernel has no regressions that impact user code. If distributions enable breaking changes through syscalls or sysctls, that doesn't violate any of the rules imposed on Linux. syscall pinning -- if that becomes a thing in Linux -- is something that distributions would enable in order to mitigate ROP attacks.


I think this decision from OpenBSD will more likely discourage developers from even considering it a supportable platform for software that originates in Linux land.


Direct syscall access is not something that is guaranteed in Unix derivatives. Linux is rare in that it provides a stable syscall API. Source compatibility is often only guaranteed when linked against libc or an equivalent low-level runtime library.


And this is generally a bad pattern unless those libc equivalents are services you call (like syscalls) and not a library you have to import or FFI. Requiring importing a library, probably from another language, is not a good alternative to syscalls.


A bad pattern according to whom? Most language runtime libraries import other system libraries as needed. For better or for worse, libc is typically considered to be a system library. It's something that every distribution or Unix flavor provides that is guaranteed to work within the POSIX standard for interfacing with the operating system. It's up to the distribution maintainers to make that happen, even if they tweak things to support syscall pinning or seccomp rules.

Userland directly calling a stable syscall API is a rare thing outside of Linux, and there is no guarantee that it will last forever even in Linux given the latest attacks. With modern ROP mitigations like syscall pinning, it will in fact be more dangerous to make syscalls directly -- if allowed in your distribution -- than it would be to call the minimal footprint of libc required to bootstrap a high level language runtime.

Of course, with special pleading, it could be possible for distribution or OS maintainers to carve out an exception for syscall pinning for a particular language runtime. Ask Go how that's going for their OpenBSD port.


The problem with system libraries requiring importing them as C libraries isn't new and doesn't seem to be going away. It has caused all sorts of problems for alternative languages over the years that it seems like an alternative would give all of computing a giant boost by allowing different models that don't work well with C. Stabilizing and standardizing the syscall interfaces would be one way to accomplish this and is the closest thing we have to it now. Implementing syscalls as a separate service might also work but then you have the IPC overhead. That might not be as bad though as we'll end up with something like that anyways as requirements ramp up for C to have it's own runtime (eg. https://dslab.epfl.ch/research/cpi/).


For most language runtimes, the minimal requirement for libc integration is to cover the standard Unix calls (unistd) which don't require specific memory management and typically just pass buffers directly from the caller to the kernel. For most of the system calls in which a high level language runtime would be interested, the libc code is largely a direct pass-through already. As such, either directly using the system calls or calling them through libc will have negligible impact on how the high level language chooses to model these things.

libc isn't really getting in the way here.

Perhaps POSIX might come up with an alternative library to wrap system calls in the future, but I would suspect that it would probably be written in C on most platforms, or at least using a C compatible ABI. So, even if a platform chose to use Rust with a large littering of unsafe all over to make it work with the kernel, it would still have to be able to be linked with C userland.


> libc isn't really getting in the way here.

It depends. For the standard set of system calls, the libc is pretty great. For Linux-specific features, it could take years for glibc to gain support, if it ever does. All the libcs will get in the way if you try to use something like the clone system call:

https://sourceware.org/bugzilla/show_bug.cgi?id=10311

> Ulrich Drepper 2009-06-22 19:35:54 UTC

> If you use clone() you're on your own.

My obsession with Linux system calls started years ago when I read about an episode where glibc literally got into Linux's way. The LWN wrote extensively about the tale of the getrandom system call and the quest to get glibc to support it:

https://lwn.net/Articles/711013/

A kernel hacker wrote in an email:

> maybe the kernel developers should support a libinux.a library that would allow us to bypass glibc when they are being non-helpful

That made a lot of sense to me. I took that concept and kind of ran with it. Started a liblinux project, essentially a libc with nothing in it but the thinnest possible system call wrappers. Researched quite a bit about glibc's attitude towards Linux to justify it:

https://github.com/matheusmoreira/liblinux#why

The more I used this stuff, the more I enjoyed it. I was writing freestanding C and interfacing directly with the kernel. The code was so clean. No libc baggage anywhere, not even errno. And I knew this could do literally anything when I wrote a little freestanding program to print my terminal window dimensions. When I did that I knew I could write code to mount disks too if I really wanted to. I was talking to the kernel.

Eventually I discovered Linux was already doing the same thing with their own nolibc.h file which they were already using in their own tools. It was a single file back then, by now it's become a sprawling directory full of code:

https://github.com/torvalds/linux/tree/master/tools/include/...

Even asked Greg Kroah-Hartman on reddit about it once:

https://old.reddit.com/r/linux/comments/fx5e4v/im_greg_kroah...

Since the kernel was already developing their own awesome headers, I decided to drop liblinux and start lone instead. :)


> there is no guarantee that [syscall stability] will last forever even in Linux given the latest attacks

That's true, but what of it? Linus won't last forever, Linux won't last forever, computers won't last forever, and Homo sapiens won't last forever. Everything needs maintenance sooner or later. "The Rockies may crumble / Gibraltar may crumble / They're only made of clay."


What you say is true, but you've inserted an inaccurate context with the quote.

There is no guarantee that _direct access to system calls_ will last forever...

Stability in the syscall API exists because Linux is a kernel that supports multiple distributions. Not because random applications could call it. The latter is an emergent feature, but not one that distribution maintainers will necessarily respect. POSIX only guarantees access to functions in libc that can perform these calls. As mentioned elsewhere in this thread, there are specific reasons why future direct access to system calls in user code could be restricted. Whether they will or not comes down to how distribution maintainers decide to deal with syscall related ROP gadgets.


No, it’s fine. And common, macOS and Windows do the same.


To be fair, linking to kernel32 is a bit different than linking to msvcrt, but yeah, it’s Linux who’s the slightly insane person in the room, not the other way around.


I wouldn’t say insane exactly, but definitely overly restrictive.


What decision from OpenBSD?

Note that Linux is the odd one here. Every other relevant system out there is not like Linux on this.


Very cool! In a similar vein there’s Zuo: https://github.com/racket/zuo


Zuo is interesting because (as far as I know) its primary user is the Racket project itself, for bootstrapping the compiler.


That is correct


So I guess the difference from Janet is no C standard library dependency; is this being targeted for hyper-slim and embedded uses? Because Janet seems like a good choice for most desktop and server cases.


Yes. My current goal is to boot Linux directly into Lisp and bring up the rest of the system from there. Perhaps even create a Lisp user space.


Have you looked into GUIX?

I think it would be really interesting to load the init system (scheme-based GNU Shepherd in this case) directly from the kernel instead of ever loading a shell environment. I bet we could factor out a lot of cruft from the current GNU OS implementation that way, especially if you are managing your system environment/configuration declaratively via scheme/lisp.


> Have you looked into GUIX?

I've heard about it and read discussions about it here on HN. I've never used it or learned the system.

> load the init system directly from the kernel instead of ever loading a shell environment

That's essentially what I want to accomplish with lone, and what I wanted to inspire others to do.

Can I help with that endeavor somehow? It's my understanding that GNU has a huge focus on portability, I assumed they would not be receptive to my Linux first approach. Perhaps the requirements are different for GUIX?


I haven't really given GUIX an honest try yet. So far, I've been content to use NixOS. I would definitely recommend either, so long as you are patient enough to learn their unique idiosyncracies.

GUIX does claim very broad compatibility, particularly with kernels. You can even use HURD instead of Linux if you really want to.

On the other hand, the GUIX project is ideologically opposed to proprietary software, so you won't find much help in that arena.


I'm new to Lisp - would this then be a modern Lisp machine?


It's certainly what I had in mind when I started the project. Writing a full Lisp operating system is extremely hard, better to take advantage of Linux and its drivers so as to avoid spending an entire lifetime recreating them.

It's my understanding that a true Lisp machine would have hardware support for Lisp abstractions at the instruction set level, so I don't think the concept would apply to lone. I would be seriously honored if people considered it one though, even if only in spirit.


Well, if by lisp machine we understand a processor that can run native lisp, of course not, but I was dreaming with a modern lisp machine, and this is the best that can be practically made


Just compile regular Lisp with cosmopolitan. Then the same binary will run on windows, linux, mac, and BIOS. /s

This has been done with Lua, see: https://github.com/jart/cosmopolitan/issues/61


That reminds me I should go debug the patch I sent to cosmopolitan, it ran on my machine but failed continuous integration...


Very cool project. Not commenting on that. Lisp and it's derivatives often hits the front page of HN and I always wonder why. What is it about lisp that is so powerful? so much so that some see it as the platonic ideal of programming languages or so it seems?


> Ask HN: Why is everyone here so obsessed with Lisp?

https://news.ycombinator.com/item?id=20697493


Ahh okay. Paul Graham and the hackability of the language itself. Got it.


And On Lisp is a intellectual pleasure to read, regardless of how you regard the Silicon Valley economics.


I have a lot of respect for lisp and its heritage. Respect is certainly one reason why I chose to write a lisp.

Other reasons include simplicity, practicality and ease of implementation. Lisp has a very simple syntax and it is relatively easy to parse it and implement a basic interpreter. I wrote the lexer and the parser by hand, there was no need to mess around with parser generators.

Another reason is I've come to see lisp as something of a frontend for C data structures. I have a byte buffer, encoded text, linked lists, resizable arrays, hash tables... Lisp is the language that binds them all together.

Another reason is that I knew how powerful lisp was despite the simplicity. Despite being a small project, lone is already metaprogrammable with FEXPRs. It turned out I needed exactly one bit in order to give lone macros.

  if (function->function.flags.evaluate_arguments) {
    arguments = lone_evaluate_all(lone, module, environment, arguments);
  }
It just doesn't evaluate the arguments if it's a macro. The function gets the lists that represent the expressions of each argument instead of the final value they compute to. And those lists can be manipulated just like any other list.

I think that was the moment I got the fabled enlightenment they say lisp programmers experience. It just brings a smile to my face.


Well, Y combinator, the domain of HN is intimately related to Lisp, as the founder Paul Graham. That is how I landed in this site to begin, and I assume is so for many around. That could explain it a little bit.


Lisp code is made of lists. And lists are the main data type for lisp programs. So you can naturally produce and transform lisp code using lisp code. Not all lisps allow that, though, but I think that's the main differentiator and it's unique in that aspect.

So it's like generating JS code text from JS, but working with JS code as a text is much more confusing compared to working with lisp code as a list.


It doesn't hurt that HN is written in LISP.


Beautiful. Also, after 30+ years of writing C and C++, I learned one thing by casually browsing the source code: you can use a preprocessor macro in an #include statement. Thanks.


Yes! GCC calls it computed includes.

https://gcc.gnu.org/onlinedocs/cpp/Computed-Includes.html

I used it to include architecture-specific source code and also a generated C file containing a table of Linux system calls defined by the Linux UAPI lone is compiled against.

The makefile defines those macros by passing flags:

  -D LONE_ARCH_SOURCE='"$(ARCH.c)"' -D LONE_NR_SOURCE='"NR.c"'


Oh so it's a GCC-specific thing... that explains it. In any case, congratulations on the lisp! It's beautiful code. I wish my current codebase was that beautiful. I mean, it's a lot of us contributing to it so the beauty is a kind of "meeting of the minds" situation... but still :)

PS: My current project implements a lisp... rendered as JSON. Some other angle on beauty...


Thanks, I mean it. I've been programming alone for a long time, it really means a lot to read that.

> My current project implements a lisp... rendered as JSON. Some other angle on beauty...

Hey, feel free to share it. I'm curious about your project.


Props to the author, if the code is as well designed and written as the README, this is looking good.


Great project. I love your style of coding and documenting.


Thank you. I've been a lone programmer from a long time, means a lot to me to read that.


Nice. I can see something like this being used with circle (https://github.com/rsta2/circle) to run bare metal on a Pi, too.


A quick glance shows it uses GNU Make but that's only a build time dependency, I guess.


All software needs dependencies at build time, saying no dependencies always means runtime.


I'm surprised the Kernel doesn't count here.


It does.

Writing your software to directly use the syscalls of a specific kernel does not make it "zero dependency", it makes it "one dependency" - and non-portable.

The author elaborates on their rationale and the technical details in a blog post: <https://www.matheusmoreira.com/articles/linux-system-calls>

TBH I have mixed feelings about this approach. It's true that this is more or less what Go or (Cosmopolitan libc) do, but the motivation in their case is to maximize portability (by making cross-compilation trivial). However when you #include <linux/...>, you not only make your software non-portable, you also make it a PITA to cross-compile as you need the kernel headers on the host machine.

In contrast, with Go or cosmo, I can trivially build a tiny /sbin/init for amd64 Linux, pack it up with cpio, and run it with qemu-system-x86_64 -initrd - all from a Mac/arm64 host.


> However when you #include <linux/...>, you not only make your software non-portable, you also make it a PITA to cross-compile as you need the kernel headers on the host machine.

Yes, that is certainly a problem that I need to solve.

I added some support for cross compilation in the makefile. It currently requires clang for that.

  ifdef TARGET
    ifndef UAPI
      $(error UAPI must be defined when cross compiling)
    endif

    TARGET.triple := $(TARGET)-unknown-linux-elf
    override CC := clang -target $(TARGET.triple)
  else
    TARGET := $(shell uname -m)
  endif
With this, I was able to cross compile lone for x86_64 from within the Termux environment of my aarch64 smartphone. All I had to do was obtain the Linux user space API headers for x86_64. Getting those headers was somewhat annoying but doable, there are packages for them in many Linux distributions.

I made a Termux package request for multiplatform Linux UAPI headers specifically so I could cross compile lone but unfortunately it was rejected.

https://github.com/termux/termux-packages/issues/16069


Surely its more than one dependency. For example, you'll need a processor. Not just any processor, but a processor for which a C compiler has been written.


Like sibling comment points out, the CPU and the rest of the universe can be considered indirect dependencies. Once you have everything you need to boot the Linux kernel (e.g. laws of physics, paid the power bill...), you're good to go ;)


For that matter you’ll need a universe, the physics of which must allow both semiconductive metals and the eventual evolution of multicellular biochemistry.


That's a great question... I wonder if Justine would be interested in that. She has her own sector lisp too.


At that point we really get into semantics... But now that you mention it, I would be interested to see if it could be built with APE to benefit from their bare metal support.


Yes. It has zero runtime dependencies but development currently assumes GNU tools. The test suite for example is entirely written in bash and uses GNU coreutils. I submitted a patch to coreutils to allow env to set argv[0] of programs specifically so I could use it in my test suite.

Currently lone is a single C source file. It could easily be compiled manually if necessary. I've started reorganizing the repository though so that's likely to change.


>Currently lone is a single C source file.

Is there a plan to write a lone compiler so as to eventually have lone bootstrapping/compiling itself so that you no longer have to sully your hands with C?


I would like that. I wanted to create the simplest possible reference C implementation first so the language can always be bootstrapped with a C compiler. After that, I'll probably make a better one. I'm considering a Rust implementation as well.

At least that's what I tell myself. Just this simple interpreter has already generated a lifetime of work. It's making me wish I had infinite time to work on it.


You don't have to; you get to! ;D


Hey some people like debasing themselves with C; I'm not gonna kink-shame.


>zero dependency

>for Linux

There's your dependency.


I understand what you mean. By dependencies I meant user space libraries such as glibc and musl.

The language itself is fully self-contained. It initializes itself with nothing but a static array of bytes.

Could be possible to modify lone to run on bare metal instead. Perhaps by replacing the Linux system call code with BIOS I/O functions and replacing the Linux process entry point code with boot code that initializes the CPU and hardware.

https://wiki.osdev.org/Printing_to_Screen

That would just shift the dependency from Linux to the firmware though.


>Could be possible to modify lone to run on bare metal instead. Perhaps by replacing the Linux system call code with BIOS I/O functions and replacing the Linux process entry point code with boot code that initializes the CPU and hardware.

Just don't make the mistake golang made. In most systems, the "stable interface to the kernel" isn't the syscall, but the c library.

(re: golang on openbsd)


Absolutely. I've written about this topic and how Linux is different:

https://www.matheusmoreira.com/articles/linux-system-calls

My plan is to target Linux exclusively.


I have always found it absurd that Linux insists on stable syscall ABI, and yet does not have a standard driver API.

Hopefully, the world will migrate to a better system at some point. It will be exactly the opposite: It will not provide a stable syscall ABI, and it will have a standard driver API.

Incidentally, it'll be microkernel, multiserver.

Many articles will then be written about the maintenance burden Linux had, and how we should have done this much earlier.


I think it's pretty great.

Unstable kernel ABI gives the Linux kernel enormous leverage: device makers either upstream their drivers under the GPL or they get left behind.

Stable userspace ABI gives people like me the complete freedom to build anything on top of Linux. I can just discard stuff like glibc and build my lisp user space for the fun of it. Rust programmers can do the same.


> Could be possible to modify lone to run on bare metal instead. Perhaps by replacing the Linux system call code with BIOS I/O functions

Certainly not. Calling BIOS interrupts requires the system to be running in real mode, which is incompatible with running 32-bit code.


>Calling BIOS interrupts requires the system to be running in real mode, which is incompatible with running 32-bit code.

Hold my beer[0].

0. https://en.wikipedia.org/wiki/DOS_Protected_Mode_Interface


DPMI is a software interface exposed by an operating system or a DOS extender, not part of the BIOS.


Sure, but then again, it does prove it is possible.

DPMI provides ability to do both DOS and BIOS calls "in protected mode".

Similar trickery is often used to leverage BIOS drivers on new OSs until adequate native drivers are available.


I see. Could it work in protected mode? The OSDev wiki page I linked uses that approach to write to video memory.

> Assuming that you are in protected mode and not using the BIOS to write text to screen, you will have write directly to "video" memory.

I have pretty superficial knowledge about OS development so I should probably refrain from speculating further.


What the OSDev page is describing here:

>> Assuming that you are in protected mode and not using the BIOS to write text to screen, you will have write directly to "video" memory.

... is bypassing the BIOS and interacting directly with hardware. Which is a thing you can do in some circumstances, but it's very limited -- especially if you want to do anything beyond simple console I/O.


My project runs in your imagination so there’s no hardware or electrical dependency other than brainwave activity.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: