Well. Lorem Ipsum vs Donald Knuth. I think they both suck. I have a different hero. Linus Torvalds. He IS good AND actually delivers.
I enjoyed the video. However, both the author and Donald Keith strike as two people who are mainly motivated by aesthetics (which is supremely ironic as they almost define themselves as being the opposite of Loren Ipsum). I like people who deliver results.
I think the author mostly likes the appearance of someone who pays attention to detail. He says that in a movie if it's obvious that the coffee cups are empty, he notices and bothers him. 3 minutes later in his video he has a guy using his laptop at the beach where glare would make it impossible to see anything on the screen.
Back in the pre-GUI era, I did this with all my posts and comments online. All fully-justified, both margins, without any extra whitespace: just careful choice of words, and occasional (but always normal standard) abbreviations.
It was fun, once practised it barely slowed me down, and when people noticed it blew their minds.
Then along came GUIs and proportional fonts and made it all invisible. :'(
72 characters is the standard limit for Git though. This convention has a long tradition from the text-mode days of email and Usenet in 80-column terminals/screens, where 72 characters allowed for the addition of a reasonable number of quoting levels (indicated by prefixing quoted lines with “>”), and/or for other line markers, before exceeding the screen width and having to reformat.
It'd be fun/great if someone made Graal Native work with it. Then things could come full circle and I could compile my Clojure/Java binaries to run everywhere (but now without a VM)
Forget which one, but there's a WASM runtime that's working on running on Cosmo-libc, combined with the WASM graal native port that's going... https://github.com/oracle/graal/issues/3391
I'm quite fuzzy on the technical details - but do you think it'd ever lead to JavaFX running everywhere? It was a while before JFX worked with Graal Native
I'd like to someday be able to easily distribute my little GUI apps (for scientific applications) in a compile-once kind of way. Asking users to install a JVM runtime has turned out to be unworkable (everyone installs the Java 8 runtime b/c that's what comes up first on Google and then the apps silently crash/don't-work and users are confused)
> everyone installs the Java 8 runtime b/c that's what comes up first on Google and then the apps silently crash/don't-work and users are confused
Is it not possible to add a tiny check from main()? If version is less than required, then show a (dreaded) JMessageBox or print message to STDOUT/STDERR where users can safely download the correct JDK.
Stuff like this is why I built Cosmo. I saw the writing on the wall, when the Java installer started adding toolbars to my browser. I made a solemn vow on that day to never build software that can't run unless my users install someone else's software first. Now, after walking away from hundreds of thousands of dollars of stock, and putting forth years of effort hacking in my garage, I'm finally a first-class citizen of all platforms and you can be too. If Java starts supporting building Cosmo binaries, the amount of love and trust that'll earn will be like how Meta pulled themselves out of the Cambridge Analytica mud by releasing LLaMA.
No idea. I'm guessing the process doesn't even launch (I don't remember if I checked this..). Do you have an example with a check ? I'd love to add it. You'd probably need a very basic Swing error popup
You can commit a lot of sins with ABIs if you throw the academic books into the fire.
For example, on my side I'm developing tooling that allows one to delink programs back into object files. This allows me to commit a whole bunch of heresy according to CS101, such as making a native port of a Linux program to Windows without having access to its source code or taking binary code from a PlayStation game and stuffing it into a Linux MIPS program.
When you're doing that kind of dark magic, it's one of those "know the rules so well you can break them" situation. Instead of following the theory that tells you what you should do, you follow the real-life world that constrains what you can do.
I'm somewhat of the same mind, but I'm fairly sure a study of Operating Systems: Three Easy Steps would get me over the hump. There's no actual reason to suspect this thing would like, sidestep normal process management or memory virtualization or something and run amok... I think.
Years ago I read some of the author's posts (they're active on HN too iirc) about it, and it seemed to me like they were relying heavily on the internals of the loader or dynamic linker on each platform, such that a new release for any given OS could conceivably break your binary.
It's probably a very fun project to hack on but I would advise against distributing the binaries and expecting them to work over the long term.
To my recollection, at least on *nix operating systems, they got some changes made to the POSIX standard to formalize behavior the binaries rely on. So going forward, mere POSIX compliance and ongoing ABI compatibility guarantee the binaries will continue to work on *nix operating systems.
On Windows, backwards ABI and executable compatibility has always been an extremely high priority, so I think the danger of future breakage is low.
Neither of those speak to macOS, but maybe someone who knows more can help clarify.
I disagree with your assessment in all cases. You simply do not know what you were talking about.
Standardization efforts and backward compatibility assumes a well-behaved application. If you explicitly depend on really weird hacky stuff like abusing corner cases in the object file format, you risk breakage and you will break.
In the Unix world, firstly, platforms that aren't Linux typically say that if you don't do syscalls through system libc, all bets are off. Second, if they standardized a few things here and there, they are likely standardizing stuff that will formed applications linked with typical libraries will exercise. Standardization does not imply explicitly listing all possible corner cases of your object file format.
On Windows, I happen to be a former Microsoft dev who worked on Windows in 2008-2011. If an app were trying to push the limits of the PE format, I don't think it would get fixed on the platform side... I've seen popular applications do much less bad things and get broken.
Cosmo author here. Could you please clarify what specific object file features we're abusing? The only hacky thing I did was remove the shebang line, and POSIX was awesome enough to change their rules to let us do it. https://austingroupbugs.net/view.php?id=1250 Beyond that, we just depend on stable ABIs and APIs. We don't link NTDLL for example. It's just straight up WIN32. On MacOS ARM we use the dynamic linker. I want Cosmo programs to stand the test of time. That's why I stopped building apps with NodeJS and wrote this instead. The whole reason this release is on the HN front page, is probably because it got rid of MAP_FIXED mappings and page size assumptions. So if you can tell me where we're still exposed, then I'd love to fix it. Thanks!
I would need to dig up blog posts I read several years ago to get specific. I recall reading a blog post of yours where you describe hardcoding particular constants to play nice with some loaders and/or dynamic linkers. My impression having worked in this area was "this person is playing with fire, the whole mental framework of operating here is high risk, they will break in a Windows or macOS release and they will deserve to be broken." I know when I played similar tricks with my binaries in the Windows world, they were broken with new OS releases. Emergent behavior from a loader is not an ABI contract, and in that old blog post, which I know I am being vague here, you were definitely treating it as such..
The HN audience at large doesn't know the C world very well and they might mistake your thing for something useful in prod. That's kind of unfortunate.
> My impression having worked in this area was "this person is playing with fire, the whole mental framework of operating here is high risk, they will break in a Windows or macOS release and they will deserve to be broken."
Surely that becomes a weaker claim with every year and release that NT/Darwin reshuffle their ABIs and it doesn't break.
When asked, by the author, to cite specifics, you're instead claiming something isn't "useful in prod" based on an impression you formed years ago while reading blog posts.
Doesn't make the conclusion incorrect. I don't feel the need to convince anyone else of this.
I also don't have time to dig up that original material and I don't have time to reassess the library to see if it has improved, though I doubt it, because I am still certain it is philosophically an unsound idea. I suspect most experienced C coders if they get one look at that they'd say "ok that's kinda cool but seriously don't do that".
It makes the conclusion unsupported. I don't know this area very well, but speaking for myself, I'm happy to ignore your argument from authority and give more weight to the author engaging in good faith.
For what it's worth, your viewpoint expressed here comes across as FUD (fear, uncertainty, and doubt)... which developers have learned over the years to be allergic to (especially since that's the attitude Microsoft used toward Linux for years before finally embracing it).
I respect that you know a lot more than I do, and I freely acknowledge I was repeating things I've read from the cosmo author without really understanding the details of how the PE/ELF/Mach-O/etc. formats work.
But my "sense" as an experienced developer is that there really is something here worth pursuing and using -- and that in the worst case, tools built using this will have to reassess their OS compatibility with each new major OS release -- which they kind of have to do already :). I trust the Cosmopolitan maintainers will keep Windows compatibility even if future changes are required. So developers will most likely only have to rebuild with the latest version of cosmocc if the PE loader changes. Maybe Windows developers haven't had to do that thanks to Microsoft's efforts, but it has been a thing on other platforms.
In other words, pragmatically, it would be no additional skin off my nose to have to occasionally rebuild to support future major Windows versions if I get such wide executable portability in return -- for something that would otherwise be supported only on macOS and Linux. Windows developers may feel differently.
But I think anyone who distributes something built in C and whose goal is extreme cross-platform portability/compatibility (and frankly, software longevity due to cosmo libc's future stability) ought to seriously consider APE instead of WebAssembly or creating multiple builds.
> I disagree with your assessment in all cases. You simply do not know what you were talking about.
No need to be rude.
> Standardization efforts and backward compatibility assumes a well-behaved application. If you explicitly depend on really weird hacky stuff like abusing corner cases in the object file format, you risk breakage and you will break.
The comment above just said "like, people really value binary compatibility and stuff" -- as if it's OK to code against a very specific moment in time with internal dynamic linker constants and such. No. "It works with how ld.so is written right now" says nothing about standards conformance. Not everything that happens to work is conformant to an ABI. Not every emergent behavior is a feature of an ABI. You would have to not understand what is going on at process load time to think that. Even the highly regarded Microsoft binary compatibility does not work against the model that someone is creating an ELF and PE executable in the same file, they would rightly call that crazy town and mark any bugs as "won't fix".
> You simply do not know what you were talking about.
Is where it crosses the line into personal attack. If you just left that part out I think the rest of your comment would be quite good. (I mean, I don't agree with it, but it's a fine argument to make.)
Why? Hardware and software architectures often have radically different needs from their loader, it doesn't make sense to have one format to rule them all. And any format flexible enough to support all use cases would just be a container for other formats, and in practice, loaders would ignore the variants that they don't/can't support so developers would need to care about porting things anyway.
ELF is pretty stable across many architectures and platforms but it's proven to be inflexible enough for plenty of applications where people develop their own, or alter it in varying ways.
A thin POSIX layer and ELF loader shouldn't be too much of a problem for Microsoft to implement if they wanted to do so (WinNT actually did have a POSIX personality at some point, but I don't think that's still supported). I'd also like to see a builtin WASM runtime in all operating systems.
Indeed, back in '95 or so there was a library called CrossELF that'd let you compile ELF so files and use a tiny loader linked to CrossELF for each platform you cared about to load the main .so file, and you could build platform-independent code with it. I remember writing some simple networking code where the loader just had a tiny set of shims for a few calls and the rest of the networking code was a single binary for both Linux and Win32.
The problem is wrapping the relevant APIs - as you can see w/e.g. Wine. For some functionality - like networking - the surface is pretty small, for others its a nightmare.
Microsoft did implement a not-so-thin POSIX layer and an ELF loader atop the NT kernel, it's WSL1 (Windows Subsystem for Linux). It was obsoleted by WSL2, which uses a specifically-tuned Linux VM instead for performance and completeness reasons.
I haven't played with it, but I think the classic Windows POSIX subsystem used the COFF/PE file formats instead of ELF.
Although this project is undoubtedly very cool, and maybe simplifies build processes by having a single binary, is there any other reason to use it? How does it compare in terms of performance, static linkability, standards conformance, etc. with musl and glibc? I’m curious because I’m picking a libc at the moment for my project.
For many other utilities I use Busybox for Windows (https://github.com/rmyorston/busybox-w32) which is well maintained, fast, and its maintainer is also very responsive to bug reports.
You're basically just benchmarking the WIN32 filesystem versus a Linux VM in that case. The Windows file system is famously slow. It'd make more sense to compare a Cosmo binary running on WIN32 with a WIN32 native program. For example, Emacs has a WIN32 port where they wrote all the Windows polyfills on their own, and because they aren't experts on Windows, the official GNU Emacs Windows releases go a lot slower on Windows than if you just compile Emacs from source on Linux using Cosmo and scp the the binary over to run on Windows. See https://justine.lol/cosmo3/
I would argue it's less about the build process and more about the user experience - no install, no "pick the right binary for your platform", just "download this file and run it". I don't think it's literally a static binary, but on anything but a `FROM scratch` container it might as well be.
Also I think there were some numbers showing that it sometimes had better performance than alternatives, but I can't seem to find the post right now.
The number of people this effects seems like it must be pretty small. There's the subset of people that have more than one OS. There's the subset of those that use command line tools (this explicitly doesn't do GUIs). There's the subset of those that would even think about some convenience of downloading a single binary command-line tool for multiple-OSes rather than use (a) something compiled specifically for that OS or (b) using something specific to that OS (rmdir on windows, rm on linxu/mac).
I use git, I've never felt especially put out that it's a different executable on mac, windows, and linux (situation "a"). I also just use what's common for the OS I'm on since there's so may other differences (C:\foo/bar) vs (/foo/bar) etc... (situation b)
That's a good point. I guess I'd be more likely to pick some library because even if the binary is portable, the OS is not. Understanding where to store user preferences for example. (unix usually puts them in ~/.somefolder but on Windows it's somewhere in LocalAppData and on mac it's often in ~/Application Support). Understanding that paths have different parts, ... I'm sure there's more.
I don't know to what level C++20 does all this now with filesystem and threads etc... There's also things like libuv, Abseil, etc. But, maybe I'll check out cosmo next.
There are all sorts of issues in trying to come up with a single abstraction for multiple platforms. It's an impedance mismatch problem, there is no single best solution.
I like where cosmo is putting effort. The self-hosting toolchain and build system for example. That was always a knowledge barrier for me (I'm a webdeveloper, I know nothing).
Personally, it was so different and shocking, that I started to understand a lot of things by the contrast it created with other code. Cosmos's Makefiles were different, the linker was being used in a way that made me notice it as a distinct part of the toolchain, the demos actually worked, etc. It made me interested in that stuff, made it more acessible.
Even if you have to do some runtime detection/self-configuration, cosmo is great because it lets you pack that logic into a single binary and pick it at runtime, so the user experience is still "download a single file and run it and it works" regardless of their OS.
I don't know enough to compare to musl, but I suspect being ISC licensed (similar to musl's MIT license) allows for static linking without upstream impact such as GPL.
Would be curious regarding features and performance comparisons with musl, which seems to come up a little short compared to say gnu libc.
Cosmo author here. We used to build our code with musl-cross-make. Once Cosmopolitan got good enough that we could compile GCC using Cosmo, our build latency dropped in half. https://x.com/JustineTunney/status/1726141024597324189
Cosmopolitan Libc is 2x faster than Musl Libc for many CLI programs like GNU Make and GCC because it has vectorized string libraries like strlen(). Musl won't merge support it on x86. I even mailed Rich patches for it years ago.
Cosmopolitan's malloc() function is very fast. If you link pthread_create() then it'll create a dlmalloc arena for each core dispatched by sched_getcpu(). If you don't use threads then it'll use a single dlmalloc arena without any locking or rdtscp overhead.
Cosmo has pretty good thread support in general, plus little known synchronization primitives from Mike Burrows. You can do things like build a number crunching program with OpenMP and it'll actually run on MacOS and Windows.
Cosmopolitan plays an important role in helping to enable the fastest AI software on CPUs. For example, Mozilla started an open source project called LLaMAfile a few months ago, which runs LLMs locally. It's based off the famous llama.cpp codebase. The main thing Mozilla did differently was they adopted Cosmopolitan Libc as its C library, which made it easy for us to obtain a 4x performance advantage. https://justine.lol/matmul/ I'm actually giving a talk about it this week in San Francisco.
I suspect parent commenter isn't just asking about static linking from the perspective of licensing but about technical feasibility; at least part of the reason people use musl is that static linking glibc isn't well supported.
static musl make is at least 4x smaller than cosmo make
cosmo make probably works on at least 4x the number of operating systems
# cd /usr/local/bin
# file make
make: /usr/local/bin/make: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), static-pie linked, stripped
# stat -c %s make
297104
# cd
# tnftp -4o make https://cosmo.zip/pub/cosmos/bin/make
# stat -c %s make
1397059
Looking forward toward somebody hooking together Python in APE [0], something like pex [1]/shiv[2]/pyinstaller[3], and the pants build system [4] to have a toolchain which spits out single-file python executables with baked-in venv and portable across mainstream OSes with native (or close enough) performance.
Not OP, but having written some Lua, I can make some guesses:
- Relatively obscure language, so potential contributor base is limited from the start
- Lua shares PHP's "One datatype to rule them all" design, which works but feels ugly. In PHP it's "arrays", in Lua it's "tables", but either way you have all the attendant problems and weird edge cases to learn
- Expanding on above point, Lua's APIs for working with tables are, uh, idiosyncratic. Slicing a "list", last I looked, was an unintuitive monstrosity like `new_table = { table.unpack(old_table, start_index, length) }`
I could keep going, but I have other things to do during my brief time in this universe.
To any Lua aficionados out there, my apologies if I've misrepresented it. Corrections to my misunderstandings will be appreciated.
Lua is among the most used languages in existence.
It's probably the most used language by under-21-year-olds, and almost certainly so for under 16. Roblox is absolutely enormous.
It's the usual choice of embedded scripting language, so a great number of programmers who don't use it as a daily driver, nonetheless learn it to modify this or that program. Being a very simple, minimalist language, this is easy to do.
It's true that it's missing some affordances which you'll find in larger (and frequently less efficient) languages. You'll end up doing more iteration and setting of metatables. That bothers some people more than others. These are the tradeoffs one must accept, to get a 70KiB binary which fits in an L1 cache while being multiples faster than e.g. Python or Ruby.
Lua is too bare bones. It's a carefully considered tradeoff for the project goals of being a very small & easily embeddable language for C projects, but it means you need to implement too much load-bearing functionality yourself. Each lua project ends up being an ad hoc framework composed of C bindings and standard lib extensions, but it doesn't have an ecosystem that makes this straightforward or consistent.
A lot of people like lua from using it on very small projects, or like, spiritual reasons related to its implementation simplicity. But having used it quite a lot professionally, in practice it ends up being a slog to work with.
All that said, lua is closely tied to C, and embedded in a C project is both its intended use and the place where it fits best. So while I don't really like lua and would almost never choose it myself, this is one of the rare exceptions where I think it's a good choice.
A lot of people want redbean to use TS or python instead, and imo either would make it much bigger, more complex, for relatively little benefit. It would definitely benefit from a more full-featured language, but I think a weirder one that is still intended for embedding in C would be best. Something out there like janet or fuck it, tcl. But lua has a lot of allure for a lot of programmers and I think gives the project a feeling of "old school cool" while still being pretty accessible. So is probably the best choice in the end.
I am looking for an Electron replacement for writing my apps. Could cosmopolitan be the basis of something like that, or is that not really its purpose?
Lua is perfect for Redbean because it's tiny and made for embedding, which is this exactly use case. Something like TypeScript would require a transpiler to JS and a JS runtime, which as far as I know, even the smallest one would be much bigger than Lua. And Lua has well defined semantics to call C and be called by C, which may be hard to do with JS/TS.
quickjs [1] has native support for Cosmopolitan, is meant to be easily embeddable, and is included as part of the standard Cosmopolitan distribution. It looks like qjs also has patches for recent-ish versions of the Typescript compiler as well. Someone has made a nodejs-style project called txiki.js [2] using vanilla qjs. Maybe it would build with Cosmopolitan with some tweaking. But if you're thinking of packaging a whole browser engine like Electron that might be a Sisyphian effort.
> We've made a lot of progress reinventing the C++ STL.
What is the motivation behind this? Try to reduce (compiled) binary size? Naively, I am surprised that the C++ STL from Clang isn't OK to use, including the license. Or is this a clean room impl thing?
EDIT
To be clear: I mean no disrespect with this question. This is an amazing project.
The rationale is explained in this README [1], but it basically boils down to transitive dependencies and the amount of checks the compiler has to do. From linked README:
"If we `#include <string>` for the LLVM libcxx header containing the `std::string` class, then the compiler needs to consider 4,800 `#include` lines. Most of them are edges it'll discount since they've already been included but the sum total of files that get included is 694!"
The commit message for the first commit there says "Actually Portable Executable now supports Android". I assume this means that the bare executables can run on Android kernels, not that there's any support for installing APEs as Android apps. But it seems possible for that to eventually work! Is that a goal of the project?
I gave up trying to run cgo-compiled Go on an old appliance[1] the version of libc was simply too old. Has anyone ever successfully built cosmopolitan-flavored cgo binaries? I see Cosmopolitan libc supports Linux 2.6, so I'm hopeful.
>The input file can be of any type, but the initial portion of the file intended to be parsed according to the shell grammar [...] shall not contain the NUL character.
You'll notice there's no NUL characters on the first line, and that the subsequent NULs are escaped by a single-quoted string, which is legal. The rules used to be more restrictive but they relaxed the requirements specifically so I could work on APE. Jilles Tjoelker is one of the heroes who made that possible.
[T]he initial portion doesn't mean the first line, it means the script part of a file consisting of a shell script and a binary payload, separated by `exit', `exec whatever', etc. A good example is the Oracle Developer Studio installation script.
You can write to Austin Group mailing list and ask for clarification if you want.
That Python app is a popular demo for Cosmopolitan. It's what I would have chosen for that demo, too! It's handy because it outputs a little bit of information about the current architecture on the first line when you start the shell.
Is that an embedded device with a small address space? APE used to require a 47bit address space. The release I published today fixes that. Maybe give it another try?
No, it's a server with 256GB of RAM, but indeed the kernel config has CONFIG_ARM64_VA_BITS_48=y (and 52 for userspace). I have since moved away from this kernel, I'll see if I find the time to boot it again to test the latest memory map manager.
Edit: I tried booting it, the issue is still present, I updated the github issue.
Looks like there is both an ARM and x86 version according to the docs. Probably need two different binaries, but you still get cross-OS for each architecture.
% curl -O https://cosmo.zip/pub/cosmos/bin/basename
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 663k 100 663k 0 0 440k 0 0:00:01 0:00:01 --:--:-- 441k
% chmod +x basename
Kind of. It can be read as a single binary, some supported systems will do that. In others, the executable is first parsed as a shell script. There's definitely more to it than a single binary.
Reminds me of a cool tool I once used, uudecode.com, which was a DOS binary that only used 7-bit characters and could decode uuencoded (base64 predecessor) files. Was useful for getting attachments through e-mail in the face of all kinds of filters.
I enjoy doing this too sometimes and don't find it too difficult, but damn...