The Rust docs cover it in small detail. FWIW you can do the same with JNI and have direct Java->Rust bindings. If you want to get really clever you can even parse the Rust AST(using syntex) to auto-generate java/headers(there may be libs to do this now, I haven't looked in a while).
What's nice about the C FFI is you can pretty much embed Rust anywhere, I've used it in C/C++, Java, C# and Python pretty easily.
The reverse tool, bindgen (https://github.com/rust-lang-nursery/rust-bindgen/) is more mature -- generates rust bindings from header files.
Which covers a bit of this, plus some of the sharp edges of FFI. Most of the trickier parts of it are covered in the FFI omnibus . Also worth checking out is the cbindgen tool .
While this sounds amazing and I'm sure is very useful in various situations, I think encouraging a culture of distributing software in such a fashion is an incredibly bad idea. This is a big issue I have with the Go culture. Downloading a statically linked binary from some site and just sticking it somewhere is extremely bad practise to encourage. The user will never get prompted to update it, vulnerabilities and bugs in the vendored libraries will never be addressed... the list of issues is endless.
Distributions are not really about distribution, they're about maintenance.
Please: single exe/msi, one click to install/run.
I think the old Windows model of installers polluting the registry and disk drive in random places was a very bad idea and it's time to move on from .exe and .msi.
> Overall, your directory structure appears to be more complicated
> than what we generally do at Google (which is the environment where
> all this behavior was designed/evolved). At Google we generally have
> a single directory structure for all source files, including protos.
The only proposal I saw involved removing the 1:1 file correspondence between .proto input and _pb2.py output. I can understand why people would want this. But inside Google, we compile all our .proto files in parallel, and our whole build system depends on being able to parallelize things in this way.
If you have a solution to this problem that doesn't break any of our existing use cases, we're all ears.
In particular there's tension between running enterprise distros on servers and getting stuff deployed when the system libs are very old.
Take libcurl and OpenSSL for example. Some distros use OpenSSL 1.0.2, some default to OpenSSL 1.1.0, in some cases libcurl depends on 1.0.2, but the system in general prefers OpenSSL 1.1.0. Which one should you use in your application for crypto?
Obviously, the sane decision is to provide a separate build for each distro which uses the libraries bundled with the distro. Or you can just link everything statically (and then security updates of OpenSSL become your problem).
Then also come QA people and management, which want to support multiple distros, but do not want to support multiple packages (as they are different and require separate QA effort), plus the users would find it hard to install an appropriate package for their distro.
So... it's more convenient to just link statically.
Of course nowadays you also use it to quickly patch security holes in applications without having to relink them.
IMO the proper way would be to simply relink the binary with patched libraries as default and only use dynamic linking where it's necessary.
Statically linked binaries work far better in my experience, I can run (some) go programs on ancient 2.3 linux kernels (last i tested atleast) without complaint but any modern Firefox will likely refuse since the libraries in that specific distro are ancient af.
Like you say, it was how everyone deployed software.
But I still think it's how software should be deployed, it should be the norm.
I blame developing in Go.
People find junk food convenient, but encouraging people to form a habit based on it is no favour to them. Both situations prioritize instant gratification, the difference is that with downloading random binaries, you really only need one of them to lead to trouble for it not to have been worth it.
> some notable problem with the Linux ecosystem
How so? Care to name a platform which has "solved" this problem? They're all either "go and download a random binary from a random site and plonk it somewhere/run an installer that does god-knows-what" or lead you to installing "apps" that are so isolated from one other that general purpose productivity on such a platform is almost impossible.
It isn't obvious to me how that is dramatically different from traditional package managers as far as the update process goes. Upgrading decencies seems like a generally active and involved process as far as most package managers I've seen go.
pip/apt/npm/go-get/glide/yum/nixos etc all require you to actively discover and upgrade your dependencies and I've never been prompted to upgrade a package from any of the programs (a few do ask if you actively engage special subcommands on CLI e.g apt list --upgrade). Unattended-upgrades might be close but you can really only enable that on security releases and most package managers don't have the resources to setup special distros (and fewer backport security fixes).
So is the pain really just not having a quick upgrade cli and is that dramatically different from going to a github page and getting the new URL for a new binary? Would something as simple as writing a script to list and download versions of binaries from github releases make this a non issue?
Also: an update prompt which you get on Ubuntu, is still better than nothing.
Guix has updater programmes for various kinds of origins to discover updates.
I know that in principle Rust is safer, but I am wondering it makes sense to re-implement code so heavily audited as libc.
Wouldn’t it be better to redirect the effort to some other less audited code?
Can't find separate listings for libc in FreeBSD, OpenBSD but the list is smaller than glibc if you check e.g., VuXML or https://www.cvedetails.com/vulnerability-list/vendor_id-97/p...
Looks like glibc is the worst sinner here.
For example the one about sscanf("foo", "%10s", strptr) returning 1. Hint: According to the standard, it should return 0.
Musl does it right, which recently broke some of my test cases. But the bug is known and not fixed since years, maybe a decade...
Most existing libc implementations are not particularly portable across different kernels, and especially not to microkernels such as Redox OS. The goal is to have a standard library that is easier for the Redox OS devs to develop in (obviously, Rust makes sense here), and not so tied down to one kernel like musl is for example.
To be more compatible with system code?
It’s a mechanism in glibc for loading dynamic libraries at runtime (based on a config file) to handle DNS and some other things. Anything statically linked can’t support this, which can cause issues with some setups.
I’m doing a (free) operating system (just a hobby, won’t be big and
professional like gnu) for 386(486) AT clones. This has been brewing
since april, and is starting to get ready. I’d like any feedback on
things people like/dislike in minix, as my OS resembles it somewhat
(same physical layout of the file-system (due to practical reasons)
among other things).
I’ve currently ported bash(1.08) and gcc(1.40), and things seem to work.
This implies that I’ll get something practical within a few months, and
I’d like to know what features most people would want. Any suggestions
are welcome, but I won’t promise I’ll implement them :)"
Look how much Redox has achieved in so little time and with so few people. It has everything from a kernel to a fully working desktop GUI.
All because they use a good language that allows easy code reuse and proper library packaging.
Also remember that Linux at some point was somebody's pet project.
They're all really just claims that C is bad, and we shouldn't use it. Okay, well how do you replace it? Or was that never the goal to begin with and you just want me to use this language?
We're at the point now where it would be useful to be able to express ABIs and FFI in a language-agnostic manner without resorting to C. C lacks the capability to describe fairly widespread functionality: things like fixed-size integer types , pointer lifetimes, arrays with lengths, functions with bound environments, variable arguments . Keeping distinctions between binary strings and textual strings would be useful for languages, and tracking how to deallocate pointers would be beneficial as well. An ABI that allows systems with different GCs to pass GC information to each other would be amazing, if challenging.
Unfortunately, we're sort of in a catch-22. The multilingual interoperation has to go through C, because that's the only standard that exists. The people who write the standards look at the situation and see no reason to go beyond C because no one's proposing an alternative.
 Sort of. There is <stdint.h>, but C tends to fundamentally think in terms of char/short/int/long/long long, and there have been issues with trying to work out what [u]int64_t maps to.
 Not in C's broken sense of "there are more arguments after this point, but you get to guess how many." Rather, I'm thinking of a more Java sense of "here's the list of extra arguments the caller gave me."
Don't get me wrong: I think it is great that more powerful languages automate and check those things. But it is also right for ABIs to be two-steps behind so that they don't tie themselves to a single language.
And here is the real problem with ABIs and C -- C is a single language. Ages ago C was a decent lingua franca, because C programs translated to machine code in pretty easy to guess ways, and the job of an ABI was just to formalise those guesses.
But nowadays we have aggressive optimisers, expansive definitions of undefined behaviour, clever parameter passing conventions, abstract "memory models" defining concurrency rules and all kinds of other stuff that make C something more than just an obvious abstraction of real-world machines.
Adding some of those features would mean that every language has to support them and that includes c if you wanted to gain any traction. Part of the reason c is the interop standard is because it lacks those features, the less features you have the easy it is for the rest of the world to inter operate with you.
Other systems would be IBM z and IBM i language environments.
The only place where this doesn't catch on are UNIX clones.
Offhand I'd assume you mean some of the following, but I'm curious if /my/ assumption about your needs is correct and complete. (Also, note that some of these may be implied by values elsewhere in the specification, it isn't strictly a data structure.)
* lengthMemory (not including the implicit size of the type's metadata)
* lengthCodepoints (distinct encoding elements)
* length(Some word meaning complete display elements, but not the actual width, height, etc.)
What if, within a given function call*, the underlying storage could be referenced with zero-copying and substrings made? (On exiting that context they'd be copied if retained.)
It might be useful to have library iterators that split on say, a single given character, segments up to a maximum memory size of N (and scanning backwards for the last "full character" or at least complete rune).
Of course, I believe that, by /default/ a program should perform /binary/ interaction with files including standard in/out/err. If there is a process for converting such binary input/output it should assume and produce data in UTF-8 (in NFC normalization) (when converting to/from "text" types) as a default, which should be easy to globally over-ride or set specifically for a given open file (and change mid use).
Of course I don't normally need to care about the encoding of data directly; libraries with parsers and writers typically do that as well as other encoding/decoding for that format for me. Everything else tends to be copy / compare and do not change, or making filenames (mostly append bits of "should all be UTF-8").
Actual, hard core, manipulation of encoded streams seems far more like a composition editing issue (A specific editor where default input dialogs are insufficient) or something that underlying libraries need to worry about.
So a string should know its number of code units. Validity information should be part of the type system and not runtime flags. I.e. you have a different type for arbitrary bytes and for valid UTF-8. (See IDNA for why normalization guarantee in infra over time is problematic.)
Being a unicode stream correctly reified using a specified (or specific) encoding.
* encoding possibly (the ABI could also specify it)
* in-memory length probably (could be a stream but that would require some sort of stream/iterator support at the ABI level), though you'd probably use the length in code units and match that to the encoding for in-memory length (if the encoding is not hard-coded)
* validated... in the sense that the stream could be garbage which may or may match the encoding above? No, this would be intrinsic.
* normalised, no
* length in codepoints, no
* length in grapheme clusters, no
> Do you also need actual random access to individual codepoints/"full characters"?
At the ABI level? Absolutely not.
If the validation is /required/ after every operation then a lot of processing must happen on the underlying storage after EVERY operation to ensure it is still valid.
If the validation is deferred (such as until programmer request or possibly when emitted out of program) then those checks are also combined in to one final operation; hence the reason for tracking 'is this validated'.
Similar logic also applies to normalization (which opens options for optimizations in comparisons among other things).
The only thing that is required is that the data be valid, anything else would be UB. I don't know of any properly implemented text processing instruction which would make valid text invalid, and thus there are almost no operations which would require validation.
The only point at which you'd need validation (aside from bytes to text) is when you're trying to play tricky bugger and do text processing via non-text instruction for performance gains.
> a lot of processing must happen on the underlying storage after EVERY operation to ensure it is still valid.
Only if your storage is insane and does not understand the concept of text e.g. mysql or files. And then yes, I would certainly want whatever garbage these output to be checked every damn time before somebody tells me "yeah it's text just trust me". In fact that's more or less how every language with an actual concept of text separate from bytes/arbitrary data treats files.
> If the validation is deferred
If the validation is deferred you need to check at runtime before each text-processing instruction if it's working on valid text or not, and if that is put in the hand of developers (to avoid paying the cost for every instruction)… we know how "just check them" null pointers end up.
POSIX is an ideal. It's true that POSIX is defined in terms of the C API instead of any ABI - Rust libstd used to have a tiny bit of code I wrote to deal with the fact that Android's libc exposes a couple of signal-handling functions as inline functions in <signal.h>, which is perfectly fine as POSIX goes but annoying if you're not writing in C. But in practice you can successfully bind to POSIX implementations from non-C languages and only need workarounds like this a few times, and in the same way, you can write a POSIX-fulfilling set of header files that interface to a non-C implementation.
And once you replace every component... you've replaced C.
And yeah, Redox is how "we replace C". A full operating system where C is not the main language.
Then come back and explain to me who cares about POSIX. It'll be an educational experience.
Started doing desktop software development on MS-DOS 3.3.
POSIX is only relevant for C, the bits that they felt shouldn't be part of ANSI C, not to impose UNIX semantics on other OSes.
It is completly irrelevant on programming languages with rich runtimes or good package managers available.
You can do that without direct dependencies to POSIX.
I don't care how .NET, Java, Go, Free Pascal, Ruby, Python, Perl,Swift, D, C++... implement their runtimes, unless I actually need to look under the hood.
- think of a PL as an OS (Smalltalk, Erlang, JVM)
- think of an OS as a PL
The latter leads to the idea that there might be a "main language" for an OS in the same way there's a "main language" for Smalltalk, Erlang, and JVM runtimes (Smalltalk, Erlang, and Java, respectively), and "auxiliary languages" that fit into the same runtimes (Java/Self/etc, Elixir/LFE/etc, and Scala/Clojure/Kotlin/etc, respectively).
On UNIX compatible OSes, others not so much.
You are fully right.
For example, AmigaDOS and BeOS took some ideas from Unix, without being Unix compatible.
This also means that the C/++ compiler, which depends on these standard imports and exports, could theoretically run in a pure Rust POSIX.
Google with their locking down of C in Android and ChromeOS?
Apple with their long term roadmap to replace everything with Swift?
Being bashed how it is possible that they aren't willing to keep on using C, walled gardens and such.
For learning a new language, it can be a good practice. But if you hope the project to grow successfully I think there is no hope.
Redox is an opearting system. An operating system needs a C library. Redox used to use newlib, but developers found it unsatisfying.