For anyone mystified about what a NIF is that doesn't want to go read the docs.
The BEAM VM (which is the thing that runs erlang / elixir / gleam / etc) has 3 flavors of functions.
- BIFs - Built-in functions, these are written in C and ship with the VM
- NIFs - Natively implemented functions, these are written in any language that can speak the NIF ABI that BEAM exposes and allows you to provide a function that looks like a built-in function but that you build yourself.
- User - User functions are written in the language that's running on BEAM, so if you write a function in erlang or elixir, that's a user function.
NIFs allow you to drop down into a lower level language and extend the VM. Originally most NIFs were written in C, but now a lot more languages have built out nice facilities for writing NIFs. Rust has Rustler and Zig now has Zigler, although people have been writing zig nifs for a while without zigler and I'm sure people wrote rust nifs without rustler.
It’s important to note that while Erlang has protections against user code crashing an Erlang process and recovering, a faulty NIF can take down the entire virtual machine.
There's a series of things that a NIF must do to be a good citizen. Not crashing is a big one, but also not starving the VM by never yielding (in case the NIF is long-running) is important, plus a few secondary things like using the BEAM allocator so that tooling that monitors memory consumption can see resources consumed by the NIF.
The creator of Zigler has a talk from ElixirConf 2021 on how he made Zig NIFs behave nicely:
I don't think this is right. The process will crash, and the Supervision strategy you are using will determine what happens from there. This is what the BEAM is all about. The thing with NIFs is that they can crash the entire VM if they error.
Erlang's (Elixirs) error management approach is actually "Let it crash"
This is based on the acknowledgment that if you have a large number of longer running processes at some point something will crash anyway, so you may quite as well be good at managing crashes ;-)
Yes, but that's not Rust's error management strategy. Most Rust code isn't written with recovery from panics in mind, so it can have unintended consequences if you catch panics and then retry.
How so? The whole point of unwinding is to gracefully clear up on panics, how did it peak for you?
It's also not like there is much of a choice here. Unwinding across FFI boundaries (e.g. out of the NIF call) is undefined behaviour, so the only other option is aborting on panics.
It’s pretty common in the Elixir ecosystem for these types of libraries to not change very much. Elixir itself doesn’t change too much so these libraries stay solid without needing frequent updates. It doesn’t mean people aren’t using them. Some libraries even put disclaimers that they are actively maintained even if they haven’t seen an update in a long time. It’s something that takes some getting used to for some people (including myself at one point).
I will second this. I've been using multiple libraries in our production Elixir app that haven't been updated in the last five years. Elixir itself was declared as "stable" feature-wise years ago. It may be argued that the type system being introduced is not in-keeping with that, but not sure. Jose is a very cautious and diligent "benevolent dictator" and you get a lot of backward compatibility guarantees. Erlang is the same. Compared to what some people might be used to with churn in Node/React etc it is apples and oranges.
The semantics can certainly be argued, but a type system is sort of on its own tier of as far as language features go. Most importantly, there is only going to be one backward incompatible change which is the spec syntax, otherwise it is just leveraging how we already write Elixir.
Yes, I'm not worried about it. I've not been following it as closely as I'd like, but from what I've read the core team seems to be taking a very measured incremental approach with the type system.
> It’s pretty common in the Elixir ecosystem for these types of libraries to not change very much.
This is kind of fascinating and seems worthy of more detailed study. I'm sure almost anything looks stable compared to javascript/python ecosystems, but would be interesting to see how other ecosystems with venerable old web-frameworks or solid old compression libraries compare. But on further reflection.. language metrics like "popularity" are also in danger of just quantifying the churn that it takes to keep working stuff working. You can't even measure strictly new projects and hope that helps, because new projects may be a reaction to perceived need to replace other stuff that's annoyingly unstable over periods of 5-10 years, etc.
Some churn is introduced by trying to keep up with a changing language, standard lib, or other dependencies, but some is just adding features forever or endlessly refactoring aesthetics under different management. Makes me wish for a project badge to indicate a commitment like finished-except-for-bugfixes.
Erlang (and friends) are built with a goal of stability. Operational stability is part of that, but it also comes into play with code and architectural stability.
Maybe it's the functionalness, maybe it's the problem domains, but a lot of the modules have clear boundaries and end up with pretty small modules where the libraries end up having a clear scope and a small code base that moves towards being obviously correct and good for most and then doesn't have much changes after that. It might not work for everyone, but most modules don't end up with lots of options to support all the possible use cases.
The underlying bits of OTP don't tend to churn too much either, so old code usually continues to work, unless you managed to have a dependency on something that had a big change. I recall dealing with some changes in timekeeping and random sources, but otherwise I don't remember having to change my Erlang code for OTP updates.
It helps that the OTP team is supporting several major versions (annual releases) simultaneously, so if there's a lot of unneccessary change, that makes their job harder as well as everyone else's.
That is not what I meant. I looked at sorted_set_nif which doesn't seem to compile on OTP 26 (we're at 27 now), and fastglobal which has a very old PR with 3 approvals has not been merged. Elixir libraries may not change _much_ but core libraries like telemetry, Ecto, ExDoc, Jason, still get either minor or patch releases all the time.
If libraries get regular updates even if they are minor, it indicates they are in use. If they have inactive repositories and low hex.pm download numbers, they may have been abandoned which can mean you have to maintain it yourself in the future, or the people behind the library found it's not such a good idea after all. This doesn't have to be the case, which is why I asked.
Ah ya, I do see how the optics of this could give off that impression. I don't use this library myself, but the issue is with Elixir 1.15.7 & OPT 26.1.26 which is VERY different than "It doesn't work on OTP 26." Certain patch versions of Elixir and OTP have caused problems before (sorry, I don't have a citation) and this particular issue looks like it's related to dependencies not syncing up on the config change?
I do think more libraries should give that little "We're still maintained" notice as people not totally ingrained in this might not realize. To some, the fact that there have been no issues reported now that we're on OPT 27 and Elixir 17 would be an indicator that all is well.
Rustler wasn't properly forward compatible (only with regard to the build process, a compiled library will work just fine on any newer OTP) until 0.29. They are using 0.22, upgrading Rustler will be enough to get rid of this issue for all future OTP versions.
Thank you for the full story here as I just gave the issue a cursory glance. As someone quite ingrained in Elixir, I see an issue referencing specific patch versions of Elixir and OTP and immediately understand it's very specifically targeting that specific Elixir/OTP combo. But depr brings up a good point that not everyone is immediately going to understand this, especially newcommers to the language and it’s generally hard not to just read the headline.
Yeah I was trying to explain this to another developer that packages end up being “finished” eventually and seem to continue to work exceptionally well without updates for a really long time.
Something about immutability and the structure of Elixir leads to surprisingly few bugs.
Is any of this code open source? As an outsider, I'm kind of at a loss for why anyone wants this or what you kids are doing over there and how offended I should be by it.
TL;DR: Erlang/Elixir/etc are high level languages and the virtual machine they run on, the BEAM, is optimized for speedy IO but is not so great when it comes to intensive CPU tasks. You'll want to write the latter in a good systems language which is what libraries like this provide (you get C bindings out of the box, I believe).
It's also important to point out ports, because as you mention, NIFs are a way to integrate external code. But as someone else points out, NIFs can crash the entire BEAM VM. Ports are a safer way to integrate external code because they are just another BEAM process that talks to an external program. If that program crashes, then the port process crashes just like any other BEAM process but it won't crash the entire BEAM VM.
And then there are port drivers which are the worst of both worlds! Can crash the BEAM and need much more ceremony than NIF to set up but they’re pretty nice to do in Zig[1] as well
There's another option and that's setting up an Erlang node in the other language. The Erlang term format is relatively straightforward. But I'm honestly not sure of the benefit of a node versus just using a port.
The Erlang term format is straightforward, but if you want to set up another node in another language you need to correctly implement/emulate process linking, binaries, and some other stuff too, it's not just a matter of writing a socket to accept and emit Erlang terms.
It's not impossibly large but it's not something one does on a lark either; if there isn't support in your language already it's hard to justify this over any of the many, many message busses supported by both Erlang and other languages that don't have so many requirements.
NIFs are great for things that really feel like a relatively quick function call.
If you've got some mathematical/crypto function, chances are you don't want that to go through a command queue to an external port, because that's too much overhead. If it's a many round crypto function like bcrypt or something, you do need to be a bit careful doing it as a NIF because of runtime. But you wouldn't want to put a sha256 through an external program and have to pass all that data to it, etc.
Something that you might actually want queueing for and is likely to have potential for memory unsafety like say transcoding with ffmpeg, would be a good fit as an external Port rather than a NIF or a linked in Port driver.
Ports are generally great, but you are running multiple apps and communicating between them using STDIN/STDOUT etc. There are certain corner cases where they might not be suitable. I had been using an OPCUA library where the logging had to be turned off because otherwise it was sending the logs back to our Elixir app and we were expecting Elixir terms. Also the shutdown of the remote end of a port can stop the data getting back to Elixir. There are ways around all of this but it's slightly annoying. In general though, ports work 80% of the time and are really convenient.
Yeap, this is a big one. In Nx we have some facilities for doing zero-copy stuff that only really work if you have, say, Evision and EXLA running on the same OS process.
We do have IPC handles that could enable this over, say, ports, but then there's a whole other discussion on pointers vs ipc handles
Do nifs have the equal process time stuff that regular elixir processes have? Where the BEAM will move the scheduler into another process if it's taking too long?
Forgive me if I'm mixing up my terminology it's been a bit since I have poked at Elixir.
BEAM can't preempt native code, that's why NIFs should either be fast/low-latency to not excessively block the scheduler or be put in what's called a dirty scheduler which just means to run it in a separate thread.
Nope, at least not by default or like one would expect from pure Erlang (when it comes to preempting). Been a while since I dug into this admittedly but I write Elixir daily for work (and have for about ten years now). They don’t do the record keeping necessary for the BEAM to interrupt. You need to make sure the “dirty scheduler” is enabled or you can end up blocking other processes on the same scheduler.
Does anyone actually enjoy using these systems that encourage you to embed programming-language X code in programming-language Y heredocs?
I always find actually doing that — and then maintaining the results over time — to be quite painful: you don't get syntax highlighting inside the string; you can no longer search your worktree reliably using extension-based filtering; etc.
I personally find the workflow much more sane if/when you just have a separate file (e.g. `foo.zig`) for the guest-language code, and then your host-language code references it.
I've done some assembly in C, and for big functions, yeah, I want it in its own file, but smaller things often make sense to embed. I'm not sure if I'd like my nif code embedded into my erl files (assuming this works for Erlang as well), but it could conceivably make the nasty bit of boilerplate around ERL_NIF_INIT in the NIF (which I have to do in C anyway) and exit(nif_library_not_loaded) in the erl go away, which would be nice.
It's certainly possible to get syntax highlighting on the embedded code, but you'll need to work with your syntax highlighter; it certainly helps if you're not the only person using it.
But then again, I worked without syntax highlighting for years, so I'm happy when it works, but when it doesn't, I'm ok with that too.
I’m not too familiar with Elixir, but I definitely prefer building libraries in Zig and then consuming them in Python, TS, whatever over embedding them inside another language directly.
That being said, you can get IDE language support for embedded code if you use eMacs or vim (and probably other editors as well). As I mentioned I still vastly prefer separating it personally, especially if you don’t necessarily expect your Python or Typescript programmers to be knowledgeable about Zig (or C).
But, if all you do is write elixir wrappers around the zig function, to completely hide the foreign language functions, keeping both the wrapper and implementation in the same file, even if two different languages doesn't seem horrible, but again, keeping them in two file doesn't seem like a huge difference too
I think its really a matter of taste, both options viable
Actually literate programming might be a tool to get you syntax highlighting back. You could write one block of code in one language and the other one in another language and make one include the other in some place. Both blocks annotated to be their specific language, inside the prose. Emacs for example syntax highlights each block according to its corresponding programming language. It also allows you to edit blocks in separate buffers.
Another way could be to switch the syntax highlighting of ones editor temporarily, but then syntax of the surrounding prose and other block might interfere.
Zig is also used in an excellent way by burrito[0]. I've also used zig for compiling NIFs written in C/C++/Objective-C, since `zig cc` makes cross-compiling much nicer.
I wish zig got more use and attention in the Erlang ecosystem, but rustler seems more popular.
Rustler is more popular because Rust solves one of the scarier bits about NIFs, the fact that irresponsible memory management in a NIF can kill the entire Erlang VM.
I can appreciate Zig for entire projects that would otherwise be written in C, but for the lengths of code that make sense for a NIF as opposed to a port, Zig seems like a strange point of failure to add to my system. If it's simple enough that I can be confident in my flawless manual memory management, I'd just use C, and for anything else, Rust is the far safer choice.
I use zig a lot in elixir nif, for things like audio and video processing, it works great. But I do not use zigler as I prefer the code to live in their own codebases. But zigler is really nice and it provides an easy way to do computational heavy tasks in elixir.
The "helpers" library is used to convert types to and from erlang, I plan on open sourcing it but it is not ready now. In the above example, the code is explicit but "entry" can be created with an helper comptime function. erl is simply the erl_nif.h header converted by zig translate-c.
Understand NIF risks: they can crash your entire Elixir Application, beyond their immediate supervision tree, because they operate in the same memory space as the BEAM itself.
(Yo dawg, we put a niche language into a niche language so that...)
I wonder if the Zig code can be not written inline, as an option. With anything larger than a few lines, I'd want syntax highlighting, LSP support, navigation, etc. It's easier to achieve with one language per file.
The BEAM VM (which is the thing that runs erlang / elixir / gleam / etc) has 3 flavors of functions.
- BIFs - Built-in functions, these are written in C and ship with the VM
- NIFs - Natively implemented functions, these are written in any language that can speak the NIF ABI that BEAM exposes and allows you to provide a function that looks like a built-in function but that you build yourself.
- User - User functions are written in the language that's running on BEAM, so if you write a function in erlang or elixir, that's a user function.
NIFs allow you to drop down into a lower level language and extend the VM. Originally most NIFs were written in C, but now a lot more languages have built out nice facilities for writing NIFs. Rust has Rustler and Zig now has Zigler, although people have been writing zig nifs for a while without zigler and I'm sure people wrote rust nifs without rustler.