Hacker News new | past | comments | ask | show | jobs | submit login

>"It was kind of interesting exploring the OTP internals, especially some of the parts that haven't changed in a long time."

I have heard similar before at an Erlang meetup. Are there elements of the VM you encountered that were also static and lacking sufficient documentation? I'm guessing much of this is just tribal knowledge deep inside Ericsson then? It would be great if there were a public repository for these things.




There are really two major pieces of the BEAM, the parts implemented in Erlang (e.g. the compiler, OTP), and the parts implemented in C (the VM, or emulator as it is called in the codebase). Virtually all of the C code is undocumented in any meaningful way, short of some internal documentation on a handful of topics, as well as a those parts of the code which have thorough comments that explain some tricky aspect of the implementation.

From my own experience, those parts that are commented or documented tend to clarify some specific design constraints (for example, why processes have multiple locks on different parts, and why they are locked in a specific order, or the rationale of the carrier design); but you never really get a clear picture of why things overall are architected the way they are overall, what designs were considered and discarded due to some deficiency, what tradeoffs were made, etc. I think much of the actual content like that which may exist, is either buried in the minds of the original engineers, or in some internal documentation at Ericsson that has never been released. My suspicion is that you'd need to dig through mountains of emails and such to piece together a more complete picture of how things where put together over time.

It's also the fact that the BEAM just has a lot of really complex pieces built in to it after all this time. Everything from binary pattern matching and construction, to garbage collection and memory management, ETS, Mnesia, etc. Each one of those things is not only non-trivial, but have evolved significantly over time, through the hands of many engineers. It also doesn't help that large portions of the C implementation are written in an extremely macro heavy style, which makes it quite hard to read without knowing what all the macros do and how they play together.

Projects like Enigma, or Lumen, have a lot to give back to the community in the form of documenting how these pieces are built. Unfortunately, the lack of a specification for the Erlang language and its runtime, means it is very much a grind to work out how things are currently implemented, and why.


Thanks for the comprehensive insight. This does kind of invite the question though that how they themselves maintain these code bases if it's steeped in such arcana. Is there not the danger that this becomes similar to the situation with mainframes and Cobol for them?


I suspect there exists some internal documentation at Ericsson that just hasn't been cleaned up and added to the source repository - but it's entirely possible that the core Erlang/OTP team solely relies on passing the knowledge on from engineer to engineer.

I'm also not saying that the C code is unmaintainable. It's definitely a bear to dive into, but by spending enough time with it, it starts to unfold in front of you. The main issue I have, is that none of the specification/design documentation exists as part of the source repository. Maybe it doesn't exist at all, but in that case, I'd really hope that some of those core engineers would have taken the time to write some of that stuff down. In any case, none of it is readily available AFAIK.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: