By the title alone, this means nothing. How much would it cost otherwise? What is the percentage savings?
In TFA, it gets better though: "Steve: That’s pretty easy. When I started on the spam team, we had close to 1,400 servers running. When we converted several parts to Elixir, we reduced that by around 95%. One of the systems that ran on 200 Python servers now runs on four Elixir servers (it can actually run on two servers, but we felt that four provided more fault tolerance). The combined effect of better architecture and Elixir saved Pinterest over $2 million per year in server costs. In addition, the performance and reliability of the systems went up despite running on drastically less hardware. When our notifications system was running on Java, it was on 30 c32.xl instances. When we switched over to Elixir, we could run on 15. Despite running on less hardware, the response times dropped significantly, as did errors."
> When our notifications system was running on Java, it was on 30 c32.xl instances. When we switched over to Elixir, we could run on 15.
Would be curious to know how they tried to optimise the Java stack.
Because on every benchmark I've seen the JVM is faster in every which way than Elixir. Except for memory where often people will over-provision the JVM rather than look at where their code might be over-allocating or leaking.
The advantages of Elixir are not performance-related.
There is a lot of focus on raw performance on web-related services, when in reality most of their running time is spent waiting for IO. If there are two things the BEAM excels at, is IO and turning almost any problem into half a dozen processes that are scheduled and run in parallel, if not geographically-distributed, with 1/50th the effort of any other language.
We live in a world with 32+ core CPUs. If your load is not spread uniformly all over those cores, you're losing a ton of performance. Handling requests over separate threads, like 99% of languages do, still isn't enough if all the business logic runs on the same thread.
I'm currently writing a web crawler in Elixir, and it is easier to design it so every request is done and processed in parallel, than to write a naive sequential one you'd do in any other language in half a day.
Though if people consistently over decades sing a language's praises on a single point consistently like they do the beam on this point, it's usually not without merit
They wrote it from scratch with the benefit of all the knowledge they had gathered after running the old system for years. A 2X improvement would not be surprising to me, even if they had rewritten it in the same language. .
According to others in this discussion they also made architecture changes (DB, Kafka etc.). Do we know if that improved the performance?
There is no objective way we can tell if Elixir had any performance impact. It could have been due to the rewrite, the architecture change or a combination of both.
Elixir/BEAM's (Erlang Virtual machine) frugality isn't just theoretical; it's got real-world creds. Originally tailored/optimized for 1980s telecom switches (a fleet of single core extremely low powered machines.) Fast forward, and you've got a setup that's less demanding on your A/C and optimizes multi-core usage like a champ. it utilizes the same concurrency abstractions whether its 2 cores across two machines or 64 cores on the same machine, it makes no difference to the BEAM
Take the hot code reloading and actor model-based concurrency as a prime example. It's like getting AWS-level functionality without the steep bill for a lot of companies.
Though, I gotta admit, it used to be a hard sell for CPU-heavy workloads, especially number crunching. But Elixir is stepping it up with their Nx library, so that's changing.
Examples of companies cashing in on BEAM's efficiency:
Bleacher Report: Went from 150 servers down to 5. No joke.
Discord: Handles millions of real-time users without breaking a sweat or the bank.
Financial Times: Their content recommendation engine got both efficient and cost-effective.
Change.org: More petitions, fewer servers.
Podium: A million SMS messages a day and didn't have to massively scale hardware.
> a fleet of single core low, extremely powered machines.
what are "extremely powered machines"?
> It's like getting AWS-level functionality without the steep bill
which part of AWS functionality? load-balancing Beanstalk-style is free. AWS compute is not free, but neither is compute free with Elixir or whatever stack you run.
Totally get your point about AWS having free-tier services and compute never being free, regardless of the stack. My point wasn't that BEAM offers free compute, but rather that its inherent features can sometimes make certain AWS services redundant. For instance, Elixir has built-in fault tolerance with its actor model and supervision trees. This means that even when a process fails, it gets rebooted automatically without messing up other processes—kind of like what you'd use Auto Scaling and backup services for on AWS.
Similarly, distributed Erlang allows Elixir to run across multiple nodes. This could cut down the need for extra AWS instances or orchestration layers like Elastic Beanstalk. And when it comes to deployments, Elixir's hot code swapping can simplify what might otherwise require rolling updates or blue-green deployments with Elastic Load Balancers in the AWS ecosystem.
On the concurrency front, Elixir is designed for handling a high number of users and tasks simultaneously, which might reduce your reliance on EC2 or Lambda. Phoenix, Elixir's web framework, even has real-time capabilities baked in, so you don't need extra services like AWS WebSockets for that.
Finally, Elixir's actor model can serve as an in-memory message queue, which could potentially negate the need for something like AWS's SQS. So, while you're still incurring compute costs, the need for additional AWS services could be lessened, thereby simplifying your architecture and perhaps lowering overall costs.
Not to just toss around anecdotes, but I once rewrote an email service in elixir for a company from a literal sketch on a piece of printer paper describing what their old system did. The new service ran on 1 server vs half a dozen and was both faster at crunching through their mail queue and used far fewer resources. Some tasks are embarrassingly parallelization and the BEAM excels at those tasks. Sure you might want features it doesn't have for certain systems, but for some things it really is the right tool.
Whatsapp took over the world running on Erlang/BEAM, with barely any servers and a few engineers. I honestly don't know what could be a better success story than that, but Discord has also done pretty well. The BEAM + Rust combination is looking scarily effective right now.
I'd be willing to accept the argument that Whatsapp happened to have assembled an uncommonly good team, but it is a signal.
Yeah the BEAM with some Rust NIFs is a great combo. I'd definitely consider it in the future for many types of problems and anything involving a HTTP or GraphQL interface.
If I had said 1/4th the effort would that have invalidated my argument? I pulled that figure out of my arse from experience. YMMV.
You'll note I'm not selling anything here, and no one is paying me a commission.
A junior that got "swindled" by my claims and spends a weekend learning Elixir becomes a better programmer and earns another feather on their cap. How tragic.
Junior devs, if you want to become a senior greybeard like me, learn anything that tickles your fancy, and ignore anyone that says it isn't worth your time. Even learning COBOL will make you a better programmer. I can only promise that Elixir is more fun than COBOL.
Erlang and BEAM was designed to handle Ericson's telephony services. Piping-lots-of-stuff-in-parallel in fault tolerant fashion is the main use case around which it was designed.
To all junior developers--this developer doesn't know what he is talking about.
Erlang and BEAM, the things underlying Elixir, were specifically engineered to be used in a reliable, distributed fashion including a gigantic amount of idioms and support that no other language comes close to.
Erlang's bit syntax is hands down the best byte stream serialization that still exists--both from a perspective of performance as well as expressiveness. OTP is a documented set of idioms and behaviors for building systems that are meant to be highly distributed and deal with failure gracefully. These include things behaviors like in-place upgrades, supervisors which shut down and restart failing processes, error delegation, etc.
Erlang was meant for genuine "five nines" reliability. No other language comes close (maybe Ada does--I'll let their proponents chime in about that).
You can do that easily in modern Java--even for older JVMs, tools like Netty and later Vertx have been around forever. Or in Node, even more easily.
Elixir/BEAM do have some benefits that are worth considering for many projects. But they absolutely are not special in this regard, and that's the junior-developer trap about which the person to whom you replied was referring.
You may be enamored with the "nothing new under the sun" idea that "Turing complete is Turing complete, anything you can do in Python you can do in Brainfuck" but as someone who has written code professionally in about a dozen languages, no, you can't just as easily get the same kind of parallelism out of Java as you can Elixir. To assert otherwise is factually false.
Is it possible to get to the same result? Yes. It is not, however, anywhere in Elixir's ballpark of "easily". Do not discount the power of language-level, not just support, but encouragement. Especially when you're working with junior devs, if "the right thing" and "the thing the language wants you to write" are not in alignment, everything is much, much harder. Erlang and Elixir actively encourage easily parallelized code. Java activity encourages tangles of objects.
Java, kind of with Akka or similar, although even with that one always be aware of blocking. Loom should help. Node: not really unless something dramatic has changed. Using 32 cores is going to require 32 separate node (OS level) processes, and your on your own for providing communication between them, plus callbacks aren't near as intuitive as the BEAM process model (think green threads)
I mean sure, but given the topic at hand is getting performance out of your 32-core server, I'm not sure that's a super relevant observation. "I've been given way more hardware than I need" is an entirely different problem I think most of us would be happy to tackle.
I'm a seasoned Java and Node developer, but have never touched Elixir/Erlang. Could you spell out for me the benefits Elixir provides over Java concurrent code? Is it actually a performance gain, or simply a nicer syntax? I am a bit confused by the claims of this post and of the earlier comment. Thanks a lot
So, when you're working with Elixir or any language on the BEAM VM, you're in a world where data is immutable and processes are isolated. It's like having Akka's actor model but at the VM level, so it's super integrated.
The BEAM VM itself is a different beast compared to the JVM. It's more like its own mini-OS designed for real-time multitasking. Each process has its own garbage collection, and it's all non-blocking. So if one process goes belly-up, it doesn't take the whole system with it. Imagine a network of telephone switches; if one gets zapped by lightning, the rest keep chugging along. That's the level of fault tolerance Erlang and BEAM were designed for.
Now, speaking of fault tolerance, let's talk about how easy it is to mess up an Akka system if you're not careful. Say a new Java dev joins your team and doesn't get Akka's actor model. They might introduce shared mutable state between actors, which is a big no-no and can lead to all sorts of race conditions. Or they might do something like put a blocking operation inside an actor, which can hog resources and mess up the whole system's performance. Akka's great, but if you don't follow its principles, you can still shoot yourself in the foot.
So, the beauty of Elixir and BEAM is that a lot of these good practices are enforced by the VM itself. You get that fault tolerance and concurrency baked right in, without having to rely on every engineer knowing all the best practices.
I recently saw a talk on Youtube about "structured concurrency" in Java. It looked pretty interesting. But it seemed to me the way to achieve parallelism is by starting on a procedural code flow and as you come to a part that can be parallelised, you split into a bunch of tasks in a scope and that scope will monitor how those are executed. Once the results are accumulated, we go back into the procedural flow. This is similar to what is done in go IMO and is a pretty good technique.
In Elixir, on the other hand, you could create a module which is like a server process. You can start this server in a procedural flow, or you can "connect" it to a supervisor by giving it the startup information needed and a strategy to be used on how to restart the process in case it crashes for some reason.
A client process (or processes) with it's own module can then send messages to the server which will handle the incoming messages in its inbox sequentially. If you squint at it from an angle, modules look like classes in that they provide a way to separate code logically.
This way of doing concurrency takes getting used to and has a higher initial learning investment. But it feels cleaner and is less prone to user errors. In go for example you have to be careful of closures and shadowing which will result in shared memory and hard to debug errors, even though the initial investment in learning Go is much lesser.
When a person makes a claim, especially such a ridiculous one, it is perfectly valid to outright reject that claim without any argument. Why? Because no argument was provided in the first place.
I would say that a person claiming that it provides better results with 1/50 the effort is a claim that needs substantiation and was born from a position of hype, yes.
The go runtime has similar capabilities as the BEAM runtime when it comes to concurrent workloads. Go has the benefit of being a typesafe compiled language which gives it speed benefits. But using either one of them instead of Java is probably going to be a huge win for most teams on concurrent workloads.
> The go runtime has similar capabilities as the BEAM runtime when it comes to concurrent workloads.
Only if you think that the BEAM is similar to being able to easily spawn a function on a separate thread and having channels.
Last I checked, goroutines had no mailboxes, supervisors, process monitoring, registry, and its scheduler has a much smaller scope and featureset than the BEAM's.
I swear it's obvious when people comment about the Erlang ecosystem without having really used it for anything.
Exactly, you've hit the nail on the head. While Go's runtime does offer some nice concurrency features, like goroutines and channels, it's just not on the same level as BEAM when it comes to a comprehensive approach to fault tolerance and system resilience.
BEAM's got this whole ecosystem built around it, right? Mailboxes, supervisors, process monitoring, and a registry—these are all first-class citizens in the Erlang world. And let's not forget the scheduler; it's like comparing a Swiss Army knife to a simple pocket knife when you look at BEAM's scheduler next to Go's.
It's kinda funny when people talk about BEAM and Erlang as if they're just another runtime or language. (it's more OS like than traditional VM like) They're really more like a whole philosophy of how to build robust, fault-tolerant systems. And if you haven't actually built something substantial with it, you're likely to miss out on what makes it so special.
I've written non-trivial code in both Erlang and Go. Beam doesn't have anything to do with supervisors that's an OTP thing. mailboxes can be simulated with channels and there are conceptual similarities. process monitoring and the registry are unique to BEAM it's true and they have useful properties to leverage. But they aren't core to how the BEAM handles concurrent processing. The schedulers work on a similar conceptual mechanism for the programmer when writing concurrent code with differing optimizations in each.
In my experience I think my statement still stands.
> Beam doesn't have anything to do with supervisors that's an OTP thing.
The key VM feature you're missing is process isolation. Without it, supervisors are not really possible - you can implement something that vaguely looks like them, but it won't provide the fault tolerance guarantees.
Imagine for example an Erlang process that leaks files. At some point it hits an EMFILE, gets killed, and the supervisor restarts it. The system will then go back to operating normally.
Now imagine a Goroutine doing the same. It hits EMFILE, exits, and the supervisor restarts it. This doesn't help anything: it just hits the same error and the system is unusable. There's no way for the VM to guarantee cleanup when a Goroutine exits, because it doesn't isolate Goroutines and track which one owns which resources.
Links and monitors are tools to extend the same behavior to user-managed resources like DB connections and so on. The responsibility for cleanup if a process crashes while holding a DB connection falls on the DB connection library, not on the error handling inside the crashing process.
> Beam doesn't have anything to do with supervisors that's an OTP thing
Are you aware Erlang was designed to be a VM(the BEAM), a language for that VM(Erlang) and a comprehensive set of patterns and infrastructure for fault tolerance(OTP)?
Those naive sequential services don't really exist in prod though.
It would be a dark day to discover my AWS t1.337metal was blocking 63 cores on I/O when nearly every modern lang has a wealth of async functionalities just awaiting to be exploited.
IO lists, which are the foundation of anything that build a string incrementally, are managed with vectored IO syscalls out of the box (readv/writev), which you'd have to handle yourself in most other languages, or resort to allocating and endless memory copying which Erlang is able to avoid.
>Handling requests over separate threads, like 99% of languages do, still isn't enough if all the business logic runs on the same thread.
I mean, if your business logic is inherently serial it makes little difference if you run it in a single thread or if each serial segment in between IO requests is run in a different thread. One way or another it's not going to get parallelized.
All of what you mentioned applies equally well, if not better, to the JVM. Especially now that it has virtual threads. The article does not go into details about the implementation of the Java program. My guess is that they were not using asyc code, or profiled it to see what the bottlenecks are. Rewriting is always fun especially in the latest flavor of the day stack.
It’s not even remotely similar. Node’s cluster is just bog-standard OS subprocesses running their own event loop.
To spread work over multiple cores with the cluster (or even worker-threads) modules you have to do so explicitely and manually. It’s essentially the same model you get with pthreads, or java, or python.
BEAM is a completely different model, the “processes” are internal tasks, which the runtime will schedule appropriately over available cores (the scheduler has been multithreaded for 15 years or so), spawning processes is as common as spawning tasks in JS, except each of these is scheduled on an SMP runtime.
beam will literally send processes between machines erlang) more easily than node will balance load over cores.
At the Abstractions conference in Pittsburgh in 2016, Joe Armstrong was hanging out in the hallway with his swag bag, just a regular engineer complaining about Jira and his manager (he had no interest in managing) and asking people's opinion about the schedule and lunch places. We were looking at the program for the next sessions and someone said there's a talk about ideas for adding concurrency features to Node, and we said great, lets go, and a few of us went to go stand in the back of that one.
On a number of points the presenter was proposing, like message passing, immutable structures, and process tree management, the presenter would say, "but Erlang's had this feature for many years..." and the room would laugh and turn around and acknowledge Joe. He was modest but the validation must have been nice.
I'm unfamiliar with BEAM. How does this compared to goroutines? Obviously they won't migrate between machines, but concurrency feels very easy and ergonomic in Go.
Go routines were pretty directly inspired by Erlang processes, so in terms of primitives I'd say they are very similar, aside from go lacking the distributed features you already mentioned.
Where Erlang/Elixir add value beyond go routines is what OTP (kind of the standard library) provides on top. Pre-built abstractions like GenServer for long running processes, Genstage for producer/consumer pipelines and supervision trees for making sure your processes haven't crashed and restarting them and their dependents if they have.
At the most basic level it's a bit similar: there is a multithreaded runtime which schedules work in userland. Green threads if you will.
The devil, however, is once you go beyond the trivial.
First, the units of work operate completely differently, BEAM follows the actor model rather than CSP, meaning every actor has an address / mailbox and the actors can send one another messages through this, any actor can send any other actor (they're aware of) messages, and actors can process their mailboxes however they want.
But BEAM is also completely strict and steadfast about its actor model: its actors are processes, each has its own stack but also its own heap, when one actor sends a message to an other, the message content gets copied (/ moved) from one stack or heap to an other, processes do not share memory[0]. Incidentally this is what makes distribution relatively transparent (not entirely so, but impressively still): if everything you do out to interact with others is send asynchronous one-way messages, it doesn't really matter whether they're in the same process (OS), in a different process (OS), or on a different machine entirely.
The reason BEAM works like that however is not for any sort of theoretical purity, instead it is in service to reliability, which is the second big difference between BEAM and Go: BEAM error handling is inter-process, not intra-process. BEAM's error handling philosophy is that processes encounter errors, for all sort of reasons, and when that happens you can't know the entire state of the process, so you just kill it[1] and it should have one of its buddies which is linked to it, and whose job is to handle this and orchestrate what happens.
BEAM has built-in support for linking and monitoring. In the context of erlang, linking means that if one process dies (crashes), the other is sent a special message which also kills it. This message can be received as a normal message instead, in order to handle the crash of your sibling (in which case you receive various metadata on the crash). Monitoring means you just want to receive the crash signal. The reason you might prefer linking to monitoring is that if you're a manager of other processes and you crash, you probably want all the processes you manage to die as well. Which doesn't happen with monitors.
That is because BEAM has its origins in telecommunications, where reliability means redundancy, and oversight. So the way you structure an application in beam (often) is a tree of processes, where many of the processes have oversight of a subtree, handle fuckups (maybe by restarting, maybe by something else), serve as entry point to their workers, etc..., and if one of the leaves dies that's just a signal sent to its parent, which might just die and signal its parent, which will handle it somehow. This is the design principle known as the supervision tree: https://www.erlang.org/doc/design_principles/des_princ#super...
The third big difference is more philosophical and has to do with code reuse: because of (2) above, a lot of erlang / beam / otp is communicating between processes in a subtree, moving messages between them, exit signal strategies, etc... which leads to behaviours (https://www.erlang.org/doc/design_principles/des_princ#behav...), which are pretty alien because they're more or less mini frameworks, which not only are two things which are usually put opposite one another, but many people don't really want to hear about frameworks.
But that's what they are: behaviours are the encoding of entire prototypal lifecycle and communication patterns, where the user / implementer of the behaviour fills in the "business" bits.
Oh yeah and beam comes with an entire suite of introspectability tooling, which is kinda linked to (2): all the oversight thing ends up at people, so you can connect to a runtime and look at and touch all the things, more or less.
BEAM is a bit of an OS running on an OS really, probably closer in philosophy to the image-based languages of the 80s. In part because it is a language from the 80s. Not quite image based though, or in an other way designed to go even further and just run forever, as it includes built-in support for hot code reloading and migrations (though from what I remember that's not super great or fun, it was quite messy and involved to actually do properly).
By comparison to all that, goroutines are just threads which happen to be cheap so you can have lots.
[0] kinda, some objects live on shared heaps as an optimisation but they're immutable and reference counted so it's an implementation detail.
[1] and here if actors share any memory, an actor might be dying in the middle of updating or holding onto shared state, which means its error corrupts other actors touching the same state
Async/await and any module won't save you from global state, data races, and the fact that you're running on an imperative language with mutable state. Additionally, the ergonomics are not the same, so even if you could replicate the BEAM in Node or any other language, you'd have to be a masochist to do it.
Lastly, the concurrency are primitives to the entire runtime, not a set of external libraries maintained by whoever, which might be incompatible with other libraries you might want to use.
I think Node is still single core by default? Elixir (or rather the Beam) will handle core utilisation so if you start a load of Elixir processes they’ll be spread across multiple cores.
Node itself has never been single-threaded. The execution model for JavaScript is single-threaded, so there’s no working around that, but libuv uses threads to build async IO on top of blocking operations.
Then there’s worker threads, which are pretty similar to web workers AIUI, that give you parallel execution for cpu-intensive work.
Obviously, though, none of these facilities compare with BEAM
It doesn't effectively do that. It does about 10% of that, ineffectively.
Instead of running a single VM with full knowledge of how to run lightweight processes, designed to fully take advantage of modern multi-core CPUs with multiple guarantees enforced by the runtime you have multiple single-threaded VMs awkwardly communicating with each other over a bolted-on API
Yeah but then you have to handle a lot of the synchronization of memory. It is hard to make you realise what is possible on the erlang vm without having tried it.
In particular, it is preemptive. This... Makes a lot of stuff easier.
NodeJS was designed to run single-threaded. Sure, you can use cluster module to run it with multiple but there's memory overhead and the ergonomics of sharing state and message-passing is nowhere near GenServer. Not to mention all the other benefits of BEAM.
But about the specifics of implementing a web-crawler, a NodeJS way to implement it would be to parallelize using lambdas.
They rewrote, which is known to help too. Going from 30 to 15 instances is not bad but it's very likely that a Java-to-Java rewrite would have helped go down too.
The big one however is going from 200 Python servers to 4 Erlang ones: a 50x reduction is quite something and a Python-to-Python rewrite would not have allowed to achieved a 50x gain:
> All this is possible because Elixir, and the Erlang platform underneath, are fundamentally designed for always-online software with many users. When you use the right tool for the job, the benefits are clear.
In short, the Elixir is doing something completely different. And they are also not counting some new pieces like the database cluster and kafka based aggregators, etc.
> The big one however is going from 200 Python servers to 4 Erlang ones
If your rewrite or refactor gives you a 10X in performance, that's not an optimization, but a bugfix. Unless you are a researcher who have just found a revolutionary algorithm.
It is a matter of mindset. I work on software with soft real time, storage capacity and power consumption constrains, there is a constant flux of "small" feature requests and I cannot even "throw hardware at the problem" at will like as they apparently did, until it became unquestionable that the problem had to be fixed properly.
I'd recommend to adopt this mindset even if one doesn't have those constrains - without erring on the side of "faster than necessary" though - because it's usually difficult to assess how much time and money are leaking because of the inefficiencies one accepts in the name of questionable reasons. They were sort of lucky to have a bill to show it to them.
A proactive stance in this regard goes a long way.
> They rewrote, which is known to help too. Going from 30 to 15 instances is not bad but it's very likely that a Java-to-Java rewrite would have helped go down too.
A java-to-java rewrite could have been as much if not more painful than a java-to-elixir rewrite, especially if the service is highly concurrent.
Rewriting synchronous code as asynchronous in Java is a lot of work and not fun at all IMO.
Given the recent arrival of virtual threads in Java 21 this may not be necessary any longer but at the time I think it was a perfectly reasonable choice.
> it's very likely that a Java-to-Java rewrite would have helped go down too.
it depends.
The elixir programming paradigm might be one where you are able to more easily write efficient, but still highly concurrent code, where as it would take more work to do the same in java.
Elixir / Erlang does open up a lot of paradigms. However it'll be interesting to see how Java fairs with its new virtual threads. There was already Akka so it should happen fairly quickly.
Still I'd prefer Elixir. The BEAM VM just runs lighter.
Flagged for being silly flamebait. The Python projects you have personal experience with might have been poorly run but that’s not representative of the language, and it’s not going to lead to a conversation where anyone learns something.
The syntax and semantics of the language change in non-reverse-compatible ways between every minor release. This is independent of project management.
As an example, between 2.3 and 2.5, the syntax for package variables was changed (and then the semantics were changed between 2.5 and 2.7). There is nothing you can do as a python user to ameliorate the impact of such changes other than to not use those language features.
Can you explain how "managing my project better" would have allowed me to avoid the impact of this change?
> The syntax and semantics of the language change in non-reverse-compatible ways between every minor release.
Even accepting this and assuming the average project was bit by every single one, the release cadence for minor versions is aboit once per year (recently, almost exactly that, in October), and minor versions are supported for 5 years, so this would justify updates every year if you were a maximally eager adopter, or every five years with a maximally conservative while only using in-support versions approach, or somewhere in between for less extreme cases, not every three months.
> As an example, between 2.3 and 2.5, the syntax for package variables was changed (and then the semantics were changed between 2.5 and 2.7).
2.7 was released 13 years ago. Why would you reach that far back for a relevant example?
That’s three years, not months, and consider that it’s possible that how we develop software as a field might have matured over multiple decades. The edge case you’re referring two didn’t even affect most packages in the 2000s so it’s quite a stretch to say that something which happened in 2006 embodies how Python is developed now.
Don't take this the wrong way, but I don't really care about how your 2.3 apps were or were not affected. It's not my job to maintain them. The apps I cared about were the apps I had to mantain. And it turns out that if you have tens of thousands of lines of python, you eventually hit a problem that needs to be fixed.
So sure... if you have a 200 line program, maybe you won't hit any code whose semantics have changed. Large apps will (and still do.)
So say “when I worked on a Python project many years, we had a lot of problems with one release” – people might find it weird that you’re bringing up old history but nobody is going to doubt that you personally had an unpleasant experience. It’s okay not to like Python!
What’s getting criticism are these huge sweeping claims like “you rewrite your code every three months” or “syntax and semantics of the language change in non-reverse-compatible ways between every minor release” which you have been completely unable to support or the attempts to dismiss anyone else’s different experiences as somehow less valid.
I don’t tend rewrite Python code any more often than is needed due to feature changes or occasionally refactoring to pay off some maintenance friction caused by design that has in practice turned out to be suboptimal.
When you move from one minor rev of python to the next, some language feature changes (either syntax or semantics or features no longer work.)
For instance... if you use async io in 2.x, the debugger stops working. Between 2.3, 2.5 and 2.7, the syntax of package variable scoping changed and then the semantics changed from package to class variables.
If you used a feature like package variables in your code in 2.3, that code would not work in 2.5. If you fixed it in 2.5, the semantics changed so that if you defined a package variable according to the 2.5 syntax, but it was defined within a class, it became a class variable.
> When you move from one minor rev of python to the next, some language feature changes
Even if there was a breaking change affecting your project every minor version, to have the cadence of backward-compatibility induced changes you siggest you’d have to be switching Python versions forward about four times as fast as they are released, which, if you started with the oldest in support version at the beginning of the project, you could only sustain for about a year and a half, before running out of versions to switch forward to.
> If you used a feature like package variables in your code in 2.3,
Then you are probably out yelling at kids to get off your lawn; 2.3 being out of support for 12 years.
That's great that your 5 line scripts don't use features that change between revs, but people who have to maintain large python apps have to book time to pore over the latest language version's definition, update our linting tools to find where in the codebase we use a deprecated feature, change the code, update the tests, retest and redeploy.
Not to mention getting a version clean dependency closure. Though we have forked and rewritten some of the non-standard modules we're dependent on to be less broken and to give credit where due, it does seem like standard modules supporting python3 are version clean unlike python2 and 1.6.
The heuristic we use is about an hour of dev time per 750 lines of code so our 70,000 line legacy python app takes somewhere around 100 dev hours per minor revision upgrade.
Compare this to a legacy C application written in 1989. How do we port it to the latest version of C? We just copy it and compile. That community went to a lot of trouble to ensure code written in previous versions of the language still worked. The last time I heard of a language feature being deprecated was in 2011 (though I think gcc recently undefeated support for trigraphs.)
In my opinion, your python baby is ugly. It was ugly in 1.6. It was ugly in 2.x. And it remains ugly in the 3.x era. You should come to terms with the fact that some people just don't like python.
> You should come to terms with the fact that some people just don't like python.
Nobody cares about that - it’s a given that any language will have fans and detractors and most of us are mature enough to focus on what works for the projects and teams we’re part of.
What we’re objecting to is portraying your experience as a global truth. If you don’t like it, sure, but unverifiable hyperbole isn’t contributing anything but noise. This could be your opportunity to learn what tools or practices people use or consider whether the way you want to use the language is at odds with the core developers’ view.
You’re the only one doing that. Nobody here has questioned that you had an unpleasant project to work on - we’re only arguing that it’s not representative of the experience now (your initial hyperbole) or even 15 years ago when the 2.3-2.5 transition would have happened. Many of us have worked on larger codebases in that timeframe with very different experiences.
They threw DB[s] and Kafka into the mix. Python would get them the same net gain, if not more, with less dev cost. Python's I/O workloads perform on par with Go/NodeJS (See FastAPIs 3rd party quarterly benchmarks as an example).
If they rebrand Python to Metal or some other name, people would recommend it left and right. It's just suffering from bandwagon criticism. Yet it remains one of the top 3 languages for years, covering several domains.
Python is the second best language for everything. Even with its warts (which any language gets after 30 years), it's a very solid, defensible choice.
However, it does not have a great concurrency story relative to languages that were built concurrency first (Go, Erlang), and it's fair to acknowledge that.
I think it's more important to look at the re-architecting than the different language. Second, I think certain architectures - like the actor model described in the article - work better and more intuitive if you use a different language.
That said, I'm sure a 2x performance improvement could've been done in Java as well if they did a re-architecture. They could also have made a lateral movement and go to a different JVM language, like Scala that also has an actor concurrency model + accompanying syntax.
> certain architectures... work better and more intuitive if you use a different language.
This is the key point that people miss when pretending that languages are interchangeable. The entire point of making a programming language is to make certain types of ways to solve problems easier to express. This constitutes a language's "pretty path". By providing such pretty paths, languages necessarily make less desirable paths, which will be painful to slog through.
If you try writing a functional pipeline in Java, you're going to have a much worse time than doing the same in Elixir. If you try to do Object-Oriented class towers in Scheme, it's going to be painful. Etc, Etc. You can write a Rust program and a C program that compile to the exact same binary, but I can put a whole stack of cash on which one's going to be easier.
This is an important point. Sometimes when you need to do a major rearchitecture of a system it can help to choose a language that is more appropriate to that architecture. The Elixir/Erlang ecosystem has a better story for Actor Model development so it makes sense to choose them for the new architecture. It depends on the team and the specifics of the new architecture because the devil is always in the details. It may be that the new language isn't enough of a win to justify the switch and sometimes the win from the new architecture is big enough that sticking with the current language makes sense.
But a knee jerk response of: This is mostly just good because they rewrote/rearchitected it, ignores the benefits of using a language or technology that fits the new architecture better.
That’s exactly the thing? Why would you bother optimizing the code, looking for overallocations, leakage, tweak parameters, when you can just take a friendlier language for the same benefits.
I think the question is whether they significantly changed the architecture at the same time. For example, reading the description of the Python migration sounds like they applied a lot of experience which would have benefited any language, and micro-optimizations like what you described would have been a rounding error on those larger changes:
Yep. We got a 10x improvement on throughput for our backend runtime, which was in Java, by moving to a better architecture for performance hotspots.. using Java again.
In a rewrite with a different design/architecture, that new design typically accounts for most gains, rather than language.
A language may make some parts of that rewrite simpler.
I've gotten 100x improvement with no code change by just adding an index in the database table. An inexperienced developer might have blamed the database and insisted on moving to NoSQL because of "web scale". If they got the chance to rewrite it, they could have pointed to the performance increase as a proof that they were right.
They really should teach benchmark training more widely in the industry. Even though I'm readily here to sing the praises of Elixir when warranted, nothing beats actually profiling end to end the workload(s) in question that need to be improved. Sometimes, it really is optimizing the database that matters most, like adding an index (or using window functions or stored procedures, as in many cases I've had in the past)
>Would be curious to know how they tried to optimise the Java stack.
Fairly safe to day not at all.
The JVM is extremely fast, very efficient and very scalable (you can write java code that scales linearly with available cores). If performance or scalability is a metric to care about, it is nearly impossible to outscore java. You can, with very skilled C/C++ developers, but it's going to be difficult to find those people and it'll be a lot of work. If you need extreme performance on a reasonable budget, you can't do better than java. I know java isn't hip anymore, so this is not a popular truth, but it is.
then I have to pay Azul for the faster JVM and I may still not sufficiently cut my server costs, or the cost of the JVM paces the reduction I have in server costs.
2 million a year is several developers compensation saved every year. It also opens the door to more savings down the road, potentially, as existing workloads may discover they can use the same approaches to reduce cost / overhead.
I was just adding that in addition to the truth that normal Java could just do it. But if that is really not enough, then paying for a few Azul licenses is an addition can save you time and money.
Elixir is slower than plain PHP according to the techempower benchmarks. I'm not even sure how that's possible but it is. By like a factor of 2 iirc. I'm not sure how elixir is that slow since it's compiled.
The techempower benchmark are... Quite infamous in the elixir community.
Long story short, they are running in debug mode, badly written, not optimised, with bad OS level settings. And every time the community have tried to contribute fixes, the experience has been... Really bad.
So we stopped trying. If things have changed we could try again but ... We just wrote them off
As far as I can tell, you’re basing this off a single thread from a prior techempower round. The result being cleaned up in a subsequent round, but are ignoring that.
It depends on what is actually being benchmarked though; if it's the simple JSON payload, more time will be spent on HTTP parsing (done in nginx for the PHP benchmark so really really fast) and some JSON parsing (done in a C library for the PHP benchmark so really really fast). Basically, how much PHP is actually being benchmarked?
Are you looking at a benchmark that compares real-world usage, or a microbenchmark like hello world or a small JSON payload?
PHP is interpreted and Elixir is compiled. Comparing them to Ruby and Python make no sense as they are interpreted as well.
The fact that Elixir, a compiled language, known for speed, is slower than PHP is surprising. As far as "Most PHP libs are wrappers over C code". That's just not true. Most PHP libs are in PHP
> Even old pre-7 PHP was much faster than Ruby, Python, and others.
No, PHP 7 was an impressive step forward in speed because it switched to better bytecode and object representations internally, but Python wiped the floor with most versions of PHP 5 for exactly the same reason. (PHP 5.5 with opcaching was roughly comparable.)
But this comparison overall is like trying to find the strongest two-year-old...
It's a ridiculous thing to say that any version of PHP is slow. You could run PHP5 today and it's still not slow by any modern standard. Even on the hardware of 15 years ago almost any script you throw at it finishes in single digit milliseconds.
How is that slow? I'd like to meet the developers that consider this slow.
I guess it depends on what the script is doing, but I'd consider that slow. Code I've worked on in Scala takes more like 50-60 microseconds per http request for a json CRUD type of thing (plus latency to wait for the db, but it can serve other requests during that time).
Because they are different languages with different VMs and different concerns. Made for different purposes. Number crunching is slower in Erlang VM (you can call C code though).
They did savings by re-implementing their services and attribute those savings to the new tool / programming language.
I wonder what the saving would look like if they chose another tool for the second / optimized system. I doubt it would differ much if they went with Go, Java or stayed with Python.
It shows that Python with Django is literally 40 times slower than the fastest framework. Python with uvicorn is 10 times slower.
The use of languages like Python and Ruby literally results in >10x the servers being used; which not only results in higher cost, but also greater electricity use, and pollution and carbon emissions if the grid where the data center is located uses fossil fuels.
Not to mention, dynamically-typed languages are truly horrible from a code readability point of view. Large code bases are IMO difficult to read and make sense of, hard to debug, and more prone to bugs, without static types. I'm aware that Elixir is dynamically-typed, but it (along with JS) is an exception in terms of speed. Most dynamically-typed languages are quite slow. Not only do dynamically-typed languages damage the environment as they're typically an order of magnitude slower, they also lower developer productivity, and damage the robustness and reliability of the software written in it. To be clear, I'm in favor of anything that increases productivity. If Kotlin were 10 times slower, I'd be happy to pay that price, since it is genuinely a great language to work with, is statically typed, and developers are more productive in it. I'm not sure how Elixir mitigates the downsides of dynamic typing (maybe lots of 'type checks' with pattern matching?), but it would definitely be super-nice if a well-designed (Kotlin or Haskell like?) statically-typed language targeting the BEAM existed...
Since you mentioned them I think it's worth telling people to take those benchmarks with a larger grain of salt. They've become such a pissing contest that I don't know if they can be called "real-world". There doesn't seem to be much scrutiny of the implementations.
Take some Rust frameworks: they write out pre-computed header strings to the output buffer, completely bypassing what the framework's documentation recommends. Examples are actix[1] and ntex[2]. No one would ever do this in real life.
Now I like Rust, and it'd likely be some of the fastest even without these shenanigans (Axum and may-http don't do that, I believe). But I don't know if other languages/frameworks have benchmarks implemented in the same non-idiomatic way just to look better.
Hmm, the pre-computed header strings is definitely interesting optimization, and does seem to be a bit non-idiomatic – but IMO, this is the sort of optimization the framework itself should try and do.
If first several bytes of the header are going to identical for many requests, a super-optimized framework would ideally memoize it, and write it directly out, as this benchmark is doing.
(And if the framework itself is doing it, then any user of the benchmark would just inherit that optimization, and not have to resort to non-idiomatic optimizations...)
Did you even ever see implementation behind techempower benchmarks? There's NOTHING realistic in them. Those applications literally hardcode static content length header values to be faster. They are pretty good show of how low you can get to squeeze out performance but not one sane person will write code like that.
> greater electricity use, and pollution and carbon emissions if the grid where the data center is located uses fossil fuels.
to
> dynamically-typed languages damage the environment as they're typically an order of magnitude slower
is quite a stretch.
Do dynamically-typed languages inherently damage the environment? Or is it the fossil fuels?
Not that the appeal to the environment matters, because later on we have this:
> If Kotlin were 10 times slower, I'd be happy to pay that price, since it is genuinely a great language to work with, is statically typed, and developers are more productive in it.
> Do dynamically-typed languages inherently damage the environment? Or is it the fossil fuels?
My opinion is that slow languages that use 10x the electricity, with no ROI for the 10x energy use is bad.
High energy use, even if it's clean energy, implies a higher environmental toll. If a country were solely using nuclear and solar, higher energy use results in (1) more nuclear reactors constructed, and (2) more solar panels built. The manufacture and construction of both has an environment cost. Of course, with fossil fuels, the damage to the environment is potentially a lot worse.
> Not that the appeal to the environment matters
I don't think higher energy use is inherently bad. If we can improve the quality of life for human beings, then IMO a higher energy use is justified. I don't really believe in degrading our quality of life to lower our energy use.
My problem with many popular slow languages is that they have a negative ROI for the higher electricity cost. In exchange for 10x the energy use, you have a language that results in less-readable code (a serious issue), that causes more bugs / less-reliable software, etc. We literally a get negative ROI in exchange for 10x the energy use. Which is absurd and illogical.
If Hindley–Milner type inference had been more prevalent in the 1990s, I have a feeling dynamically-typed languages would have never take off. We're moving back to static typing with mypy, TypeScript, etc., but I'm hoping we move away entirely soon from using slow languages for writing servers serving large numbers of users.
If we actually got something in exchange for it, it wouldn't bother me so much.
For the vast majority of projects this makes no difference. If you are at Facebook scale? Sure. But then, you do what they did, write a VM to speed things up.
Dynamic typing doesn't cost much more money on average and thankfully the cost of energy itself is a motivation for companies to do rewrites. If they are paying a lot in server costs and electricity then they do typically rewrite to reduce the amount of servers.
For companies they primarily need to worry about running at a profit and getting to market quickly which dynamic languages do extremely well, and the costs in electricity and carbon aren't very high when your scale is small.
> running at a profit and getting to market quickly which dynamic languages do extremely well
This (or similar variants of this) is an assertion that's commonly made about dynamically-typed languages, but I don't think they hold any water.
Less readable code (due to the lack of types) makes it a lot harder to add new features, and harder to debug code as well.
Several years ago, I briefly worked on a fairly large codebase at a startup that was written in a Ruby on the backend, and CoffeeScript on the front-end. There were only around 60,000 active users. Yet, the dynamic typing made adding new features, or fixing a bug a truly painful and fragile experience. Needlessly painful, and slow. It literally reduced developer velocity.
I think once you cross a few hundred lines, dynamic typing becomes a handicap rather than an advantage.
All of this doesn't even touch on the energy use. Which I'll admit is irrelevant to most companies. Server costs even for popular web/SaaS/etc tech companies are often a tiny tiny fraction of overall cost, with most of the company's annual operating cost being employee salaries. (As I had stated earlier, I don't mind a language being slower - if it actually provided any advantages–like improved developer productivity, in exchange for that slowness.)
The main argument here was about the energy costs. But it sounds as though you don't like dynamically typed languages. That's fine.
Facebook, twitter and plenty of of billion dollar businesses were built with dynamic languages and many would argue they may not have even existed if they were written in staticly typed languages due to the slower up front time expenditure.
I like staticly types languages for large projects but enjoy thr development speed of dynamic ones. If you don't like dynamic that's completely fine
The only thing I'll concede is that using a statically-typed language without a good type inference system, might slow people down a tiny bit.
The serious downsides of dynamic typing means you might still win out (in terms of developer productivity, code readability, code reliability, and ease of debugging) even if you use a language like Java instead of PHP.
Stack overflow is miniscule compared to Twitter or Facebook. You can just book at Basecamp if you want to pick a smaller company. They are Ruby on Rails and they have been consistently doing well.
Ultimately billion dollar companies have been built on dynamic languages. There is nothing stopping you from succeeding with dynamic languages. There just isn't. There are tradeoffs and these companies made them.
I agree that there isn't much necessarily stopping a person succeeding due to the choice of language (unless it's some esoteric or otherwise unrealistic language).
Yahoo used C++ for instance (it would not be my choice, even though it's weakly statically-typed).
But, yea, like you said--there are tradeoffs.
I think dynamically-typed languages has the allure of letting you build quickly initially, but the downsides of dynamic typing start hitting pretty soon afterwrd.
I write a lot of small scripts in Python. It certainly is easier to throw something quickly together, especially when the data model is amorphous, with dynamic typing. But that doesn't mean I'd use Python for a large project.
Reading higher in the thread, some Elixir folks are saying that the techempower benchmarks used the wrong settings (debug mode, etc) for their Elixir benchmark.
The fastest Elixir framework on the list, phoenix, is about as fast as uvicorn. (Both around 10 times slower than dragon.)
I was mostly responding directly to these 2 statements from my parent comment:
> They did savings by re-implementing their services and attribute those savings to the new tool / programming language.
> I wonder what the saving would look like if they chose another tool for the second / optimized system. I doubt it would differ much if they went with Go, Java or stayed with Python.
I think second syndrome is probably a significant factor, but I can also believe that ditching Python was also a significant factor.
EDIT: I’m being rate limited because I guess my comments are too spicy for the HN mods, but anyway I agree that there’s no reason other non-Python languages would fare much worse than Elixir.
There is a reason: the BEAM is almost not prone to huge GC pauses. Bigger load results in every actor responding very slightly slower. Nothing else.
Many other systems don't have this property. They fall down under pressure.
Gosh, a huge chunk of HN is always so dismissive. At least read up a bit beforehand, man. The criticisms should be informed and benefit the readers, not only express a generic skepticism.
The beam just garbage collects each actor separately, and it so happens that much of the time your actor has finished before a gc happened so you never see the cleanup.
The beam also has a spectacular failure mode: OOM whenever messages come in at a higher rate than they are processed. The lack of backpressure mechanisms mean a huge amount of beam language developers spend way too much time recreating their own way for dealing with this or pretend it is not a problem at all. This means too many libraries in the ecosystem behave totally differently under load.
I can see how you would think that, yes. In practice I haven't noticed it except in super rare cases where processes (actors) hold on to huge binaries / strings -- which is one of the weak points of BEAM's GC.
I've been bit by this. In reality you need to know your shit when it comes to tuning the Beam and GC to achieve decent performance under load without triggering OOM.
Frankly I am seeing that as a myth, you seem to have made up your mind some time ago or judged by 1-2 occasions.
I am on ElixirForum every day and worked with Elixir for 7 years and have never seen anyone "perpetuate myths". I've seen some people willing to "increase adoption" which was always met with resistance by the wider community -- we believe growth should be organic.
Pretty sad stance from you though, I have no idea why people get so ticked off when another programmer wants to tell them about a secret weapon.
If you are not willing to try it, that's fair. Say that. Claiming you know stuff about the ecosystem while a guy who is there every day is not seeing that at all comes across as... strange. Biased. And not arguing in good faith. :(
> If you are not willing to try it, that's fair. Say that. Claiming you know stuff about the ecosystem while a guy who is there every day comes across as... strange. Biased. And not arguing in good faith. :(
I am not willing to drag others, such as those that wrote the repos, into a technical discussion with people out to act as you are.
The guy you are responding to was completely calm and reasonable. Didn't say anything attacking or otherwise. I'm not sure why you are seemingly trying to cast him (and the Beam) in such a bad light, with seemingly no reason to back it up.
Both of you are afflicted by that logical fallacy of failing to understand that you not encountering a phenomenon does not mean it does not happen or is rare, it just means you didn't encounter it.
If you try telling people that did encounter that phenomenon that in practice they wouldn't/didn't then you shouldn't be surprised if they question why they started talking to you in the first place.
> Do you genuinely fail to see how your behaviour proves my point?
I genuinely see only one thing: I asked you to elaborate but you are convinced that I am pretending to discuss while I, again genuinely, actually did want to discuss.
You asserting something about me, a person whose mind you cannot read is confusing and quite aggressive, in a very uncalled-for manner too. But as I said already -- have it your way, I disengaged because it became apparent you are not interested in discussing. OK. It's your right.
What's not OK is you claiming that I am not interested in discussing however, and I maintain that I was interested in discussing.
> You cannot just hound people with demands because you don’t like what they are saying.
1. I am not "hounding" you for anything, I asked a question.
2. You are again assuming my motivation and I assert that you have gotten it wrong. You that I "disliked what you said" is a borderline personal attack and an off-topic. I was confused why you claimed what you did and wanted you to elaborate, to find out what made you think like that and if I can change your mind with a few anecdotes and some facts (that are hard to look up because they require scanning a forum; yet they are there and are visible to everyone who engages with the platform).
BTW, if you really have known anything at all about the Elixir ecosystem you would know that its creator, to this day, engages with users on ElixirForum and asks for their feedback on what they find lacking. That sort of engagement and genuine discussion spirit that you claim I (as a part of the Elixir community) don't have.
That alone invalidates your point entirely.
I am disengaging second and final time, let future readers decide for themselves.
Calling someone “biased” and “acting in bad faith” is a personal attack and violates this site’s rules. People get rate limited for far less on this site.
I'd argue taking things out of context and deliberately painting the commenter in a bad light is not a nice forum discourse.
I said, very plainly and visibly, that my parent commenter's unwillingness to back up negative claims COMES ACROSS as biased and ARGUING (NOT "acting") in bad faith.
Come on now, this stuff is not hard, the message is literally up there. Not sure why you had to editorialize it and thus misconstrue it?
As I plainly explained, right at the top, it is not theoretical. There is no point engaging people with evidence if they are so dismissive of basic facts.
But then that is also true here. Your claims about him not saying what he plainly did are just bizarre.
TBH at the load we had we got substantial savings by eventually replacing usage of gen_server as well, though that probably isn’t a good idea much of the time.
The OOMs were largely being caused by calls to and from other services (i.e. kafka) so the answer proved to be in controlling the rate at which things come in and out at the very edge.
From what I saw I got the impression the Beam devs assumed memory and CPU usage go together so a system that is under load memory wise would also be CPU wise, but this isn’t the case if your fan out and gather involves holding large* values on which the response is based, even if for tiny amounts of time.
EDIT: *large meaning "surprisingly small" if you're coming from other universes.
The underlying problem we had* was the rate work was being completed was lower than the rate requests were coming in which causes the mailboxes on the actors to grow indefinitely.
In golang the approximate equivalent is a buffered channel that would start blocking because it has run out, but the beam will just keep those messages piling on to the queue as fast as possible until OOM. This is obviously a philosophical tradeoff.
* I should qualify that each request here had a life in low double digit ms, but there were millions of them per second, and these were very big machines.
Why weren't your processes dropping messages? Also I think you can tell the VM to not allow the process to exceed a certain message size and trigger some sort of rate limiting or scaling out
Edit: huh. I could swear the VM had memory limit options. Guess not. Time to rewrite it in zig!
Yeah, I think that's the assumption people had been operating under.
That team would have thoroughly endorsed a zig rewrite! It was a very odd situation where most of us liked erlang the language but found the beam to be an annoying beast, whereas most of the world seems to be the opposite.
That does not in any way explain a drop of 95%, which IMO is ridiculous and points to other issues.
The system they created now is totally different from the one they had. It’s more efficient by an insane margin. Choice of language seems like it would have trouble breaking the top 5 major reasons.
I migrated Rails apps to Elixir before, we reduced from 15 servers to 3, and 1 was basically "if crap hits the fan", we could have gotten away with 2 easily.
It's worrying that a supposedly high-quality forum like HN receives comments with no substance. If you have an actual counter-argument, let's discuss. If not, well, not an interesting exchange.
You keep insisting HN is not providing comments up to your standards. Let’s not go there OK? I just disagree with your analysis. I think it is too simplistic to state GCs cause this and Elixir somehow magically causes 95% efficiency boosts.
All I’m saying is that if you can drop 1330 servers just like that, there might be something more going on than Python’s slowness.
This is from experience. I have seen people create slow and fast systems with just about any tech. I can make Elixir crawl, I can assure you of that.
I have seen Python apps use 10 servers and reduced it to one as well. Same tech, just a more efficient mindset. It’s IMO a bit too simplistic to say systems with GCs fall over when under load.
Sure, if you want to expand the discussion to "everyone can make every tech stack act badly" then you might have had an argument. I don't find that argument compelling however -- it's borderline meaningless.
Also nobody used the word "magically" before you did. Note that.
What's your argument exactly? That Elixir is overrated? Or something else?
Furthermore, I am not insisting on my standards of the quality of comments. I am under the impression that's the expected quality of comments on HN at large.
Thank you for recognizing that we were going nowhere. Apologies if my tone was sharp.
I am not evangelizing tech -- I am a polyglot and I use what I find is best suited for a job, and Elixir happens to cover quite a lot of ground. That's all really. I also use Rust and Golang quite a bit.
I simply get ticked off when people start demeaning something without seriously working with it or even reading a bit beforehand. Sorry if I mistakenly put you in that group.
You’re comparing Elixir to Python and Rails. Many of us have seen Python replaced with other languages for an astronomical improvement. Python and Ruby are the slowest category of languages; they’re easily beaten and you need to offer some evidence as to why the improvement was derived from migrating to Elixir specifically rather than moving away from Python/Rails.
I'm not dismissive of Elixir, I'm dismissive that Elixir magically solved this problem in a way other languages couldn't. If you have some supporting evidence or rationale as to why Elixir is uniquely able to solve this problem, I'm happy to hear it, but so far you've offered up "low latency GC" which isn't unique to Elixir and itself doesn't adequately explain the degree of improvement over Python (GC latency alone doesn't reduce from 200 servers to 4). Again, I'm happy to entertain arguments about why Elixir is uniquely able to improve performance, but I'm not going to take it on faith (which you interpret as 'dismissive of Elixir').
Okay, how about "a runtime that has been extremely carefully crafted for the lowest possible latency all the way to the point of the hardware falling over"?
It's very hard to provide evidence unless we make a screen-share call where I show you real time dashboards of services being bombarded with thousands of requests per second and for you to see for yourself how the median latency numbers climb from 25ms to 45ms and then fall back to 20-30ms after the burst load subsides.
I find it difficult to just describe this because as much as I've seen it many times in practice, it's also practically impossible (NDAs and compliance nightmares) to demonstrate it to a programmer outside the companies I've worked with without violating all sorts of laws. :(
But yes, basically: a super latency optimized runtime, a GC that's not very sophisticated but it elegantly dodges most GC problems by simply releasing all memory linked to an actor as soon as it quits (and Erlang/Elixir encourage you to spawn many of those in the right conditions; not for every single thing though), and one of the very fastest dynamic languages in the world, probably second only to JS's V8.
All of that is combined with me working with several other programming languages and their hosting solutions which were tripping over themselves when 1000 req/s started coming in (looking at you, Ruby 3.X and Puma and a few other servers; or PHP, or Python).
TL;DR: reliability is much better, latency is predictable.
Weirdest thing is: people don't believe it. If you only knew the CTOs I worked with: they were extremely data-driven and they would not allow me or anyone to just pull all of that out of their bottom. All had to be proven with numbers, and me and my teams did that, many times.
I understand the skepticism somewhat, but you and a few others seem to look at Elixir through the lenses of "too good to be true", and IMO you should try relaxing that skepticism to some extent. And try to be little more sympathetic because again, I literally cannot give you the hard cold data without violating at least three laws.
Go also doesn’t have huge GC pauses (and moreover idiomatic Go generates very little garbage)and I have a hard time seeing how GC pauses would contribute so significantly. Java also allegedly has a very low latency collector.
Apologies, I was only responding to a single point which was meant to counter another. I am well aware that Golang's GC is world-class and it's my second most loved language after Elixir.
If and that's a very, very, very big if, the current open GC are really leading to to much pauses. Then you can go to Azul and buy a better VM and GC, further improving the performance compared to BEAM.
Yep, agreed, I am just listing possibilities. More often than not a performance loss in Erlang/Elixir is caused by GC pauses but you can do a lot to reduce or outright eliminate those.
You need to advance a theory about how the BEAM could make a 95% improvement in a way the alternatives cannot.
For example, a Go has a very good asynchronous story as well and while it may not be exactly as good as BEAM languages, it makes up for it in straight line execution performance.
I’ve personally rewritten a few carefully optimized Python applications in pretty naive Go (without significant rearchitecting) and witnessed 50X-1000X performance improvements. And moreover Go allows for more optimization beyond what Python allows (e.g., consolidating allocations and lifting them out of the hot path).
While you are probably right it's also true that python is just straight up slow - especially in normal configurations (django, flask, uwsgi, lambdas, etc) and elixir is pretty fast while offering a great/fun/friendly dev experience + BEAM-HA/scaling-benefits.
They could have also blown everything out of the water with C++ - or probably even golang - but if elixir can do it on 2-4 boxes it's fast enough.
Python is 100x slower than a real language (I like python and use it often, not meant as a dig at the language just stating facts)
My favorite thing about Python is writing prototypes. The big risk with writing a prototype is that it survives into production. Using Python ensures that the code will be replaced by real code
Python is great for other non-production code like Jupyter notebooks, numpy experiments, etc
How is calling Python "not a real language" just stating facts?
I'm not sure by which metric we are judging whether a language is real or not, but I'm fairly sure that almost anything anyone comes up with will include Python, considering it's one of the most used languages in the world at this point, and used for a fairly large variety of use cases.
Typescript/JavaScript has many of these properties except for GIL and maybe not typically interpreted. That means not a real language? Ruby also has a GIL.
I don’t even know what “proper package management” means in this context. Certainly there’s an official package management system and modules. C++ doesn’t have the latter (very fragmented user efforts and not commonly used in my experience) and barely the former (no adoption). So that means C++ isn’t a real language?
Python does have type checking by the way. Same as TypeScript - you annotate your types and you can run a program to verify your annotations. This is basically how TypeScript works although the typing in the latter is more mature by way of @types packages to let the community supplement adding typing information to third party packages. There’s no relation between the two so it’s not sound (in both cases), but in practice it’s quite useful.
Easy to learn hard to master syndrome. Your comment says nothing.
* GIL - Your use case more than likely reimplements one wheel or another. Whatever compute you're doing should be deferred to the right tool (DB, queue, etc). Otherwise, if it's I/O, you're on par with Go and NodeJS (See FastAPI 3rd party quarterly benchmarks as an example)
* No sound typing - I don't know what this means. If you're concerned with typing, you can use pydantic for highly performant type checking and input/output validation.
* No proper package management - Poetry is excellent, been around a long time and is the unwritten defacto tool. Pipenv imo is close second. This argument feels forced as no one argued Go isn't a language before gomod was solidified. The community was fragmented and people wrote/chose their own tool.
* Runtime type checking only - In a world of interprocess communication, I don't see how this is relevant. You're not using Pytyon to write firmware. If you're not writing tests and just depend on successful compilation, you're writing bad code. Tests not only cover your point but are an excellent self doc. An added value, imo, that isn't spoken about enough. Regardless, tests cover this.
* Interpreted - A good last point to nail in the "your comment says nothing". What does this imply? That it's easier to debug with an interpreter? That cold starts are slow and your design is flawed given the tools?
Anyway, with C and Go bindings, most arguments against Python fall short. It has its place, yes, but a much wider one than the bandwagon regurgitates.
It's really strange to me how people continue to build services using python, knowing it is 100 times slower than appropriate languages, and then get surprised by it being 100 times slower, so they eventually rewrite it.
It’s extremely rare for a system to be 10x slower, much less 100x, and developer productivity is huge. When you see huge numbers being tossed around for an entire system, they almost always mean “our first architecture wasn’t right for the problem” and the question to ask is how much time it would have taken the same team to discover the correct shape of the problem with the other candidates.
I think it is a lack of know-how. Most businesses/managements are not capable of making the decision to go for an ecosystem like Elixir, because they either don't even know it exists (that is also true for many devs) or they do not dare to do anything non-conventional or non-mainstream, or they have the wrong impression, that the "programming language does not matter". (Well, it does! Since it connects you to an ecosystem that comes with it and its language design choices influence how easily you can do things ...)
So then Python comes along and you find loads of devs for that. Once Python is entrenched, businesses have a hard time telling their devs to actually learn something new. And few devs will already explore things like Elixir on their own in their free time. And so they continue to hire Python devs.
(One could also replace "Python" with "Java" or "NodeJS" or similar, the principles remain the same.)
I think it's a generally practiced strategy at this point to spew out blog dev posts on company blogs to build SEO or act as an ad, regardless of quality.
The problem with Elixir is that it is such a foreign language to most of the junior developers and a radical shift to dig into coming from object-oriented and other higher-level languages. This is worsened by the fact that there are not many jobs for Elixir in addition to Development Tooling, IDE Support.
To someone who starts their job on an Elixir codebase, it is just not a smooth onboarding at all. While the performance aspect is unparalleled compared to most of the popular scripting languages in the last decade, the price to pay to settle into Elixir seems huge to me.
sounds like a feature not a bug. Poor onboarding is a cultural problem in my experience, not really a technology one.
When you get junior (or even non junior) developers onboarded in a new language, you have a unique opportunity to break them of bad habits and expand horizons.
Yes, there is a cost to it as it extends in the short term the time it takes to get developers ramped up, however the long tail payoff is huge
Anything the business can control for: architectural designs, server costs, approaches to building out features / services for the business etc.
When you can mold someone's experience via a new language to model a domain, they become very efficient to it, when they have no prior notions to fall back on.
How many times have developers gone down the wrong path because of X did it this way? type thinking. When you can sufficiently remove that so all that is left to think about is the problem space, you do make more gains around that problem space.
My thesis from (albeit anecdotal) experience, is that when you have developers working in a new paradigm (often, this corresponds with a new language) you have better chances at establishing these things than having to consistently try and override a developers prior notions about how something should work / look.
The trade off is higher ramp times and slower on-boarding, of course. In the short term, it can be more costly.
We just did a Prometheus migration that I suspect will take us 5 years to break even on the development effort investment. And I'm not even counting opportunity costs, which were immeasurable.
I like Elixir and I want it to do well, but bad articles make that harder, not easier.
In TFA, it gets better though: "Steve: That’s pretty easy. When I started on the spam team, we had close to 1,400 servers running. When we converted several parts to Elixir, we reduced that by around 95%. One of the systems that ran on 200 Python servers now runs on four Elixir servers (it can actually run on two servers, but we felt that four provided more fault tolerance). The combined effect of better architecture and Elixir saved Pinterest over $2 million per year in server costs. In addition, the performance and reliability of the systems went up despite running on drastically less hardware. When our notifications system was running on Java, it was on 30 c32.xl instances. When we switched over to Elixir, we could run on 15. Despite running on less hardware, the response times dropped significantly, as did errors."