"Currently, there are several Python features that Codon
does not support. They mainly consist of runtime polymorphism, runtime reflection and type manipulation (e.g., dynamic method table modification, dynamic addition of class
members, metaclasses, and class decorators). There are also
gaps in the standard Python library coverage. While Codon
ships with Python interoperability as a workaround to some
of these limitations, future work is planned to expand the
amount of Pythonic code immediately compatible with the
framework by adding features such as runtime polymorphism and by implementing better interoperability with the
existing Python libraries. Finally, we plan to increase the
standard library coverage, as well as extend syntax configurability for custom DSLs."
Ye well if you remove the dynamic feutures of a dynamic language it gets fast. It would be really impressive of they can achieve those feutures with the sameish speed.
I dont necessarily need all that dynamism though, and would happily use a Python subset that removed some stuff (and forced type hinting) in exchange for better compilation.
Yes there are already subsets like this, but its not as helpful if it isnt standard.
Assume your python program is fully static and well behaved from a compiler perspective. JIT compile it, and observe what it does, so that you can invalidate the compiled code and re-JIT it if it does something overly dynamic.
Incidentally, what JVM did. (I’m sure now it’s been tweaked beyond recognition)
I do: I mean some people with experience making Common Lisp implementations (SBCL, maybe) getting an idea and implementing Python with the same basic concepts they used to implement a Common Lisp compiler.
Seconded, I deploy large packages with gigabytes of deep learning and GIS dependencies in single executables with Nuitka and it works very well. Also handles including data files into the executable if needed.
Ha ing been down a similar path, this whole thing works so much better if you don't 'import arcpy'. Licencing issues aside, you've often got faster tools in shapely, fiona, geopandas, rasterio, xarray.
> high level of support of event the tricky things like the scientic and gui stacks
Could it compile an app that uses Pillow and AggDraw and ReportLab and OpenPyXL with a TKInter GUI into a standalone app I can give to a coworker? That would be extremely useful!
AggDraw is an "anti-grain geometry" graphics library that works with Pillow for drawing high quality images. ReportLab is a very big PDF generating library, and OpenPyXL reads and writes XLSX format Excel spreadsheets. I use these in many of my work-related apps. Tkinter is the big question for me because it involves a lot of behind-the-scenes files. Thanks for your comment, I will give Nuitka a try!
There are other python implementations like pypy which includes a JIT (Just In Time compiler). There are other jit which can run with official python (cpython) like numba (not all code can be optimized, but if you only need optimize your hot code path).
You can use a superset language of python called cython that generate C code. It can be used to generate C bindings or fast python (for cpython) modules implemented in a python like code.
You can use a really fast language like C, Rust, C++,... create python wrappers with cython, swig, Boost.python, cffi,... and use python like glue code.
Python is not a fast languages as others, but there are tricks to make fast programs.
numba seems to hit the sweet spot for most numerics for me. it can get a little annoying with type inference issues but overall it seems the most concise and least hassle for moving loops into optimized machine code.
You can certainly use it, but whether you see any benefit is going to strongly depend on your workload. If you're doing significant calculations in the API then it might be considerably more performant, but if your API is primarily retrieving things from the database and transforming it to JSON then you're going to be limited mostly by the database latency and so I wouldn't expect major improvements.
If you are fetching lots (not even 'big data', but a few thousand rows) of data using the Django ORM, you will see a performance difference when using pypy, or at least I did a few years ago. The database can happily return a few thousand rows very quickly, especially if you take care to optimize your queries and have good indexes.
Converting a few thousand rows to python/django objects takes _time_. I can't quantify anything, because it's been too long, but I remember it being fairly significant. When I profiled it, the majority of the time was spent calling __setattr__ a few million times.
Like you said, it depends on your use case. If your queries are slow, then optimize your database queries. But if your queries are fast and your responses are still slow, then investigating pypy is definitely worth it. You can also play around with .values_list or something in Django, so that you get 'raw values' instead of objects (but there's still a cost to building them up).
And yet the same supposedly "io bound" workloads (like "parse request, fetch something from DB, return it as JSON") still have widely difference performance characteristics in some languages vs others, with 10x to 100x requests handled per second...
2x, yes. If you’re seeing 100x you’re comparing different things like creating and serializing complex objects versus simple types or using a JSON parser which loads an entire document into objects versus one which only retrieved specific values.
At my old gig we ran Django and FastAPI with pypy. I don't remember there being too many issues. One thing is that pypy versions lag behind official python, so if you're using bleeding edge stuff, it won't be supported in pypy yet.
That was a couple of years ago at this point, and I've not been in the python ecosystem since then, but I can only imagine things are getting better in that regard rather than worse.
Have a look at Cinder - https://github.com/facebookincubator/cinder - it's Meta's performance oriented fork of CPython that they use to run Instagram (which is a big Django app).
With a codebase of any significant size the priority is always to maintain compatibility while improving performance.
If you start with an incompatible, highly performant interpreter, the compatibility "distance" is difficult to measure and could create unknown performance cost. For example, PyPy doesn't support C modules due to the differing memory layout.
I had little trouble switching a Django app years ago but the results were mixed. Some complicated views and reports saw a hefty win but most of the app was database-limited and optimized to the point that there was no meaningful difference, except that PyPy used more RAM.
Pypy is great but I didn't find it very useful with Django.
Quick, transactional HTTP exchanges (GET, POST, etc.) aren't really its thing-- there's no time for the compiler to get warmed up; the request is complete before pypy has gotten out of bed.
But if you have to do really complex view rendering (graphs or something) where it would take cpython ~10s or more to process, then pypy will leave cpython in the dust.
They should have said python compiler (shorter) or python compiler in c++ (more accurate and only one character longer, including spaces).
Considering at least 2 people have gone to look at the source and then come here to comment, it would have been a net benefit for all involved. Plus, what does it say about the potential quality of your compiler if you can't even make correct English statements? This seems easier to get right than if( x = *p++ )
These headlines usually work by the dept asking a researcher for a 10 sentence summary of their work. Someone in the dept summarizes that to 3 sentences, and sends that to the university pr dept. The university turns that 3 sentences into 1 and that's what's published. My take is this says more about the weird game of telephone being played than it does about the research product itself.
Yes, but it's a game of telephone that does make it annoying for HN readers to try and actually understand what's being presented, so it should be discussed and corrected if possible.
true, but if that's the case, then the game of telephone would not call (haha) into question the quality of the compiler, since the devs were not responsible for the game.
_The Shaft_, a Georgia Tech periodical (similar to _The Onion_ in spirit), interviewed a local associate professor: "How can I be expected to teach math if my students don't speak basic Mandarin?"
And it is severely underrated. Even though performance gain is aevrate around 4x-20x. Used in production and memory usage is also about 1/6th of CPython. Can get 10x perfromance easily in many cases.
Somewhat paradoxically, PyPy always uses more memory for programs with small working sets, but can use less memory for programs with large working sets. 1/6 is a lot more extreme than I would have expected, though.
No, it has drastically less memory requirements. It's very similar to my perl compiler, and even in non-optimizing mode it uses much less memory. In optimizing mode it's almost the same as this one. Just ~20 years older.
It's a nit-pick because it's ultimately just a gripe about ambiguous phrasing, not because the implications of what the phrasing means are unimportant.
Am losing count of all these efforts to rescue Python's performance - they all seem to amount to the same thing: it's not very hard to achieve this if you throw out fundamental aspects that make Python what it is. The premise is always that syntax is the barrier, and that people struggle so much to learn a new syntax that this is what keeps them using Python even though its performance is abysmal. But what if this isn't the right assumption? What if it's not the syntax but the ecosystem - and an ecosystem at that which highly depends on all the dynamic features to achieve what it does. Further than that, what about the assumption that people lacking confidence such that they blocked by learning another syntax are actually up for the challenge of understanding all the constraints and limitations of something that is something like but not quite real Python? These are exactly the people that you would expect to struggle with it.
Its performance and poor support for parallelism has prevented me from using it in places I've wanted to for years.
Is it "fast enough"? fast enough 90% of the time? Or just fast enough to leave you uncertain that it's even a good choice?
Sorry, we have productive choices now that don't leave me worrying about this situation. I still like it though, and if they could solve those issues I'd probably use it a lot more!
If you want to crunch a lot of numbers in parallel, GPUs are your friend, and thus C/C++ (or any of the python libraries that already set these up for you).
If you are trying to run a lot of executions in parallel where you need the full instruction set of the CPU over a GPU (namely, the compare/jump operations, otherwise known as if statements, for loops, e.t.c), then you are most likely either extremely resource constrained (like writing code for a microprocessor or embedded system), at which point you will still use C/C++, or writing something like a video game, which again is C/C++/C# (or Swift for Apple).
For most every other use case, Python is simply applicable with its vast array of libraries. For example, at my work, we use FastUI with orjson for a backend that needs to handle some significant TPS. Its fast enough. Could we write the entire thing in another language and use less ECS containers/EC2 instances? Sure, and we will save on cloud costs but lose massively on developer costs.
Python is currently running on 100,000s of CPUs right now. There's an environmental cost to that.
Moreover, people have huge python apps that they can't just rewrite and python just isn't fast enough. This has happened so many times. So many man hours have been spent optimizing python code, that we have over a dozen different implementations in just this thread alone and it doesn't include 3 that I know of.
Python's current Achilles heel is actually it's performance. It's slow as fuck. Those 10% of applications matter. And faster performance won't hinder anything for people writing 100 line scripts.
Python is a scripting language. It allows me to develop a right answer in record devopment time and with a high degree of confidence that the result does what it needs to do. If, then, there are speed problems then I can optimise my existing code or rewrite sections in other languages, but crucially use my initial code to aid in testing the more obscure rewrites.
Most of the time, some thought upfront will tell you if you will need a compiled solution early on, but even then - getting it correct in Python before getting it fast in something like C++ can be faster - the spec is often revised as implementation can change specsand Python is more agile.
Execution speed is more than execution speed - you need to be correct, and being fast enough is quality; faster may be wasteful.
Python is not "slow" in the sense that it not applicable to use when you wanna run real world applications.
Yours is the equivalent argument for buying a BMW M3 over a Toyota Corolla because it can do a faster lap time around a track and thus is better for your commute, except in the real world with traffic and traffic lights, on a 30 minute commute home, you will probably arrive at your destination 1 minute quicker in the BMW over a Corolla.
Disclaimer: developer on Pyston, which could be considered a competitor
My concern is: there have been a few projects already that are, from the outside, more or less the same approach and set of tradeoffs as this. And they haven't been that successful. Given that this is treading familiar ground I would expect some words about how this is different, and the lack thereof makes me a bit skeptical to say this will become successful when others did not.
> and the lack thereof makes me a bit skeptical to say this will become successful when others did not.
Success of projects like this is not usually based on merit, but on how many people you can convince to go along with it until it eventually becomes a thing of its own sustainability. So, the recipe here would be to:
1. Be "good enough" and easy enough to get started such that early adopters have a great first experience speeding up something important to them. (hook them)
2. Be open and friendly to potential incoming contributors, letting them land changes, have a say in the discussion, and generally be part of it all. (community build)
3. Encourage people to share their successes and hopes / dreams for how great $X is on their blogs, HN, social media, etc. (propaganda)
4. Goto 1.
In this case, step 3 will work best by highlighting that "You actually don't need most of the dynamic features of Python" as the central narrative.
One big caveat is that Codon choose to not use Python semantics for `%` so the basic test of `print(-2 % 5)` fails unless you run it with `-numerics=py`... which should just be the default behavior -- and a great first community patch / discussion!
for line in code_lines:
if line.startswith("#"):
var_name, _, value = line.partition(" like ")
if "add " in value:
_, var1, var2 = value.partition("add ")
var1 = var1.strip()
var2 = var2.strip()
variables[var_name.strip()] = variables[var1] + variables[var2]
else:
variables[var_name.strip()] = int(value)
elif line.startswith("OMG"):
message = re.findall(r'"(.*?)"', line)
if message:
print(message[0].format(**variables))
# Sample code
code = '''
#my_var like 10
OMG "Value of my_var: {my_var}"
#var1 like 5
#var2 like 7
#sum like add #var1 #var2
OMG "Sum of {var1} and {var2} is {sum}"
'''
Another big difference: "Codon is licensed under the Business Source License (BSL), which means its source code is publicly available and it's free for non-production use. ... each version of Codon converts to an actual open source license (specifically, Apache) after 3 years."
I don't see any indication that Codon makes any claim to generated code. It's just that the license prohibits production/commercial use of the compiler--not much different than other proprietary compilers like, say, Visual Studio.
Preface: I don't just want to crap on Python here and sell Nim. I like Python, and still use it.
But it still shocks me just how much money and manpower is thrown at trying to bikeshed and optimize and compile Python and its libraries, while the Nim compiler is essentially a community hobby project that has made the concept of a "compiled Python" a reality already. The orders of magnitude in scale difference, and the qualities of the output products, are staggering.
I'm kind of starting to see what Guido is talking about when he says Python is a legacy language that's probably on its way out. Even in the interpreted world, languages like Janet and other newcomers are performing fascinating experiments, often doing more with less.
> while the Nim compiler is essentially a community hobby project that has made the concept of a "compiled Python" a reality already.
I checked out Nim a few years ago because I have a large Python project that I'd like to move to a compiled language. In my experience, if you get on the forums and do any kind of comparison between Python and Nim, you will quickly get responses of "Nim isn't Python, so quit trying to make it like Python".
I think Nim would have been much more successful if it was more like Python, and if Python compatibility was one of its goals. Just as an example, a subrange a..b in Nim is closed on the right, unlike Python, and a..<b is the open Python version. They could just as easily have made a..b open and used a..=b for the closed interval, for Python compatibility.
I'm not saying Nim had to be the perfect Python compiler and compile all Python code unmodified. Based on what I've read, Python is too dynamic for this. But in cases where there was the choice to either be compatible with Python or "do something unique", Nim often takes the unique path, and not always for any good reason IMO.
A lot of effort is dedicated to trying to improve the speed because python is so widely used that improving performance could have a massive beneficial impact.
Migrating to a new language is not easy when you have millions of lines of code.
But is this still really python? This compiler and others are not a drop in replacement. They typically cover a narrow subset and/or need additional code/hints etc.
You can adopt it incrementally, but then you could just as well switch to a language with higher default performance, more language features that just work, unified tooling etc. and adopt that incrementally?
I didn't say that anyone should switch (see my preface), and I perfectly well understand this point. Having to retread this point over and over just serves to make our comments long and redundant and full of qualifiers, but I guess here we are again.
You said you don’t want to “crap on python” which I never said you were. I’m simply pointing out that what you’re “shocked by” makes complete sense if you look at what can be gained by people “bikeshed and optimize and compile python”
>I'm kind of starting to see what Guido is talking about when he says Python is a legacy language that's probably on its way out. Even in the interpreted world, languages like Janet and other newcomers are performing fascinating experiments, often doing more with less.
Wow, what a way to mischaracterize what Guido said.
His point was about languages evolving to be more abstract than Python or any of the ones you mentioned. Programming is going to become more and more abstract to the point where you will be able to program in natural language through speech. In the mean time, we still have to write code manually.
And look, there are plenty of valid criticisms of Python, but you are kidding yourself if you don't think its going to be one of the primary languages of the future. There is a reason why it has the 2nd most gihub repos (behind JS, because of hard dependency on it for web stuff).
And the simple reason is this: the vast, vast majority of applications don't need the fastest possible speed, its much more important to be able to develop fast, and have it be right. Its easier and cheaper to throw another EC2 instance in your stack rather than pay a developer to write stuff from scratch whereas in Python you can just import the relevant library for your needs and be up and running much faster, not only the short concise syntax used, but also the introspection into the running language because of its interpreted nature. And this allowed the snowball effect to happen, where developers could quickly write relevant libraries, which in turn allowed other developers to quickly import those libraries and write their libraries, slingshotting Python into a language that is used not only for bleeding edge ML stuff but to run backend web stacks with no issues.
And in the cases where you do need speed, this is where these compilers come in, and its a 100% valid use of manpower and money. Think of it as another library.
Every other language that focuses on things like static typing, whatever type of inheritance the designers think is best, memory safety, and all the other theoretical CS stuff completely misses the above point, and for that reason alone, it will never become mainstream. Rust is not going to happen, Nim is not going to happen, Julia is not going to happen, Scala is not going to happen, Elixir is not going to happen. Sure, there will be a significant amount of code written in those, but the popularity will never come close to Pythons. You may not like it, but you know this is true.
We have already seen this cycle happen with Haskell where functional programming was the next best thing. you would constantly see posts about it at the same frequency you now see posts about Rust, and look where Haskell is now.
Rust is different because it is trying to answer real needs. It is not going to replace Python, and if you're seriously thinking about writing your code in Python you probably shouldn't be thinking of writing it in Rust. There will likely also be less Rust code than Python code. But it can replace C/C++, not completely and not in the near future, but it is possible.
Both Python and Rust guarantee memory safety. Python does it through automatic reference counting at runtime, Rust does it through compile time checks. If you are writing backend code, you could really do it in either, and you would hypothetically choose Rust because its compiled and going to be fast.
The problem with Rust is that they have the unsafe operator. When using a 3d party library, I have no idea if someone put a bunch of unsafe code in there, so all memory safety guarantees go out the window. Sure, you can grab the raw source and compile it yourself, but then that introduces a whole bunch of friction into the dev process.
And the reason unsafe is in Rust is because you can't write standard library stuff, especially with performance in mind, using traditional Rust constructs.
In the end, Rust doesn't give you anything over a compiled C extension to Python, that can be written as memory safe in the sense that it just receives a buffer of data to process with preallocated memory, runs said processing, and returns the data. This is pretty much the standard way that ML works except the compiled extensions just get put on the GPU rather than CPU, and the overhead of the translation layer is extremely small in comparison.
Surprised there is no comparison to MyPyC. That said the availability of a "JIT" compiler in the style of Numba but with much broader Python feature support sounds great to me.
mypyc keeps Python's "BigIntegers", unicode string implementation, reference counting, and has little to no floating point-related optimizations yet. It prioritizes compatibility over overall performance, so I'm not surprised. I was also disappointed at how poor mypyc is at compiling across multiple files, but that they can fix at some point.
The BigInteger "issue" pretty much makes something like Fibonacci a worst case scenario for it.
> Python — which is typically orders of magnitude slower than languages like C
Not to nit-pick...this has been characterized by a team who tested and compared a large set of languages against a wide range of application code. The number is, if I remember correctly, about 78x slower. I don't think "orders" of magnitude is entirely fair. Yes, Python is slow. I have made the mistake of trying to use it for time-critical embedded applications. Never again.
Aside from this admittedly pedantic observation, the first thing that crossed my mind with regards to this tool --which sounds fantastic-- is that you would have to trust the correctness and reliability of your code to this translation layer. Not sure how to think about this other than to keep a mental note of it if using this tool.
78x is roughly two orders of magnitude in typical physics parlance. If you take a more CSy stance and count powers of two, it would be six to seven orders of magnitude. Sounds entirely fair to me.
In the article they say most speed-ups are in the 5x to 10x range. The paper shows this to be true, particularly when compared to PyPy.
In other words, the acceleration isn't measured against raw C implementations (where the 78x factor I quoted would be relevant). It is measured against Python or PyPy.
How much faster does Codon make your Python code. The answer seems to be somewhere around the 5x to 10x range.
In that context, and in the context of actual applications rather than hand-picked tests (how much can we optimize a loop), "orders of magnitude" seems to be an exaggeration.
BTW, MIT does this kind of thing all the time with their press releases. They have a brand to support with outlandish claims about everything that comes out of there. Those with frequent exposure to this kind of press release are wise to this. I've seen it for decades. It's marketing.
For me, when someone says "orders of magnitude" it means "massive". I tend to say "10 times faster", "50 times faster" even "100 times faster". I probably start using "orders of magnitude" faster at 1000x or when I am trying to explicitly make an impression on a mathematically-challenged audience. "Orders of magnitude" sounds great to that crowd.
I have never, in 40 years in CS/Engineering, heard anyone use powers-of-two when they say "orders of magnitude". Doing so would open you to serious misinterpretation. Engineers might say something like "a factor of 2 to the n" or something like that.
Scala seems to be taking this route with Scala 3. I’ll definitely be keeping an eye on it. The JVM is very underrated on HN, and I’m far from a Java fanboy (I’m basically a Zig evangelist).
I don't like it for my use cases, but whenever I read about it on HN it's supposedly the best-tooled, finest artifact of performance engineering ever built.
I'm guessing you know, but as a reminder to readers, Python does have compilation built in. It works similarly as Java - source code is compiled to a bytecode that is then run by a virtual machine. A difference is that for executing that bytecode, CPython virtual machine have JIT native code generation.
"So we thought, let’s take Python syntax, semantics, and libraries and [...] Codon currently covers a sizable subset of Python, it still needs to incorporate several dynamic features and expand its Python library coverage. The Codon team is working hard to close the gap with Python even further".
I have no idea about compilers, so bear with me with this question: Can't we have a faster compiler for a subset of Python?
I mean AFAIK the hard part of Python is that the language allows dynamic overwriting of attributes (or something like that). Is that feature actually needed for projects like Django, FastAPI, numpy, etc?
Maybe I'm wrong, but the main idea I'd like to ask is, can we make a compiler for a subset of that language with C-API compatibility?
The problem turns out to be that it's maintaining the C-API compatibility which is the main thing which makes it hard to make Python fast, not the other stuff -- Javascript has most of the nasty things Python does, and it's plenty fast on browsers.
However, maintaining C-API compatibility means you need to set up lots of data structures exactly how the C API requires, and maintaining and updating those ends up losing you lots of your benefits of JITing.
You could, hypothetically, introduce an entirely new API, which allowed for faster dynamic recompiling, but then you'd need to get every package anyone cares about to switch to that.
That’s why I found GraalVM’s approach intriguing. They provide a high level language API where you can simply write a language interpreter, and it will be able to JIT/AOT compile it down to fast machine code. But the most interesting aspect is that the IR they convert languages to is basically universal, so something like LLVM bitcode can also use the exact same representations.
So you can interpret (and later AOT compile as well) LLVM bitcode and python, and this approach will allow cross-language optimizations as well, which were not available at all before. But feel free to add a bit of JS/Java, etc to your code as well!
I really try hard to understand this argument and I must be missing something and must be super stupid. Don’t languages like JavaScript have this and yet they can still do JIT and the base runtime is still in C++? Java itself has an official way to invoke C programs from Java applications and still has a JIT. And Java also has AOT compilers.
Sure. Crossing that FFI boundary is going to be expensive. But there’s lots of techniques to mitigate it or even in the limit eliminate it. if I recall correctly you can JIT a fast call that knows how to invoke the FFI directly without the extra indirection layer. Basically a fancy runtime LTO.
I think a huge part of it is CPython’s interest in keeping the core codebase as simple as possible which seems to be the overriding reason for why the global lock still hasn’t been removed (which iirc even Ruby pulled off at some point). Also the reason there’s no JIT afaict and why Pypy got started to prove it is possible to JIT (and frequently sees substantial gains vs cpython). The problem they’ve had is that CPython is a moving target and it’s hard to keep a parallel runtime up to date on a shoestring amount of funding. That’s why you see alternate approaches like numba (JIT’ed Python) which are less of a departure and Cinder (better budget). To me this seems like a CPython project actively hostile to JIT than C data structures meaning you lose some benefit to FFI overhead. Performance is a virtuous cycle too - when there’s enthusiasm about a language you get more and more people paid to make your language fast. For a while companies tried. Google gave up. Facebook only has it as a fork with a public plea for the maintainers of CPython to mainline literally anything.
The CPython maintainers feel like the biggest obstacle. No?
The difference is that the other languages FFI don't expose internals like CPython does.
For example, JNI only exposes handles and you need to convert an handle to a pointer, so the runtime knows for the time being that handle is special and being used by native code.
When it is only an opaque handle, lots of optimizations can happen and the native code won't see them.
Doesn’t PyPy accomplish it via CPyExt? It sounds like Cinder, Instagram’s version is CPython+JIT (among other things). I haven’t looked at the details so maybe it’s not a sufficient speed up and that’s why all these parallel efforts haven’t been merged? The part I’m missing is how what you said makes it intractable when we have counter examples within and without the language. Sure. Maybe some optimizations aren’t possible. But that’s a world of difference from little to no benefit and impossible.
Don’t get me wrong. I’m not passing a value judgement on the maintainers. But the reasons don’t feel technical to me.
That native code still needs to be able to interact with your Python objects somehow. You can't change the API around PyObject without forcing all C libraries to make changes on their side, and that API forces you to expose things a certain way.
The interface to the C language. It is what makes Python fast - you write the code which needs to be fast in optimised C and call into it with Python. The Python code is then basically just the glue.
> Is that feature actually needed for projects like Django, FastAPI, numpy, etc?
Yes.
FastAPI depends on really slow pydantic (disclaimer: I'm the author of the faster typedload).
All those dynamic typechecking modules rely on the dynamic nature of the language. The alternative would be to having to generate code at compile time instead.
pydantic is also in the process of being rewritten in rust to be not so slow any longer, and in the process it will become incompatible with anything else than cpython (the normal python runtime). Which in turns means fastapi won't be able to run on anything else (unless they decouple from pydantic… which probably won't be easy).
Django makes use of a whole lot of Python’s fancypants stuff. For this reason, for instance, mypy doesn’t do well on Django projects without a purpose-built plugin. But I still take your point.
pydantic also requires a mypy plugin… But it's just how it was designed. I designed typedload with mypy in mind, so it kinda works (except for some limitations in the type system that don't allow to express some things, as of now).
This is super cool and useful. I know that Instabase , which also came from MIT got really popular and useful within finance communities because they allowed for really fast and efficient compute through their own Python DSL. Good to see this as an open source project which everyone can now use.
took me a while to decode the fact that Codon is (AFAIK) just a Python-to-native compiler. and an incomplete/non-conformant one to boot. The title and OA was written in a breathless, but murky style
"Oh... a native compiler. That cheats by not really honoring Python. Got it."
Second post on this in two days? Had a show HN yesterday. We appreciate the info but dont plug an incomplete product so hard, especially one that doesn't really give other options a fair assessment on their website. Eg. no mention/discussion of nuitka, jax, etc.
I'd say Nuitka or Cython are maybe the more common ones when talking about this. IronPython is/was interesting in that instead compiling to python bytecode or machine code it targetted the .NET CLR, and iirc I saw some kind of JIT going on when I was digging around (so some things ended up as machine code), but it's not really one of the usual "compiles python to machine code" implementations.
Quite sad to know that the dynamic nature of Python is preventing the speedups in the first place. I really hope there'll be a built-in optimizing JIT compiler without the limitations of PyPy, Codon, Nuitka, Numba, etc.
I'd argue that JavaScript and Lua are much simpler underneath the hood, there's only a handful of types to be aware of, hence easier to make a JIT for.
To share objects between threads, some synchronisation is needed, for example to update reference counts. There are a few ways to do this:
- make the user add locks; the problem with this if it goes wrong it can crash the interpreter and make it impossible to debug the problem from within python, which is not user-friendly, and lots of existing code will break. Competent users are already doing this though, so it's nearly free.
- add fine-grained locking/synchronisation for object internals within the interpreter. This slows everything down, even if you're not using threads.
- Lock the whole interpreter state whenever a thread is running. This makes threads less useful (no speed-up from threading pure-python code that isn't doing IO; you have to use multiprocessing for that), but it's cheap as you only need to lock/unlock when you're doing something slow anyway (IO, thread switching, native code).
I think this explains why GIL removal has not been successful yet despite much work: the alternatives slow down single-threaded code, which is not worth it when nearly all sensible uses of threading don't benefit either.
One must note that this is impossible, unless you have chosen to handicap the C-implementations while benchmarking. Borderline unethical IMO to put forth such a claim.
I must have misunderstood what you were objecting to then, my bad. What claim are they making that is so impossible that it borders on being unethical?
I mentioned JIT because it seems to be based on a similar principle at least, that of optimizing things on the programmer's behalf by looking at the program's usage and not just by looking at how to speed up the code generally.
My claim is that there is no additional information that Python provides as opposed to C that would make it faster. And hence, the only conclusion I have is either they have supercharged their compiler for that particular benchmark OR they have chosen to handicap C as once can express the computation in C that emits the same assembly that they lowered to and hence my point on handicapping the C benchmark.
> My claim is that there is no additional information that Python provides as opposed to C that would make it faster
Ok, but that's not what they are claiming - their claim (at least based on what the article is saying) is more about one toolchain vs another, i.e. "if you use our compiler (that takes python code as input) then the resulting executable will run as fast (or possibly faster than) programs created by all the popular compilers (that take C/C++ code as input)." The sales pitch is that they've got magic sauce in their compiler, and you get to use Python as well.
Beating gcc/clang/icc is not a trivial task. One can engineer the pair `{benchmark compiler pass}` in such a manner that they show a speedup, but over a benchmark suite say like the SPEC suite, general community consensus is that it is very difficult and their paper doesn't reveal that they've found a secret sauce (sauce := compiler pass).
Humanity wasted close to 50 years optimizing compilers for one garbage language. Wasted unimaginable efforts, money and developer hours... and all could've been avoided if the same people dedicated a fraction of those resources to language design.
Same thing happened with Java. And now the existence of a well-developed compiler became an argument in its own right in favor of choosing a bad language.
There's no need or reason to try to make Python run faster. It's a trash language. At best, it deserves a credit for being funny 30 years ago... but that had worn out pretty fast. Now it's just dumb. Improving its compiler will be again a resource sink for the programming community that, in the best case, may hope to produce something of value by accident, independent of its main goal...
Such an overwhelming amount of perfectly good solutions are created with Python that I have trouble conceiving it as a "trash" language. Clearly it's at least good enough in many scenarios. For example, I use it for machine learning and data science professionally and I find it much more pleasant to use than alternatives in that space (e.g. Julia and R -- both have advantages over Python but have disadvantages too).
It works, it doesn't confuse me, it's easy to find libraries and examples, and when it's too slow -- which is surprisingly rare -- I have other options to turn to. If that's trash then so be it; call me a raccoon because I'm there for the trash.
> if the same people dedicated a fraction of those resources to language design.
What do you mean by language design here? Is it the user-facing bits and the ergonomics? Because it seems to me (as a non python dev) that that's the bit that python devs really like.
yet its the language of choice for several people who are domain experts but not programming experts. It's doing a job for non-programmers who have to program. until a solution exists that does better, python is going nowhere.
There are plenty of things which are bad but universally used. What's your point?
Fast food is universally more popular than any healthy food that requires time to cook.
People choose to buy low-quality goods in general, trading quality for immediate effect all the time. Take any industry, any product kind, you will see that consumption is skewed towards paying extra for immediate gain rather than paying for quality to minimize waste over time.
The fact that you chose to rely on the opinion of non-experts in the field to assess the value / quality of a particular technology only means that you don't understand what quality is about. You are confused between wants and needs.
In general, or specifically in the same niche as Python?
Python has several disjoint domains where it's used. So, for example, when it comes to statistics, then J, R or Julia would all be better than Python. Not ideal, but still a lot better.
When it comes to infrastructure and ops, then Erlang would've been a lot better. Still not ideal because of how existing implementation deals with deployment (too complicated), but that's not a feature of the language, and could be worked on in the same way how OP wants to work on Python compiler.
When it comes to Web... well, I'm not a specialist... and I find everything about Web revolting, so it's hard for me to think about alternatives. On the other hand, there's rarely a language that doesn't come with a Web framework / some tools that allow it to be used to make Web applications. So, just, basically, throw a dart, and wherever it lands it's going to be better than Python with a very high probability.
Python is also used to teach intro to computer science. And there's a lot of problems with this idea. Firstly, I don't believe that intro to CS should be taught by way of learning to program. It should give an overview of what CS is about, give some foundation, basic concepts from important fields... just like intro to math does, for example. But if we still have to have intro to CS the way we do today, then Scheme would be a lot better for this. Assembly is also a good pick, but for a different reason.
Now, to address the "in general" part: I don't believe that languages like Python should be used universally in different domains. What I believe we need is a language like OMeta, which we specialize for the domains we want to program in, so that we can keep language and compiler mechanics separately from syntax of specific domains. Ironically, this was even obvious at the time of ALGOL design, but nobody waited for it to be implemented and went with the quick-and-dirty solution instead.
"Currently, there are several Python features that Codon does not support. They mainly consist of runtime polymorphism, runtime reflection and type manipulation (e.g., dynamic method table modification, dynamic addition of class members, metaclasses, and class decorators). There are also gaps in the standard Python library coverage. While Codon ships with Python interoperability as a workaround to some of these limitations, future work is planned to expand the amount of Pythonic code immediately compatible with the framework by adding features such as runtime polymorphism and by implementing better interoperability with the existing Python libraries. Finally, we plan to increase the standard library coverage, as well as extend syntax configurability for custom DSLs."