Judas priest, after all the effin' grief we went through to learn how to handle Unicode strings in Python 3, and to finally begin to realize their value, you take this step backward? Forget the i64 limits, the lack of native Unicode strings is a flat deal-breaker. (For example, will Codon warn if it sees an "open(encoding='UTF8')" call? Or a normal open(mode='rt') if the default local encoding is UTF8?)
It doesn't help that the same doc also mentions that
* Dictionaries: Codon's dictionary type is not sorted internally, unlike Python's
Current Python dicts are not "sorted"; rather they "preserve insertion order, meaning that keys will be produced in the same order they were added sequentially over the dictionary."[2]
This is new functionality added only recently (3.7) so its lack would not inconvenience a lot of existing code. OTOH, why did they not plan to reproduce this useful feature from the start?
Possibly they were thinking of the pypi package SortedContainers[3]?
Breaking compatibility from the current spec/functionality of python should be a definite no-no for any implementation. That being said, I can still appreciate that they didn't try to write their own busted unicode implementation since many other ones have contributed to security issues.
I don't understand why you say that. Like, it's gonna cost them in adoption to diverge, they don't need a lecture to understand that, they are doing what meets their needs and sharing it.
The type conversion assumptions here are real problematic. "64 bits ought to be enough for anybody"-style statements ignore integers as bitfields, large constants (eg Avogadro's number), any kind of math with large intermediate terms, all kinds of stuff.
Makes me very suspect about the rest of this project when they try to glide past all of these issues with nary a mention.
> There are many things we took for granted here, like how we determine the data types to begin with, or how we put the source code in a format that’s suitable for code generation. These, among other things, will be topics of future posts in this series. Stay tuned!
I don't feel they are trying to glide past anything. It's the first post in the series about a product in 0.x state, it's gotta start somewhere other than perfection and they seem to know that.
If I recall it really wasn't that much faster than CPython given the overhead, but it's been a long time; if it was faster I assume it wouldn't have been abandoned.
Quite. Unladen Swallow was unfortunately a failure, in part because LLVM at the time was quite buggy, and in part because LLVM wasn't (isn't?) magic enough to speed up a dynamic language.
The blog post here mentions they do their own optimization passes, before handing over to LLVM. I imagine that's pretty important.
LLVM really wasn't that buggy at the time (circa 2009); the project I was using it for at the time, a .NET compiler that targeted video game consoles, was quite stable from a code generation point of view, and we were shipping games with it.
Ah, that's cool. Thanks for the correction. I was misremembering the Unladen Swallow retrospective[1]. It's fair to say they used a lot fo their available time contributing to LLVM, but it sounds like that was feature work, not bug focused.
Codon is very impressive, it feels a lot like Python without being slow like Python.
Don’t think of it as a Python compiler, it is its own language. (Esp. re choice of int == i64, this saves SO MUCH computation for the CPU.)
I will say though that I’m not sure where to use it yet, since it’s too immature for important projects and also aims at the “we need a nuclear bomb” level performance.
> How can I use Codon for production or commercial use?
> Please reach out to... to inquire about about a production-use license.
Having "contact us" pricing with several incompatibilities makes this pretty hard to consider in a commercial environment. I wish they had a public pricing structure.
I don't know how codon does this. But I always supposed that existing optimized pythons like pypy map integer operations to native types and promote them to arbitrary precision when they encounter overflow. It's IMO a similar problem to "but what if someone decided to overwrite Int.__add__ with some other function?" - arguably these are weird/bad things to do but AFAIK permitted by the language semantics. So to fix problems like these you just make it work for the paranoid case and implement optimizations that rely on that not being common. When the weird behavior is detected you fall back to the slower path.
While RPython is a restricted version of python used to build the PyPy python interpreters, the interpreters themselves are not restricted. Any deviation from CPython behavior, intended or not, is considered a bug. So Codon should be compared to the PyPy python interpreter, not to RPython. The advantage to writing the interpreter in RPython rather than C (CPython) or pre-compiling python code to LLVM IR and from there to creating and executable (Codon), is that RPython comes with a metaJIT (which can generate a JIT) and a mark-and-sweep garbage collector for any interpreter built on top of it.
"While Codon does offer a JIT decorator similar to Numba's, Codon is in general an ahead-of-time compiler that compiles end-to-end programs to native code. It also supports compilation of a much broader set of Python constructs and libraries."
Very well written article. Delves into some details of things like exception handling semantics without going too far in the weeds. Thanks for sharing.
You can write your code generator to produce the optimized output right away, of course. But the whole point of LLVM is to not have every compiler worry about doing stuff like this well.
* Strings: Codon currently uses ASCII strings unlike Python's unicode strings.
Judas priest, after all the effin' grief we went through to learn how to handle Unicode strings in Python 3, and to finally begin to realize their value, you take this step backward? Forget the i64 limits, the lack of native Unicode strings is a flat deal-breaker. (For example, will Codon warn if it sees an "open(encoding='UTF8')" call? Or a normal open(mode='rt') if the default local encoding is UTF8?)
It doesn't help that the same doc also mentions that
* Dictionaries: Codon's dictionary type is not sorted internally, unlike Python's
Current Python dicts are not "sorted"; rather they "preserve insertion order, meaning that keys will be produced in the same order they were added sequentially over the dictionary."[2]
This is new functionality added only recently (3.7) so its lack would not inconvenience a lot of existing code. OTOH, why did they not plan to reproduce this useful feature from the start?
Possibly they were thinking of the pypi package SortedContainers[3]?
[1] https://docs.exaloop.io/codon/general/differences
[2] https://docs.python.org/3/reference/datamodel.html#index-30
[3] https://pypi.org/project/sortedcontainers/