Instead of performance, I’d like to see more effort in portability, package management, and stability for Python because, essentially since it is often enterprise managed, juggling fifteen versions of Python where 3.8.x supports native collection typing annotations but we use 3.7.x, etc. is my biggest complaint. Also up there is pip and just the general mess of dependencies and lack of a lock file. Performance doesn’t even make the list.
This is not to discredit anyone’s work. There is a lot of excellent technical work and research done as discussed in the article. I just think honestly a lot of this effort is wasted on things low on the priority tree of Python.
At Google there is some essay that Python should be avoided for large projects.
But then there’s the reality that YouTube was written in Python. Instagram is a Django app. Pinterest serves 450M monthly users as a Python app. As far as I know Python was a key language for the backend of some other huge web scale products like Lyft, Uber, and Robinhood.
There’s this interesting dissonance where all the second year CS students and their professors agree it’s the wrong tool for the job yet the most successful products in the world did it anyway.
I guess you could interpret that to mean all these people building these products made a bad choice that succeeded despite using Python but I’d interpret it as another instance of Worse is Better. Just like Linus was told monolithic kernels were the wrong tool for the job but we’re all running Linux anyway.
Sometimes all these “best practices” are just not how things work in reality. In reality Python is a mission critical language in many massively important projects and it’s performance characteristics matter a ton and efforts to improve them should be lauded rather than scrutinized.
A few successful projects in the world did it. There's likely far more successful products that didn't use it.
The key metric along this line is how often each language allows success to some level and how often they fail (especially when due to the choice of language).
>should be lauded rather than scrutinized
One can do both at the same time.
Doesn't Instagram serve mostly static content that's put together in an appealing way by mobile apps? I'd figure Instagram's CDN has far more impact than whatever Python code it's running somewhere in it's entrails.
Cargo cult approaches to tech stacks don't define quality.
How does python score on these key metrics?
All those namedrops mean and matter nothing. Hacking together proof of concepts is a time honoured tradition, as is pushing to production hacky code that's badly stiched up. Who knows if there was any technical analysis to pick Python over any alternative? Who knows how much additional engineering work and additional resources was required to keep that Python code from breaking apart in production? I mean, Python always figured very low in webapp framework benchmarks. Did that changed just because <trendy company> claims it used Python?
Also serving a lot of monthly users says nothing about a tech stack. It says a lot about the engineering that went into developing the platform. If a webapp is architected so that it can scale well to meet it's real world demand, even after paying a premium for the poor choice of tech stack some guy who is no longer around made in the past for god knows what reason, what would that say about the tech stack?
If your goal is to actually ship product then this matters a lot. Many of us have dealt with folks spinning endlesslessly on "technical analysis" when just moving forward with something like python would be fine. Facebook is PHP.
I'm actually cautious now when folks are focused too much on the tech and tech analysis instead of product / users / client need.
Why look at results when you can look at analysis!
The problem with this mindless cargo culting around frameworks and tech stacks is that these cultists look at business results and somehow believe that they have anything to do with arbitrary choices regarding tech stacks.
It's like these guys look at professional athletes winning, and process to claim the wins are due to the brand of shoes they are using, even though the athlete had no say about the choice and was forced to wear what was handed over to him.
It's rarely the best choice, or even the fifth best. But if it's OK at a dozen things, then it makes it all but impossible to ignore. The fact that it sucks to write a GUI in is fine as long as I can put some basic "go" buttons and text boxes in front of a web scraper.
On the other hand, for many years a lot of Amazon's back office processes were written in Elisp.
With enough thrust you can get a pig to fly but effort can only compensate for bad technical decisions up to a point.
Elisp? Elisp? Are you sure?
> Shel wrote Mailman [not the Python mailing list manager, an Amazon-internal application] in C, and Customer Service wrapped it in Lisp. Emacs-Lisp. You don't know what Mailman is. Not unless you're a longtime Amazon employee, probably non-technical, and you've had to make our customers happy. Not indirectly, because some bullshit feature you wrote broke (because it was in C++) and pissed off our customers, so you had to go and fix it to restore happiness. No, I mean directly; i.e., you had to talk to them. Our lovely, illiterate, eloquent, well-meaning, hopeful, confused, helpful, angry, happy customers, the real ones, the ones buying stuff from us, our customers. Then you know Mailman.
> Mailman was the Customer Service customer-email processing application for ... four, five years? A long time, anyway. It was written in Emacs. Everyone loved it.
> People still love it. To this very day, I still have to listen to long stories from our non-technical folks about how much they miss Mailman. I'm not shitting you. Last Christmas I was at an Amazon party, some party I have no idea how I got invited to, filled with business people, all of them much prettier and more charming than me and the folks I work with here in the Furnace, the Boiler Room of Amazon. Four young women found out I was in Customer Service, cornered me, and talked for fifteen minutes about how much they missed Mailman and Emacs, and how Arizona (the JSP replacement we'd spent years developing) still just wasn't doing it for them.
> It was truly surreal. I think they may have spiked the eggnog.
I wrote all the rest of the back-office utilities (at least, the initial versions of them), and I have never come across any indication that they got "wrapped" in anything else (they did, no doubt, evolve and mutate in something utterly different over time).
Yegge's quote is also slightly inaccurate in that mailman, like all early amzn software, was written in C++. Shel and I just chose not use very much of the (rather limited) palette of C++ syntax & semantics.
Most days I regret posting to HN. Today is not one of those days.
A good example it's GNUs.
> I guess you could interpret that to mean all these people building these products made a bad choice that succeeded despite using Python but I’d interpret it as another instance of Worse is Better. Just like Linus was told monolithic kernels were the wrong tool for the job but we’re all running Linux anyway.
This isn't the correct perspective or take away. The 'tool' for the job when you're talking about building/scaling a website changes over time as the business requirements shift. When you're trying to find market fit, iterating quickly using 'RAD' style tools is what you need to be doing. Once you've found that fit and you need to scale, those tools will need to be replaced by things that are capable of scaling accordingly.
Evaluating this binary right choice / wrong choice only makes sense when qualified with a point in time and or scale.
YouTube video processing uses C++. It also uses Go and Java along with Python.
PInterest makes heavy use of Erlang for scaling.The rate-limiting system for Pinterest’s API and Ads API is written in Elixir and responds faster than its predecessor.
Takeaway: Basically you need to either build your own PythonVM/CPython fork for better Python performance or use another language for the parts that needs to scale or run fast.
Those companies that succeed in python usually have a long path and python was never successfully removed and most likely attempts were made. The PL economics is often stickyness and its not easy to propose absolute measure.
Same thing with Python, those business succeed despite Python, and when they grew it was time to port the code into something else, or spend herculean efforts into the next Python JIT.
Portability has many solutions that are good enough - often only bad because they result in second order issues which are themselves solvable with limited pain. Being able to scale software further without having to solve difficult distributed systems problems is of value.
A critical thing is Python does numerics very, very well. With machine learning data science, and analytics being what they are, there aren't many alternatives. R, Matlab, and Stata won't do web servers. That's not to mention wonderful integrations with OpenCV, torch, etc.
Python is also competent at dev-ops, with tools like ansible, fabric, and similar.
It does lots of niches well. For example, it talks to hardware. If you've got a quadcopter or some embedded thing, Python is often a go-to.
All of these things need to integrate. A system with Ruby+R+Java will be much worse than one which just uses Python. From there, it's network effects. Python isn't the ideal server language, but it beats a language which _just_ does servers.
As a footnote, Python does package management much better than alternatives.
pip+virtualenv >> npm + (some subset of require.js / rollup.js / ES2015 modules / AMD / CommonJS / etc.)
No offense meant, but that sounds like the assessment of someone that has only experienced really shitty package management systems. PyPI has had their XMLRPC search interface disabled for months (a year?) now, so you can't even easily figure out what to install from the shell and have to use other tools/a browser to figure it out.
Ultimately, I'm moving towards thinking that most scripting languages actually make for fairly poor systems and admin languages. It used to be the ease of development made all the other problems moot, but there's been large advances in compiled language usability.
For scripting languages you're either going to follow the path or Perl or the the path of Python, and they both have their problems. For Perl, you get amazing stability at the expense of eventually the language dying out because there's not enough new features to keep people interested.
For Python, the new features mean that module writers want to use them, and then they do, and you'll find that the system Python you have can't handle what modules need for things you want to install, and so you're forced to not just have a separate module environment, but fully separate pythons installed on servers so you cane make use of the module ecosystem. For a specific app you're shipping around this is fine, but when maintaining a fleet of servers and trying to provide a consistent environment, this is a big PITA that you don't want to deal with when you've already chosen a major LTS distro to avoid problems like this.
Compiling a scripting language usually doesn't help much either, as that usually results in extremely bloated binaries which have their own packaging and consistency problems.
This is cyclical problem we've had so far. A language is used for admin and system work, the requirements of administrators grate up against the usage needs of people that use the language for other things, and it fails for non-admin work and loses popularity and gets replaced be something more popular (Perl -> Python) or it fails for admin work because it caters to other uses and eventually gets replaced by something more stable (what I think will happen to Python, what I think somewhat happened to bash earlier for slightly different reasons).
I'm not a huge fan of Go, but I can definitely see why people switch to it for systems work. It alleviates a decent chunk of the consistency problems, so it's at least better in that respect.
Yes, this is, frankly, an absurd situation for python.
And then there is the fact that I end up depending on third-party solutions to manage dependencies. Python is big-time now; stop the amateur hour crap.
If you use it as a scripting language, that might very well be the case (it's at least simpler). When you're building libraries or applications, no, definitely not. It's a huge mess, and every 3 years or so we get another new tool that promises to solve it, but just ends up creating a bigger mess.
More, given that no language competes at high-level numerics with Python outside of Julia and numerics in general only adds C++.
It's used in a bunch of small niches, but it has users beyond just astronomy.
** A million disclaimers apply.
There is a reason why most BLAS implementations have been rewritten into C.
Not unless they're pushed to, like Python was.
>A critical thing is Python does numerics very, very well.
That's not Python doing numerical stuff. That's C code, called from Python.
As for me, I write:
A * B
It multiplies two matrices. C can't do that. In C, I'd have some unreadable matrix64_multiply(a, b). Readability is a big deal. Math should look more-or-less like math. I can handle 2^4, or 2**4, but if you have mpow(2, 4) in the middle of a complex equation, the number of bugs goes way up.
I'd also need to allocate and free memory. Data wrangling is also a disaster in C. Format strings were a really good idea in the seventies, and were a huge step up from BASIC or Python. For 2022?
And for that A * B? If I change data types, things just work. This means I can make large algorithmic changes painlessly.
Oh, and I can develop interactively. ipython and jupyter are great calculators. Once the math is right, I can copy it into my program.
I won't even get started on things like help strings and documentation.
Or closures. Closures and modern functional programming are huge. Even in the days of C and C++, I'd rather do math in a Lisp (usually, Scheme).
I used to do numerics in C++, and in C before that. It's at least a 10x difference in programmer productivity stepping up to Python.
Your comment sounds like someone who has never done numerical stuff before, or at least not serious numerical stuff.
In C you get BLAS which provides functions like ?gemm, a BLAS level-3 function which is literally stands for general matrix-matrix product.
Also, anyone doing any remotely serious number crunching/linear algebra work is well aware that you need to have control over which algorithms you use to run these primitive operations, and which data type you're using.
> I used to do numerics in C++, and in C before that. It's at least a 10x difference in programmer productivity stepping up to Python.
I'm rather skeptical of your claim. Eigen is the de-facto standard C++ linear algebra toolkit and it overloads operators for basic arithmetics.
I'm not sure your appeal to authority is backed up with any relevant experience or authority.
I'm not sure your appeal to authority is backed up with any relevant experience or authority. It's ok if you like Python and numpy, but don't try to pass off your personal taste for anything with technical merit.
> In C you get BLAS which provides functions like ?gemm, a BLAS level-3 function >which is literally stands for general matrix-matrix product.
> Hardly cryptic.
The signature for dgemm is
void cblas_dgemm(const CBLAS_LAYOUT layout, const CBLAS_TRANSPOSE TransA,
const CBLAS_TRANSPOSE TransB, const CBLAS_INT M, const CBLAS_INT N,
const CBLAS_INT K, const double alpha, const double *A,
const CBLAS_INT lda, const double *B, const CBLAS_INT ldb,
const double beta, double *C, const CBLAS_INT ldc)
The signature of gemm is trivial if you're aware of the basics of handling dense row-major/column-major matrices.
I haven't met a single person who did any number crunching work whatsoever who ever experienced any problem doing basic matrix-matrix products with BLAS. Complaining about flags to handle row-major/column-major matrices while boasting about being an authority on number crunching is something that's not credible at all.
However, most of code is about being able to read it and modify it, not write it. Equations are sometimes hard enough when they look like equations. If you've got a call like that in your code, that's going to be radically harder to understand than A*B.
That's not to mention all the clutter around it, of allocating and deallocating memory.
One of the things C++ programmers don't typically understand are the types of boosts one gets from closures, garbage collection, and functional programming in general, especially for numerics. I recommend this book:
In it, you'll see code which:
- The user writes a Lagrangian
- The system symbolically computes the Lagrange equations (which are derivatives of the above)
- This is compiled into native code
- A numerical algorithm integrates that into a trajectory of motion
- Which is then plotted
All of this code is readable (equations are written in Lisp, but rendered in LaTeX). None of this is overly hard in a Lisp; this was 1-3 people hacking together, and not even central to what they were doing. It'd be neigh-impossible in C or C++.
Those are the sorts of code structures which C/C++ programmers won't even think of, because their impossible to express.
(Footnote: Recent versions of C++ introduced closures; I have not used them. People who have express they're not "real closures.")
The hard part is the compiler. The somewhat hard part is the efficient numerical integrator (if you want good convergence and rapid integration). The symbolic manipulation is easy, once you know what you're doing. If you want to know what you're doing, it's sufficient to see how other people did it:
And if you haven't seen it:
If you're dumb like me, and can't write a compiler or an efficient numerical integrator in your spare time, having this interpreted and using a naive integrator is still good enough most of the time. Computers are fast. The authors of the above proved the motion of the solar system is chaotic with a very, very long, very, very precise numeric integration on hardware from decades ago, so they have super-fancy code. For the types of things I do, a dumb integration is fine.
And using the tool is even easier. Look at SICM (linked book). The things that look like code snippets are literally all the code you need.
The system, if you want it:
That said, I'll maintain that my initial skepticism was somewhat justified given that the derivation relies on a pre-written symbolic manipulation/solver library (scmutils) so it's not quite 'from scratch' in a Lisp ;) although I believe that you could write such a library yourself, as SICP demonstrates.
As a side note, it looks like I can't go through this book because I have an M1 Mac, which GNU Scheme doesn't support :(
In case you are forced to use the unreadable long-named unintuitively-syntaxed methods, add unit tests, and check that input-output pairs match with whatever formula you started with.
if 0.1 + 0.2 == 0.3:
print('Data is handled as expected.')
Raw floats don't get you there unfortunately.
Other people have posted other examples but it’s not possible to represent real numbers losslessly in finite space. Mathematicians use symbolic computation but that probably is not what you would want for numerics. I could see a language interpreting decimal input as a decimal value and forcing you to convert it to floating point explicitly just to be true to the textual representation of the number, but it would just be annoying to anyone who wants to use the language for real computation and people who don’t understand floating point would probably still complain.
Edit: I’ll admit I have a pet peeve that people aren’t taught in school that decimal notation is a syntactic convenience and not an inherent property of numbers.
That's sort of a distinction without a difference, isn't it? Python can be good for numeric code in many instances because someone has gone through the effort of implementing wrappers atop C and Fortran code. But I'd rather be using the Python wrappers than C or especially Fortran directly, so it makes at least a little sense to say that Python "does numerics [...] well".
> Not unless they're pushed to, like Python was.
R and Matlab, maybe. A web server in Stata would be a horrible beast to behold. I can't imagine what that would look like. Stata is a terrible general purpose language, excelling only at canned econometrics routines and plotting. I had to write nontrivial Stata code in grad school and it was a painful experience I'd just as soon forget.
Readability of code and ease of use is a big thing. It's just not about pushing hard till we make it.
And then it pays dividends later, as well, because it's really easy for a python developer to pick up code and maintain it, but for JS it's more dependent on how well the original programmer designed it.
Django is single request at a time with no async. The standard fix is gunicorn worker processes, but then you require entire server memory * N memory instead of lightweight thread/request struct * N memory for N requests.
I shudder to think that whenever Django server is doing an HTTP request to a different service or running a DB query, it's just doing nothing while other requests are waiting in the gunicorn queue.
The difference is if you have an endpoint with 2s+ queries taking 2s for one customer, with Django, it might cause the entire service to stall for everybody, whereas with a decent async server framework other fast endpoints can make progress while the 2s ones are slow.
Either way, high concurrency websites shouldn't have queries that take multiple seconds and it's still possible to block async processes in most languages if you mix in a blocking sync operation.
I'm looking at a page now which recommends 9 concurrent. requests for a Django server running on a 4 core computer.
Meanwhile node servers can easily handle hundreds of concurrent requests.
I don't think in 'handling x concurrent requests' terms because I don't even know what that means. Usually I think around thoughout, latency distributions and number of connections that can be kept open (for servers that deal with web sockets).
For example if you have the 4 core computer and you have 4 workers and your requests take around 50ms each you can get to a throughput of 80 requests per second. If the fraction of request time for IO if 50% you can bump your thread count to try to reach 160 request per second. Note that in this case each request consumes 25ms of CPU so you would never be able to get more than 40 requests per second per CPU whether you are using node or python.
What about Python makes it unsuitable for those purposes other than its performance?
But I disagree on "not the right jobs for the tool".
Python is extremely versatile and can be used as a valid tool for a lot of different jobs, as long as it fits the job requirements, performance included.
It doesn't require a CS degree to know that fitting job requirements and other factors like the team expertise, speed, budget, etc, are more important than fitting a theoretical sense of "right jobs for the tool".
It requires experience.
A lot of those lessons only come after you've seen how much more expensive it is to maintain a system than to develop one, and how much harder people issues are than technical issues.
A CS degree, or even a junior developer, won't have that.
Whether Python will be easier or harder to maintain depends on numerous factors that vary so much for each job that you cannot generalize upon.
That's something experience shows.
Reaching such a conclusion that "Python is not a right tool for web backend" is just naive.
No matter how experienced a developer is, reality of the world is at least 100x more diverse than what they alone could possibly have learned and experienced.
If one believes to possess the experience of everything to generalize on complex topics like this, it just shows this person could benefit from cultivating a bit more humbleness.
Another example, with a different take: MicroPython, on embedded. The only good reason I can think for this is to appeal to people who've learned Python, and don't want to learn another language.
I have been leading a Python project lately and, yes, the tooling is very poor, although it is getting better. I have found poetry to be a very good for venv management and lock files + having one file for all your config.
Python is much more than just a scripting language. I remember attending this talk a few years about JPMorgan's 35 million LOC Python codebase. Python is being used to built seriously large software nowadays and I don't think performance is ever a minor issue. It should always be in the top 3 for any general purpose language because it directly translates into development speed, time and money.
You can use pyproject.toml or requirements.txt as lock files, Poetry can use the former and poetry.lock files, as well.
Is it possible to solve your problem using pip freeze?
Anything beyond that, there are compiled languages with REPL available.
Eclipse has Java scratchpads for ages, Groovy also works out for trying out ideas and nowadays we have jshell.
F# has a REPL in ML linage, and nowadays C# also shares a REPL with it in Visual Studio.
Lisp variants, going at it for 60 years.
C++, there are hot reload environments, scripting variants, and even C and C++ debuggers can be quite interactive.
I used GDB in 1996, alongside XEmacs, as poor man's REPL while creating a B+Tree library in C.
Yes, there are Go interpreters available,
Might be tripping you up. Very few languages require that implementations be compiled or interpreted. For most languages, having a compiler or interpreter is an implementation decision.
I can implement Python as an interpreter (CPython) or as a compiler (mypyc). I can implement Scheme as an interpreter (Chicken Scheme's csi) or as a compiler (Chicken Scheme's csc). The list goes on: Standard ML's Poly/ML implementation ships a compiler and an interpreter; OCaml ships a compiler and an interpreter.
There are interpreted versions of Go like https://github.com/traefik/yaegi. And there are native-, AOT-compiled versions of Java like GraalVM's native-image.
For most languages there need be no relationship at all between compiler vs interpreter, static vs dynamic, strict or no typing.
“Performance is a virtue; if Perl ceases to be good enough, or you need to write ‘serious’ software rewrite in C.”
And during Python’s ascension, the common narrative shifted very slightly:
“Performance is a virtue, but developer productivity is a virtue too. Plus, you can drop to C to write performance critical portions.”
Then for our brief all-consuming affair with Ruby, the wisdom shifted more radically:
“Developer productivity is paramount. Any language that delivers computational performance is suspect from a developer productivity standpoint.”
But looking at “high-level” languages (i.e. languages that provide developer productivity enhancing abstraction), we can rewind the clock to look at language families that evolved during more resource-constrained times.
Those languages, the lisps, schemes, smalltalks, etc. are now really, really fast compared to Python, and rarely require developers to shift to alternative paradigms (e.g. dropping to C) just to deliver acceptable performance.
Perl and Python exploded right at the time that Lisp/Scheme hadn’t quite shaken the myth that they were slow, with Python/Perl achieving acceptable performance by having dropped to C most of the time.
Now the adoption moat is the wealth of libraries that exist for Python—and it’s a hell of a big moat. If I were a billionaire, I’d hire a team of software developers to systematically review libraries that were exemplars in various languages, and write / improve idiomatic, performant, stylistically consistent versions in something modern like Racket. I’d like to imagine that someone
would use those things :-)
In addition, "real" performance is often tricky to measure and may be irrelevant compared to other parts of the system. Yes, Ruby is 10-100x slower than C. But if a user of my web service already has a latency of (say) 200ms to the server then it barely matters if the web service returns a response in 5 ms or in 0.5 ms. Similarly for rendering an email: no user will notice their email arriving half a second earlier. Similarly for a python notebook: if it takes 1 or 2 seconds to prepare some data for a GPU processing job that will take several hours, it doesn't really matter that the data preparation could have been done in 0.1 seconds instead if it had been done in Rust.
Especially for startups where often you're not sure if you're building the right thing in the first place, a big ecosystem of prebuilt libraries is super important. If it turns out people actually want to buy what you've made in sufficient numbers that the inefficiency of Ruby/Python/JS/etc becomes a problem then you can always rewrite the most CPU intensive parts in another language. Most startup code will never have the problem of "too many users" though, so it makes no sense to optimize for that from the start.
The productivity gain is so great it outweights everything else. It's as simple as that.
Developer salaries are way higher than CPU and server costs. Productivity wins out here vs. performance.
There are other effects at play, too, of course.
If your write an object system on top of your performant hobby Scheme implementation, you'll likely find that the performance of its method dispatch is about as slow as it is in Python. Probably even slower.
Purely procedural Python code isn't as slow as object-oriented Python code.
The Python ecosystem has certainly received a lot of developer resources and attention the past couple of decades. Shall we compare the performance of CLOS on SBCL, which again has seen comparatively little developer resources, to Python's performance in dealing with objects? I'd take that performance wager.
I'm thinking CLOS has more dynamism than Python - they're both dynamically typed, they're both doing a lookup then dispatch, but then CLOS adds dynamism on top of that, it's also looking in the metadata thingy (i'm not a lisp developer, do they call it the hash? I'm meaning the key value store on every "atom" - i'm so out of my depth here, is atom the right word?) plus if i remember right the way CLOS works you use multiple dispatch not just single dispatch like python.
I like python (especially things like comprehensions), but to say it's more dynamic than common lisp is a little insane.
If I recall correctly, method calls in CLOS are also not syntactically distinguished from regular calls either (like they are in x.f() languages), so there is no motivation to write a method unless you actually want the dynamic dispatch methods provide. Methods are fairly rare compared to regular functions.
Unless you declare them "notinline".
> so there is no motivation to write a method unless you actually want the dynamic dispatch methods provide.
Methods have some different functionality in CL than in most languages. For example, you can have the methods from the entire inheritance hierarchy (well, the classes that have had that method specialized for them) all be called as, essentially a pipeline of methods.
But this is a discussion about dynamism of the object system, and most of that is defined before runtime. How about changing an object's class at runtime? https://www.cliki.net/site/HyperSpec/Body/stagenfun_change-c...
a become: b
Two parts to your argument:
- Writing a Scheme implementation quickly: Google "Write a Scheme in 48 hours" and "Scheme from scratch." 48 hours to a functioning Scheme implementation seems to be a feat replicated in multiple programming languages.
- Performance: I haven't benchmarked every hobby scheme, but given the proliferation of Scheme implementations that, despite limited developer resources, beat (pure) Python with it's massive pool of developers (CPython, PyPy), I still don't buy the idea that optimizing Scheme is a harder task than optimizing Python. Again, I'd strongly suggest that optimizing Scheme is a much easier task than optimizing Python simply by virtue of how often the feat has been accomplished.
Fair enough, you're right. But if we're only talking about incomplete Scheme implementations it's not a very interesting claim. As I pointed out in another comment, even I could write a fast Scheme implementation in 48 hours if I kept my scope very limited. That doesn't say much about Scheme performance overall or how it relates to Python.
You simply can't say these things about Python (and I generally like Python!). It's truer for PyPy, but PyPy is pretty big and complex itself. Take a look at the source for the scheme or scheme-derived language of your choice sometime. I can't claim to be an expert in any of what's going on in there, but I think you'll be surprised how far down those parens go.
Given how many performant implementations of Scheme there are, I just don't think you can claim it's because of complex implementations by well-resourced groups. To me, I think the logical conclusion is that Scheme (and other lisps for the most part) are intrinsically pretty optimizable compared to Python. If we look at Common Lisp, there are also multiple performant implementations, some approximately competitive with Java which has had enormous resources poured into making it performant.
"An Incremental Approach to Compiler Construction"
I think people who haven't implemented a language underestimate how slow CPython is. And overestimate how hard it is to build a compiler for a dynamic language.
I think every professional developer or CS student can and should build a compiler for a dynamic language!
Even I, very much a non-computer scientist, could write a fast Scheme quickly if I could keep myself to a very small subset, so that’s not interesting to me.
Compared with current state of affairs in CPython, even with basic code generation algorithms, the machine code would be fast enough.
Introducing to compilers means also writing one in the process, unless it is a lousy degree.
But compared to say Smalltalk, Scheme, and Lisp, I think Ruby and Python have been able to evolve incrementally in part because of their simpler more hackable implementations.
So traits for Squeak but not the commercial Smalltalks.
The larger Python ecosystem has tried that a number of times too (Unladen Swallow, PyPy, etc.)
It's quite difficult since both of those languages already lean heavily on C FFI and having frequent hops in and out of FFI code tends to make it harder to get the JIT fast. JITs work best when all of the code is in the host language and can be optimized and inlined together.
Python could have done this easily too but evolving as a language just isn't as big a priority (not that I'm saying it should be) and that's completely (or mostly) disconnected from their backend implementation decisions.
To increase Python performance, you only need more Python!
Another possibility is that the requirement to "drop to C" is a virtue by de-democratizing access to serious performance. In other words, let the commoners eat Python, while the anointed manage their own memory.
I personally find this argument a bit distasteful / disagree with it, but there was a thread the other day that talked about the, uh, variable quality of code in the Julia ecosystem (Julia being another language where dropping to C isn't important for performance). In Julia, the academics can just write their code and get on with their work—the horror!
The dynamic-nature of the language is actually something that I had studied a few years back . Particularly the variable and object attribute look ups! My work was just a master's thesis, so we didn't go too deep into more tricky dynamic aspects of the language (e.g. eval, which we restricted entirely). But we did see performance improvements by restricting the language in certain ways that aid in static analysis, which allowed for more performant runtime code. But for those interested, the abstract of my thesis  gives more insight into what we were evaluating.
Our results showed that restricting dynamic code (code that is constructed at run time from other source code) and dynamic objects (mutation of the structure of classes and objects at run time) significantly improved the performance of our benchmarks.
There was also some great discussion on HN when I had posted our findings as well .
Well, yes. In Python, one thread can monkey-patch the code in another thread while running. That feature is seldom used. In CPython, the data structures are optimized for that. Underneath, everything is a dict. This kills most potential optimizations, or even hard-code generation.
It's possible to deal with that efficiently. PyPy has a compiler, an interpreter, and something called the "backup interpreter", which apparently kicks in when the program being run starts doing weird stuff that requires doing everything in dynamic mode.
I proposed adding "freezing", immutable creation, to Python in 2010, as a way to make threads work without a global lock. Guido didn't like it. Threads in Python still don't do much for performance.
It doesn’t - this has been a basically solved problem since Self and deoptimisation were invented.
I'm thinking of reimplementing some python modules in rust, as that seems like the kind of weird thing I'm in to. I've done it with some success (using the excellent work of the pyo3 project) professionally, but I'd be interested in doing more.
You can make large Numpy arrays fine — eg, 20k x 20k or 500k x 500k, but trying to render that to anything but SVG or manual tilings pukes badly.
That’s my main blocker on rendering high dimensional shapes: you can do the math, but visualizations immediately fall over (unless you do tiling yourself).
There’s probably someone with a more useful idea than “gigapixel rendering” though.
But I'm sure there's always room for improvement
Same API means I can import polars as pd and be done with it.
Also, if you happen to build microservices, don't forget to try PyPy, that's another easy performance booster (if it's compatible to your app).
Every time I experiment with PyPy (on a set of non-trivial web services) I encounter at least one incompatibility with PyPy in the dependency tree and leave disappointed.
I really enjoy Nim which is "slick as Python, fast as C".
A dedicated 40core / 6Tb server is around $2k but will be amortized over the years of its life.
It needs power, cooling, someone to install it in a rack, someone to recycle it afterwards, ..., around $175/yr
A FAANG dev varies wildly but $400k seems fair-ish (given how many have TC > 750k).
So that's about 12 hours of time optimising the code vs throwing another 40c / 6Tb machine at the problem for 365 days.
The big cost i'm missing out of both the server and the developer is the building they work in. What's the recharge for a desk at a FAANG, $150k/yr ? I have no idea how much a rack slot works out at.
Unless i've screwed up the figures anywhere, we should probably all be looking at replacing Python with Ruby if we can squeeze more developer productivity!
What’s the ping latency from US East to Europe? 80ms-ish? What’s a roundtrip to postgres with a regular business app type query, 20ms-ish? What’s the latency on a beefy rails app’s request handling, 40ms?
We’re talking 140ms best case for a slow stack. What can you get that down to with tuning work?
When your user comes along on their 4g connection with 800ms latency, will they be able to tell the difference?
Don’t get me wrong, I’d far rather invest time in making the stack efficient but from a business point of view, it might not make sense vs. just throwing hardware at the problem and spending your expensive engineering resources on making it possible to deliver more utility to customers.
I don't understand how this is supposed to be "quickly" verifiable?
Nothing prevents you from doing eval('gl' + 'obals')()['len'] = ...; how is the interpreter supposed to quickly check that this isn't the case when you're calling a function that might not even be in the current module?
Doing this correctly would seem to require a ton of static analysis on the source or bytecode that I imagine will at best be slow, and at worst impossible due to the halting problem.
On the real world, when faced with a generally undecidable problem, we don't run away and lose all hope. We decide the cases that can be decided, and do something safe when they can't be decided.
On your example, Python can just re-optimize everything after an eval. That doesn't stop it from running optimized code if the eval does not happen. It can do even more and only re-optimize things that the eval touched, what has some extra benefits and costs, so may or may not be better.
Besides, when there isn't an eval on the code, the interpreter can just ignore anything about it.
I'm not lost on that at all; I'm well aware of that. that's precisely why I wrote
>> [...] require a ton of static analysis on the source or bytecode that I imagine will at best be slow, and at worst impossible due to the halting problem
>> static analysis is impossible in the general case so we run away and lose all hope.
I'm not sure how you read that sentiment from my comment.
Also, you can combine the two to get something better than any single analysis alone.
x86 processors solve this by speculating about what's going on. If you suddenly run into a 1976-era operation, everything slows down dramatically for a bit (but still goes faster than an 8086). If you have a branch or cache miss, things slow down a little bit.
One has a few possibilities:
- A static analysis /proves/ something. print is print. You optimize a lot.
- A static analysis /suggests/ something. print is print, unless redefined in an eval. You just need to go into a slow path in operations like `eval`, so if print is modified, you invalidate the static analysis.
- A static or dynamic analysis suggests something probabilistically. You can make the fast path fast, and the slow path eventually work. If print isn't print, you raise an internal exception, do some recovery, and get back to it.
I'm also okay with this analysis being run in prod and not in dev.
As a footnote, JITs, especially in Java, show that this kind of analysis can be pretty fast. You don't need it to work 100% of the time. The case of a variable being redefined in a dozen places, you just ignore. The case where I call a function from three places which increments an integer each time, I can find with hardly any overhead at all. The latter tends to be where most of the bottlenecks are.
You don’t verify, and instead you run assuming no verification is needed. Then if someone wants to violate that assumption, it’s their problem to stop everyone who may have made that assumption, and to ask them to not make it going forward.
You shift the cost to the person who’s doing the metaprogramming and keep it free for everyone who isn’t.
An "easier" change would be to add a class attribute "no__dict__", which says that the __dict__ attribute can't be used, which lets the implementation do whatever it wants. That can be incrementally added to classes.
Another option is a "no__getattr__" attribute, which disables gettattr and friends.
The Python wiki also has some good info about it: https://wiki.python.org/moin/UsingSlots
If anyone has a burning desire to try to write the next big dynamically-typed scripting language, I've often noodled in my head with the idea of a language that has a dynamically-typed startup phase, but at some point you call "DoneBeingDynamic()" on something (program, module, whatever, some playing would have to be done here) and the dynamic system basically freezes everything into place and becomes a static system. (Or you have an explicit startup phase to your module, or something like that.)
The core observation I'm driving this on is much the same as the quote I give from the article. You generally set up the vast majority of your "dynamicness" once at runtime, e.g., you set up your monkeypatches, you read the tables out of the DB to set up your active classes, you read the config files and munge together the configurations, etc. But then forever after, your dynamic language is constantly reading this stuff, over and over and over and over again, millions, billions, trillions of times, with it never changing. But it has to be read for the language to work.
Combine that with perhaps some work on a system that backs to a struct-like representation of things rather than a hash-like representation, and you might be able to build something that gets, say, 80% of the dynamicness of a 1990s-era dynamic scripting language, while performing at something more like compiled language speeds, albeit with a startup cost. If you could skip over the dozens of operations resolving
x.y.z.q = 25
You can also view this as a Lisp-like thing that has an integrated phase where it has macros, but then at some point puts this capability down.
I tend to think it's just fundamentally flawed to take a language that is intrinsically defined as "x.y.z.q" requiring dozens of runtime operations versus trying to define a new one where it is a first-class priority from day one that the system be able to resolve that down to some static understanding of what "x.y.z.q" is. e.g., it's OK if y is a property and z is some fancy override if the runtime can simply hardcode the relevant details instead of having to resolve them every time. You can outrun even JIT-like optimizations if you can get this down to the point where you don't even have to check incoming types, you just know.
V8 tries to guess that for classes and objects based on runtime information - that's how it gets some of its speed (it still needs checks about whether this is violated at any point, so that it can get rid of the proxy/stub "static" object it guessed).
For a more static guarantee, there are also things like Object.freeze which does about what you describe for dynamic objects in JS (#).
I dunno. It's possible the real world would stomp all over this idea in practice, or the resulting language would just be too complex to be usable. It does imply a rather weird bifurcation between being in "the init phase" and "the normal runtime phase", and who knows what other "phases" could emerge. Although technically, Perl actually already has this split, although generally it can be ignored because it's of much less consequence in Perl precisely because there mostly isn't much utility to having something done in the earlier phase, unlike this hypothetical language.
Partially I throw this idea out as a bone to those who like dynamic languages. Personally I don't have this problem anymore because I've basically given them up, except in cases where the problem is too small to matter. And if you already know and like Lisp, you don't really have this problem either.
But if you are a devotee of the 1990s dynamic scripting languages, you're getting really squeezed right now by performance issues. You can run 40-50x slower than C, or you can run circa 10x slower than C with an amazing JIT that requires a ton of effort and will forever be very quirky with performance, and in both cases you'll be doing quite a lot of work to use more than one core at a time. Python is just hanging in there with the amazing amount of work being poured into NumPy, and from what I gather from my limited interactions with data scientists, as data sets get larger and the pipelines more complex, the odds you'll fall out of what NumPy can do and fall back to pure Python goes up and the price of that goes up too.
I think a new dynamic scripting language built from the ground up to be multithreadable and high performance via some techniques like this would have some room to run, and while hordes of people will come out of the woodwork promising that one of the existing ones will get there Real Soon Now, just wait, they've almost got it, the reality is I think that the current languages have pretty much been pushed as far as they can be. Unless someone writes this language, dynamic scripting languages are going to continue slowly, quite slowly, but also quite surely, just getting squeezed out of computing entirely. I mean, I'm looking at the road ahead and I'm not sure how Go or C# is going to navigate a world where even low-end CPUs casually have 128 cores on consumer hardware.... Python qua Python is going to face a real uphill battle when the decision to use it entails basically committing to not only using less than 1% of the available cores (without offloading on to the programmer a significant amount of work to get past that), but also using that core ~1.5 orders of magnitude less efficiently than a compiled language. You've always had to pay some to use Python, sure, but that's an awful lot of orders of magnitude for "a nice language". Surely we can have "a nice language" for less price than that.
Interestingly, GraalPython never seems to come up on these speeding-up-Python articles & benchmarks while TruffleRuby is a heavyweight in the speeding-up-Ruby space.
If you just set your structure up and run it statically, you are better with a static language, that can take all kinds of value from that fixed structure.
Turned around, Python has arbitrary access into the C programming space (really, the UNIX or Windows process it's running inside), so long as it has access to headers or other type info it can see C with more than black box info.
Most python numerics is implement in numpy; the low levels of numpy are actually (or were) effectively a C API implementing multiple dimension arrays, with a python wrapper.
from lib1 import makedata
from lib2 import processdata
data = makedata()
Yes they are - the C interpreter knows nothing about what your C extension does. It can’t optimise it because all it has is your machine code - no higher level logic.
What dynamic, interpreted, single threaded language is fast?
To answer OP, if you replace "dynamic" by "untyped", Forth qualifies. And it actually can go where there's no JIT to save your A from the "just throw more hardware (and software) at the problem" mindset.
I did not do it for performance considerations, but because I felt the type system of Rust could give me better guarantuees here (and it did, I found multiple bugs this way).
On top the performance gain was noticible, although I did not bother to benchmark it.
There are still some gotchas in the conversion from Python types to Rust (e.g. a function which takes a Vec<String> will happily accept a str and produce a vector of the str characters. This is common behaviour in Python where I tend to do a isinstance(value, str): value = [value] to deal with this.
Statically reasoning about all this (as would be required for optimization) is difficult. Not totally impossible, but not something that the canonical CPython implementation tries to do.
It seems to me some guys learned a language suited to a thing and instead of learning other languages better fitted for other purposes, they push for their one and only language to be used everywhere, resulting in delays and financial losses.
It's not very hard to learn another language. Or, if you are that lazy, you can stay with the language you know and use it for what was intended.