Hacker News new | past | comments | ask | show | jobs | submit login
Modern Python Performance Considerations (lwn.net)
320 points by chmaynard 16 days ago | hide | past | favorite | 244 comments



With JavaScript, these kinds of optimizations in an engine make sense due to the web being limited by it and thus speed is a huge factor. With Python, however, if a Python web framework is “too” slow, I would honestly say the problem is using Python at all for a web server. Python shines beautifully as a (somewhat) cross platform scripting language: file reading and writing, environment variables, simple implementations of basic utilities: sort, length, max, etc that would be cumbersome in C. The move of Python out of this and into practically everything is the issue and then we get led into rabbit holes such as this where since we are using Python, a dynamic scripting language, for things a second year computer science student should know are not “the right jobs for the tool.”

Instead of performance, I’d like to see more effort in portability, package management, and stability for Python because, essentially since it is often enterprise managed, juggling fifteen versions of Python where 3.8.x supports native collection typing annotations but we use 3.7.x, etc. is my biggest complaint. Also up there is pip and just the general mess of dependencies and lack of a lock file. Performance doesn’t even make the list.

This is not to discredit anyone’s work. There is a lot of excellent technical work and research done as discussed in the article. I just think honestly a lot of this effort is wasted on things low on the priority tree of Python.


On paper, Python is not the right tool for the job. Both because of its bad performance characteristic and because it’s so forgiving/flexible/dynamic , it’s tough to maintain large Python codebases with many engineers.

At Google there is some essay that Python should be avoided for large projects.

But then there’s the reality that YouTube was written in Python. Instagram is a Django app. Pinterest serves 450M monthly users as a Python app. As far as I know Python was a key language for the backend of some other huge web scale products like Lyft, Uber, and Robinhood.

There’s this interesting dissonance where all the second year CS students and their professors agree it’s the wrong tool for the job yet the most successful products in the world did it anyway.

I guess you could interpret that to mean all these people building these products made a bad choice that succeeded despite using Python but I’d interpret it as another instance of Worse is Better. Just like Linus was told monolithic kernels were the wrong tool for the job but we’re all running Linux anyway.

Sometimes all these “best practices” are just not how things work in reality. In reality Python is a mission critical language in many massively important projects and it’s performance characteristics matter a ton and efforts to improve them should be lauded rather than scrutinized.


>the most successful products in the world did it anyway

A few successful projects in the world did it. There's likely far more successful products that didn't use it.

The key metric along this line is how often each language allows success to some level and how often they fail (especially when due to the choice of language).

>should be lauded rather than scrutinized

One can do both at the same time.


Instagram has one billion monthly users generating $7 billion a year. There are almost zero products on earth as successful.


Just compare Instagram written in Python to Google Wave, Google+ or any other Google's social media, written in C++/Java :))))


> Instagram has one billion monthly users generating $7 billion a year.

Doesn't Instagram serve mostly static content that's put together in an appealing way by mobile apps? I'd figure Instagram's CDN has far more impact than whatever Python code it's running somewhere in it's entrails.

Cargo cult approaches to tech stacks don't define quality.


The point is that it's still one project. You need to count the failures as well to rule out survivorship bias.


And you can put 7 billion of effort into tweaking your python application performance?


> The key metric along this line is how often each language allows success to some level and how often they fail

How does python score on these key metrics?


> But then there’s the reality that YouTube was written in Python. Instagram is a Django app. Pinterest serves 450M monthly users as a Python app. As far as I know Python was a key language for the backend of some other huge web scale products like Lyft, Uber, and Robinhood.

All those namedrops mean and matter nothing. Hacking together proof of concepts is a time honoured tradition, as is pushing to production hacky code that's badly stiched up. Who knows if there was any technical analysis to pick Python over any alternative? Who knows how much additional engineering work and additional resources was required to keep that Python code from breaking apart in production? I mean, Python always figured very low in webapp framework benchmarks. Did that changed just because <trendy company> claims it used Python?

Also serving a lot of monthly users says nothing about a tech stack. It says a lot about the engineering that went into developing the platform. If a webapp is architected so that it can scale well to meet it's real world demand, even after paying a premium for the poor choice of tech stack some guy who is no longer around made in the past for god knows what reason, what would that say about the tech stack?


"All those namedrops mean and matter nothing"

If your goal is to actually ship product then this matters a lot. Many of us have dealt with folks spinning endlesslessly on "technical analysis" when just moving forward with something like python would be fine. Facebook is PHP.

I'm actually cautious now when folks are focused too much on the tech and tech analysis instead of product / users / client need.


> All those namedrops mean and matter nothing. ... Who knows if there was any technical analysis to pick Python over any alternative?

Why look at results when you can look at analysis!


> Why look at results when you can look at analysis!

The problem with this mindless cargo culting around frameworks and tech stacks is that these cultists look at business results and somehow believe that they have anything to do with arbitrary choices regarding tech stacks.

It's like these guys look at professional athletes winning, and process to claim the wins are due to the brand of shoes they are using, even though the athlete had no say about the choice and was forced to wear what was handed over to him.


I don't think "I could use tool X for job Y" implies "X was the right tool for job Y". You could commute with a truck to your workplace 300 feet away for 50 years straight and I would still argue you probably used the wrong tool for the job. "Wrong tool" doesn't imply "it is impossible to do this", it just means "there are better options".


The main thing is that Python is often in the top 10 choices for almost every problem on top of being insanely easy to learn and write; it also doesn't fragment its community - its standard library is so ridiculously large there are very few fault lines to break upon.

It's rarely the best choice, or even the fifth best. But if it's OK at a dozen things, then it makes it all but impossible to ignore. The fact that it sucks to write a GUI in is fine as long as I can put some basic "go" buttons and text boxes in front of a web scraper.


IME writing a GUI in Python is pretty easy in a whole bunch of different ways, shipping is the annoying part.


Python is the new BASIC.


Or maybe tech stack really doesn't have that much influence on the success or failure of the business :)


Depends on the margins you have to work with. For high-margin businesses, I agree, the tech stack isn’t crucial to the health of the business. But for low-margin businesses for which compute is a significant cost center, tuning a tech stack to be more cost-efficient can make you a hero.


You should read about MySpace and "Samy is my hero". Or Google vs. AltaVista. Or GeoWorks vs. Microsoft Windows. Or Yahoo Mail and the medireview problem. Or when Danger lost everybody's data. Or THERAC-25. Or Knight Capital's bug.

On the other hand, for many years a lot of Amazon's back office processes were written in Elisp.

With enough thrust you can get a pig to fly but effort can only compensate for bad technical decisions up to a point.


>Elisp

Elisp? Elisp? Are you sure?


I didn't witness it, but Steve Yegge says he did; from https://sites.google.com/site/steveyegge2/tour-de-babel:

> Shel wrote Mailman [not the Python mailing list manager, an Amazon-internal application] in C, and Customer Service wrapped it in Lisp. Emacs-Lisp. You don't know what Mailman is. Not unless you're a longtime Amazon employee, probably non-technical, and you've had to make our customers happy. Not indirectly, because some bullshit feature you wrote broke (because it was in C++) and pissed off our customers, so you had to go and fix it to restore happiness. No, I mean directly; i.e., you had to talk to them. Our lovely, illiterate, eloquent, well-meaning, hopeful, confused, helpful, angry, happy customers, the real ones, the ones buying stuff from us, our customers. Then you know Mailman.

> Mailman was the Customer Service customer-email processing application for ... four, five years? A long time, anyway. It was written in Emacs. Everyone loved it.

> People still love it. To this very day, I still have to listen to long stories from our non-technical folks about how much they miss Mailman. I'm not shitting you. Last Christmas I was at an Amazon party, some party I have no idea how I got invited to, filled with business people, all of them much prettier and more charming than me and the folks I work with here in the Furnace, the Boiler Room of Amazon. Four young women found out I was in Customer Service, cornered me, and talked for fifteen minutes about how much they missed Mailman and Emacs, and how Arizona (the JSP replacement we'd spent years developing) still just wasn't doing it for them.

> It was truly surreal. I think they may have spiked the eggnog.


AFAIK, mailman was the only thing ever wrapped in this way.

I wrote all the rest of the back-office utilities (at least, the initial versions of them), and I have never come across any indication that they got "wrapped" in anything else (they did, no doubt, evolve and mutate in something utterly different over time).

Yegge's quote is also slightly inaccurate in that mailman, like all early amzn software, was written in C++. Shel and I just chose not use very much of the (rather limited) palette of C++ syntax & semantics.


Aha, the correction is greatly appreciated.

Most days I regret posting to HN. Today is not one of those days.


Well, a proper Emacs module can be set up with menus and a relatively easy interface for everyone.

A good example it's GNUs.


> There’s this interesting dissonance where all the second year CS students and their professors agree it’s the wrong tool for the job yet the most successful products in the world did it anyway.

> I guess you could interpret that to mean all these people building these products made a bad choice that succeeded despite using Python but I’d interpret it as another instance of Worse is Better. Just like Linus was told monolithic kernels were the wrong tool for the job but we’re all running Linux anyway.

This isn't the correct perspective or take away. The 'tool' for the job when you're talking about building/scaling a website changes over time as the business requirements shift. When you're trying to find market fit, iterating quickly using 'RAD' style tools is what you need to be doing. Once you've found that fit and you need to scale, those tools will need to be replaced by things that are capable of scaling accordingly.

Evaluating this binary right choice / wrong choice only makes sense when qualified with a point in time and or scale.


Instagram created Cinder https://www.infoworld.com/article/3617913/instagram-open-sou... to address Python Performance.

YouTube video processing uses C++. It also uses Go and Java along with Python.

PInterest makes heavy use of Erlang for scaling.The rate-limiting system for Pinterest’s API and Ads API is written in Elixir and responds faster than its predecessor.

Takeaway: Basically you need to either build your own PythonVM/CPython fork for better Python performance or use another language for the parts that needs to scale or run fast.


I think people look ahead. They see how the app evolves (from A to B) and then claim "X not good". Where as they do not judge the flexibility of X as a tool to move from A to B. Typically they look at B and make claims from that perspective.

Those companies that succeed in python usually have a long path and python was never successfully removed and most likely attempts were made. The PL economics is often stickyness and its not easy to propose absolute measure.


We might be running Linux, yet it has so many virtualization layers on cloud infrastructure, user space stacks to workaround switching to kernel, with microservices for everything, that it is effectively a monolithic kernel being bended into a microkernel one.

Same thing with Python, those business succeed despite Python, and when they grew it was time to port the code into something else, or spend herculean efforts into the next Python JIT.


+1. Languages that are general purpose get used for everything. Perl for the web. Python for builds. Scala transpiled for web.

Portability has many solutions that are good enough - often only bad because they result in second order issues which are themselves solvable with limited pain. Being able to scale software further without having to solve difficult distributed systems problems is of value.


I want a common language I can work with. Right now, Python is the only tool which fits the bill.

A critical thing is Python does numerics very, very well. With machine learning data science, and analytics being what they are, there aren't many alternatives. R, Matlab, and Stata won't do web servers. That's not to mention wonderful integrations with OpenCV, torch, etc.

Python is also competent at dev-ops, with tools like ansible, fabric, and similar.

It does lots of niches well. For example, it talks to hardware. If you've got a quadcopter or some embedded thing, Python is often a go-to.

All of these things need to integrate. A system with Ruby+R+Java will be much worse than one which just uses Python. From there, it's network effects. Python isn't the ideal server language, but it beats a language which _just_ does servers.

As a footnote, Python does package management much better than alternatives.

pip+virtualenv >> npm + (some subset of require.js / rollup.js / ES2015 modules / AMD / CommonJS / etc.)

JavaScript has finally gone from a horrible, no-good, bad language to a somewhat competent one with ES2015, but it has at least another 5-10 years before it can start to compete with Python for numerics or hardware. It's a sane choice if you're front-end heavy, or mobile-heavy. If you're back-end heavy (e.g. an ML system) or hardware-heavy (e.g. something which talks to a dozen cameras), Python often is the only sane choice.


> As a footnote, Python does package management much better than alternatives

No offense meant, but that sounds like the assessment of someone that has only experienced really shitty package management systems. PyPI has had their XMLRPC search interface disabled for months (a year?) now, so you can't even easily figure out what to install from the shell and have to use other tools/a browser to figure it out.

Ultimately, I'm moving towards thinking that most scripting languages actually make for fairly poor systems and admin languages. It used to be the ease of development made all the other problems moot, but there's been large advances in compiled language usability.

For scripting languages you're either going to follow the path or Perl or the the path of Python, and they both have their problems. For Perl, you get amazing stability at the expense of eventually the language dying out because there's not enough new features to keep people interested.

For Python, the new features mean that module writers want to use them, and then they do, and you'll find that the system Python you have can't handle what modules need for things you want to install, and so you're forced to not just have a separate module environment, but fully separate pythons installed on servers so you cane make use of the module ecosystem. For a specific app you're shipping around this is fine, but when maintaining a fleet of servers and trying to provide a consistent environment, this is a big PITA that you don't want to deal with when you've already chosen a major LTS distro to avoid problems like this.

Compiling a scripting language usually doesn't help much either, as that usually results in extremely bloated binaries which have their own packaging and consistency problems.

This is cyclical problem we've had so far. A language is used for admin and system work, the requirements of administrators grate up against the usage needs of people that use the language for other things, and it fails for non-admin work and loses popularity and gets replaced be something more popular (Perl -> Python) or it fails for admin work because it caters to other uses and eventually gets replaced by something more stable (what I think will happen to Python, what I think somewhat happened to bash earlier for slightly different reasons).

I'm not a huge fan of Go, but I can definitely see why people switch to it for systems work. It alleviates a decent chunk of the consistency problems, so it's at least better in that respect.


>No offense meant, but that sounds like the assessment of someone that has only experienced really shitty package management systems. PyPI has had their XMLRPC search interface disabled for months (a year?) now, so you can't even easily figure out what to install from the shell and have to use other tools/a browser to figure it out.

Yes, this is, frankly, an absurd situation for python.

And then there is the fact that I end up depending on third-party solutions to manage dependencies. Python is big-time now; stop the amateur hour crap.


Most languages have numerous third-party solutions for managing dependencies, or only recently added native support. go only recently added modules and was an absolute mess prior to that. javascript has npm, yarn, and about a million others. PHP has compose, but it doesn't cover everything. C/C++ are a mess. Java has gradle, maven, sbt, etc.


> As a footnote, Python does package management much better than alternatives.

If you use it as a scripting language, that might very well be the case (it's at least simpler). When you're building libraries or applications, no, definitely not. It's a huge mess, and every 3 years or so we get another new tool that promises to solve it, but just ends up creating a bigger mess.


This is an overly-dire assessment in my opinion. Setuptools + Pip has been more or less stable for years, and no changes in dev environment have been needed since stability was reached. There is a lot of new stuff coming out, for people who want new stuff, but there's nothing wrong with the old stuff if the old stuff works for you, which will remain supported for several more years if not forever.


I think poetry actually does solve it


Oh, there are a half dozen different tools that solve python package management. Unfortunately, they are mutually incompatible and none solve it for all use cases.



It’s somewhat the opposite situation, poetry is a tool that lots of people are adopting rather than one being pushed by a standards committee. I don’t think it set out to unify a dozen standards, only build a good UX and to be reproducible.


Poetry isn't an additional standard, more like an implementation. It is PEP 518 compliant.


> it has at least another 5-10 years before it can start to compete with Python for numerics or hardware

More, given that no language competes at high-level numerics with Python outside of Julia and numerics in general only adds C++.


Fortran >:D


For low-level, fair. I only know of people in astronomy academia who actually use it nowadays though.


There are good technical reasons to use Fortran even in 2022. For example, it avoids** is aliasing (pointers / references / etc.). This allows for some kinds of optimizations impossible in most other languages.

It's used in a bunch of small niches, but it has users beyond just astronomy.

** A million disclaimers apply.


I am aware that hypothetically fortran code is the fastest possible. That said, I am not sure how great this aliasing difference is in practice.

There is a reason why most BLAS implementations have been rewritten into C.


Signal processing stuff is still sometimes written in it.


>. R, Matlab, and Stata won't do web servers.

Not unless they're pushed to, like Python was.

>A critical thing is Python does numerics very, very well.

That's not Python doing numerical stuff. That's C code, called from Python.


It's not C code. It calls into a mixture of C, CUDA, Fortran, and a slew of other things. Someone did the work of finding the best library for me, and integrating them.

As for me, I write:

A * B

It multiplies two matrices. C can't do that. In C, I'd have some unreadable matrix64_multiply(a, b). Readability is a big deal. Math should look more-or-less like math. I can handle 2^4, or 2**4, but if you have mpow(2, 4) in the middle of a complex equation, the number of bugs goes way up.

I'd also need to allocate and free memory. Data wrangling is also a disaster in C. Format strings were a really good idea in the seventies, and were a huge step up from BASIC or Python. For 2022?

And for that A * B? If I change data types, things just work. This means I can make large algorithmic changes painlessly.

Oh, and I can develop interactively. ipython and jupyter are great calculators. Once the math is right, I can copy it into my program.

I won't even get started on things like help strings and documentation.

Or closures. Closures and modern functional programming are huge. Even in the days of C and C++, I'd rather do math in a Lisp (usually, Scheme).

I used to do numerics in C++, and in C before that. It's at least a 10x difference in programmer productivity stepping up to Python.

Your comment sounds like someone who has never done numerical stuff before, or at least not serious numerical stuff.


> It multiplies two matrices. C can't do that. In C, I'd have some unreadable matrix64_multiply(a, b).

In C you get BLAS which provides functions like ?gemm, a BLAS level-3 function which is literally stands for general matrix-matrix product.

Hardly cryptic.

https://www.intel.com/content/www/us/en/develop/documentatio...

Also, anyone doing any remotely serious number crunching/linear algebra work is well aware that you need to have control over which algorithms you use to run these primitive operations, and which data type you're using.

> I used to do numerics in C++, and in C before that. It's at least a 10x difference in programmer productivity stepping up to Python.

I'm rather skeptical of your claim. Eigen is the de-facto standard C++ linear algebra toolkit and it overloads operators for basic arithmetics.

I'm not sure your appeal to authority is backed up with any relevant experience or authority.

https://eigen.tuxfamily.org/dox-devel/group__TutorialMatrixA...

I'm not sure your appeal to authority is backed up with any relevant experience or authority. It's ok if you like Python and numpy, but don't try to pass off your personal taste for anything with technical merit.


> > It multiplies two matrices. C can't do that. In C, I'd have some unreadable matrix64_multiply(a, b).

> In C you get BLAS which provides functions like ?gemm, a BLAS level-3 function >which is literally stands for general matrix-matrix product.

> Hardly cryptic.

Seriously? The signature for dgemm is

    void cblas_dgemm(const CBLAS_LAYOUT layout, const CBLAS_TRANSPOSE TransA,
                     const CBLAS_TRANSPOSE TransB, const CBLAS_INT M, const CBLAS_INT N,
                     const CBLAS_INT K, const double alpha, const double  *A,
                     const CBLAS_INT lda, const double  *B, const CBLAS_INT ldb,
                     const double beta, double  *C, const CBLAS_INT ldc)
Maybe you have a loads of free time, but I don't want to memorize that function signature when I could just type A @ B.


> Seriously? The signature for dgemm is (...)

The signature of gemm is trivial if you're aware of the basics of handling dense row-major/column-major matrices.

https://www.intel.com/content/www/us/en/develop/documentatio...

https://www.netlib.org/lapack/explore-html/db/dc9/group__sin...

I haven't met a single person who did any number crunching work whatsoever who ever experienced any problem doing basic matrix-matrix products with BLAS. Complaining about flags to handle row-major/column-major matrices while boasting about being an authority on number crunching is something that's not credible at all.


You sound a lot like me, when I was a teenager.


I don't actually mind memorizing these things if I'm doing this 60 hours per week. I used to be quite good at some of the C++ STL numerics libraries. I took the same perverse pride some folks here apparently do in knowing these things in- and-out.

However, most of code is about being able to read it and modify it, not write it. Equations are sometimes hard enough when they look like equations. If you've got a call like that in your code, that's going to be radically harder to understand than A*B.

That's not to mention all the clutter around it, of allocating and deallocating memory.

One of the things C++ programmers don't typically understand are the types of boosts one gets from closures, garbage collection, and functional programming in general, especially for numerics. I recommend this book:

https://en.wikipedia.org/wiki/Structure_and_Interpretation_o...

In it, you'll see code which:

- The user writes a Lagrangian

- The system symbolically computes the Lagrange equations (which are derivatives of the above)

- This is compiled into native code

- A numerical algorithm integrates that into a trajectory of motion

- Which is then plotted

All of this code is readable (equations are written in Lisp, but rendered in LaTeX). None of this is overly hard in a Lisp; this was 1-3 people hacking together, and not even central to what they were doing. It'd be neigh-impossible in C or C++.

Those are the sorts of code structures which C/C++ programmers won't even think of, because their impossible to express.

(Footnote: Recent versions of C++ introduced closures; I have not used them. People who have express they're not "real closures.")


Interesting... I would think symbolically deriving the Euler-Lagrange equations from the Lagrangian would be quite hard in practice.


That's the shocking thing. It's totally not hard in a Lisp. I could write the code for that symbolic derivation in an evening, tops.

The hard part is the compiler. The somewhat hard part is the efficient numerical integrator (if you want good convergence and rapid integration). The symbolic manipulation is easy, once you know what you're doing. If you want to know what you're doing, it's sufficient to see how other people did it:

https://mitpress.mit.edu/sites/default/files/titles/content/...

And if you haven't seen it:

https://mitpress.mit.edu/sites/default/files/sicp/full-text/...

If you're dumb like me, and can't write a compiler or an efficient numerical integrator in your spare time, having this interpreted and using a naive integrator is still good enough most of the time. Computers are fast. The authors of the above proved the motion of the solar system is chaotic with a very, very long, very, very precise numeric integration on hardware from decades ago, so they have super-fancy code. For the types of things I do, a dumb integration is fine.

Once you've seen how it's done, and don't mind lower speed, that's trivial in any language with closures (e.g. Python or even JavaScript). If you want to swing a double-pendulum, or play around with the motion of a solar system over shorter durations, it's easy.

And using the tool is even easier. Look at SICM (linked book). The things that look like code snippets are literally all the code you need.

The system, if you want it:

https://groups.csail.mit.edu/mac/users/gjs/6946/installation...


Wow, I can't believe I've never heard of this book (SICM, not the wizard book). Seems incredibly cool.

That said, I'll maintain that my initial skepticism was somewhat justified given that the derivation relies on a pre-written symbolic manipulation/solver library (scmutils) so it's not quite 'from scratch' in a Lisp ;) although I believe that you could write such a library yourself, as SICP demonstrates.

As a side note, it looks like I can't go through this book because I have an M1 Mac, which GNU Scheme doesn't support :(


> the number of bugs goes way up

In case you are forced to use the unreadable long-named unintuitively-syntaxed methods, add unit tests, and check that input-output pairs match with whatever formula you started with.


Yet, Python (and most of her programmers including data scientists, of which I am one) stumble with typing.

    if 0.1 + 0.2 == 0.3:
        print('Data is handled as expected.')
    else:
        print('Ruh roh.')
This fails on Python 3.10 because floats are not decimals, even if we really want them to be. So most folks ignore the complexity (due to naivety or convenience) or architect appropriately after seeing weird bugs. But the "Python is easiest and gets it right" notion that I'm often guilty of has some clear edge cases.


Why would you want decimals for numeric computations though? Rationals might be useful for algebraic computations, but that’d be pretty niche. I’d think decimals would only be useful for presentation and maybe accountancy.


Well, for starters folks tend to code expecting 0.1+0.2=0.3, rather than abs(0.3-0.2-0.1) < tolerance_value

Raw floats don't get you there unfortunately.


If you want that you should use integers. This seems to be a misalignment of expectations rather than a fault in the language.

Other people have posted other examples but it’s not possible to represent real numbers losslessly in finite space. Mathematicians use symbolic computation but that probably is not what you would want for numerics. I could see a language interpreting decimal input as a decimal value and forcing you to convert it to floating point explicitly just to be true to the textual representation of the number, but it would just be annoying to anyone who wants to use the language for real computation and people who don’t understand floating point would probably still complain.

Edit: I’ll admit I have a pet peeve that people aren’t taught in school that decimal notation is a syntactic convenience and not an inherent property of numbers.


They also expect 1/3 + 1/3 + 1/3 == 1. Decimals won't help with that.


That's slightly different in that most programmers won't read 1/3 as "one third" but instead "one divided by three", and interpret that as three divisions added together, and the expectations are different. Seeing a constant written as a decimal invites people to think of them as decimals, rather than the actual internal representation, which is often "the float that most closely represents or approximates that decimal".



Correct! Many python users don't know about this and similar libraries that assist with data types. Numpy has several as well.


It is not a Python thing, it is a floating-point thing. You need it if you want hardware support (CPU/GPU) for non-integer arithmetic in any language. Otherwise, you have decimal, fractions, sympy, etc modules depending on your needs.

https://docs.python.org/3/tutorial/floatingpoint.html


This is an issue for accountancy. Many numerical fields have data coming from noisy instruments so being lossy doesn't matter. In the same vein as why GPUs offer f16 typed values.


> That's not Python doing numerical stuff. That's C code, called from Python.

That's sort of a distinction without a difference, isn't it? Python can be good for numeric code in many instances because someone has gone through the effort of implementing wrappers atop C and Fortran code. But I'd rather be using the Python wrappers than C or especially Fortran directly, so it makes at least a little sense to say that Python "does numerics [...] well".

> Not unless they're pushed to, like Python was.

R and Matlab, maybe. A web server in Stata would be a horrible beast to behold. I can't imagine what that would look like. Stata is a terrible general purpose language, excelling only at canned econometrics routines and plotting. I had to write nontrivial Stata code in grad school and it was a painful experience I'd just as soon forget.


You can do web stuff in R, but it's a lot harder than it needs to be. R sucks for string interpolation, and a lot of web related stuff is string interpolation.


Yeah, I'm not surprised by that. The extent of my web experience in R is calling rcurl occasionally, so I've never tried and failed to do anything complicated.


> Not unless they're pushed to, like Python was.

Readability of code and ease of use is a big thing. It's just not about pushing hard till we make it.

edit: formating


I wouldn't want to do a web-server in MATLAB. I like MATLAB, but no, not that.


Or in some cases, FORTRAN code called from Python iirc.


> the problem is using Python at all for a web server

I don't agree with this. Maybe for a web server where performance is really going to matter down to the microsecond, and I've got no other way to scale it. I write server code in both Javascript and Python, and despite all of my efforts I still find that I can spin up a simple site in something like django and then add features to it much more easily than I can with node. It just has less overhead, is simpler, lets me get directly to what I need without having to work too hard. It's not like express is hard per se, but python is such an easy language to work with and it stays out of my way as long as I'm not trying to do exotic things.

And then it pays dividends later, as well, because it's really easy for a python developer to pick up code and maintain it, but for JS it's more dependent on how well the original programmer designed it.


The problem with Django services is the insanely low concurrency level compared to other server frameworks (including node).

Django is single request at a time with no async. The standard fix is gunicorn worker processes, but then you require entire server memory * N memory instead of lightweight thread/request struct * N memory for N requests.

I shudder to think that whenever Django server is doing an HTTP request to a different service or running a DB query, it's just doing nothing while other requests are waiting in the gunicorn queue.

The difference is if you have an endpoint with 2s+ queries taking 2s for one customer, with Django, it might cause the entire service to stall for everybody, whereas with a decent async server framework other fast endpoints can make progress while the 2s ones are slow.


Django has async support for everything except the ORM. async db is possible without the ORM or by doing some thread pool/sync to async wrapping. A PR for that was under review last I checked.

Either way, high concurrency websites shouldn't have queries that take multiple seconds and it's still possible to block async processes in most languages if you mix in a blocking sync operation.


You can configure gunicorn to use multiple threads to recover quite a bit of concurrency in those scenarios and that is enough for many applications.


What threading/workers configuration do you use?

I'm looking at a page now which recommends 9 concurrent. requests for a Django server running on a 4 core computer.

Meanwhile node servers can easily handle hundreds of concurrent requests.


We use the ncpu * 2 + 1 formula for the number of workers that serve API requests.

I don't think in 'handling x concurrent requests' terms because I don't even know what that means. Usually I think around thoughout, latency distributions and number of connections that can be kept open (for servers that deal with web sockets).

For example if you have the 4 core computer and you have 4 workers and your requests take around 50ms each you can get to a throughput of 80 requests per second. If the fraction of request time for IO if 50% you can bump your thread count to try to reach 160 request per second. Note that in this case each request consumes 25ms of CPU so you would never be able to get more than 40 requests per second per CPU whether you are using node or python.


This sounds like sour grapes. Python is a general-purpose language. Languages like Awk and Perl and Bash are clearly domain-specific, but Python is a pretty normal procedural language (with OO bolted on). The fact that it is dynamic and high-level does not mean it is unsuited for applications or the back-end. People use high-level dynamic languages for servers all the time, like Groovy or Ruby or, hell, even Node.js.

What about Python makes it unsuitable for those purposes other than its performance?


Totally agree that performance is not on my top 10 wish list for Python.

But I disagree on "not the right jobs for the tool".

Python is extremely versatile and can be used as a valid tool for a lot of different jobs, as long as it fits the job requirements, performance included.

It doesn't require a CS degree to know that fitting job requirements and other factors like the team expertise, speed, budget, etc, are more important than fitting a theoretical sense of "right jobs for the tool".


> It doesn't require a CS degree to know that fitting job requirements and other factors like the team expertise, speed, budget, etc, are more important than fitting a theoretical sense of "right jobs for the tool".

It requires experience.

A lot of those lessons only come after you've seen how much more expensive it is to maintain a system than to develop one, and how much harder people issues are than technical issues.

A CS degree, or even a junior developer, won't have that.


Experience does not lead to that conclusion.

Whether Python will be easier or harder to maintain depends on numerous factors that vary so much for each job that you cannot generalize upon.

That's something experience shows.

Reaching such a conclusion that "Python is not a right tool for web backend" is just naive.

No matter how experienced a developer is, reality of the world is at least 100x more diverse than what they alone could possibly have learned and experienced.

If one believes to possess the experience of everything to generalize on complex topics like this, it just shows this person could benefit from cultivating a bit more humbleness.


Python can do just about anything... but it will take its time doing it.


And many times this is negligible, which puts this out of the equation.


The world doesn't revolve around web development. It's not the only use case. Scientific Python is huge and benefits tremendously from the language being faster. If Python can be 1% faster, that's a significant force multiplier for scientific research and engineering analysis/design (in both academia and industry).


Because most of the really huge scientific Python libraries are written as wrappers over lower-level language code, I'd be curious to what extent speeding up Python by, say, 10% would speed up "normal" scientific Python code on average. 1%? 5%?


If you are talking about large sets of numbers, then the speed up will be far below 1%.


I'm not sure it's very relevant to say in a discussion of the answer of "how do we improve Python" is "don't use Python". People have all kinds of valid reasons to use Python. Let's keep this on topic please


The folks that work on performance are not the folks working on packaging. Shall we stop their work until the packaging team gets in gear?


I agree! Here's a related point: Rust seems ideal for web servers, since it's fast, and is almost as ergonomic as Python for things you listed as cumbersome in C. So, why do I use Python for web servers instead of Rust? Because of the robust set of tools of tools Django provides. When evaluating a language, fundamentals like syntax and performance are one part. Given web server bottlenecks are I/O limited (mitigating Python's slowness for many web server uses), and that I'd have to reinvent several wheels in Rust, I use Python for current and future web projects.

Another example, with a different take: MicroPython, on embedded. The only good reason I can think for this is to appeal to people who've learned Python, and don't want to learn another language.


So really, you're not so much writing Python as writing Django, which just so happens to be Python.


>Instead of performance, I’d like to see more effort in portability, package management, and stability for Python because, essentially since it is often enterprise managed, juggling fifteen versions of Python where 3.8.x supports native collection typing annotations but we use 3.7.x, etc. is my biggest complaint. Also up there is pip and just the general mess of dependencies and lack of a lock file. Performance doesn’t even make the list.

I have been leading a Python project lately and, yes, the tooling is very poor, although it is getting better. I have found poetry to be a very good for venv management and lock files + having one file for all your config.


Pip has a decent solution for lock files:

https://pip.pypa.io/en/stable/user_guide/#constraints-files


>Python shines beautifully as a (somewhat) cross platform scripting language

Python is much more than just a scripting language. I remember attending this talk[1] a few years about JPMorgan's 35 million LOC Python codebase. Python is being used to built seriously large software nowadays and I don't think performance is ever a minor issue. It should always be in the top 3 for any general purpose language because it directly translates into development speed, time and money.

[1]https://youtu.be/ZYD9yyMh9Hk


> Also up there is pip and just the general mess of dependencies and lack of a lock file.

You can use pyproject.toml or requirements.txt as lock files, Poetry can use the former and poetry.lock files, as well.


> and lack of a lock file

Is it possible to solve your problem using pip freeze?


Agreed, my only use for Python since version 1.6, is portable shell scripting or when sh scripts get too complicated.

Anything beyond that, there are compiled languages with REPL available.


What compiled languages do you have in mind? I suppose technically there are repls for C or Rust or Java, but I wouldn't consider them ideal for interactive programming. Functional programming might do a bit better -- Scala and GHCi work fine interactively. Does Go have a repl?


Java, C#, F#, Lisp variants, and C++.

Eclipse has Java scratchpads for ages, Groovy also works out for trying out ideas and nowadays we have jshell.

F# has a REPL in ML linage, and nowadays C# also shares a REPL with it in Visual Studio.

Lisp variants, going at it for 60 years.

C++, there are hot reload environments, scripting variants, and even C and C++ debuggers can be quite interactive.

I used GDB in 1996, alongside XEmacs, as poor man's REPL while creating a B+Tree library in C.

Yes, there are Go interpreters available,

https://github.com/traefik/yaegi


Particle physicists have been using interpreted c++ for "macros" forever. First using the terrible hack of cint, now using cling which is quite good.


Indeed, although I remember there used to be some commercial ones as well, from ads on The C/C++ Users' Journal and Dr. Dobbs.


> compiled languages

Might be tripping you up. Very few languages require that implementations be compiled or interpreted. For most languages, having a compiler or interpreter is an implementation decision.

I can implement Python as an interpreter (CPython) or as a compiler (mypyc). I can implement Scheme as an interpreter (Chicken Scheme's csi) or as a compiler (Chicken Scheme's csc). The list goes on: Standard ML's Poly/ML implementation ships a compiler and an interpreter; OCaml ships a compiler and an interpreter.

There are interpreted versions of Go like https://github.com/traefik/yaegi. And there are native-, AOT-compiled versions of Java like GraalVM's native-image.

For most languages there need be no relationship at all between compiler vs interpreter, static vs dynamic, strict or no typing.


During Perl’s hegemony as The Glue Language, I feel like the folk wisdom was:

“Performance is a virtue; if Perl ceases to be good enough, or you need to write ‘serious’ software rewrite in C.”

And during Python’s ascension, the common narrative shifted very slightly:

“Performance is a virtue, but developer productivity is a virtue too. Plus, you can drop to C to write performance critical portions.”

Then for our brief all-consuming affair with Ruby, the wisdom shifted more radically:

“Developer productivity is paramount. Any language that delivers computational performance is suspect from a developer productivity standpoint.”

But looking at “high-level” languages (i.e. languages that provide developer productivity enhancing abstraction), we can rewind the clock to look at language families that evolved during more resource-constrained times.

Those languages, the lisps, schemes, smalltalks, etc. are now really, really fast compared to Python, and rarely require developers to shift to alternative paradigms (e.g. dropping to C) just to deliver acceptable performance.

Perl and Python exploded right at the time that Lisp/Scheme hadn’t quite shaken the myth that they were slow, with Python/Perl achieving acceptable performance by having dropped to C most of the time.

Now the adoption moat is the wealth of libraries that exist for Python—and it’s a hell of a big moat. If I were a billionaire, I’d hire a team of software developers to systematically review libraries that were exemplars in various languages, and write / improve idiomatic, performant, stylistically consistent versions in something modern like Racket. I’d like to imagine that someone would use those things :-)


Perl/Python/Ruby grew up in the 90s, the "Bubble economy" of the single core performance world, the likes of which had never and probably will never be seen again on the face of the Earth. In the post-Bubble world, throwing out 90% of your performance before you even start writing code, especially when the same dynamic features could be delivered via JIT without the cost, seems crazy.


So true, excellent point! I just do not understand startups choosing Python/Ruby in 2022 when you can get most of the features, type safety, concurrency, async and 5 times more speed in other languages.


I don't think it is such a surprise. The ecosystems around Rails (for Ruby) and numpy/pandas/etc (for python) are orders of magnitude larger than you get in the modern languages. In Rails for example, adding an entire user management system (including niceties like password reset mails and must-haves like proper security for obscure vulnerabilities most people will have never heard of) is literally a single extra line in the gemfile and two console commands. In python the ML and numerics ecosystem are completely beyond anything another language has to offer at the moment, even more so when you compare the time to get started.

In addition, "real" performance is often tricky to measure and may be irrelevant compared to other parts of the system. Yes, Ruby is 10-100x slower than C. But if a user of my web service already has a latency of (say) 200ms to the server then it barely matters if the web service returns a response in 5 ms or in 0.5 ms. Similarly for rendering an email: no user will notice their email arriving half a second earlier. Similarly for a python notebook: if it takes 1 or 2 seconds to prepare some data for a GPU processing job that will take several hours, it doesn't really matter that the data preparation could have been done in 0.1 seconds instead if it had been done in Rust.

Especially for startups where often you're not sure if you're building the right thing in the first place, a big ecosystem of prebuilt libraries is super important. If it turns out people actually want to buy what you've made in sufficient numbers that the inefficiency of Ruby/Python/JS/etc becomes a problem then you can always rewrite the most CPU intensive parts in another language. Most startup code will never have the problem of "too many users" though, so it makes no sense to optimize for that from the start.


> are orders of magnitude larger than you get in the modern languages. This is plain wrong, Kotlin has access to the whole JVM/Java ecosystem. In fact in theory you can even access Python libraries via graalPython. Machine learning is one of the only niche to be language specific as if it's true that there are major Java deep learning libraries (such as the standford parser) they are increasingly obscolete vs Python.


Well if you choose Python/Ruby you only need 1/3 of the developers as if you choose another language.

The productivity gain is so great it outweights everything else. It's as simple as that.


For startups it's a matter of rapidly developing an MVP to validate your business and iterating quickly.

Developer salaries are way higher than CPU and server costs. Productivity wins out here vs. performance.


> Those languages, the lisps, schemes, smalltalks, etc.

The main reason those languages got fast despite being highly dynamic is because of very complex JIT VM implementations. (See also: JavaScript.)

The cost of that is that a complex VM is much less hackable and makes it harder to evolve the language. (See also: JavaScript.)

Python and Ruby have, I think, reasonably chosen to have slower simpler implementations so that they are able to nimbly respond to user needs and evolve the language without needing massive funding from giant corporations in order to support an implementation. (See also: JavaScript.)

There are other effects at play, too, of course.

Once your implementation's strategy for speed is "drop to C and use FFI", then it gets much harder to optimize the core language with stuff like a JIT and inlining because the FFI system itself gets in the way. Not having an FFI for JS on the web essentially forced JavaScript users to push to make the core language itself faster.


Spending a weekend or two writing a Scheme that beats Python in performance has been a pastime for computer science students for at least a couple decades now. I'm not sure that I believe that a performant Scheme implementation has more complexity than e.g. PyPy. In fact, I'd wager the converse.


Sure, but that's because Python has objects.

If your write an object system on top of your performant hobby Scheme implementation, you'll likely find that the performance of its method dispatch is about as slow as it is in Python. Probably even slower.

Purely procedural Python code isn't as slow as object-oriented Python code.


That's fair, but also the fact that we're comparing hobby scheme implementations to two mainstream extremely popular implementations of Python and setting up conditions that forces (hobby) Scheme to play to Python's relative strengths is telling. :-)

The Python ecosystem has certainly received a lot of developer resources and attention the past couple of decades. Shall we compare the performance of CLOS on SBCL, which again has seen comparatively little developer resources, to Python's performance in dealing with objects? I'd take that performance wager.


This isn’t as much of a gotcha as you think. Python is slow because the language is so dynamic and simply has to do more behind the scenes work on each line. It’s not impressive that a language that does less is faster. What’s impressive is that a language that does more, like JS on V8, is faster.


Is CLOS doing less than Python?

I'm thinking CLOS has more dynamism than Python - they're both dynamically typed, they're both doing a lookup then dispatch, but then CLOS adds dynamism on top of that, it's also looking in the metadata thingy (i'm not a lisp developer, do they call it the hash? I'm meaning the key value store on every "atom" - i'm so out of my depth here, is atom the right word?) plus if i remember right the way CLOS works you use multiple dispatch not just single dispatch like python.


The CLOS is more dynamic than python. You can do things like specialize a method on multiple types based on the runtime types (that is to say, the method conceptually belongs to the intersection of the classes, not any single class).

I like python (especially things like comprehensions), but to say it's more dynamic than common lisp is a little insane.


CL has many constraints that make it less dynamic to "minimize the observable differences between compiled and interpreted programs" (CLHS, 3.2.2.3 Semantic Constraints). For example, if f is defuned in a file, throughout that file you can assume (f x) refers to that f and does not have to be looked up dynamically at runtime.

If I recall correctly, method calls in CLOS are also not syntactically distinguished from regular calls either (like they are in x.f() languages), so there is no motivation to write a method unless you actually want the dynamic dispatch methods provide. Methods are fairly rare compared to regular functions.


> For example, if f is defuned in a file, throughout that file you can assume (f x) refers to that f and does not have to be looked up dynamically at runtime.

Unless you declare them "notinline".

http://www.lispworks.com/documentation/HyperSpec/Body/d_inli...

> so there is no motivation to write a method unless you actually want the dynamic dispatch methods provide.

Methods have some different functionality in CL than in most languages. For example, you can have the methods from the entire inheritance hierarchy (well, the classes that have had that method specialized for them) all be called as, essentially a pipeline of methods.

https://lispcookbook.github.io/cl-cookbook/clos.html#method-...

But this is a discussion about dynamism of the object system, and most of that is defined before runtime. How about changing an object's class at runtime? https://www.cliki.net/site/HyperSpec/Body/stagenfun_change-c...


Smalltalk,

    a become: b
Now all references of a and b across the complete image are changed, invalidating all assumptions done at every call site sending messages to each of them.


You're either exaggerating or the computer science students you're familiar with are wizards. I've never known the student who could write a Scheme implementation, from scratch, in one weekend that is both complete and which beats Python from a performance perspective.


If it's an exaggeration, it's not much of one.

Two parts to your argument:

- Writing a Scheme implementation quickly: Google "Write a Scheme in 48 hours" and "Scheme from scratch." 48 hours to a functioning Scheme implementation seems to be a feat replicated in multiple programming languages.

- Performance: I haven't benchmarked every hobby scheme, but given the proliferation of Scheme implementations that, despite limited developer resources, beat (pure) Python with it's massive pool of developers (CPython, PyPy), I still don't buy the idea that optimizing Scheme is a harder task than optimizing Python. Again, I'd strongly suggest that optimizing Scheme is a much easier task than optimizing Python simply by virtue of how often the feat has been accomplished.


If you can give me an implementation that implements almost all of R5RS, in 48 hours, beating Python in performance, and all by a single developer, I’ll tip my hat to that guy or gal. But I can’t imagine it’s too commonly done.


Nobody said you can implement a full Scheme implementation in 48 hours or two weeks. That's very much besides the point about how poor CPython performance is.


> Nobody said you can implement a full Scheme implementation in 48 hours or two weeks.

Fair enough, you're right. But if we're only talking about incomplete Scheme implementations it's not a very interesting claim. As I pointed out in another comment, even I could write a fast Scheme implementation in 48 hours if I kept my scope very limited. That doesn't say much about Scheme performance overall or how it relates to Python.


Well let's flip this around: do you think you could write a performant minimal Python in a weekend? Scheme is a very simple and elegant idea. Its power derives from the fact that smart people went to considerable pains to distill computation to limited set of things. "Complete" (i.e. rXrs) schemes build quite a lot of themselves... in scheme, from a pretty tiny core. I suspect Jeff Bezanson spent more than a weekend writing femtolisp, but that isn't really important. He's one guy who wrote a pretty darned performant lisp that does useful computation as a passion project. Check out his readme; it's fascinating: https://github.com/JeffBezanson/femtolisp

You simply can't say these things about Python (and I generally like Python!). It's truer for PyPy, but PyPy is pretty big and complex itself. Take a look at the source for the scheme or scheme-derived language of your choice sometime. I can't claim to be an expert in any of what's going on in there, but I think you'll be surprised how far down those parens go.

The claim I was responding to asserted that lisps and smalltalks can only be fast because of complex JIT compiling. That is trueish in practice for Smalltalk and certainly modern Javascript... but it simply isn't true for every lisp. Certainly JIT-ed lisps can be extremely fast, but it's not the only path to a performant lisp. In these benchmarks you'll see a diversity of approaches even among the top performers: https://ecraven.github.io/r7rs-benchmarks/

Given how many performant implementations of Scheme there are, I just don't think you can claim it's because of complex implementations by well-resourced groups. To me, I think the logical conclusion is that Scheme (and other lisps for the most part) are intrinsically pretty optimizable compared to Python. If we look at Common Lisp, there are also multiple performant implementations, some approximately competitive with Java which has had enormous resources poured into making it performant.


What is it that you think makes a full Scheme implementation as slow as CPython?


I don’t think a full Scheme implementation is as slow as Python in general. What I’m hung up on is the claim that it’s so absolutely trivial to write a language implementation faster than Python that basically anybody at any skill level could do it in a weekend, and still have time for Sunday afternoon bocce.


Start with,

"An Incremental Approach to Compiler Construction"

http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf


I would not include PyPy in a list of easy to beat implementations.


Nor ones with massive pools of developers.


Compared to most Scheme implementations?


Substitute computer science student with "developer" and it holds for me. Definitely some CS students can do it too. Actually at my school we did have to implement a Scheme compiler. So yeah it's not too big of a stretch to say.

I think people who haven't implemented a language underestimate how slow CPython is. And overestimate how hard it is to build a compiler for a dynamic language.

I think every professional developer or CS student can and should build a compiler for a dynamic language!


But the claim was that a student could write a conformant Scheme implementation in 48 hours that beats Python. Clearly it’s possible for a student to write a Scheme that’s faster than Python, but is it a reasonably complete Scheme done in a single weekend?

Even I, very much a non-computer scientist, could write a fast Scheme quickly if I could keep myself to a very small subset, so that’s not interesting to me.


Conformant is a word you introduced, they didn't say that.


Only if they aren't attending degrees where compiler design is part of the curriculum, which if that is the case kind of speaks for the quality of the institution.


A reasonably complete, fast language implementation, from scratch, in one weekend, though? By someone who was only introduced to compilers within the last few months? I don’t believe most students would be capable of that, but I’m not a computer scientist so what do I know.


Yes, Scheme is quite simple, no one is speaking about doing R7RS.

Compared with current state of affairs in CPython, even with basic code generation algorithms, the machine code would be fast enough.

Introducing to compilers means also writing one in the process, unless it is a lousy degree.


> Python and Ruby have, I think, reasonably chosen to have slower simpler implementations so that they are able to nimbly respond to user needs and evolve the language without needing massive funding from giant corporations in order to support an implementation. (See also: JavaScript)

I'm not really sure I buy that argument for why JavaScript evolved slowly for a while. It seems more likely to me that it was due to the combination of basically being required to be 100% backwards compatible (no one making a browser wants to be the only one who doesn't work for a certain website), having no single canonical implementation (which means that all the common browsers would have to agree on a spec for something to be added or changed), and not really being used in the server context widely before Node. Python and Ruby both sacrificed backwards compatibility at times in their history, and both had one central implementation which others tended to draw from, so it was easier to get changes made.


Agreed, that isn't the reason JavaScript in particular was essentially motionless for about a decade. As you note, that was largely due to political factors and disagreement about language direction among the various competing implementers.

But compared to say Smalltalk, Scheme, and Lisp, I think Ruby and Python have been able to evolve incrementally in part because of their simpler more hackable implementations.


Similarly why shouldn't we think that business reasons left Smalltalk motionless after release from ParcPlace?

So traits for Squeak but not the commercial Smalltalks.


> Python and Ruby have, I think, reasonably chosen to have slower simpler implementations…

?

https://shopify.engineering/yjit-just-in-time-compiler-cruby


Yes, CRuby is slowly moving towards a JIT now because performance is a major blocker for user adoption.

The larger Python ecosystem has tried that a number of times too (Unladen Swallow, PyPy, etc.)

It's quite difficult since both of those languages already lean heavily on C FFI and having frequent hops in and out of FFI code tends to make it harder to get the JIT fast. JITs work best when all of the code is in the host language and can be optimized and inlined together.


Javascript the language seems to have evolved much more than Python despite CPython's very simple implementation.


Hence my point about "massive funding from giant corporations in order to support an implementation". :)


Well almost all the JavaScript language innovation was syntax sugar and was implemented as transforms before the browsers implemented it. I think JavaScript devs mostly were fine to keep using transforms indefinitely and it's just been more convenient that the browsers have moved to implement it.

Python could have done this easily too but evolving as a language just isn't as big a priority (not that I'm saying it should be) and that's completely (or mostly) disconnected from their backend implementation decisions.


This sounds a lot like what some Python package developers are trying with Rust (example being the cryptography package), which also has the unfortunate side effect of limiting support for some less popular platforms.


My remembering of Python is, “developer experience is paramount; if you need more performance use PyPy.”

To increase Python performance, you only need more Python!


Is it gauche to offer my own counterpoint?

Another possibility is that the requirement to "drop to C" is a virtue by de-democratizing access to serious performance. In other words, let the commoners eat Python, while the anointed manage their own memory.

I personally find this argument a bit distasteful / disagree with it, but there was a thread the other day that talked about the, uh, variable quality of code in the Julia ecosystem (Julia being another language where dropping to C isn't important for performance). In Julia, the academics can just write their code and get on with their work—the horror!


This is a great read, and it's fantastic to see all the work being done to evaluate and improve the language!

The dynamic-nature of the language is actually something that I had studied a few years back [1]. Particularly the variable and object attribute look ups! My work was just a master's thesis, so we didn't go too deep into more tricky dynamic aspects of the language (e.g. eval, which we restricted entirely). But we did see performance improvements by restricting the language in certain ways that aid in static analysis, which allowed for more performant runtime code. But for those interested, the abstract of my thesis [2] gives more insight into what we were evaluating.

Our results showed that restricting dynamic code (code that is constructed at run time from other source code) and dynamic objects (mutation of the structure of classes and objects at run time) significantly improved the performance of our benchmarks.

There was also some great discussion on HN when I had posted our findings as well [3].

[1]: https://github.com/joncatanio/cannoli

[2]: https://digitalcommons.calpoly.edu/theses/1886/

[3]: https://news.ycombinator.com/item?id=17093051


But we did see performance improvements by restricting the language in certain ways that aid in static analysis, which allowed for more performant runtime code.

Well, yes. In Python, one thread can monkey-patch the code in another thread while running. That feature is seldom used. In CPython, the data structures are optimized for that. Underneath, everything is a dict. This kills most potential optimizations, or even hard-code generation.

It's possible to deal with that efficiently. PyPy has a compiler, an interpreter, and something called the "backup interpreter", which apparently kicks in when the program being run starts doing weird stuff that requires doing everything in dynamic mode.

I proposed adding "freezing", immutable creation, to Python in 2010, as a way to make threads work without a global lock.[1] Guido didn't like it. Threads in Python still don't do much for performance.

[1] http://www.animats.com/papers/languages/pythonconcurrency.ht...


> This kills most potential optimizations, or even hard-code generation.

It doesn’t - this has been a basically solved problem since Self and deoptimisation were invented.


In theory, yes. In CPython, apparently not. In PyPy, yes.[1] PyPy has to do a lot of extra work to permit some unlikely events.

[1] https://carolchen.me/blog/jits-impls/


You’re trying to correct me by posting my own mentee’s blog post at me.


As a very partial; almost unrelated question: Is there any python module that you use day-to-day that you'd like to have a significant speedup with?

I'm thinking of reimplementing some python modules in rust, as that seems like the kind of weird thing I'm in to. I've done it with some success (using the excellent work of the pyo3 project) professionally, but I'd be interested in doing more.


Definitely matplotlib. Navigating image plots in interactive mode with even just 10000x10000 pixels is painfully slow. While I've picked up some alternatives, they don't feel as clean as matplotlib.


10000% -- matplotlib for visualization of a lot of different data I've looked at, but esp things like high res images in machine learning contexts is incredibly slow, even on good computers. It does fine for small vector stuff and render once and save graphs, but it's bad for what a lot of people use it for.


Pydantic is quite popular library. Its author is doing exactly this - rewriting its core [0] in Rust. It's still WIP, but readme mentions that "Pydantic-core is currently around 17x faster than Pydantic Standard."

[0] https://github.com/samuelcolvin/pydantic-core


Pydantic is not just popular (and awesome) on its own but serves as the underpinnings of a lot of the FastAPI functionality - faster Pydantic would make a LOT of apps faster


You’d be awesome if you wrote a library for large image processing.

You can make large Numpy arrays fine — eg, 20k x 20k or 500k x 500k, but trying to render that to anything but SVG or manual tilings pukes badly.

That’s my main blocker on rendering high dimensional shapes: you can do the math, but visualizations immediately fall over (unless you do tiling yourself).

There’s probably someone with a more useful idea than “gigapixel rendering” though.


Not working in Python right now, but I have 15 years of Python + Django on the web and while there are any number of attempts at this (I keep a list at https://pinboard.in/u:tclancy/t:json/t:python/), any improvement in JSON serialization and unserialization speeds is a huge boon to projects. I am trying to think of similar bottlenecks where a drop-in replacement can be a huge performance improvement.


The missing thing last time I looked was a fast python json library that's byte-compatible with stdlib -- same inputs, same outputs. There are good fast options but they tend to add some (perfectly reasonable) limitation like fixed indentation size, for the sake of speed, that blocks them from being dropped into an existing public API.


pandas


It's been done: https://github.com/pola-rs/polars

But I'm sure there's always room for improvement


It’s not like polars is a drop-in replacement, it has a totally different API.


You wrote "it has a totally different API", did you mean "it has an actually sane API?" Because that's what I think of when I compare pandas to polars.


Not if you've legacy apps using pandas everywhere.

Same API means I can import polars as pd and be done with it.


This is a curious reply for me. I would think that there are very few parts in pandas that could be sped-up by reimplementing them with a compiled language. Pandas is plenty fast for the built-in methods, it only gets slow when you start interfacing with Python, e.g. by doing an `.apply` with your custom Python method. Obviously this interfacing part is impossible to speed up by reimplementing parts of pandas (you'd need a different API instead).


I remember, when trying to squeeze some performance out of it, that a lot of the overhead came from it trying to infer types.



The answer would then be to have a look at polars.


If you are building server-side applications using Python 3 and async API and if you didn't use https://github.com/MagicStack/uvloop, you are missing out on performance big time.

Also, if you happen to build microservices, don't forget to try PyPy, that's another easy performance booster (if it's compatible to your app).


> if it's compatible to your app

Every time I experiment with PyPy (on a set of non-trivial web services) I encounter at least one incompatibility with PyPy in the dependency tree and leave disappointed.


As I see it, Python is good for glue code and small scripts where performance usually doesn't matter. Even if it would be more performant, it would be a nightmare for large code bases since it's dynamically typed.

I really enjoy Nim which is "slick as Python, fast as C".


You wouldn’t believe how many near-FAANGS have hundreds of large backend services on Python without any issues and from times where typing was in docstrings.


Because they have insane amounts of money that they can throw at the machines.


I one had a database-backed website serving 50k unique visitors/day written in Django and hosted on a low-budget vps. Worked like a charm with very few hiccups.


Right? I do not understand why the comparison of Python is always, "But when you hit 10 million daily users, you are really going to be feeling the scaling pain." You can hit a very serious audience size before ever having to worry about performance characteristics at all. Computers are fast.


I was curious so i had a bash at comparing the cost of just buying another server to throw at the problem vs telling a FAANG dev to optimise the code.

A dedicated 40core / 6Tb server is around $2k but will be amortized over the years of its life. It needs power, cooling, someone to install it in a rack, someone to recycle it afterwards, ..., around $175/yr

A FAANG dev varies wildly but $400k seems fair-ish (given how many have TC > 750k).

So that's about 12 hours of time optimising the code vs throwing another 40c / 6Tb machine at the problem for 365 days.

The big cost i'm missing out of both the server and the developer is the building they work in. What's the recharge for a desk at a FAANG, $150k/yr ? I have no idea how much a rack slot works out at.

Unless i've screwed up the figures anywhere, we should probably all be looking at replacing Python with Ruby if we can squeeze more developer productivity!


Adding hardware doesn't improve single-request performance, so slow stacks can require a bunch of optimizing or caching work that wouldn't be needed on a faster one. At some point it also impacts productivity when the test suite is slow, the app takes a long time to restart, etc.


Sure it does, my home lab servers have a single thread performance approximately half that of a server today.

What’s the ping latency from US East to Europe? 80ms-ish? What’s a roundtrip to postgres with a regular business app type query, 20ms-ish? What’s the latency on a beefy rails app’s request handling, 40ms?

We’re talking 140ms best case for a slow stack. What can you get that down to with tuning work?

When your user comes along on their 4g connection with 800ms latency, will they be able to tell the difference?

Don’t get me wrong, I’d far rather invest time in making the stack efficient but from a business point of view, it might not make sense vs. just throwing hardware at the problem and spending your expensive engineering resources on making it possible to deliver more utility to customers.


> Python can quickly check to see if they are using the dynamic features

I don't understand how this is supposed to be "quickly" verifiable?

Nothing prevents you from doing eval('gl' + 'obals')()['len'] = ...; how is the interpreter supposed to quickly check that this isn't the case when you're calling a function that might not even be in the current module?

Doing this correctly would seem to require a ton of static analysis on the source or bytecode that I imagine will at best be slow, and at worst impossible due to the halting problem.


Hum... You are getting lost on theoretical undecidability.

On the real world, when faced with a generally undecidable problem, we don't run away and lose all hope. We decide the cases that can be decided, and do something safe when they can't be decided.

On your example, Python can just re-optimize everything after an eval. That doesn't stop it from running optimized code if the eval does not happen. It can do even more and only re-optimize things that the eval touched, what has some extra benefits and costs, so may or may not be better.

Besides, when there isn't an eval on the code, the interpreter can just ignore anything about it.


> You are getting lost on theoretical undecidability. [...] We decide the cases that can be decided, and do something safe when they can't be decided.

I'm not lost on that at all; I'm well aware of that. that's precisely why I wrote

>> [...] require a ton of static analysis on the source or bytecode that I imagine will at best be slow, and at worst impossible due to the halting problem

and not

>> static analysis is impossible in the general case so we run away and lose all hope.

I'm not sure how you read that sentiment from my comment.


Hum... Ok. Then the answer is that most cases do not demand as much analysis time as you expect, and the ones that demand more still can gain something from dynamic behavior analysis in a JIT.

Also, you can combine the two to get something better than any single analysis alone.


I don't think the world is quite so bad.

x86 processors solve this by speculating about what's going on. If you suddenly run into a 1976-era operation, everything slows down dramatically for a bit (but still goes faster than an 8086). If you have a branch or cache miss, things slow down a little bit.

One has a few possibilities:

- A static analysis /proves/ something. print is print. You optimize a lot.

- A static analysis /suggests/ something. print is print, unless redefined in an eval. You just need to go into a slow path in operations like `eval`, so if print is modified, you invalidate the static analysis.

- A static or dynamic analysis suggests something probabilistically. You can make the fast path fast, and the slow path eventually work. If print isn't print, you raise an internal exception, do some recovery, and get back to it.

I'm also okay with this analysis being run in prod and not in dev.

As a footnote, JITs, especially in Java, show that this kind of analysis can be pretty fast. You don't need it to work 100% of the time. The case of a variable being redefined in a dozen places, you just ignore. The case where I call a function from three places which increments an integer each time, I can find with hardly any overhead at all. The latter tends to be where most of the bottlenecks are.


I was reading this as an undetailed description of state available WITHIN the interpreter. Probably there is a table of globals that you can simply check last modification on or something like this. Whether you hit it with eval or some other tricky code, you can't modify a global without the interpreter knowing about it.


If that's what they mean, how would that be any faster than what's going on right now? I thought normally when you hit a callable, the interpreter would just look up its name, check to see if it's a built-in, and then call the built-in if so... whereas in this case you'd still have to look up the name of the callable (is the idea to bypass this somehow? what do they do currently?), check to see if it's different than the built-in you'd expect from the name (i.e. if it's ever been reassigned to), then call that expected built-in if it's not... which seems like the same thing? At best it would seem to convert 1 indirect call to a direct call, which would be negligible for something like Python. Is the current implementation somehow much slower than I'm imagining? What am I missing?


You could do something like primitive inline cache. Store "version" of the globals in another variable. Each time globals are modified - bump the version. For each call-site and/or keep what the global name is resolved to + version of "globals object" in a static variable. Now you can avoid name resolution if version hasn't changed between two executions of the line. Now in fast-path you just pay the price of (easily predicted, because globals almost never change) single compare and jump vs full hash-table lookup.


I think the core of the optimization you're mentioning hinges on a normal lookup being a slow hashtable lookup (of a string?)... whereas I imagined the first thing the interpreter would do would be to intern each name and assign it a unique ID (as soon as during parsing, say) and use that thereafter whenever they're not forced to use a string (like with globals()). That integer could literally be a global integer index into a table of interned strings, so you could either avoid hashing entirely (if the table isn't too big) or reduce it to hashing an int, both of which are much faster than hashing a string. Do they not do that already? Any idea why? I feel like that's the real optimization you'd need if checking a key in a hashtable is the slow part (and it's independent of whether the value is being modified).


> I don't understand how this is supposed to be "quickly" verifiable?

You don’t verify, and instead you run assuming no verification is needed. Then if someone wants to violate that assumption, it’s their problem to stop everyone who may have made that assumption, and to ask them to not make it going forward.

You shift the cost to the person who’s doing the metaprogramming and keep it free for everyone who isn’t.

https://chrisseaton.com/truffleruby/deoptimizing/


Python dictionaries now have version counters that track how many times they were modified, so the quick check is to ask "was len not overidden last time and is the number of modifications to the globals the same as it was last time".


One possibility is to move the cost to the assignment, so the code that assigns a new value to the global 'len' function is going to track and invalidate all cached lookups. Hopefully you are changing the binding of 'len' less often than you are calling it :)


Cinder does this (invalidation), and both Faster CPython and Pyston use guarding.


Right, of course, guarded devirtualization is a common technique.


Why not switch to making __slots__ in classes the default and then making attribute changes to an object during runtime an opt-in? It will require a long grace period but wouldn't it help optimisation efforts immensely?


That's going to require quite a lot of changes, it's a giant breaking change. All classes would need someone to go around finding all the attributes that are created and adding an __slots__ dictionary, to avoid regular attribute initialization in __init__ failing. It's a massive task, and it would completely break backwards compatibility for performance gains that not everybody will need.


That would mean all installed dependencies need to comply with this change as well, which is unlikely to happen in any realistic timeframe.


default __slots__ breaks a lot of monkey patching.

An "easier" change would be to add a class attribute "no__dict__", which says that the __dict__ attribute can't be used, which lets the implementation do whatever it wants. That can be incrementally added to classes.

Another option is a "no__getattr__" attribute, which disables gettattr and friends.


Where can I read about what kind of performance improvements `__slots__` brings?


The Python docs themselves is a good place to start: https://docs.python.org/3/reference/datamodel.html#slots

The Python wiki also has some good info about it: https://wiki.python.org/moin/UsingSlots


Apart from the official docs, this video also explains the low level data layout (in CPython) that slots introduce: https://www.youtube.com/watch?v=Iwf17zsDAnY


"Those techniques are based on the idea that most code "does not use the full dynamic power that it could at any given time" and that Python can quickly check to see if they are using the dynamic features."

If anyone has a burning desire to try to write the next big dynamically-typed scripting language, I've often noodled in my head with the idea of a language that has a dynamically-typed startup phase, but at some point you call "DoneBeingDynamic()" on something (program, module, whatever, some playing would have to be done here) and the dynamic system basically freezes everything into place and becomes a static system. (Or you have an explicit startup phase to your module, or something like that.)

The core observation I'm driving this on is much the same as the quote I give from the article. You generally set up the vast majority of your "dynamicness" once at runtime, e.g., you set up your monkeypatches, you read the tables out of the DB to set up your active classes, you read the config files and munge together the configurations, etc. But then forever after, your dynamic language is constantly reading this stuff, over and over and over and over again, millions, billions, trillions of times, with it never changing. But it has to be read for the language to work.

Combine that with perhaps some work on a system that backs to a struct-like representation of things rather than a hash-like representation, and you might be able to build something that gets, say, 80% of the dynamicness of a 1990s-era dynamic scripting language, while performing at something more like compiled language speeds, albeit with a startup cost. If you could skip over the dozens of operations resolving

     x.y.z.q = 25
a dynamically-typed language like Python needs to properly implement that and get down to a runtime that can do the same thing compiled languages do by pre-computing the offset into a struct and just setting the value, you might get near static-language performance with dynamic typing affordances.

You can also view this as a Lisp-like thing that has an integrated phase where it has macros, but then at some point puts this capability down.

I tend to think it's just fundamentally flawed to take a language that is intrinsically defined as "x.y.z.q" requiring dozens of runtime operations versus trying to define a new one where it is a first-class priority from day one that the system be able to resolve that down to some static understanding of what "x.y.z.q" is. e.g., it's OK if y is a property and z is some fancy override if the runtime can simply hardcode the relevant details instead of having to resolve them every time. You can outrun even JIT-like optimizations if you can get this down to the point where you don't even have to check incoming types, you just know.


This may be a naive question (I have very little knowledge about building languages and compilers): Would this be possible in Python by introducing a keyword like `final`? Any object, variable, method that is marked final just has to be looked up once by the interpreter, the re-fetching the article describes doesn't have to happen again. Trying to change a final thing results in an exception.


>I've often noodled in my head with the idea of a language that has a dynamically-typed startup phase, but at some point you call "DoneBeingDynamic()" on something (program, module, whatever, some playing would have to be done here) and the dynamic system basically freezes everything into place and becomes a static system. (Or you have an explicit startup phase to your module, or something like that.)

V8 tries to guess that for classes and objects based on runtime information - that's how it gets some of its speed (it still needs checks about whether this is violated at any point, so that it can get rid of the proxy/stub "static" object it guessed).

For a more static guarantee, there are also things like Object.freeze which does about what you describe for dynamic objects in JS (#).

# https://gist.github.com/briancavalier/3772938


I'd be curious to see if a language developed with the idea that this is what it's going to do from scratch could do better than trying to bodge it on afterwards. Rather than pecking around what could be done literally decades after the language is specified, what if you started out with this idea?

I dunno. It's possible the real world would stomp all over this idea in practice, or the resulting language would just be too complex to be usable. It does imply a rather weird bifurcation between being in "the init phase" and "the normal runtime phase", and who knows what other "phases" could emerge. Although technically, Perl actually already has this split, although generally it can be ignored because it's of much less consequence in Perl precisely because there mostly isn't much utility to having something done in the earlier phase, unlike this hypothetical language.


It seems that lisp-like macros or more generally multistage compilation is close to what you have in mind.


Yes, it's not a brand-new dimension of programming languages, merely a refinement of existing ideas. However I'm not aware of anything quite like it out there. Lisp could be used to implement it, but, I mean, that's not a very strong statement now is it? Lisp can be used to implement anything. The question is about whether it exists.

Partially I throw this idea out as a bone to those who like dynamic languages. Personally I don't have this problem anymore because I've basically given them up, except in cases where the problem is too small to matter. And if you already know and like Lisp, you don't really have this problem either.

But if you are a devotee of the 1990s dynamic scripting languages, you're getting really squeezed right now by performance issues. You can run 40-50x slower than C, or you can run circa 10x slower than C with an amazing JIT that requires a ton of effort and will forever be very quirky with performance, and in both cases you'll be doing quite a lot of work to use more than one core at a time. Python is just hanging in there with the amazing amount of work being poured into NumPy, and from what I gather from my limited interactions with data scientists, as data sets get larger and the pipelines more complex, the odds you'll fall out of what NumPy can do and fall back to pure Python goes up and the price of that goes up too.

I think a new dynamic scripting language built from the ground up to be multithreadable and high performance via some techniques like this would have some room to run, and while hordes of people will come out of the woodwork promising that one of the existing ones will get there Real Soon Now, just wait, they've almost got it, the reality is I think that the current languages have pretty much been pushed as far as they can be. Unless someone writes this language, dynamic scripting languages are going to continue slowly, quite slowly, but also quite surely, just getting squeezed out of computing entirely. I mean, I'm looking at the road ahead and I'm not sure how Go or C# is going to navigate a world where even low-end CPUs casually have 128 cores on consumer hardware.... Python qua Python is going to face a real uphill battle when the decision to use it entails basically committing to not only using less than 1% of the available cores (without offloading on to the programmer a significant amount of work to get past that), but also using that core ~1.5 orders of magnitude less efficiently than a compiled language. You've always had to pay some to use Python, sure, but that's an awful lot of orders of magnitude for "a nice language". Surely we can have "a nice language" for less price than that.


Good take. Back to Lisp, then? What are we waiting for?


Well, that is why Julia matters.


This approach kind of describes Graal.

Interestingly, GraalPython never seems to come up on these speeding-up-Python articles & benchmarks while TruffleRuby is a heavyweight in the speeding-up-Ruby space.


I tried to benchmark GraalPython for the talk but the compatibility situation was so poor that I wasn't even close to being able to run any benchmarks.


This kind of sounds similar to what a JIT compiler does, except that a JIT will silently fall back to slower code if you do those forbidden dynamic things. I think the most appealing thing about what you're suggesting here is less about the peak performance and more about having better guarantees about startup cost and that performance won't be degraded (prefer failing loudly to chugging along unoptimized). These two areas often aren't the strongest point in JIT-ed systems...


This feels like you are describing julia, startup cost included :).


I disagree. You are just doing those same optimizations by hand, instead of on a JIT. The computer is there to help us, and a lot of the value in a dynamic language comes from being able to override things at any time.

If you just set your structure up and run it statically, you are better with a static language, that can take all kinds of value from that fixed structure.


Great read, vaguely reminds me someone or other was trying to get cpython going with cosmopolitan libc. Wonder what that would do for speed.


why do you think that this would help performance? a quick read says cosmo is slower than regular libc. maybe it would be more portable, but not faster


15 years ago I remember reading Guido van Rossum saying that Python is a connector language and if you need performance, just drop into C and write/use a C module. I thought it was crazy at the time, but now I see that he was absolutely right. It took a while, but now Python has a high-performing C module for pretty much every task.


But these don’t compose right? Each is a black-box to each other? A black-box add and a black-box multiply don’t fuse.


C and Python are not black boxes to each other. The entire python interpreter is literally a C API. You can create pyobjects, add heterogenous PyObjects to PyLists, etc. So evewrything in Python can be introspected from C.

Turned around, Python has arbitrary access into the C programming space (really, the UNIX or Windows process it's running inside), so long as it has access to headers or other type info it can see C with more than black box info.

Most python numerics is implement in numpy; the low levels of numpy are actually (or were) effectively a C API implementing multiple dimension arrays, with a python wrapper.


You're talking past chrisseaton's point here. If you want two C extensions to interoperate with bare-metal performance, you can't just do

  from lib1 import makedata
  from lib2 import processdata

  data = makedata()
  print(processdata(data))
Because makedata needs to provide a c->py bridge and processdata needs a py->c bridge, so your process inherently has python in the middle unless lib2 has intimate knowledge of lib1. It can absolutely be done (I've written plenty of c extensions that handle numpy arrays, for example) but if somebody hasn't done the work, you don't get it for free. If your c extension expects a list of lists of floats, the numpy array totally supports that interface... but (last I checked) the overhead there is way slower than calling list(map(list, data)) and throwing that into your numpy-naive c extension.


> C and Python are not black boxes to each other

Yes they are - the C interpreter knows nothing about what your C extension does. It can’t optimise it because all it has is your machine code - no higher level logic.


They can! Numpy exposes a C API to other Python programs [0]. It's not hard to write a Cython library that uses the Numpy C API directly and does not cross into Python [1].

[0]: https://numpy.org/doc/stable/reference/c-api/index.html

[1]: https://github.com/kylebarron/pymartini/blob/4774549ffa2051c...


So they can if you use their specific API? It doesn’t naturally compose in conventional Python code?


>The first topic he raised, "why Python is slow", is somewhat divisive

What dynamic, interpreted, single threaded language is fast?


Lua (LuaJIT implementation). Some Smalltalk VMs are also quite fast. For example, see Eliot Miranda work on CogVM.


Your pushing it a bit too far if you say that JIT is interpreted.

To answer OP, if you replace "dynamic" by "untyped", Forth qualifies. And it actually can go where there's no JIT to save your A from the "just throw more hardware (and software) at the problem" mindset.


I think someone once said dynamic langs must cheat to be performant. Jitted runtimes are just interpreters cheating.


> What dynamic, interpreted, single threaded language is fast?

Javascript. End of list.

The problem is that a Javascript implementation is now so complicated that you can't develop a new one without massive investment of resources.


Interpreted Javascript is slow. Most modern Javascript implementation use JIT.


As far as interpreted languages go, Wren is pretty quick, but still not fast compared to compiled languages.

But for dynamic, single threaded languages, JavaScript is famously fast with a modern JIT compiler like V8.


Practically every other language that ticks those boxes is faster than Python.


Just recently I replaced an internal library (classes for User and a list of users, that need serialization from and to json) with a Python library written in Rust (using pyo3 and maturin).

I did not do it for performance considerations, but because I felt the type system of Rust could give me better guarantuees here (and it did, I found multiple bugs this way).

On top the performance gain was noticible, although I did not bother to benchmark it.

There are still some gotchas in the conversion from Python types to Rust (e.g. a function which takes a Vec<String> will happily accept a str and produce a vector of the str characters. This is common behaviour in Python where I tend to do a isinstance(value, str): value = [value] to deal with this.


Faster-cpython is not the main topic here but certainly welcome since it's the most used python. They've done great things so far. Though I remember I heard the promise of 50% improvement in each of five separate steps :)


Print doesn't have to be re resolved on every access... Not sure about python but many interpreters do a resolution pass that matches declarations and usages (and decides where data lives, stack, heap, virtual register, whatever)


In Python semantics, indeed, 'print' does need to be looked up each time!


Surely in the common case this can be optimized away??? Could you add some detail?


'print' is a global function that can be overridden by user code. The most obvious way to do this would be to define a new function named 'print', but it could also be overridden through less direct means, like from an 'eval' call, manually manipulating the global variables mapping ('globals()'), or by manipulating the '__builtins__' mapping. Maybe this happens in another thread!

Statically reasoning about all this (as would be required for optimization) is difficult. Not totally impossible, but not something that the canonical CPython implementation tries to do.


What does "Modern" python even mean?


Focuses on 3.8+, but 3.7 has another year of life in it.


What's wrong with using the right tool for the right job? Python for utility scripts, Javascript for Web frontend, C and C++ for system programming, C# for Web backend, R for statistical stuff and data analysis?

It seems to me some guys learned a language suited to a thing and instead of learning other languages better fitted for other purposes, they push for their one and only language to be used everywhere, resulting in delays and financial losses.

It's not very hard to learn another language. Or, if you are that lazy, you can stay with the language you know and use it for what was intended.


Python has domain on web backend, statistical stuff and data analysis nowadays.


Especially cuz it allows you to easily write web backends using its awesome scientific and statistical/data science libraries. There is no other language that lets you build web services on dynamic data calculations like that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: