I've never been able to find a good use case for pypy. Why would you ever want t...

daveFNbuck · on Dec 31, 2020

I used Luigi [1] to automate data processing at a previous job. It's a simple job queue with a UI. You request jobs from it, and then run them for minutes or hours, so it shouldn't normally be a bottleneck and it makes sense to use a language that's quick and easy to write.

It's written in Python and works fine to process thousands of jobs per day. Once you start having tens of thousands of jobs in the queue, it gets slow enough that it can back things up. This compounds the problem, eventually resulting in the whole thing crashing.

By switching the interpreter to PyPy, I was able to keep the data pipeline running at that scale without having to rewrite anything.

[1] https://github.com/spotify/luigi

chrisseaton · on Dec 31, 2020

> Why would you ever want to run CPU bound tasks in python? ... But most of the widely used stuff in python already has underlying compiled code for the heavy lifting.

Haven't you just answered your own question? We know that people want to run CPU bound tasks in Python so much that they went to the effort of writing native modules because they couldn't do it in Python.

> python's lackluster concurrency support

This is a common misconception - Python actually has fully concurrent threads already.

seunosewa · on Dec 31, 2020

Not fully. The GIL is always held when executing Python bytecode, because it isn’t threadsafe. Dropping the GIL (parallelism) only happens in native code that explicitly drops it.

chrisseaton · on Dec 31, 2020

> Not fully.

This isn't true - they are fully concurrent.

The GIL prevents parallelism - not concurrency.

woadwarrior01 · on Dec 31, 2020

> The GIL prevents parallelism - not concurrency.

When a Python thread is holding on to the GIL (running Python bytecode), how many other Python threads can concurrently run in the same process?

The answer is zero.

Sure the interpreter releases the GIL every n bytecode ops, and C extensions can release the GIL before doing anything IO bound and reacquire it (i.e wait for it) afterwards, but that isn’t full concurrency, in my books.

chrisseaton · on Dec 31, 2020

I think you're describing parallelism. The thread is about concurrency. I think you'll find this definition of concurrency matches industry standard definitions like Padua.

woadwarrior01 · on Jan 2, 2021

The broader point is: What good is it if it cannot effectively utilize modern multi-core hardware?

mlazos · on Dec 31, 2020

This is a commonly held misconception. Concurrency actually implies that computations can be reordered without changing the final outcome, and does not imply parallel execution. This is related to parallelism in that concurrent computations can be run in parallel for speed up.

mumblemumble · on Jan 1, 2021

> When a Python thread is holding on to the GIL (running Python bytecode), how many other Python threads can concurrently

You've been tricked by jargon. It's a common misconception. In this context, "concurrency" has a specific meaning that is different from the everyday one you're using in this sentence.

For a good introduction to what the two words mean in a software engineering context, check out this written version of Rob Pike's talk, "Concurrency is not Parallelism."

https://rakhim.org/summary-of-concurrency-is-not-parallellis...

The very tl;dr summary is:

> Concurrency is composition of independently executing things. . . Parallelism is simultaneous execution of multiple things.

When Python's threading model was implemented, parallelism just wasn't much of a concern. CPUs had a single core and could therefore only be working on one thing at a time. (In a macro sense; pipelining and supersclar architectures were still a thing, but not super relevant here.) Multithreading was not a way to do multiple things at once, it was just a way to ensure that some long-running calculation would not cause the program to lock up by, e.g., preventing it form responding to event queues in a timely manner. This was done by, not by running things in parallel, but by switching back and forth among them them very quickly.

Python's GIL was designed for this kind of situation. It's there to ensure that nothing bad happens if one of those context switches happens in the middle of a sensitive operation. Which is, strictly speaking, a concurrency concern and not a parallelism concern.

(It's also possible to have parallel work that is not concurrent, in which case locks are not necessary. But just because it's common for parallelism and concurrency to co-occur does not mean that they are the same thing.)

seunosewa · on Dec 31, 2020

What does fully concurrent mean, exactly? Which commonly used programming languages have threads that are not fully concurrent?

chrisseaton · on Dec 31, 2020

I think it’s binary - it’s either fully concurrent or not concurrent at all.

seunosewa · on Dec 31, 2020

I guess I incorrectly understood “fully concurrent threads” to mean threads that actually run in parallel, like Java threads. The redundant word “fully” threw me off; apologies.

btown · on Dec 31, 2020

A well-tuned web application layer is CPU bound at scale - your database is well designed and not a source of latency, and the lack of concurrency support in a language doesn’t matter if the interpreter is so much slower than context switches, which is absolutely true for CPython.

There are many sites and services where a rewrite in a new language is just not viable, and I still would recommend Python-everywhere to startups doing things remotely associated with data. So PyPy would be a tide that would lift many boats.

macNchz · on Dec 31, 2020

I switched a high traffic Flask web app to PyPy a couple of years ago and we saw substantially faster response times across the board, and much higher task throughput from our background worker machines, many of which were pegged 24/7.

We had so much less baseline load afterwards that we were able to scale down a bunch of instances. The transition only took a few hours of effort fixing one or two incompatible dependencies, so it paid for itself in savings quickly, especially vs an approach of trying to rewrite the slowest bits in a faster language.

dheera · on Dec 31, 2020

> Why would you ever want to run CPU bound tasks in python?

0. Because you don't have time to deal with the mess that C++ has become, and the amount of please-repeat-yourself-a-million-times crap you have to deal with (cmake files, header files, they give us a goddamn spaceship operator but not basic necessities like string split/join methods)

1. There are many use cases where faster execution time is nice to have, e.g. when results can be cached for a long time, or if it's a one-off data analysis script, but human time is far more expensive. If it costs $1000 more in engineer hours to write C++ instead of Python for that script that's only going to be run 10 times, that isn't a worthwhile tradeoff. Hell you could buy a new GPU for that money.

2. Because the same exact file can be deployed on arm32, arm64, and x86

3. Because CPU-intensive stuff is largely already optimized by numpy, numba, tensorflow, pytorch, etc.

marcinzm · on Dec 31, 2020

I can either spend a lot of time rewriting my already tested and working code if my application scales to the point of hitting a CPU bottleneck OR I can just try using PyPy.

katbyte · on Dec 31, 2020

But why write it in python in the first place?

ArchOversight · on Dec 31, 2020

Python makes it quick to write code, test it and get it out the door.

Other languages don't have that same cycle, and Python has a LARGE amount of freely available packages that are able to help launch even quicker by not having to necessarily write everything oneself.

So at that point it comes down to what are you fastest in?

marcinzm · on Jan 1, 2021

Python given it's eco-system of packages is quick to write functioning code in. It's dynamic nature allows for quick prototyping and re-factoring without requiring massive pre-planning and cognitive overhead.

mattbillenstein · on Dec 31, 2020

I've done it for one-off data munging scripts - processing archives of one format of data to another, etc. Python is easy to write and for these tasks, you can get PyPy to execute at twice the speed for basically no cost.

mumblemumble · on Dec 31, 2020

FWIW, you don't have to be faster than the bear.

Strictly speaking, "CPU bound" is not an adjective that can be used to describe tasks. It's one that describes a particular program solving a particular task under a particular configuration. I've done no analysis on this myself, but I would be more than willing to believe that a CPU-bound job might be only a CPython-to-MyPy's worth of speedup away from instead being memory- or disk-bound.

It can be hard to tell the difference, too, since being stalled out while waiting on the memory controller shows up as CPU activity in htop.

muxator · on Dec 31, 2020

The Intel vtune profiler [0] bolts on linux's perf subsystem and offers a very nice way of assessing if the cpu is stalled on memory (or cache) or os spending its time computing. I guess is a nice GUI on (nowadays) standard Linux tracing interfaces, but I really did not dig enough.

If you are after deep profiling, you should definitely give it a try. My recollection is totally positive.

[0] https://software.intel.com/content/www/us/en/develop/tools/o...

pjmlp · on Dec 31, 2020

Killing efforts like PyPy is what creates the Julias and Chapels of the future.

sidewndr46 · on Dec 31, 2020

As others mentioned, it's generally an easy choice to write one-off data processing & analysis stuff in. I can get all the multithreading support I need with the multiprocessing library.

For me this always just consists of reading a bunch of files in and then doing some basic aggregation on them. Years ago I benchmarked Python vs. PyPy and found no real benefit to it.

Here is the link to that if you'd like to read it:

http://www.hydrogen18.com/blog/unpickling-buffers.html

seunosewa · on Dec 31, 2020

Code can end up being CPU bound that wasn’t meant to be. For example, having to use on a database driver or some other low level library written in Python can make your web app pretty CPU-bound. The main issue with PyPy is compatibility for me.