Python 3 Q&A by a prolific core developer

zanny · on June 29, 2012

I think all the outrage is that Python 3 behaves like a new language in how everything needs to be ported (albeit at a magnitude less complexity than going up the language chain or across the playing field, like C to Ruby or C# to Java) but it still incurs the new language cost, while being the new version of the old language.

If people just thought of Python 3 as the replacement to broken text model Python, like how XHTML tried to supplant HTML and failed, and then HTML5 supplanted XHTML with some quirks, the same thing happens here.

Also, I adore the comments about the GIL. I understand the benefits of using a single language in every circumstance, but (even) a JIT scripting language is not sufficient for all problems. I guess it depends on if a polyglot code base is more complex than handling multithreading and the corresponding memory complexity and gimmicks, and that is developer centric.

Still, I would much rather just stick processor bound code in openMP C functions and call those from Python when need be. It seems like the "right" answer to the performance problem, without losing much productivity.

dmbaggett · on June 29, 2012

From the article:

Back in reality, though, complaining about the GIL as though its a serious barrier to adoption amongst developers that know what they’re doing often says more about the person doing the complaining than it does about CPython.

It was a solid, well-argued piece up to this point. You do yourself and the Python community a disservice by writing off your critics as ignorant. It sounds petulant and childish, and is wrong.

There are valid arguments on both sides of the GIL argument, but neither side's advocates are ignorant or bad programmers.

bwood · on June 29, 2012

In my experience with Python, the issue of the GIL comes up far too frequently. I use Python first as a prototyping environment, and once I have a proof of concept I like to optimize for performance (without re-writing the entire program in a more appropriate language). Most of the stuff I do is CPU-bound, but also involves large amounts of data. So, without writing C code as an extension, I can either suck it up and deal with single-core performance, or I can use the multiprocessing module and hope that inter-process communication isn't too expensive or difficult to code (and no, it's not always as simple as just using multiprocessing.Queue). I am under no delusion that the GIL will ever be "fixed", but I am disappointed that I spent years working with Python before I got into more CPU-intensive tasks and eventually hit the brick wall that is the GIL. And of course, comments like you mentioned, which attack my competence as a coder, are not constructive.

pwang · on June 29, 2012

I'm interested to hear about your CPU-bound Python programs that also involve large amounts of data. Normally such problems are I/O bound and are better solved in a data parallel way. (Graph problems can sometimes be an exception, but those are still generally I/O bound.)

For writing extensions, have you considered Cython?

bwood · on June 29, 2012

Sure thing, here are two examples:

1) My current project at work is a GPU-accelerated keyword-matching engine. The project was started before I joined the company, so I had no say in the choice of Python. Keywords change infrequently, while we analyze a continuous stream of incoming text. There are several million keywords, ranging from small to enormous in size. Aho-Corasick (http://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_string_mat...) is a pretty ideal algorithm for this scenario, which we use for the GPU matching kernel.

AC requires some preprocessing of keywords into a deterministic finite automaton (basically a suffix trie). This is very expensive for a large number of keywords with a large number of characters. The DFA grows to something like 10GB while being built.

Meanwhile, the main engine loop has to be running continuously, while updating keywords in the background. The engine is a service available to other systems on our network, so it uses multiple threads for concurrent I/O. The problem is that the GPU performance is so ridiculously high that the CPU can't keep it fed with data. I've profiled it and this is not a memory-bound problem...the CPU simply cannot keep up with the document streams that we send to it.

The concurrent I/O threads cannot reasonably be split across processes because they need a shared memory space for the data structures driving the engine. So clearly, the background keyword updating is a problem if it runs in the same process as the rest of the engine. I spent a lot of time trying to figure out how to get the keyword updating working in its own multiprocessing process. It's a complete hack to work around the failings of Python (I can go into more depth about the implementation issues if you'd like). And this is why I loathe the GIL.

We use Cython for some aspects of the code, but the keyword updating has yielded very little gain. It's difficult to rewrite parts of the keyword updater as more optimized Cython because it uses some language features that do not seem to be supported in Cython.

2) For a personal project, I need to do a lot of timeseries processing. I'm using Python to prototype, with the intention of either optimizing it eventually or possibly rewriting it in a more suitable language. I've found parsing timestamps to be particularly CPU-intensive, while working on gigabytes of data. Most data I send to a multiprocessing process will have to be returned in some form eventually, so communication costs are huge. So huge, in fact, that I only see a 10% speedup from splitting the workloads evenly across six cores. Profiling reveals that the majority of the "processing" time is actually just waiting on data getting sent back to the main process. This would not be a problem with a shared memory space.

baq · on June 29, 2012

re 2) - http://docs.python.org/library/mmap.html ?

bwood · on June 30, 2012

Interesting, thanks for the tip! I'll look into it and think about whether it makes sense for my timeseries analysis.

slurgfest · on June 29, 2012

Let's be honest: you are upset because you have sometime committed to saying the GIL is a huge deal and you feel insulted by anyone saying it isn't, which is why you are using emotionally loaded words like 'petulant' and 'childish' and 'ignorant' and 'bad'.

The GIL is the single biggest target for language-advocacy FUD against Python by advocates of other languages. I would be a rich person if I got a dollar for every time I saw someone trashing Python as a toy language because of the GIL, without significant knowledge of how to use Python. It's just a much huger issue to someone without real Python experience than it is to people with Python experience.

Setting that aside, there are a vast number of use cases where someone might try to use threads when actually that's not a good solution. It actually isn't incredibly easy to come up with cases where the GIL is this huge fatal flaw. It's really sad how many times I have seen people ranting about the GIL and totally unaware of multiprocessing, unable to give a technical reason why they can't try processes, unaware of greenlets, wholly unaware that the GIL does not exist in Jython/IronPython, etc.

I'm sure that doesn't apply to you, which would put you in the minority. Thus, "often says more about the person doing the complaining".

Threads should be used judiciously, because they dramatically increase the complexity of a program and the difficulty of debugging and reasoning about it. Shared-everything is a great way to blow your foot off if you don't specifically need it. So if you throw threads at every problem indiscriminately, then that really is a weak point in your programming. (Note: I am not saying that using threads is always a bad idea)

If you are a good programmer then you should already know, and not be offended to hear, that threads are a tool of specific applicability, not a panacea to scale up everything.

dmbaggett · on June 29, 2012

Let's be honest: you are upset because you have sometime committed to saying the GIL is a huge deal and you feel insulted by anyone saying it isn't, which is why you are using emotionally loaded words like 'petulant' and 'childish' and 'ignorant' and 'bad'.

Thank you for telling me what was inside my head. I was simply unaware of my emotional state and motivations until you helpfully pointed them out to me.

But seriously, "good programmers don't use threads (much)" is your counter-argument?

sho_hn · on June 29, 2012

I have to agree here, despite the two of us rating the importance of the GIL problem quite differently in yesterday's discussion: A better strategy against the GIL FUD is plain old education, not trying to make it a taboo topic. Especially programmers who do know what they're doing won't take too kindly to a community where that becomes prevalent.

hynek · on June 29, 2012

I find Nick arguing pretty much to the point and the quote above was ripped from it’s context which is a huge section balancing the arguments against each other.

tldr He argues there are better ways for scaling out than threads and removing the GIL would have enormous consequence _throughout_ the whole code base so it’s removal cannot be warranted.

Reducing 10 paragraphs and 11 bullet points to “only stupid people use threads” is just poor style.

seunosewa · on June 29, 2012

From the article: "The only downsides of this approach are that it means that CPU bound Python code can’t scale to multiple cores within a single machine using threads, and that IO operations can incur unexpected additional latency in the presence of a CPU bound thread."

Response: The only downside? As a Python user suffering from JVM envy, I have to say that that's a SERIOUS downside!

Here's why: (1) Python is slow. Almost any real life program in pure Python will have CPU-bound components. (example: BBCode parsing on my forum).

(2) Most programs that need to scale won't need to scale beyond a single machine (the most active web forum on my continent runs on a single quad core server)

Therefore the need to be able to scale CPU-bound python programs to multiple cores on is a very real need. Even though we accept that removing the GIL is hard, let's not insult real-life Python users by suggesting that their needs are not real.

hynek · on June 29, 2012

The article doesn’t say multi-CPU scaling isn’t necessary. It says that threads are usually the wrong answer anyway.

There are great process based ways to scale out – look no further than Erlang to see it’s true.

seunosewa · on June 29, 2012

I use Python for my day job. I've experimented extensively with Java and Scala. I must say this: Threads are AWESOME. Threading is the most flexible model of concurrency because you can can build programs that EFFICIENTLY implement "alternative concurrency models" like message parsing and STM with threads. Threading is supported natively by every OS. And fast.

Erlang is hyped as the ideal model for concurrency, but in practice is a niche product that's primarily useful for programs that are almost pure IO - chat servers, routing components like proxy servers and packet switches.

The Erlang model does NOT apply to python, anyway, since Python processes are nothing like Erlang processes. Unlike Erlang processes, Python processes are very heavyweight and message passing between them is costly.

hynek · on June 29, 2012

If you’re using python, the performance gap between processes with message passing and threads with locking is the last of your problems, believe me.

The big difference is that processes are much more robust and testable. The cases where threads are really needed are fringe cases and – while it’s a pity – Python doesn’t seem the right language if you don’t want to go the Jython/IronPython way.

The bigger problem is that people are used to go for threads by default although only few are able to write bug-free threaded code. It’s obsolete but prevalent performance wisdom and the fact that threads were really popular in the Java world.

ak217 · on June 29, 2012

This is a really short-sighted perspective.

Proper support of threading inherently allows more performance and flexibility than multiprocessing. On top of them, you can build powerful, Pythonic abstractions like concurrent.futures.ThreadPoolExecutor.map and STM, and on top of those, even more powerful abstractions that help the developer avoid concurrency bugs.

I'm really excited for PyPy. That is a project full of people who are not afraid to quickly iterate on powerful ideas that can make Python a high performance language that it deserves to be, instead of resorting to calling MT programming a "fringe case" and ad hominem attacks.

hynek · on June 29, 2012

I wish you’d read the article before making your accusations.

ak217 · on June 29, 2012

I have. There's an ad hominem attack in the middle of it. Otherwise it's a great article.

I understand that everyone here is acting in good faith and wants Python to be better, and the article otherwise contains lots of great information presented in a reasonable manner. You bring up lots of good points too. But other statements like the ones I mentioned are overly broad or brash.

hynek · on June 29, 2012

I think it’s just attrition by explaining the GIL problem over and over again. Nick is one of the major core developers and probably just fed up by the topic. So this bit of snark is all the pay he’ll ever get for his work on CPython.

scott_s · on June 29, 2012

I really liked the article. But look at how much of this thread is spent talking about that bit of snark. I think that mixing the snark in with all of the rational reasons for Python 3's existence and not working on the GIL made some people less receptive to rational arguments. In other words, I don't think it was worth it.

slurgfest · on June 29, 2012

"Pythonic abstractions like concurrent.futures.ThreadPoolExecutor.map"

Although it might be great stuff, the word 'Pythonic' is pretty funny next to a long Java-style name like that

bwood · on June 29, 2012

If you’re using python, the performance gap between processes with message passing and threads with locking is the last of your problems, believe me.

What would be the first of my problems?

hynek · on June 29, 2012

Not sure if you’re trying to troll me by taking it out of the obvious performance context but anyway: The performance penalty due to the usage of a un-JIT-ed scripting language?

bwood · on June 29, 2012

Not trolling you, just wondering what you thought was more important from a performance perspective. Re-writing working software in another language is not always feasible due to real-world time constraints, and in that context the performance difference between message-passing and threads would be the first of my problems. Basically, I just don't see the argument of "use a more appropriate language than Python" as a useful counter to criticism of the GIL. The whole point of criticizing Python (in my case anyway), is to hopefully nudge the language to suiting my needs more closely.

hynek · on June 29, 2012

The counter is not (even while some people try to turn it that was): THREADS SUCK, WE WON'T ADD THEM BECAUSE WE DON'T LIKE THEM. it is: given the circumstances (which Nick outlines verbosely) a removal of the GIL is not pragmatic.

And this is the last time I’ve wrote this, I feel like a street-organ. >:(

And all I was saying in this thread is that the performance gap between threads and processes isn’t that big of a deal, if you run non-native code anyway. The multiprocessing module is pretty cool.

bwood · on June 29, 2012

I'm not intentionally trying to make you repeat yourself. And I know that removal of the GIL is not pragmatic at the moment (or maybe ever), but that doesn't mean it wouldn't be valuable. The GIL wasn't much of a problem ten years ago because not many personal computers had multiple cores. Today it's become a bit of a pain for me personally, and it will only become more painful as core count increases while single-core performance remains largely stagnant.

And all I was saying in this thread is that the performance gap between threads and processes isn’t that big of a deal, if you run non-native code anyway. The multiprocessing module is pretty cool.

This is a line that I hear over and over again, but I strongly disagree with. It's not always easy to predict where your performance bottlenecks will be until you actually start implementing it in some language. If I've chosen Python for a project and find I need more cores, I'm stuck with either re-implementing critical sections of code in C extensions or other languages, or using multiprocessing. And multiprocessing is not that great because it splits the memory space across processes and communication between them is extremely expensive. And there are many caveats to this which cause enormous headaches (eg., you can't fork your process while having an active CUDA context, not all Python objects are serializable, pickling is slow, marshaling doesn't work well for all data types, you must finish dequeuing large objects from a multiprocessing.Queue before joining the source process, etc.).

Yes, I could get a 10-100x speedup by re-writing everything in C. But most of the time, I would be very happy with a 6-12x performance gain from just using threads in a shared memory space.

slurgfest · on June 29, 2012

I assume that for whatever reason, it is absolutely impossible for you to use any other concurrency model.

Did you try Jython?

bwood · on June 29, 2012

The specific project I'm vaguely referring to here is described in a little more detail as 1) here: http://news.ycombinator.com/item?id=4178070

No, I didn't try Jython. The choice of CPython was made before I took over the project, and there are also a dozen or so dependencies which I don't think are compatible with Jython.

slurgfest · on June 29, 2012

If you are trying to exploit parallelism, any kind of data sharing (other than pure read-only sharing) costs you.

seunosewa · on June 29, 2012

Yes, but with threading, the costs are minimized. You don't have to convert your Python objects to bytes, send them over a network, wait, retrieve bytes, and convert back to Python objects every time you need to read or write some shared data.

MetaCosm · on June 29, 2012

SMP Erlang DOES use threads, it then uses green-processes and a custom scheduler to schedule processes fairly and preemptively across those threads. It doesn't use like 1 thread per process, but threads are a key component of how it works.

It is absolutely nothing like spawning OS level processes. They are micro-processes, green-processes than live inside the Erlang VM.

slurgfest · on June 29, 2012

You are writing a forum which includes BBcode parsing, and Python's CPU usage is your bottleneck? Really?

You are devoting multiple cores to parsing an individual user's BBcode?

This is your example of a real need to remove the GIL?

seunosewa · on June 29, 2012

No, that was just an example of a unexpectedly CPU-bound operation. My point was that you can't assume your Python program is not CPU-bound. Every time you have to do something non-trivial in Python, performance could become an issue.

pwang · on June 29, 2012

How do you know it's not memory I/O bandwidth bound? Just curious.

scott_s · on June 29, 2012

From the perspective of a Python program, being memory bandwidth bound is the same as being CPU bound: you have the GIL, your process is in the running state in the OS, and is currently executing on a core.

(I assume you mean truly mean bandwidth between main memory and the processor, and not to disk.)

lrem · on June 29, 2012

Yeah, but do most programs that need to scale also need shared context? Otherwise, what are the big downsides of processes vs threads?

seunosewa · on June 29, 2012

Many programs can benefit from shared context. You can push all the shared context to your database, but it's often helpful to keep around some shared data structures for performance reasons. For example, you can cache pure Python functions using functools.lrucache, and share such caches between threads, but such caches can't be shared between processes. In-process data structures like dicts and lists are much, much faster than alternatives like memcached and redis because they avoid the overhead of IPC and deserialization, and they are also easier to use since they are built-in.

slurgfest · on June 29, 2012

dicts and lists are fast, but that doesn't mean a threading approach which must protect your dicts and lists with various kinds of locking will be that fast, because your program will now be waiting on locks. What are you trying to do with memcached that it is not fast enough?

seunosewa · on June 29, 2012

> What are you trying to do with memcached that it is not fast enough?

Fine-grained caching of objects that correspond to DB rows. Most pages touch hundreds of DB rows, due to the various relationships between objects. With memcached, you have to cache at a higher granularity and contort your code quite a bit to reduce the number of gets per request.

> dicts and lists are fast, but ... your program will now be waiting on locks

In my experience, the overhead of locking is often negligible. In Java-land, you can have millions of lock operations per second. IPC involves serialization, deserialization, and context switching, in addition to actual work. Most IPC routines are built on locks, anyway.

slurgfest · on June 29, 2012

The overhead of locking itself (which nobody has even mentioned) is completely distinct from the impact of waiting on locks. It's totally irrelevant that you can have millions of lock operations per second if you actually have any shared state to protect. If you are actually USING locks then you have threads waiting on them.

thu · on June 29, 2012

The following situation happens with other data than just unicode. This seems an argument for static typing in general.

  The reason this approach is problematic is that it means the traceback for an unexpected UnicodeDecodeError or UnicodeEncodeError in a large Python 2.x code base almost never points you to the code that is broken. Instead, you have to trace the origins of the data in the failing operation, and try to figure out where the unexpected 8-bit or Unicode code string was introduced. By contrast, Python 3 is designed to fail fast in most situations[...]

qznc · on June 29, 2012

I agree that static typing would helps for such bugs.

However, not "in general". The general problem is that bogus data leads to an irreparable situation. To fix that bug, you have to find out who broke the data. Consider for example, a circle in a linked list. You would need quite an expressive type system to prevent that statically (dependent types might suffice? Theorem provers are the ultimate hammer).

DannoHung · on June 29, 2012

Uhh, how would you make a linked list with a circle in it with the Haskell List type?

qznc · on June 29, 2012

  makeCircle rst = rst ++ makeCircle rst

The thing is that a circular list is actually an infinite list in Haskell. ;)

DannoHung · on June 29, 2012

If you want to prevent that then you want to take general recursion out of your programming language and if you're gonna do that you might as well make it Turing incomplete.

These aren't "circles" per se, they are partially evaluated recursive structures. If you expand them you end up with evaluated list structures that are non-self referential.

ufo · on June 29, 2012

This is not a real circular list though, since it does not tie the knot as in dbaupp's comment.

dbaupp · on June 29, 2012

  a = 1:a

ufo · on June 29, 2012

This trick is often called "Tying the Knot", if anyone is interested.

zanny · on June 29, 2012

Different implementations for different use cases in different situations :P

gardentheory · on June 29, 2012

Well the lighter solution and more Pythonic solution is strong typing. If a Unicode string and a byte string could not be combined then the problem would not exist AFAIK. For example "a" + 1 is an error, why is u"a" + "s" not an error?

xerula · on June 29, 2012

But b"s" + "a" is an error in Py3.

masklinn · on June 29, 2012

> For example "a" + 1 is an error, why is u"a" + "s" not an error?

"a" + b"s" is an error in Python 3.

gardentheory · on June 29, 2012

Sorry, that was my point. Strengthening the types in python3 solves the problem without static typing being needed at all.

masklinn · on June 29, 2012

ok.

Static types solve it earlier though, especially for little-exercised code paths which could be forgotten or missed in testing.

qznc · on June 29, 2012

This just gives a different error message. It does not help to find the error source.

The programmer would probably just try to fix it via str(u"a")+"s" or u"a" + unicode("s").

slurgfest · on June 29, 2012

It is an argument for failing fast in general. Static typing isn't guaranteed to pick up every mistake at its origin, because far from every mistake is of the "int rather than string" variety.

Since you have brought up the general case, the origins of many bugs involve something sophisticated going wrong which would require you to encode much of your program (hopefully not too redundantly) into a Turing-complete type system. Rather than stuffing square pegs into round holes, you could just write the appropriate error handling code (or even just asserts).

earl · on June 29, 2012

Please:

   alias hncomment='fold -w 77 -s | sed "s/^/   /" | pbcopy'

   use:
   echo "blah blah blah" | hncomment
   command-v

Thus:

     The reason this approach is problematic is that it means the traceback for 
   an unexpected UnicodeDecodeError or UnicodeEncodeError in a large Python 2.x 
   code base almost never points you to the code that is broken. Instead, you 
   have to trace the origins of the data in the failing operation, and try to 
   figure out where the unexpected 8-bit or Unicode code string was introduced. 
   By contrast, Python 3 is designed to fail fast in most situations[...]

gvalkov · on June 29, 2012

It's just one of the little things, but I'm still very conflicted about print as a function. The reasoning is sound, but I just can't seem to get used to it. It feels especially cluttered and clumsy with new-style string templates:

    print('ver: {}'.format(', '.join(str(i) for i in sys.version_info)).upper())

versus:

    print 'ver: {}'.format(', '.join(str(i) for i in sys.version_info)).upper()

Despite the more consistent naming (ConfigParser -> configparser), simplified api (iteritems() -> items()) and all other syntactic improvements, I somehow still find 2.x code more enjoyable. Writing small scripts in Python has kind of lost its charm for me.

This, of course, is all very subjective and I'll probably grow over it in a few thousand lines of code. I hope you're all less sensitive to the little things that annoy you.

dmbaggett · on June 29, 2012

In the last few hours, this has been silently added to the post, and I cannot post a reply on that site:

Jumping on the internet to say that “they” (specifically, the people you’re not paying a cent to and who aren’t bothered by the GIL because it only penalises a programming style many of us consider ill advised in the first place) should “just fix it” (despite the serious risk of breaking user code that currently only works due to the coarse grained locking around each bytecode) is also always an option. Generally speaking, such pieces come across as “I have no idea how much engineering effort, now and in the future, is required to make this happen”.

Fine, then don't bother. But don't insult us or call foul for our pointing out the downside impact on us of this decision. Your assumption that we are naive rubes who don't know how to code is really, really wrong.

I have a very good idea how much engineering effort would be involved in fixing the GIL, and I am well aware that Python has involved many person-millennia of gratis work, and am appreciative of both. However, I still disagree with the Python devs' obviously entrenched position that fixing the GIL isn't worth the effort, and I will continue -- even when shouted down by the likes of you -- to advocate for the GIL's removal or some equivalently good solution. (As I said earlier, I am not opposed to STM solutions, but the current one performs unacceptably without special-purpose hardware.)

Why? Because I have a single, selfish interest in this. I depend heavily on Python now, and want the language to be better. I have written many lines of Python 2 code that rely on the threading primitives in the standard library. Perhaps it was foolish of me to expect that the threading model offered by the standard library, modeled on Java's threading primitives, would some day work in the same way as Java threads do in practice. Nonetheless, I am left with a real world problem: my CPU-bound threaded Python code does not scale well to multiple cores. I need the GIL fixed, or to rewrite my Python code, or to migrate to another language that supports the standard model of threading programming that real-world programmers have been using for several decades, and which has built-in support from all major operating systems. Or, sure, wait for STM to be ready for prime time and migrate my thread-based semantics to the new STM-based semantics.

The best path right for me right now is migration to Jython or IronPython. But then we are still unsupported orphans, living in the third world of Python 2.X.

I guess it comes down to: do you want people to actually use this language to write programs they want to write, or do you want Python to be an advocacy platform for "correct" programming? Python's pragmatism has always appealed to me, so the ivory tower reaction to the practical concerns around the GIL really seem dissonant. (And this is coming from an MIT AI Lab Lisp guy who would rather write everything in Lisp. But Lisp lacks Python's wonderful, high-quality third party libraries and developer-centric pragmatism regarding unit testing, documentation, etc.)

I know you are tired of hearing people bitch about the GIL, but, really: people write multithreaded programs. They should work as advertised, using native OS threading primitives and taking advantage of the native OS thread scheduler. Why does Python offer threading primitives if the language is not meant to support, from a practical standpoint, multithreaded programs?

old-gregg · on June 29, 2012

...the standard model of threading programming that real-world programmers have been using for several decades...

What standard? And whose "real world"? The need for threads has always been controversial even among OS kernel devs. UNIX/Linux/BSDs have twisted and non-trivial threading histories peppered with religious wars similar to this one. And which "several decades" are you talking about?

There is no such thing as "standard threading model". To some, a thread is just a flavor of fork() with a wrong parameter and plenty of "real world" programmers continue to believe that kernel-level threads is a hack. And please, do not make it sound like Python threads are useless. Far from it.

Python threads are not what you are used to. That's pretty much TL;DR of your comment.

...Python's pragmatism has always appealed to me, so the ivory tower reaction to the practical concerns around the GIL really seem dissonant...

I feel like they are being dragged into it though. The original motivation behind GIL support has always been a pragmatic one: removing GIL will make the entire codebase more complex, harder to hack on and will complicate and slow down the development/maintenance of the libraries. That's pretty pragmatic.

But a fairly vocal groups of users started to claim, similarly to you, that programming with threads is supposed to work like they expect it to work according to make-believe "threading standard", to which GIL supporters (correctly, IMO) replied that shared memory + locks is not the only/best approach to concurrency. It is easy to be offended by this answer but it doesn't invalidate their point.

dmbaggett · on June 29, 2012

Python threads are not what you are used to. That's pretty much TL;DR of your comment.

Correct. I am used to programming language threads that work the way computer scientists and programmers have typically described them -- for example, as in this (I hope uncontroversial) Wikipedia article:

http://en.wikipedia.org/wiki/Thread_(computer_science)

When I say "standard model of threading", I am not talking about nuances of call conventions to the underlying OS thread primitives. I am talking simply about running multiple streams of instructions, bytecodes, or other units of computation in parallel, within a single OS process.

haberman · on June 29, 2012

That Wikipedia article defines threads in terms of operating systems. Only one small part of that article concerns how threads are exposed to programming languages.

You can't talk about running multiple streams of instructions or bytecodes in parallel without talking about the nuances of how they share memory. Semantics of a multithreaded memory model are a highly "opinionated" thing -- there are lots of possible ways to define it, and the definition can have widespread effects on efficiency, ease of programming, and the guarantees that the runtime can provide. For example, an important aspect of a Python memory model would be that no Python program can SEGV the interpreter due to a race condition.

I recommend the following reading to get an appreciation for how much really goes into a memory model and how far from "simple" or "standard" it is:

  http://en.wikipedia.org/wiki/Memory_model_(computing)
  http://en.wikipedia.org/wiki/Java_Memory_Model
  http://www.kernel.org/doc/Documentation/memory-barriers.txt

Python is a lot harder to define a good memory model for than say Java, because in Python lists and dictionaries are primitive objects. If you say:

  x['A'] = 1

...that is a single operation that must not corrupt the dictionary, even if multiple concurrent threads are mutating it. In practice, this means that you need to either make every such mutation wrapped in a lock (which adds a lot of locking overhead) or you need to use lock-free data structures (which are still relatively experimental and architecture-specific).

dmbaggett · on June 29, 2012

I agree that a so-called dynamic language like Python is at something of a disadvantage because it must make atomicity guarantees that lower-level languages like C need not.

I still don't think it's reasonable to conclude that typical programmers are fine with their threads not really running in parallel, or that the GIL isn't worth bothering to fix, even though fixing it would be hard. In my original post yesterday, I pointed out that as the language footprint has grown, Python's disadvantage in this respect has increased: it is much harder to remove the GIL now than it was in, say, the 1.5 era when there actually was a (problematic) GIL removal patch.

We've gotten way off track, but the original point I was trying to make was that 1) the GIL really is a problem for not-purely-theoretical programs written by competent developers, and 2) that the 2->3 transition, by complicating the language and increasing the workload for the alternative implementations, has made it less likely than ever that the GIL problem would be resolved.

And, indeed, Nick explicitly confirmed this by saying the GIL is basically a dead issue for the CPython devs. His post made many good points about the merits of the 2->3 transition, and in particular pointed out some ways that 3 has reduced work for the alternative implementations, but I remain unconvinced overall. And not out of ignorance or incompetence, as he implied.

haberman · on June 29, 2012

I still think your position is unreasonable, because your inherent assumption is that the GIL is a "problem" that needs a "fix." This terminology is appropriate for a situation where the status quo could be improved without giving up any of the benefits of the current implementation. But this is not the case; removing the GIL in the way you advocate would add CPU and memory overhead that everyone would pay, even in the single-threaded case. And this is to say nothing of the practical problems of maintaining compatibility with existing C extensions.

The GIL is not a bug, it's a threading model. You wish the threading model was something else. You insist on your particular vision of an alternative threading model without acknowledging its downsides. You make no indication that you have actually considered or tried the alternative concurrency models that CPython does support, like multiprocessing, greenlets, or independent processes. You make no objective arguments for why your desired threading model is better than the ones that are currently available, except that you could avoid changing your code. You accuse Python of failing to live up to some accepted standard for what a "thread" should be, when in fact no such standard exists, especially for high-level, dynamically-typed languages like Python. If anything, newer languages are moving away from shared-state concurrency; see Erlang, Go, and Rust.

I don't think you have malicious intentions, but I urge you to reflect on what you are demanding and whether it is reasonable. What may look to you like "obvious" brokenness that demands an "obvious" fix is really a lot less clear-cut than you seem to think it is. I feel for the Python developers who have to deal with this complaining all the time.

comex · on June 30, 2012

To Python-level code, Python's threading model is pretty much exactly the same as that supported in all "fast" languages such as C and Java (even Go, ever pragmatic, has locks). Given that Jython already allows true multithreading and PyPy is trying to emulate it with STM, it's reasonable to see the GIL more as an implementation bug that won't be fixed for practical reasons than as a threading model... even if Python also supports alternate threading models that are perhaps better for most applications anyway (if strictly less powerful).

haberman · on June 30, 2012

> To Python-level code, Python's threading model is pretty much exactly the same as that supported in all "fast" languages such as C and Java

Yes, but Python also exposes higher-level operations like table manipulation as language primitives.

> Given that Jython already allows true multithreading

That may be, but as I mentioned this has an inherent cost, both in CPU and in memory. Therefore it is not a strict improvement over CPython, just a different direction.

Flimm · on June 29, 2012

"They should work as advertised, using native OS threading primitives and taking advantage of the native OS thread scheduler."

Just to be clear, Python does use native OS threading primitives, and it does make use of the native OS thread scheduler. Also, Python does support, from a practical standpoint, multithreaded programs, but only if the program is not CPU-bound. I think you could rewrite that paragraph to make the role of the GIL clearer.

dmbaggett · on June 29, 2012

Fair enough;the distinction is subtle. Python uses native system threads and of course such threads are scheduled by the OS. But in practice, the GIL allows only one thread to run at a time unless the GIL is subverted via C extensions or similar. I get why this is, given Python's architecture and the legacy of the GIL (which dates to the late 90s, when multicore machines were relatively rare and expensive).

But it's still not the case that multiple threads "normally" run in parallel, with the OS ensuring fairness, which is what (I think) most programmers would expect threads to do in a general-purpose threaded language.

slurgfest · on June 29, 2012

It simply isn't true that "the GIL allows only one thread to run at a time."

"Note that potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck."

The sky isn't falling, in the worst possible case you can still use Jython or whatever

dmbaggett · on June 29, 2012

Right, that's why I qualified my statement with ...unless the GIL is subverted via C extensions or similar.

The sky isn't falling, in the worst possible case you can still use Jython or whatever.

Sigh. I could port to C++ or whatever, too.

slurgfest · on June 29, 2012

The entire interpreter is written in C. Using facilities written in C, as documented, is not "subverting" anything. That's ridiculous hyperbole and it really doesn't help your credibility.

Flimm · on June 29, 2012

"unless the GIL is subverted via C extensions or similar."

I would hardly call I/O with Python built-in functions to be a subversion of the GIL.

cpeterso · on June 29, 2012

Python could avoid GIL problems with atomicity by implementing actor-based concurrency. Like Erlang or Rust (and unlike Go), actors could send deep copies of objects to each other using channels. Each actor would have its own GIL. To maintain compatibility, C extensions would only run on the primordial actor thread. After some testing, C extensions could opt-in to be multi-actor safe.

If Guido doesn't want to implement actor concurrency within the Python interpreter, someone could write their own Python host that allows multiple, independent instances of the Python embedded interpreter. The host would implement a C extension that allows interpreter instances to send deep copies of objects to each other.

toyg · on June 29, 2012

Unreadable on iPhone, what's up with its stylesheet? Safari won't even zoom...?

hynek · on June 29, 2012

Yeah, it seems to be the default RTFD style…try: http://www.instapaper.com/text?u=http%3A%2F%2Fncoghlan_devs-...

wololo · on June 30, 2012

I wish python 3 would make default parameters immutable. It seems to go against the "simple", "explicit" python philosophy, not to mention that (it seems) most people tend to learn about the feature through debugging.

description: http://effbot.org/zone/default-values.htm

hynek · on June 30, 2012

Yeah sometimes I wish there were a warning whenever someone defines a function with [] as def arg.

wololo · on June 30, 2012

I know pylint and pychecker do this, but not pep8. Opinions on those?

ash · on June 29, 2012

Why can't I use Python 2 modules from Python 3 program? Is there any effort to make it possible?

hynek · on June 29, 2012

There is no such thing as "Python 2" or "Python 3" modules. There are just modules that will go belly up because they are incompatible with one or another.

I presume you mean that you can’t access your old Python 2 modules from a new Python 3 installation. That’s just because Python installations usually don’t share their modules (i.e. site-packages). You can try to install them using Python 3 (usually just python3 setup.py install) and see if they work or not.

lrem · on June 29, 2012

There's a deeper question here: why doesn't the interpreter contain a compatibility layer for the old code? There's no really good reason for the old modules to go belly up in the new interpreter. At least no other than the man months needed to implement that layer. But then, that probably would still be orders of magnitude less than migrating all the libraries that keep people from migrating...

ash · on June 29, 2012

Thanks for a better wording of my question!

Even PyPy project doesn't have a goal of running py2-compatible and py3-compatible modules inside one interpreter:

> At the end of the project, it will be possible to decide at translation time whether to build an interpreter which supports Python 2.7 or Python 3.2 and both versions will be nightly tested and available from nightly builds.

http://pypy.org/py3donate.html

hynek · on June 29, 2012

For one, they didn’t start out with the baggage of nearly 20 years but started building a JITed VM which was polyglot by design.

hynek · on June 29, 2012

That’s essentially what __future__ imports are for – things that go further would be simply too much work and Python 3 was about shedding weight, not shipping essentially two interpreters. (while they are very similar from outside, the innards couldn’t be more different)

lrem · on June 29, 2012

Cool. So this way porting things to Py3k got easier. That perfectly explains why all the critical libs have been ported fast and no one remembers about Python 2 anymore, doesn't it?

Shedding weight is sometimes a step in a good direction. But we need to draw a line at some point. As the history of Python 3 adoption shows, that point maybe was not optimal.

Edit: forgot an angle:

Also, __future__ is fixing the incompatibility of what people are using with something that they can't use. I'm not sure if it's the right problem to solve.

hynek · on June 29, 2012

As it has been pointed out for several times: the adoption of Python 3 is doing just fine. It has never been expected that everybody will be using Python 3 today.

But look at all those current efforts at Canonical, Django, Twisted…it’s not like nothing is happening and we expect a knot to burst. Far from it! Porting started slowly and has gained a momentum by now that has surprised myself. It’s not like we changed the language completely like perl6 did.

In the result, we’ll have a better Python for it.

ash · on June 29, 2012

Well, you can't do this:

  from __future__ import python3

(via http://www.aaronsw.com/weblog/python3)

smeg · on June 29, 2012

Python3 and Perl6 can both fuck off.

I look forward to the day PyPy is considered the real Python. Look at PyPy's homepage (http://pypy.org/), doesn't even mention Unicode as a significant feature. Instead it talks about speed, security, concurrency, and compatibility with the current real Python (2.7.2) - all the things real Python programmers care about and expect the Python developers to focus on.

PyPy may be some way off but I want to find the developers and hug them for setting the right vision and trying.

I just donated $50 and if you hate Python3 you should too.

sho_hn · on June 29, 2012

Or if you really like Python 3 (as I do!) you can donate those $50 to Python 3 support in PyPy: http://pypy.org/py3donate.html

lclarkmichalek · on June 29, 2012

Any reason they can "fuck off"? Anyway, you're incorrect about pypy. "Real" python developers will likely not run their code in a sandboxed environment, so PyPy's security advantages are minimal, added to the fact that your sandbox is only as good as its configuration. The concurrency is interesting, but there are numerous other solutions available, and that alone makes it unlikely that the de facto implementation of the language would include one. Claiming that PyPy is better on compatibility than CPython is ridiculous; last time I looked, anything using a C extension had to be rewritten to use ctypes. On speed, yes PyPy is impressive. But in python, anything that is particularly CPU intensive should probably be written as a C extension and PyPy takes a hit when using ctypes iirc. In general, I don't see the point of people trying to optimize generic python code. Most code never gets profiled, and most code isn't a problem. At least, not a problem enough to switch interpreter. Definitely not a problem enough to ignore the massive problem that is python 2.x's unicode implementation.

Deestan · on June 29, 2012

> all the things real Python programmers care about

Guess I'm not a real programmer, then, since I find proper Unicode handling to be a welcome feature, and find the implementation in Python2 is painful.

msz-g-w · on June 29, 2012

I think you should take a walk, and then come back and read this ridiculous stuff you wrote.

dbaupp · on June 30, 2012

It's easy to dismiss nice builtin Unicode support if you are an English speaker in an English speaking country where all applications you write can conveniently ignore other alphabets without meeting any complaints or problems.

sutro · on June 29, 2012

Unconvinced.