
A green threading library with true concurrency for Python - lukes386
https://github.com/mmirman/conpig
======
jacob019
I don't really get what advantage this gives me beyond using gevent. There's
still no parallelism. The readme describes it as "A solution to concurrency,"
but I already get that using gevent. Is it just to improve communication
between green threads?

~~~
jerf
It appears to add some amount of Python-bytecode level preemption to gevent,
which allows you to hopefully avoid some of the pathological cases of
cooperative scheduling. Said pathological cases are only a matter of scale...
if your program becomes large enough, you _will_ hit them, eventually.

That said, with no offense intended to mirman, I'd really hesitate before
using this for anything serious enough to reach that scale in the first place.
Gevent, frankly, visibly pushes Python to the limits (and occasionally a bit
beyond), trying to also tack on some preemption on an environment not
fundamentally expecting it would scare me another notch.

~~~
hogu
what are the pathological cases one runs into with gevent?

~~~
jerf
With a cooperative scheduler, you _will_ , sooner or later, experience some
form of starvation. The most obvious is the process that just infinitely
loops, but less obvious situations will end up popping up too; calls that you
thought were handled by the event loop but turn out to be blocking and add up
when you start calling them at scale, strange behavior when you have a set of
processes that turn out to yield far less often than you thought and the
system starts behaving with much higher latency than it should if you get too
many of them, and all kinds of such manifestations. There's also the inability
to create things that watch other things; if something does go spinning off
into infinity, nothing else gets to run to kill it.

You can hack around many of them, but you eventually hit a wall, and the
effort of the hack increases rapidly.

Note I didn't really use gevent in this reply, this is just about cooperative
scheduling. There's a reason why we've all but completely abandoned it at the
OS level for things we'd call "computers" (as opposed to "embedded systems" or
"microcontrollers", etc). I tend to consider cooperative scheduling an
enormous red flag in any system that uses it... and yes, that completely and
fully includes the currently-popular runtimes that use it.

~~~
anonymoushn
It sounds like this sort of problem comes from using a cooperative scheduler
to implement concurrency of arbitrary routines rather than control flow. I
haven't been in a situation in which it would even be possible for something
to yield less often than I expect, because I expect it to run until it yields.
Similarly I don't often find that subroutines return too infrequently because
I expect them to run until they return.

This library is probably nice for the places I would otherwise use threads.

~~~
jerf
You _will_ eventually, at scale, be wrong about that. To have full and correct
knowledge of exactly how long your code takes to run sufficient to do this
sort of scheduling correctly, by hand, in advance of running it, is basically
equivalent to claiming that you never need to profile code because you already
know exactly how long it takes. And it is well known and established to my
satisfaction that even absolute, total experts in a field will still often be
surprised about what actually comes out of a profiler, even in code strictly
in their domains. You may well be right _most_ of the time... but that is all
you can hope for.

~~~
anonymoushn
If it takes more than 16.67ms to run a frame's worth of update-and-draw, then
it does, and replacing "wake up every in-game entity that asked to wake up
this frame" with "let a preemptive scheduler manage ~10,000 threads that want
to wake up, do almost nothing, and then sleep for k frames, while some master
thread waits on a latch until they're all done" seems unlikely to make it any
faster. If the logic my server must perform to handle a request is expensive,
then it is, and replacing an event loop with a single-threaded preemptive
scheduler will not increase throughput.

I'm not sure why it is difficult to do this sort of thing correctly. The
scheduler does next to nothing in the "server with connections managed in
coroutines" case and probably makes matters worse in the "storing game state
in execution states" case. It could have a positive impact in the server
application if one routine is secretly going to crash or run forever, in the
sense that the other routines will continue running while the problematic
feature is fenced off or fixed.

------
pekk
Since this is a very new project, it is a good time to suggest that you abide
by PEP8 (e.g., no 'waitAll') since this would be widely appreciated, and is
not easy to fix later on.

~~~
mirman
waitAll gone. Before I put it in any package managers it will get style
guided. This is currently on version -1.0.0.

~~~
derleth
> version -1.0.0.

Version negative one? I don't think I've ever seen that before. Usually, the
very earliest versions of software are numbered like 0.0.1 or something like
that.

~~~
mirman
\- here actually negates the ordering of the numbers in the list. -1.0.0 ===
0.0.1.

~~~
derleth
> \- here actually negates the ordering of the numbers in the list. -1.0.0 ===
> 0.0.1.

Well, I've certainly never seen that before.

~~~
ceol
It should be written 0.0.1[::-1]

------
fusiongyro
> Conpig threads still can only run on one core of a processor.

The disillusionment caused by having so many options for non-parallel
"concurrency" in Python is, I believe, feeding the high defection rate from
Python to Go.

~~~
mirman
The lack of options for parallel (and non-parallel) concurrency is, along with
other things, feeding the high defection rate to many other languages.

~~~
pekk
We've had processes, threads and greenlets for a while now... if anything the
problem isn't that there are no options, but too many options that require
understanding to choose and apply.

Many of the people complaining about this issue don't have a demonstrated
problem and could try any simple approach first (if the point is not just to
slam Python in favor of something else, from the beginning).

~~~
MetaCosm
This type of response is why I gave up on Python entirely. Not to pile on you
pekk, it isn't your fault, but it is a tone... defensive apologist... "first
of all there is no problem, and if there was a problem... it is that Python is
too awesome"

As someone who has had to ship stuff using multiprocess & gevent to actually
meet real world scaling needs -- and integrating them with C code and
communicating to a C++ application via ZMQ (inprocess by sharing the object)
... the sad fact is once we started to tackle really hard problems in Python
that aren't pre-solved via a nice library all those early advantages fell away
and we craved the blessed simplicity of C++ (note: sarcasm, C++ isn't simple,
but it was far simpler than the Frankenstein's monster we built).

------
rektide
The resounding question in my mind is why this, when there is Stackless
Python? What's better about this greenthreading impl?
[http://www.stackless.com/](http://www.stackless.com/)

~~~
mirman
This is a library that can be used to supplement any implementation.

If I'd had the option to switch us to stackless easily, and I could guarantee
it was as fast, worked with all the libraries, and was as stable, I probably
wouldn't have written this. I imagine that there are a lot of people in the
same boat, where switching interpreters isn't really an option.

------
mietek
> Conpig threads still can only run on one core of a processor.

This isn't true concurrency. Scaling to 20 million requests per second over 40
cores on a single machine is true concurrency.

~~~
mercurial
Single-core concurrency is concurrency is concurrency (you have different
computations occurring concurrently, even if they are not executing at the
same time). What it's not is parallelism.

~~~
stonemetal
Concurrent means at the same time. You can't have things happening
concurrently but not at the same time.

~~~
silverlake
This may help explain the difference: [http://blog.golang.org/concurrency-is-
not-parallelism](http://blog.golang.org/concurrency-is-not-parallelism)

~~~
stonemetal
Should I have thrown a literally in there? Literally Concurrent means at the
same time. Parallel literally means non intersecting. I understand some
dumbass has\is trying to co opt the language it doesn't mean I have to like
it.

If one of them should mean one thing and the other the other why not have the
one that literally means at the same time for the term that means at the same
time. And have the one that means non intersecting mean threads of execution
that are beside each other but not touching without lots of pain and
suffering. Sorry, rant bit off.

~~~
mercurial
> I understand some dumbass has\is trying to co opt the language it doesn't
> mean I have to like it.

That's called a lingo or a jargon (in this case, programmer lingo), and it
involves making up words or having a different definition for a given word in
the correct context.

------
erikb
Where's the library? I see about 50 lines of code that don't go far beyond a
hello world of Gevent and some strangely written nose tests.

