
A New Chapter for PyPy - zbb
https://morepypy.blogspot.com/2020/08/a-new-chapter-for-pypy.html
======
pansa2
From pmatti’s comment at
[https://old.reddit.com/r/Python/comments/i8ksfc/a_new_chapte...](https://old.reddit.com/r/Python/comments/i8ksfc/a_new_chapter_for_pypy/):

> _PyPy core dev here. PyPy will always remain free and open source. No
> worries there. The question we and many other open source projects are
> trying to deal with is how to fund our developers under that constraint. We
> felt that it was time to explore other alternatives._

------
dwheeler
Weird. There is no information on what they are changing to, only what they
are changing from. I would like to see what they plan to do instead.

~~~
fijal
That will likely be the contents of the next blog post

------
apendleton
I've had great success with using pypy in production, but found that it was
sometimes tricky to figure out how to fully leverage the performance wins on
the table, especially if the regular way you might do a thing in cpython
depended on a C extension that was either unsupported on pypy, or supported
but slow. I could imagine a real demand for people with the specialized
expertise necessary to do this for commercial customers, so I wonder if the
plan here is for pypy be funded by this kind of consulting, and maybe exist
under an entity that can make that happen (i.e., that's allowed to do for-
profit consulting work)?

~~~
acdha
That’s what I’d guess at, something like Red Hat: you pay us for support &
consulting for OSS work you don’t want to have in-house developers for.

------
humanistbot
So I have no direct knowledge of this case, but this seems to be an issue over
fiscal sponsorship. Like a lot of smaller and younger open source projects,
PyPy had delegated financial and accounting responsibility to the Software
Freedom Conservancy, who acts as their fiscal sponsor. SFC is a registered
non-profit with professional accountants, taking donations on behalf of lots
of projects including PyPy and distributing it to the project on request from
the project's leaders. This is a common model in open source, there are
similar organizations that do this like Software in the Public Interest,
NumFocus, or the Open Source Collective.

One issue that happens is when fiscal sponsors take a cut on all
revenue/donations raised by the project in exchange for their services, which
is 10% for SFC [1]. Fiscal sponsors also have policies around what they will
reimburse, which can become as stringent and as corporate travel policies [2].
I can totally understand a project getting frustrated at their fiscal sponsor
and wanting to either start their own and do it all themselves or find another
sponsor.

[1]
[https://sfconservancy.org/projects/apply/](https://sfconservancy.org/projects/apply/)

[2] [https://sfconservancy.org/projects/policies/conservancy-
trav...](https://sfconservancy.org/projects/policies/conservancy-travel-
policy.html)

~~~
jamesdutc
The admin fees you refer to are typically assessed in the context of large,
institutional grants. These are grants from private organisations like the
Gordon and Betty Moore Foundation, the Alfred P. Sloan Foundation, the Chan
Zuckerberg Initiative, or government agencies like the US National Science
Foundation. These grants have significant financial accounting requirements.
There are also many other legal or operational costs associated with these
grants.

When projects run these grants through parent institutions like universities,
the typical admin fee is >40%. In some cases, it can be as high as 60%. Many
projects are eager to enter fiscal sponsorship agreements with organisations
like NumFOCUS, because so much more of their grant money goes to funding the
work.

In the case of NumFOCUS, admin fees do not adequately cover the staff
requirements to manage these grants. NumFOCUS "loses money" when servicing the
administrative needs of these grants & it takes this responsibility onto
itself solely for the betterment of the projects.

Rather than assess administrative fees similar to universities, NumFOCUS uses
its other fundraising—corporate donations, event (i.e., PyData) sponsorship,
individual giving—to finance its operations.

source:

\- I serve on the NumFOCUS board of directors as its co-chair.

\- I presented on this topic last year at the NumFOCUS Annual Summit to an
audience of core developers from projects like Julia, Jupyter, Pandas, NumPy,
AstroPy, &c.

\- NumFOCUS budgets are public, and all of the above information can be
corroborated from materials published on
[https://numfocus.org/](https://numfocus.org/)

------
riffraff
Is anyone here using pypy for their daily job? What do you use it for?

~~~
rciorba
I did. Inherited a legacy web app that did stupid things in Python in memory
(basically search and aggregation).

I realized a rewrite was the best course of action, but in the meanwhile the
old thing had to stay up and running, and as the volume of data increased, it
started to run in to HTTP timeouts as often requests took longer than 2
minutes.

I moved the thing to PyPy, and got about a 30% speedup from that. Only one lib
had to be replaced with a pure python alternative, as it was using a C
extension.

It bought me enough time to finish the new implementation (duplicate the data
in Elasticsearch, hey presto from over a minute to about a second to get
results).

For some workloads PyPy's JIT can do wonders.

~~~
PaulHoule
I parse big XML and similarly structured files, convert them into RDF, puff
them up into a (still RDF but with a lot of blank nodes) hypergraph so I can
load the content into a single database and be able to trace that these two
facts are related and come from this part of document A and that part of
document B.

I have document parsing and SPARQL queries that can take a few minutes that
I'd like to run frequently so I can keep all parts of the system up to date.

I've only benchmarked it a bit, but I found I got approximately the five times
speed-up that PyPy promised. This is with PyPy based on Python 3.6. I think
PyPy is switching to cffi as the way to connect to C code so most native code
"just works" now.

I had to backport my code from Python 3.8; Python 3.6 lacks contextvars, but
there is a polyfill for that, otherwise there was no problem.

I stayed away from PyPy for a long time because it was tied to Python 3.5
which was busted in various ways. One of those was that the filesystem path
objects were half-implemented, you should have been able to pass them into
anything from the stdlib that expected a string path and at that time you
couldn't. Little accidents like that can slow down a technology like PyPy from
being adopted.

~~~
rciorba
> I think PyPy is switching to cffi as the way to connect to C code so most
> native code "just works" now.

As far as I know extensions need to be written for cffi specifically.

cffi is a newer way of writing C extensions, developed by the PyPy project. It
was designed to have a smaller&cleaner interface to let you call C code from
Python. Here's Armin Rigo talking about it at EuroPython:
[https://www.youtube.com/watch?v=ejUzVcvTLgI](https://www.youtube.com/watch?v=ejUzVcvTLgI)

The CPython way of writing extensions is documented here:
[https://docs.python.org/3/extending/extending.html](https://docs.python.org/3/extending/extending.html)
It seems to require you to deal with the internals of the CPython interpreter
(deal with PyObject structs, reference counting, etc).

I know PyPy has some support for CPython extensions, but it has to emulate
some internals and it's slower as a result.

------
pyuser583
Reminds me of the tweets by Jeff and Mackenzie Besos announcing their divorce:
super happy and optimistic, but containing no information at all.

~~~
unexpected
I mean, why should the general public have information about their divorce at
all?

~~~
smitty1e
"We love dirty laundry."\--Don Henly

------
sgillen
I’m not familiar with the situation, but I guess this means the pypy team can
try to monetize the project somehow now?

------
satisfaction
They likely mean that there will always be a project called "PyPy" that will
be free software but they will move to an open core model and introduce
"PyPyPro" where are the cutting edge development will take place, PyPy may or
may not be crippled.

------
mwcampbell
I think funding of ambitious open-source projects such as PyPy will continue
to be a problem as long as these projects use pushover (a.k.a. permissive)
licenses such as the MIT license. It's time for these developers to take a
stand for what is fair, by relicensing to a copyleft license such as Parity
[1] and selling proprietary licenses to companies that can and should pay.

[1]: [https://paritylicense.com/](https://paritylicense.com/) (not affiliated,
I just think it's a good and fair license)

~~~
chriswarbo
I've not heard of this license before, and a search for "parity" comes up
empty on [https://www.gnu.org/licenses/license-
list.en.html](https://www.gnu.org/licenses/license-list.en.html) and
[https://opensource.org/licenses/alphabetical](https://opensource.org/licenses/alphabetical)
so it's not Free Software or Open Source.

It might be free software or open source (i.e. in spirit), but I don't have
the patience to read potentially-dubious licenses (that's what the FSF and OSI
are for!)

~~~
infogulch
I'll add in
[https://choosealicense.com/appendix/](https://choosealicense.com/appendix/)
as a place where it would be nice for Parity License to be documented for
comparison reasons.

------
Animats
_PyPy will remain a free and open source project, but the community 's
structure and organizational underpinnings will be changing and the PyPy
community will be exploring options outside of the charitable realm for its
next phase of growth ("charitable" in the legal sense -- PyPy will remain a
community project)._

In other words, volunteers can contribute but others get to monetize?

 _People like you helping people like us help ourselves_ \- Processed World.

~~~
no_wizard
In essence these projects live and die with funding. Donations just aren’t
enough to pay the bills for full time developers there isn’t any real
alternative.

I wish there was more corporate giving to foundations that could handle this
sort of thing but we never built that culture in software unfortunately.

I don’t think it’s fair to frame this negatively at all, really misses the
nuance of these situations

~~~
R0b0t1
I do think he captured it quite well -- that they are leaving it as a
community project but only directly monetizing it for some people feels, well,
wrong? It might be more neutral if they allocate funds for bounties and let
anyone claim them, with the core developers obviously being able to address
most bounties the fastest.

~~~
__s
No need for it to feel wrong

There are a few people who basically run PyPy development. They can do as they
please. It's open source, so if you're so against it, you can make a "nobody
profits" fork. Most outside contributions to open source projects are made by
people who wanted to scratch an itch & then let the existing maintainers
maintain that improvement. Their reward is the great software. This is still
there so long as PyPy commits to remaining freely available

~~~
R0b0t1
Commercializing the project proper seems incorrect. If they want to take on
consulting due to their experience that seems far better.

I never said nobody can make money.

------
fxtentacle
I'm surprised that they didn't mention numba.jit, which solves the same basic
problem as pypy (faster numpy calculations) but in a different way that is
easier to mix with existing python frameworks.

For example, TensorFlow with numba preprocessing is easy, just install both
packages and it'll work. TensorFlow with pypy requires a 5 hour compile and 40
GB of temporary storage. Plus some source code fiddling inside TF, if I
remember correctly.

Even as an open source project, pypy should honestly consider who they're
competing with for users and funding.

~~~
oefrha
PyPy is targeted at general purpose workloads, NumPy support is totally an
afterthought, so the basic problem it set out to solve was/is definitely not
faster numpy calculations. Numba on the other hand is way more targeted at
numeric workloads.

Besides, I don't see why they should mention any other project in a post
announcing their departure from the Conservancy. The only surprising thing is
no mention of the funding model they're moving to, other than a rather vague
hint, "exploring options outside of the charitable realm".

~~~
fxtentacle
Based on your comment, I would guess that you never tried out numba. Of
course, it can also do general python and loop optimizations. And in my
experience, numba worked for every case where I couldn't get pypy to work.

And I stand by my opinion that that is something that the pypy developers
should consider: is this actually usable as a solution to practical problems?
Or is there something else that people use instead? If so, why? Analyzing your
competition is usually a good way to learn about your own strengths and
weaknesses.

~~~
oefrha
> Based on your comment, I would guess that you never tried out numba.

Well, you guessed wrong.

> it can also do general python and loop optimizations.

Yes, it can be used in general purpose workloads, with varying degrees of
success. But its main purpose is made abundantly clear:

 _Accelerate Python Functions_

 _Numba translates Python functions to optimized machine code at runtime using
the industry-standard LLVM compiler library. Numba-compiled __numerical
algorithms__ in Python can approach the speeds of C or FORTRAN._

 _Built for Scientific Computing_

 _..._

[https://numba.pydata.org/](https://numba.pydata.org/)

> ... Analyzing your competition is usually a good way to learn about your own
> strengths and weaknesses.

Except this is an announcement on their funding situation, so strengths and
weaknesses are completely irrelevant, unless Numba has a particularly
interesting funding model. (The funding model is government grants and
corporate sponsorship, so, not particularly interesting.)

~~~
wrmsr
> Built for Scientific Computing

I mean the thing's called 'numba' lol.

I always liken Pypy to HotSpot in that to this day the numerical performance
of the latter isn't spectacular and nobody really cares - it's built to handle
the harder job of making vast tangled codebases of non-numerical application
code run fast, not just tight math loops which are already handled perfectly
well by other more specialized tools.

------
rurban
Weird that Simon Cross completely lost us here. He normally can write
meaningful stuff. See e.g.
[http://hodgestar.za.net/blog/](http://hodgestar.za.net/blog/)

"All good things must come to an end". No, they don't.

~~~
rurban
On the other hand, this is a good explanation:
[https://sfconservancy.org/news/2020/aug/12/pypy-
transition/](https://sfconservancy.org/news/2020/aug/12/pypy-transition/)

~~~
asah
$220,000 over nine years for multiple developers. Ouch.

