Hacker News new | past | comments | ask | show | jobs | submit login
Top Python libraries of 2016 (tryolabs.com)
321 points by jonbaer on Dec 22, 2016 | hide | past | favorite | 40 comments



"According to the Sanic author’s benchmark, uvloop could power this beast to handle more than 33k requests/s which is just insane (and faster than node.js)."

I keep looking for excuses to focus on Python instead of Go and claims like this one make me happy. However, when I check the benchmarks I get confused:

http://www.techempower.com/benchmarks/#section=data-r13&hw=p...

Of course this looks amazing:

https://magic.io/blog/uvloop-blazing-fast-python-networking/

But, even if in the request/second area Python might be doing great now, won't there be problem down the line with the GIL? That is, isn't async/await an island of good performance?


So rarely is speed actually needed. If you need the speed though, just write a library in C/Fortran/Go and call it from Python. That way you can do all your business logic where it's easy and very fast (Python) and still have room for crazy fast (Go, C) if you need it. Most real applications bottleneck in the database though, not the language.


> won't there be problem down the line with the GIL?

If you want parallelism in pure Python code, you don't use multiple threads.


Unless you're on Jython or the parallelism is for I/O.


I/O isn't pure code.


That's certainly one way to interpret "pure Python" but many people use that phrase to describe code that is using only builtins and standard library as opposed to community-supplied C extensions.


I honestly disagree, but I do recognize how one could see it that way.


Yeah, it's a bit of a leaky abstraction since many modules in the standard library are implemented in C. And the interpreter itself.


The TechEmpower benchmarks should be taken with a grain of salt because, in the specific case of Python, and last year at least, they didn't seem to take into account real-life deployments (i.e., they're benchmarking raw scripts instead of a WSGI app running inside gunicorn or uwsgi, either of which has its own threading model and performance improvements).

However, Go and Java will handily beat Python any day, even with uvloop -- all you need to do is try to do a little more processing inside your HTTP handler, aaaaannnnddd thiiiingggsss willl sloooowww dooooowwwwnnnn....


>> they're benchmarking raw scripts

TechEmpower uses gunicorn/nginx [1] for Python frameworks, and has since 2014.

>> all you need to do is try to do a little more processing inside your HTTP handler

Unless you're working with an external service like a relational database, in which case performance between Go/Java and Python will be comparable [2]. If you don't need to leave your process, Go/Java will flatten Python.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/tree/mast...

[2] https://www.techempower.com/benchmarks/#section=data-r13&hw=...


> But, even if in the request/second area Python might be doing great now, won't there be problem down the line with the GIL?

Yes, you'll probably have to move the really work down to a cluster based job queue.


For simple API things it definitely is significantly faster than anything else python has to offer. I am not sure where the author got the node comparison from, but all the python benchmarks are provided https://github.com/channelcat/sanic with the code used to run them. These are also single process benchmarks on a AWS medium (not sure what type). And I believe (best I can tell) those techempower benchmarks are run on a 40 core bare metal machine.


It seems like Sanic isn't on that list, so you can't really check its authors claim with it.


It will be a problem with ~every library since python is slow by default (compared to go,java,etc)


Big surprise that a hashtable-interpreter is slower than a compiled or heavily JITed runtime.

I also heard that CPython is slower than assembler. I'm not sure why that is?


spaCy going from AGPL to MIT was exciting to me but I see that actually happened in late 2015 https://github.com/explosion/spaCy


I love Spacy, and recently discovered Textacy: "higher-level NLP built on Spacy"[1].

It's pretty good.

[1] https://github.com/chartbeat-labs/textacy


I didn't know that either. I would have definitely used spaCy for a project I was doing last year, but didn't because of the licensing.


I've been using sanic+uvloop+aiozmq (current public draft: https://github.com/rcarmo/newsfeed-corpus) and am quite pleased with this setup for building fast (for Python) asynchronous apps.

The async/await thing is still a bit half-baked (I can't wait for 3.6 to arrive with more bits, including async generators), but I can see myself using this more and more -- just don't expect Go-like performance from this setup for anything but simple processing, since Python is still slow (and 3.x all the more so until PyPy 3 becomes a reality).

As to the rest of the libraries mentioned, I can't really fault the selection. I'm still using Python largely because of the ecosystem and sane deployment options, so I'm always happy to find new, decent libraries.


I am wondering why python community focus seems to be shifting away from gevent and towards the asyncio framework.

Am I the only one who finds the programming style of using async/await inferior to that of gevent?


You're not the only one. I too am fond of the implicit context switching that we can have with gevent.


"Explicit is better than implicit", you know.


Couple of reasons: it's new, it's shiny, it holds the promise of doing super awesome stuff, it's really hard to get right and if you've explored the language enough, it's one more facet to explore.


Nice list! I did not know about hug. Here's hoping I can remember it next time I need a basic API.

Is that an issue for anyone else -- that you can't remember all these libraries exist when you need them?

I guess that what awesome-{python, PHP, golang, etc} are for, right?


I don't get a lot out of those awesome lists. Not enough is done to filter them...what would greatly help is to arrange the list as a table, with Github stars, number of issues, committers, and weeks since last update.

I'm teaching a Python class in winter and put down all the libraries I could think of that I might need (on top of Anaconda 4.x)...probably will serialize this later: https://gist.github.com/dannguyen/9e5082ac6a80590bfe1541952f...

Planning to use AWS significantly this quarter, so awscli and boto3 are at the top of my list.


There are a bunch of apps who can manage your github stars (i can't remember which one i use, something like "night sky" maybe? Am on mobile). I periodically review and tag them so they're easy to find when i start a new project. Then I check my Pinboard.

I wish someone would make an app that merged Github, Bitbucket, Gitlab, Pinboard and similar dev-oriented collections.


Late edit: the app I use is Astral: https://astralapp.com/

Pretty good for a non-native app ;)


https://djangopackages.org/ does that for Django packages. Most big Django packages have an underlying big Python package.


> Is that an issue for anyone else -- that you can't remember all these libraries exist when you need them?

Yep. I just star them on github and hope that when I need one, I can remember that I starred something like that some time ago. It has worked a few times. But surely there must be better ways?

> I guess that what awesome-{python, PHP, golang, etc} are for, right?

Not sure. There are some cool ones but plenty of them are pretty crappy dump of links of very varying quality for the internet points (i.e. github stars) :/ (no offense to people who publish them!)


I use my GitHub stars as my search engine for these sorts of projects.


I create a small tool organize the stars.https://github.com/maguowei/starred


It's telling that at least three of these are python 3 only.


Thanks, quite a lot of libraries I wasn't aware of (probably coz I haven't needed them yet! I still prefer perl for quite-n-dirty scripts, but one of these days I should try to do the same with python instead).

Anybody want to volunteer to start a resource page for python, kind of like this one for elasticsearch? https://github.com/dzharii/awesome-elasticsearch



hug is great fun to work with, and with just a little abuse works great for a normal website frontend too - but provides you with a canonical endpoint list, and decent typing.


hug (like Falcon) has way lot of "magic". It seems to be a trend (at least in python webdev world). I have mixed feelings about this.

Hug appears to be one of those things that is great when all you're doing is very simple but quickly turn into morass of fighting the opinions and magic of framework when doing anything "real".

Also (good) CLI is so different than HTTP that I see only sadness in mashing them into same interface. like having to repeat large amounts of boilerplate decorators to get it to do what you need. The chaining seems it would help with boilerplate at least.

Still, I hope I'm wrong. I want to have a hug. Falcon's lack of validation and documentation means it solves only about 10% of the problem.


I don't use hug's CLI. It exposes a WSGI component that you can send any which way that you feel like, and plug into whatever else you use.

Hug also exposes directives, formatters and middleware for you to dive into the nitty gritty - not much magic. Just standard simple decorators.


Is that really a trend? I remember web2py getting a lot of flack because of the amount of magic it used.


Feels more like a pendulum. I remember one of the big pre-1.0 branches of Django was called "removing-the-magic".


Bokeh and Sanic+uvloop is pretty interesting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: