
'New' Python modules of 2015 - ColinWright
http://blog.rtwilson.com/my-top-5-new-python-modules-of-2015/
======
StavrosK
Hands down, my favorite new library is schema:

[https://pypi.python.org/pypi/schema](https://pypi.python.org/pypi/schema)

Here's a schema I use in production, see how readable it makes the parameters
of the API and how quick all the validation _and_ normalization is:

[https://www.pastery.net/mhwwnv/](https://www.pastery.net/mhwwnv/)

At the end, you get an object called data, and you can do data.title,
data.language, etc, and be sure that everything is as you expect.

~~~
fidget
For those of us more dictionary oriented, there is
[https://pypi.python.org/pypi/voluptuous](https://pypi.python.org/pypi/voluptuous)
(which is OK for the most part, as long as you are only trying to do
validation, and nothing too crazy)

~~~
odonnellryan
Similar, I've used this library with a lot of success:
[https://marshmallow.readthedocs.org/en/latest/](https://marshmallow.readthedocs.org/en/latest/)

~~~
aidos
I've tried truckloads of Python serialisation libs over the last few years and
marshmallow is the one that finally makes me feel like I don't need to look
for another one.

~~~
odonnellryan
It's very easy and does everything I've needed it to!

------
rffn
The Python module I learned to love this year is Click. Gets me better command
line interfaces fast. URL is [http://click.pocoo.org](http://click.pocoo.org).

It does progress bars too...

~~~
agumonkey
docopt was inspiring in its own way, but click was 'simpler' do toy with.

~~~
gh02t
I find click to be much more natural too. Docopt was always touchy about more
complicated CLIs when I used it (probably because a lot of "magic" happens
behind the scenes), whereas click lets you drill down and arrange things just
so. It also feels very well designed, hats of to Ronacher as usual for being
good at designing Python libraries.

I also find myself using click even when I don't want a CLI. The pretty-
printer (`secho`) and progress bars are extremely handy, plus some of the
other stuff in utilities. It's quite nice that they handle detecting when
output is an interactive terminal versus piping to a file.

------
teekert
I'm feeling a lot of love for Pandas. Any (biology related) project I work on
starts with multi-headered dataframes and ends in beautiful Seaborn graphs. In
combination with Jupyter notebook I breeze through large data sets while
leaving a perfect trail of what goes on in the data pipeline. Python is great.

~~~
chestervonwinch
For those interested, you can also get nice styles with just matplotlib
(including seaborn style) using stylesheets:

[http://matplotlib.org/users/style_sheets.html](http://matplotlib.org/users/style_sheets.html)

~~~
mrswag
I had no idea it existed, very useful, thanks!

------
scrollaway
Not really 2015 but Q!
[https://pypi.python.org/pypi/q](https://pypi.python.org/pypi/q)

Print-debugging on steroids. This really does make things so much easier,
especially when dealing with huge apps you don't have time to learn. Not just
useful as a dev but also as a sysadmin.

~~~
wesleyy
What's the advantage of Q over pdb?

~~~
StavrosK
First of all, pudb is fantastic, just use it over pdb all the time.

q is for when you want to log data, pudb is for when you want to step through
and evaluate lines in-context. It's very possible that you'll want to use both
together.

------
rotten
Wow, there are some great new tools to explore. Thanks!

Some of the new libraries I'm using this year that I've found really handy
include:

Odo -
[http://odo.readthedocs.org/en/latest/](http://odo.readthedocs.org/en/latest/)
It is ridiculously handy for converting data from one format to another -
especially for transforming a table from a database or csv into a DataFrame
and back.

Arrow - Makes for quick datetime processing. -
[http://crsmithdev.com/arrow/](http://crsmithdev.com/arrow/)

Xlsxwriter -
[http://xlsxwriter.readthedocs.org](http://xlsxwriter.readthedocs.org) \- I'm
building beautiful reports, with charts, using this tool. As someone who moves
data around a lot, but has to work with less technical business and analyst
folks, this is becoming my goto for handing them some data to play with.

Blessings -
[https://pypi.python.org/pypi/blessings](https://pypi.python.org/pypi/blessings)
\- as I get older staring at simple black and white text on the screen seems
to be getting harder. Putting a little color and flare in my command line
interfaces cheers me up even if it doesn't do much in the way of actually
getting the job done.

Lastly, switching from curl to httpie was a huge help in working with API's of
all sorts. It solved a problem I didn't even know I had.
[https://pypi.python.org/pypi/httpie](https://pypi.python.org/pypi/httpie)

------
bbayer
I am using Scrapy a lot. [http://scrapy.org/](http://scrapy.org/) It is very
well designed web crawling library.

~~~
scrollaway
I found out, through reddit a couple of days ago, about Pomp:

[https://bitbucket.org/estin/pomp](https://bitbucket.org/estin/pomp)

It looks like a much cleaner Scrapy-inspired spider framework, without the
twisted dependency. And it's python 2+3 compatible. I'm very excited to try it
out.

~~~
bbayer
It looks very promising. I will give it a try.

------
lunchladydoris
Interesting list. I love Anaconda!

A few years ago I tried to set up a Mac with a scientific computing stack and
it took me days to hack my way through all the various dependencies and
incompatible versions. Anaconda now lets me do that in a minutes.

~~~
danso
Me too...I taught a python class by making everyone download Anaconda's
distribution of 3.x...and everyone could do the assignments no matter what
kind of computer they used. Anaconda does a little too much for me to have it
be my own default install but it does quite well in on boarding beginners. I
use pyenv to install maintain Anaconda on my own machine when I need to
replicate student work

~~~
bmer
What does it do that ends up being a little too much for your use?

~~~
danso
It takes precedence in the path over everything...and in the last version I
used (before I upgraded to OS X El Capitain and wiped out everything), things
like `curl` were provided [1] ...which I completely understand for Anaconda's
use case, but it caused a lot of confusing grief to me when I hadn't expected
that and OpenSSL was having its rough times.

I don't know if that's the case (curl being part of the package) now, with
Anaconda 3 2.4.0+? It certainly isn't so when installed via pyenv, so I'm
happy with that. But there were other issues in the past build...BeautifulSoup
was inexplicably broken. I mean that it simply did not correctly parse non-
trivial HTML pages and yet threw no errors. The results could be replicated
for all of my students but I never could isolate the issue... I installed
Python 3 and the same version of BS4 from scratch and had no problems, but I
can't imagine where the Anaconda build would have gotten wrong. It ended up
being OK since I just switched to lxml which I now happily use over BS4 on any
day, but it was frustrating to not be able to diagnose the problem (I didn't
get a response in the support forums either). I'm assuming this problem has
gone away in subsequent versions of Anaconda though I haven't tried since lxml
is perfectly fine to me.

And finally...well, I have to admit it, but I use Python like a goddamned
moron in that I still don't know how to use virutalenv/venv to do proper dev
isolation. And from the brief research I did, I see that Anaconda has its own
conventions, or work flow...something with the conda utility. Again, I can see
why it's necessary for Anaconda's use case (people who want to do data science
and not hand-tweak their environment every time they upgrade a package over
pip), but it added too many layers for me at the time.

[1]
[https://groups.google.com/a/continuum.io/forum/#!topic/anaco...](https://groups.google.com/a/continuum.io/forum/#!topic/anaconda/XCSUmZbNqn4)

~~~
jackmaney
> And finally...well, I have to admit it, but I use Python like a goddamned
> moron in that I still don't know how to use virutalenv/venv to do proper dev
> isolation.

I was the same way for quite a while, until I bumped into pyenv-virtualenv[1].
Just install that plugin, and you can do, eg,

    
    
        pyenv virtualenv 3.5.1 my-project
    

to get a virtual environment called `my-project` based off of Python 3.5.1
(assuming that you've installed 3.5.1 via pyenv, of course). Or, you can just
do

    
    
        pyenv virtualenv my-project
    

to make a virtualenv called `my-project` based off of the current version of
Python that you're using.

Once you do that, pyenv treats `my-project` just as another installation of
Python. In fact, `my-project` will show up in the list of installed versions
(`pyenv versions`), and you can switch to it:

    
    
        pyenv global my-project
    

(Or you can switch at the local or shell levels. Whichever.)

And voila! You have your own virtual environment that can contain its own list
of libraries.

And no, I'm not a shill for the creator of pyenv, I just really like the
software.

[1]: [https://github.com/yyuu/pyenv-virtualenv](https://github.com/yyuu/pyenv-
virtualenv)

~~~
danso
Thanks for this...wrapping it up in pyenv is a lot more familiar to me. And
why would you apologize for shilling for pyenv?...it's amazing :) (as is
rbenv, its inspiration)

~~~
jackmaney
You're very welcome.

And ehhh, I've been downvoted and bitched at about evangelizing pyenv before.
Just thought I'd preempt that. But yes, it's an amazing piece of software. :)

FYI, I've put together a bash function for my .bash_profile that adds an
indicator to my prompt showing the current Python version/virtualenv in
use[1]. That's saved me a bit of frustration when going into a directory where
a local pyenv version overrides the global version.

[1]: [https://github.com/jackmaney/bash-
profile/blob/ddb57091aab44...](https://github.com/jackmaney/bash-
profile/blob/ddb57091aab4425995c8d52466183ec421d6cc5b/.bash_profile#L21-L32)

------
craig552uk
I've got a lot of love for py.test right now.
[http://pytest.org/latest/](http://pytest.org/latest/)

I feel I'm able to write much more concise test scripts than I could with
unittest.

------
bru
Looks down (503), here is a cached version:
[http://webcache.googleusercontent.com/search?q=cache:Bf9Iv63...](http://webcache.googleusercontent.com/search?q=cache:Bf9Iv63AAD0J:blog.rtwilson.com/my-
top-5-new-python-modules-of-2015/)

------
Drdrdrq
Pyrasite. When you have a running python app that is behaving oddly and you
can't replicate the bug elsewhere, you can run python code inside the running
process - without any preparation beforehand - to display stack trace, output
vars,...

------
bubalus
Anaconda has been a lifesaver, because it can be installed and managed quite
easily without root privileges (it even installs pip). Some of the sysadmins
where I work are slower than molasses when it comes to installing python
packages (as in, it takes months of repeated emails from multiple people to
get anything done), and what is installed is often years out of date.

------
bpicolo
While we're on the subject of python modules: sqlalchemy is, and will probably
continue to be, my favorite library for any language.

------
baldfat
> I went looking for a pure-Python NoSQL database and came across TinyDB…which
> had a simple interface, and has handled everything I’ve thrown at it so far!

Why would anyone need a simple NoSQL? Why would you go the NoSQL route if it
isn't a HUGE complex database?

~~~
GFK_of_xmaspast
You've already got a perfectly fine KVS built into the language.

~~~
nashequilibrium
Could you name it?

~~~
vram22
This is probably what he meant:

$ py

Python 2.7.8 |Anaconda 2.1.0 (64-bit) ... on win32

>>> import bsddb

>>> print bsddb.__doc__

Support for Berkeley DB 4.3 through 5.3 with a simple interface.

For the full featured object oriented interface use the bsddb.db module
instead. It mirrors the Oracle Berkeley DB C API.

~~~
alelos
bsddb is deprecated as of python 2.6 and removed in python 3

~~~
teraflop
That particular back-end is deprecated, but the same API is provided by the
dbm/gdbm/dumbdbm modules. Those still exist in Python 3, although they've been
consolidated under one top-level module.

------
PLenz
Folium + Geopandas is my new goto GIS toolkit this year

------
wclax04
We've transitioned our local/dev/prod instances to use conda on Heroku, and
couldn't be happier. It was a tiny bit of work to get it set up, but now
everything is consistent, and we can set up new local environments in seconds.

~~~
sandGorgon
So I have been considering this. does conda track pypi or does it lag it? I
have been concerned about moving over my requirements.text for a webapp with
lots of dependencies

~~~
wclax04
It slightly lags, but you can include pip requirements in an environment.yml
file, and they install normally.

I really only use conda for the non-python bits of our stack:
numpy/scipy/pandas etc - packages that are a pain to install on Heroku.

~~~
gh02t
It's also pretty straightforward to set up your own Conda package tree. Nice
for packaging your app for deployment or making sure you have very precise
dependencies.

[http://conda.pydata.org/docs/custom-
channels.html](http://conda.pydata.org/docs/custom-channels.html)

~~~
sandGorgon
I think deployment is a solved problem with docker. Its libraries like
blas,etc that are a huge pain. I'm not sure why static linked bumpy is not
possible - even anaconda could not achieve it.

~~~
gh02t
If you've ever tried to dive into the NumPy build process you'd see why. It's
_unbelievably_ complicated... not that they really could do it better given
that they are compiling about a billion scientific libraries and support
alternatives and optimizations (like MKL).

~~~
sandGorgon
Yes - unfortunately I have and I failed miserably. These days I'm trying to
see if there's a docker build that can build a great numpy (with all
optimizations). Interestingly there are even docker images to call cuda APIs
from python.

------
willvarfar
Obiwan pypi.python.org/pypi/obiwan/

validating JSON

also: type checking on function signatures

~~~
vram22
jsonschema:

Using JSON Schema with Python to validate JSON data:
[http://jugad2.blogspot.in/2015/12/using-json-schema-with-
pyt...](http://jugad2.blogspot.in/2015/12/using-json-schema-with-python-
to.html)

[https://pypi.python.org/pypi/jsonschema](https://pypi.python.org/pypi/jsonschema)

[http://json-schema.org/](http://json-schema.org/)

------
emehrkay
I really like lists like this. I get updates daily on which of my github
friends (is that what they're called?) have starred and there is no real
reason why they're following a project. I can look at the README and guess. I
did see someone start following this project the other day
[https://github.com/elastic/elasticsearch-dsl-
py](https://github.com/elastic/elasticsearch-dsl-py), which seems pretty
interesting. Has anyone used it?

~~~
jefurii
It's something like Django models but with Elasticsearch. You can create
object classes and then save them to Elasticsearch, query them, etc. It's
built on the lower-level [https://github.com/elastic/elasticsearch-
py](https://github.com/elastic/elasticsearch-py). Very handy.

------
robohamburger
tqdm looks super promising. progressbar and progressbar2 end up being
complicated and weird enough to use that my company ended up making wrappers.
Why maintain that when you can just use a library that works out of the box.

It would be great if it had ipython notebook support. I often end up doing
long operations that scrape services for data but have no idea what their
progress is.

For me 2015 has been the year of tox. It is a great tool and worth using for
just about any python project.

~~~
toyg
tqdm is failing for me on Windows at the moment. To be fair, it might not be
its fault (I'm mixing it with Blinker signals), but still I'm slightly
disappointed.

A lot of these "magic" tools fall apart when you're trying to do something
slightly more structured than "throwaway bunch o' functions".

------
pougetj
progressbar2, can also takes iterables as input, for easy display of progress
bars

~~~
maxerickson
There's something to be said for the nearly non existent api provided by tqdm.

------
tenfingers
"python-bond" also came out in 2015: [https://pypi.python.org/pypi/python-
bond](https://pypi.python.org/pypi/python-bond) allows a simple interface
between python runtimes and php/perl/nodejs.

~~~
collyw
There is a lot of overlap between these languages, and I wonder what any of
them can do that can't be done in pure Python.

~~~
tenfingers
It's not much about language, it's mostly about code/ecosystem re-use (that
is: if you have a library available in system X and you're writing for Y you
can still take advantage of it).

------
antman
Easy platform independent gui as webpage. Used it on my pc a robot and
raspberry pi.

[https://github.com/dddomodossola/remi/blob/master/README.md](https://github.com/dddomodossola/remi/blob/master/README.md)

------
krylon
I haven't used Python in a while, but I shall once more look into it. Python
was always fun. Guess I'll do that over christmas, see where it takes me.

I might even get around to learning Python 3 after only ... what? seven-or-so
years?

~~~
StavrosK
Just write the next thing in Python 3, there's not really that much to learn
right away as much as there are minor surprises that you can very quickly get
up to speed on as you encounter them.

------
atmosx
Are there any ruby alternatives to tqdm?

~~~
Hortinstein
or node?

~~~
bpicolo
It's only around 100 lines of code. Should be easy to replicate in language of
choice:
[https://github.com/noamraph/tqdm/blob/master/tqdm.py](https://github.com/noamraph/tqdm/blob/master/tqdm.py)

------
xyzzy4
This is a tangent, but the most annoying change in the latest Python versions
is you can no longer write print "foo". Now it has to be print("foo"). Damn
kids ruining my language.

~~~
dragonwriter
> This is a tangent, but the most annoying change in the latest Python
> versions is you can no longer write print "foo". Now it has to be
> print("foo").

The statement-to-function migration for print is, IMO, generally an
improvement, but in any case its not a change in the _latest_ versions of
python, except with an unusually broad interpretation of latest; its a change
in Python 3.0, which was released a little over 7 years ago.

