Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How do you handle/maintain local Python environments?
103 points by PascLeRasc on Sept 23, 2019 | hide | past | favorite | 95 comments
I'm having some trouble figuring out how to handle my local Python. I'm not asking about 2 vs 3 - that ship has sailed - I'm confused on which binary to be using. From the way I see it, there's at least 4 different Pythons I could be using:

1 - Python shipped with OS X/Ubuntu

2 - brew/apt install python

3 - Anaconda

4 - Getting Python from https://www.python.org/downloads/

And that's before getting into how you get numpy et al installed. What's the general consensus on which to use? It seems like the OS X default is compiled with Clang while brew's version is with GCC. I've been working through this book [1] and found this thread [2]. I really want to make sure I'm using fast/optimized linear algebra libraries, is there an easy way to make sure? I use Python for learning data science/bioinformatics, learning MicroPython for embedded, and general automation stuff - is it possible to have one environment that performs well for all of these?

[1] https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1449319793

[2] https://www.reddit.com/r/Python/comments/46r8u0/numpylinalgsolve_is_6x_faster_on_my_mac_than_on/

Not a fan of Conda, the cli is terrible, their unofficial builds are opaque and in my experience the very few packages that are not available on pypi are rarely worth installing (usually the worst kind of "research code"[1]).

Simply using pipenv and pyenv is enough for me (brew install pipenv pyenv). You don't ever have to think about this, or worry you're doing it wrong.

Every project has an isolated environment in any python version you want, installed on demand. You get a lockfile for reproducibility (but this can be skipped) and the scripts section of the pipfile is very useful for repetitive commands. It's super simple to configure the environment to become active when you move into a project directory.

It's not all roses, pipenv has some downsides which I hope the new release will fix.

When I need a temporary environment to mess around in, then I use virtualfish by running `vf tmp`: https://virtualfish.readthedocs.io/en/latest/

1. https://github.com/menpo/menpo/blob/master/setup.py#L17

I have had the opposite experience. Pipenv chose to ignore the use case of developing a package. In the docs, they state they only care about projects that are directly deployed, not installed and used by some other project. Conda helps with both types of projects and handles non-Python dependencies.

It takes way too long to make a Pipenv.lock file, and I need to be in the project folder to activate the env?! Terrible user experience.

I have fully moved over to pipenv and it is what I onboard new people with. Not having a virtualenv that everyone names something differently and invariably gets checked in on accident was a big plus on its own.

For a while I was encouraging people to use conda especially on windows, but with WSL I found it to be mostly unnecessary.

I do still leverage pip install --user for tools that are stable enough and I use frequently.

+1 for pipenv

I develop using pipenv and then deploy installing the dependencies directly in the folder. It works for AWS lambda and containers, although lambda layers are a little more tricky to release:

`pipenv lock -r > requirements.txt`

`pipenv run pip install -r requirements.txt -t .`

You might find AWS SAM CLI worth looking at if you're finding that awkward : https://docs.aws.amazon.com/serverless-application-model/lat...

I used to have a Makefile to do the install dependencies in directory, test locally, zip the directory up, deploy to AWS jobs - but SAM is a really good replacement for that workflow. It uses docker containers too, and the other big thing it makes trivial if you need it is running an API gateway locally so you can test the Lambda as an HTTP API instead of just hacking events in as standard input/function inputs.

> Not a fan of Conda, the cli is terrible, their unofficial builds are opaque and in my experience the very few packages that are not available on pypi are rarely worth installing (usually the worst kind of "research code"[1]).

The CLI used to be slow, but since 4.6 the performance is reasonable -- still slower than some non-Python package managers, but in my experience no longer so slow as to be aggravating (< 10s install now, whereas previously >30-60s in the past just solving the environment)

> their unofficial builds are opaque

I'm not entirely sure what builds you are referring to, but conda-forge [1] is completely open source and community-driven, and Anaconda recipes are in a github organization called AnacondaRecipes [2]. If you are talking specifically about the Python interpreter builds for the Anaconda distribution, I believe the build recipes are here [3] (although I don't know that for a fact).

> my experience the very few packages that are not available on pypi are rarely worth installing

Depending on your field, platform, and dependencies, that is certainly believable. However, several projects such as Apache Arrow and Rapids have dropped support for PyPI wheels [4] because the engineering cost of wheel-building on linux is prohibitive, and conda is preferable on windows for a variety of reasons.

> usually the worst kind of "research code"[1]

That's an oddly-specific complaint. Anyway, given the variety of environments where people run `setup.py`, that kind of thing is usually explained by "someone put in a PR to make it build somewhere." It might be surprising if you've never built or packaged python extensions with native dependencies before, but I'm not sure it's a particular indicator of quality.

[1] https://conda-forge.org/ [2] https://github.com/AnacondaRecipes [3] https://github.com/AnacondaRecipes/python-feedstock [4] https://medium.com/rapids-ai/rapids-0-7-release-drops-pip-pa...

> their unofficial builds are opaque

By opaque I mean: https://github.com/AnacondaRecipes/tensorflow_recipes/tree/m...

What do these do. Why are they adding an alternative gettime implementation for MacOS? Why are they patching a bunch of test files to make them pass?

> That's an oddly-specific complaint.

It's unfortunately one we ran into recently while migrating a legacy system off conda, but it's not the first and I doubt it will be the last.

> What do these do. Why are they adding an alternative gettime implementation for MacOS? Why are they patching a bunch of test files to make them pass?

The gettime patch is to support macOS SDK < 10.12, as it says in the patch itself. From a quick look, some of the test patches appear to be for Python 2 compatibility. Every linux distribution and system integrator has a patch list just like that for complicated software. If a particular patch concerns you, I would suggest opening an issue on their bug tracker to ask for an explanation.

> It's unfortunately one we ran into recently while migrating a legacy system off conda, but it's not the first and I doubt it will be the last.

Unclear what that has to do with conda. The same package is available on PyPI (source only), and the line you highlighted almost certainly exists for some other build target than conda, where the compilers will automatically find numpy includes in CONDA_PREFIX without any extra effort.

This is has worked for me:

- brew python for my global environment.

- Create a virtual enviroment for each project (python -m venv).

- Use pip install after activating it (source venv/bin/activate)

If you need to work with different versions of python replace brew python with pyenv.

This is my setup too and if you're using VS Code, their virtual env support makes it pretty seamless[1].


I was surprised at how well it worked when I tried it again recently after a hiatus. Now I just wish they would get Docker-based environments working.

VSCode already has it as a feature but it's a little rough around the edges. It's called "Remote Containers".

Docs: https://code.visualstudio.com/docs/remote/containers

I also use VSCode for some projects. I am not completely sold, but it is better than the alternatives IMO.

> replace brew python with pyenv.

replace pyenv with asdf [0]. asdf is like pyenv, but it works for all major programming languages.

[0] https://github.com/asdf-vm/asdf

Score on asdf!

The fact that it can do multiple languages and environments is lovely.

My current technique for that is a pseudo-wrapper script that activates each 'env' and runs associated commands with it to set it up, e.g. bash exports, $PATH, etc.

Now I just need to fit asdf into my workflow.

This but virtualenv / virtualenvwrapper.

workon [project name] and deactivate.

Pycharm sees these and they work.

The biggest advantage of venv for me is that it comes with python>=3.3. The advantage of virtualenv are the extra features, but I haven't needed them so far.

I just use Anaconda. It's basically the final word in monolithic Python distributions.

For data science/numerical computation, all batteries included. It also has fast optimized linear algebra (MKL) plus extras like dask (paralellization), numba, out of the box. No fuss no muss. No need to fiddle with anything.

Everything else is a "pip install" or "conda install" away. Virtual envs? Has it. Run different Python versions on the same machine via conda environments? Has it. Web dev with Django etc.? All there. Need to containerize? miniconda.

The only downside? It's quite big and takes a while to install. But it's a one time cost.

I chose Anaconda for this use case:

1 Setup a production Python environment on Windows Server 2016

2 Write a README on how to set up a prod Python env on Windows Server 2016 (pretty short)

3 Tell my Win 10 coworkers to refer to the README to setup their own Python env

4 Minimize the number of "conda install" commands needed to setup a production Python env, which in our case are mostly the Oracle/cx_oracle and MS SQL Server dependencies.

I'm in the same boat, but I use miniconda and install the individual packages I want as I need them, which helps minimize the size. For doing scientific analysis, it is really nice to have (basic) environment which has the basic required packages (matplotlib, numpy, scipy, ipython, etc.) from which I can duplicate and build off of. It works well with VSCode, and unlike other comments, I don't really see any negatives to the CLI. For someone who started not even programming a lot, to doing this in a scientific context (so potentially less rigorous with regards to how things should be perfectly set up, but where it is necessary that they are functional), this was by far the easiest and least complex way to set up all the stuff you need to get started.

I also prefer conda for the same reasons.

Precompiled MKL is really nice. Conda and conda-forge now build for aarch64. There are very few wheels for aarch64 on PyPI. Conda can install things like Qt (IPython-qt, spyder,) and NodeJS (JupyterLab extensions).

If I want to switch python versions for a given condaenv (instead of just creating a new condaenv for a different CPython/PyPy version), I can just run e.g. `conda install -y python=3.7` and it'll reinstall everything in the depgraph that depended on the previous python version.

I always just install miniconda instead of the whole anaconda distribution. I always create condaenvs (and avoid installing anything in the root condaenv) so that I can `conda-env export -f environment.yml` and clean that up.

BinderHub ( https://mybinder.org/ ) creates docker containers from {git repos, Zenodo, FigShare,} and launches them in free cloud instances also running JupyterLab by building containers with repo2docker (with REES (Reproducible Execution Environment Specification)). This means that all I have to do is add an environment.yml to my git repo in order to get Binder support so that people can just click on the badge in the README to launch JupyterLab with all of the dependencies installed.

REES supports a number of dependency specifications: requirements.txt, Pipfile.lock, environment.yml, aptSources, postBuild. With an environment.yml, I can install the necessary CPython/PyPy version and everything else.


In my dotfiles, I have a setup_miniconda.sh script that installs miniconda into per-CPython-version CONDA_ROOT and then creates a CONDA_ENVS_PATH for the condaenvs. It may be overkill because I could just specify a different python version for all of the conda envs in one CONDA_ENVS_PATH, but it keeps things relatively organized and easily diffable: CONDA_ROOT="~/-wrk/-conda37" CONDA_ENVS_PATH="~/-wrk/-ce37"

I run `_setup_conda 37; workon_conda|wec dotfiles` to work on the ~/-wrk/-ce37/dotfiles condaenv and set _WRD=~/-wrk/-ce37/dotfiles/src/dotfiles.

Similarly, for virtualenvwrapper virtualenvs, I run `WORKON_HOME=~/-wrk/-ve37 workon|we dotfiles` to set all of the venv cdaliases; i.e. then _WRD="~/-wrk/-ve37/dotfiles/src/dotfiles" and I can just type `cdwrd|cdw` to cd to the working directory. (Some of the other cdaliases are: {cdwrk, cdve|cdce, cdvirtualenv|cdv, cdsrc|cds}. So far, I have implemented cdalias support for bash, IPython, and vim)

One nice thing about defining _WRD is I can run `makew <tab>` and `gitw` to `cd $_WRD; make <tab>` and `git -C $_WRD` without having to change directory and then `cd -` to return to where I was.

So, for development, I use a combination of virtualenvwrapper, pipsi, conda, and some shell scripts in my dotfiles that I should get around to releasing and maintaining someday. https://westurner.github.io/dotfiles/venv

For publishing projects, I like environment.yml because of the REES support.

After using many different Python tools (Poetry, Pipenv, virtualenv, etc) I've found that the most reliable method is to just use the standard `venv` command to create virtual environments. I have the following `cvenv` ('create venv') alias set up in my .bashrc to create new virtual environments:

    alias cvenv='python -m venv .venv && source .venv/bin/activate && pip install --upgrade pip setuptools > /dev/null'
Also, as others have mentioned, if you're not in a virtual environment, only ever install pip packages to your user location, i.e. `pip install --user --upgrade package_name`

But the first thing to do before any of that is to make sure you know which python binaries you're running:

    which python
    which python3
    which python3.7
You can use the following command to set your default Python: sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.7 9

(that sets python3.7 as the default when you just type `python`)

I've been using virtualenv + virtualenvwrapper for years. I hear complaints but I'm used to it now and it's pretty nice. Usually create a new environment for each project and pip install my packages into it.

Yep. Python 3 installed via brew. Then every project has its own virtual environment. You can specify the path to the python binary when creating a new env so you can specify whether you want it to use Python 3 or Python 2.

It’s simple. It’s easy to debug. I genuinely don’t understand the point of all this other nonsense like anaconda.

Unfortunately when it comes to data-science-Python it seems like the patients are running the asylum. The APIs and developer ergonomics of tools like tensorflow, pandas etc... are so bizarre to people who are running production, vanilla Python.

pandas keeps turning me off, even if I feel like I ought to like it; I feel like I never really know what is going on, and the syntax feels like a domain-specific-language tacked on top of python, and it just doesn't sit well (my opinion!). Besides... some of the functions have an insane number of parameters, e.g.

pandas.read_table(filepath_or_buffer: Union[str, pathlib.Path, IO[~AnyStr]], sep='t', delimiter=None, header='infer', names=None, index_col=None, usecols=None, squeeze=False, prefix=None, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=False, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression='infer', thousands=None, decimal=b'.', lineterminator=None, quotechar='"', quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, dialect=None, error_bad_lines=True, warn_bad_lines=True, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None)

I use R happily enough, and MATLAB too for that matter. Maybe, some-day, I'll see the pandas-light.

Learning pandas really does feel like learning a new language - new syntax, idioms and implementation details to be aware of. Much more so than other libraries imo.

Given how utterly powerful it is, I think that's OK.

Not sure a large list of optional params (with good documentation) is a bad thing though.

Why do you not like so many parameters? They are optional so typically pandas only has one or two required parameters.

Callables requiring many parameters is often indicative that the code is doing too much, perhaps as coder after coder tweaked it to handle their particular use-case.

I am not a big fan of (the over-use of) optional arguments either: I prefer code I'm reading to be more explicit, which is more pythonic imo.

I agree on over-use of optionals.

I disagree with the code is thus doing too much, however.

At least in this specific and some other cases, the method shown there is a wrapper to do a major task of converting data formats from one form to another.

These optional, well-documented arguments show the defaults used, and are modifiable in, as you said, specific use-cases.

I should have been more explicit but this is exactly what I'm doing. It's nice, having control over which version of Python I'm using by project. And if I name my virtualenv the same as my project directory, I can cd to the directory and "workon ." to load the environment.

I'm not a data scientist so I have little experience with these tools, except for Pandas, which I won't use unless I have to.

if on Linux, NEVER touch the system Python! If you can, you should use the package manager of your distro to install the desired python version. If not available, or you don't trust a third party ppa, compile from source and install using make altinstall:


Next, install pip for your user!

`curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py`

`python get-pip.py --user`

Now you should have `python` which is the system one and `python3.x` for alternate versions. Installing modules from pip can also be harmful, so always use `pip install --user` for tools written in python (like aws cli). Those tools are installed in ~/.local/bin, so make sure to add that to your PATH.

Next, you can use virtual environments to prevent your .local/bin from becoming bogged, if you are dealing with many dependencies from different projects. A nice virtual environment management tool helps a lot. Take your pick, virtualenv + wrapper, pyenv, pipenv, conda... whichever you choose, stick to virtualenv for development!

In a nutshell

1. never touch your system python!

2. install tools from pipy using pip's --user flag

3. install different python versions using make altinstall, if building from source

4. virtual environments for development

This should be enough to keep your Linux installation healthy while messing around with python.

Also, docker containers could simplify this whole process, although this is another tech you are adding to your stack

I wish Linux distros would take the system python out of the default path, put it in /usr/libexec, and patch python dependent packages to use that path manually for exactly that reason.

There desperately needs to be a separation between the Python that's intended to support the OS and Python that's intended to be used by users. These should have never been mixed and there's so much headache caused by it.

Never ever type pip as root. Then there is no risk of messing up the system python by forgetting —user. This strategy works fine on Linux.

I haven’t quite gotten the hang of doing this on macOS where I forget —user from time to time and bludgeon my Homebrew python.

Or just use conda.

Either use Anaconda (if you’re using a lot of science libraries, or you also want to install non-python stuff like git, pandoc, or nodejs), or pyenv, which will install and manage multiple versions of Python for you in parallel. In fact pyenv can also install Anaconda, so that’s another possibility. If you’re just running Jupyter notebooks for doing some science or data analysis, I’d go with Anaconda. If you do a lot of library development and testing, pyenv is probably better.

Don’t use the system Python, brew-installed Python, it the official Python installer, if you can avoid it (you’ll want your Python to be up to date and self-contained in your home folder as much as possible)

I use direnv for most of my development that doesn't need full blown containers. Once you hook it up, it automatically enables/disables virtual environments based on directory.


yep, this is the most convenient and seamless way to use python virtual environments. I also use it to specify node or java version per directory.

These days I use `asdf` for everything--think of it as a pyenv/rbenv/nvm for Python, Ruby, Java, NodeJS, and a ton of other stuff. Through it I create virtualenvs, each targeting a particular version of Python.


I second this. asdf is awesome! It works for all major languages.

In case of Python, I use asdf to install the needed interpreter version and I use pipenv to create/handle the virtual environments. It's a perfect combo.

This is my biggest issue with Python. I absolutely love the language, but Python 3 ought to be Python 3 no matter where/how you got it. If you are using 3.5 and I'm using 3.7 we ought to be able to share code and things just work. Unfortunately, that is not the case.

Why would that not work?

There are some breaking features from added in 3.7 which were not in the language in 3.7. Meaning code written for the newer Python 3 version might not work with the older Python 3 version. Most of those additions are relatively easy to fix though.

Code from an older Python 3 version should always work in the newer Python 3 version though (at least for 3.5 -> 3.7).

You'd think, but it's not always the case. I've recently encountered an issue with eventlet that is only present in 3.7, for example.

I agree. Try pyenv. It lets you install a particular version for use in a shell account, then `python -m venv venv`. I like pip-tools for requirements management, but some reasonable people disagree.

I use Conda for all environment management.

I recently had to reinstall a windows dev environment. I didn’t even install standalone python, just conda.

For each new project I create a new conda env, activate it and the. Install stuff with pip. I use pipreqs to create requirements files for the environment before checkin to git to help others who may want to checkout and install.

I just prefer the cli and syntax of conda over the other environment managers. I use pip to install packages over conda as it has more packages and I don’t want to remember the Chanel names that may have the package I want.

As someone who writes the occasional Python script or service I was expecting to find a definitive answer in this thread, but God it's such a mess... Venv, virtualenvwrapper, pew, pyenv, poetry, pipenv, anaconda, conda... How are you expected to chose?

Easy - virtualenvwrapper!

All the talk about anaconda etc is for mad data scientists, not us normal folk ;)

Depends on the environment a great deal.

Linux: Typically using the package manager, or the AUR if I'm not using the latest version. At the moment, I'm using 3.6 as my "universal" version since that's what ships with current Ubuntu.

Windows: Good ol' fashioned executable installers, though there's really no reason I don't use chocolatey other than sheer force of habit.

macOS: I don't do enough work on macOS to have a strong opinion, though generally I just use the python.org installer. I don't think there's that much of a difference from the Homebrew version, but I could be wrong on that front. As an aside: IIRC, the "default" python shipped with macOS is still 2.X, and they're going to remove it outright in 10.15. So I wouldn't rely on that too heavily.

As for other tooling, IME pipenv and poetry make dealing with multiple versions of Python installed side-by-side much easier. I have a slight preference for poetry for a variety of reasons, but both projects are worth checking out.

Finally, at the end of the day, the suggestions in here to "just use Docker" aren't unreasonable. Performance between numpy on e.g. 3.6 vs 3.7 or clang vs GCC likely aren't that significant, but if you create a Docker environment that you can also use for deployment you can be sure you're using packages compiled/linked against the same things.

If all of this sounds like an unreasonable amount of effort for your purposes... probably just use Anaconda. It's got a good reputation for data science applications for a good reason, namely removing much of this insanity from the list of things you have to worry about.

* I haven't looked into it myself, so I can't really address your specific question about using fast/optimized libraries such as BLAS.

This doesn't (seem to) distinguish your system python (i.e., the ones that'll run in a default execution environment if you run "python" or "python3") and project-specific pythons.

For the latter, I guess it's common to use something like virtualenv or a container/VM to isolate this python instance from your system Python.

Personally, I use Nix (the package manager and corresponding packaging language) for declaring both my python system and project environments.

Nix still takes some work to ease into, but I think your question suggests enough understanding/need/drive to push through this. Costrouc (https://discourse.nixos.org/u/costrouc) may be a good person to ask; I can recall him making a number of posts about data science on Nix. I also noticed this one, which seems related to what you're looking into: https://discourse.nixos.org/t/benchmarking-lapack-blas-libra...

I am using NixOS, and on NixOS you can use Nix expressions to create an environment with whatever packages at whatever version you need. It isn’t perfect, but with some effort and tooling it is a reliable, flexible way of managing an environment, and is especially nice in that it manages the exact versions of C libraries as well. Nix can be used on any Linux or macOS system.

(FWIW: there are tools to create Nix expressions from requirements files, so it isn’t necessarily all manual work.)

Have you seen any way to use Nix to test with tox?

I use conda because it’s the default way to install PyTorch. That’s pretty much it. Pretty sure any NumPy install should use reasonable BLAS nowadays but I believe conda includes Intel MKL which maybe makes a difference.

Every couple months I nuke all my environments and my install and start fresh out of general frustration. It’s still the least bad for a scientific workflow IMO (having different environments but all globally accessible rather than stuck in specific directories is nice).

If you `pip install numpy` on most environments it will download an optimized wheel. Anaconda also has an optimized numpy automatically.

I wouldn't use the python shipped with osx because it tends to be out of date rather quickly, and it doesn't ship with libreadline.

Usually the python shipped with Ubuntu is good if it has the version you want.

If you compile it yourself from source you are in for a bad time unless you know how to install all the dependencies. Make sure you turn on optimizations when running ./configure or it will be significantly less fast.

Once you have python installed, install pip. Try the following until one works (unless you use anaconda, then you won't use pip most of the time, and I think you can just `conda install pip` if it's not there by default):

$ sudo pip install -U pip (for python3: sudo pip3 install -U pip)

$ sudo easy_install pip (easy_install-3 or something like that for python3)

(after, if those don't work, that you need to get it from somewhere else, most likely by installing python-pip via apt or whatever the package is in brew or whatever you are using)

When you want to install something globally, do NOT do `sudo pip install ...`, this can break things in a way that is hard to repair, do `pip install --user ...`, and make sure the path that --user installs are installed in is added to your $PATH. I think it's always ~/.local/bin/

If you are working on a project, always use a virtualenv, google for more details on that.

Pip doesn't always give the most optimal build. But if you care about that, you wouldn't be asking this question :-)

Pip gives you whatever the default build is, but for numpy they build optimized builds into wheels for almost every platform. https://pypi.org/project/numpy/#files For anything else, you have to get into the details of how it is built and what the options are which is going to depend on the specific package. Building numpy yourself is not beginner friendly, you have to know a ton of details about linear algebra packages.

Conda is helpful for the odd cases where one C dependency conflicts with another and you need to be careful specifying which build you want.

For me it depends on what I'm doing.

- If building a one off system it really doesn't matter, so it's whatever happens to be easiest on the system I'm deploying it on.

- If I'm adding software to a production system, especially one that will be exposed to the outside world then it's use the distro libraries - no if's or butts. The reason is straightforward: I _must_ get automatic security updates to all 3rd party software I install. I do not have the time to go through the 10's of thousands of CVE's released each year (that's 100 a day), then figure out what library it applies to, then track when they release an update.

- The most constrained is when I am developing software I expect others I have never met will be using. E.g. stuff I release as open source. Then I target a range of libraries and interpreters, from those found on Debian stable (or even old-stable occasionally) to stuff installed by PIP. The reason is I expect people who are deploying to production systems (like me) will have little control over their environment: they must use whatever came with their distro.

pyenv is a tool to manage python environments. it lets you install and setup whatever python version you want with a simple command. you can then switch the active python version globally or just in the current shell with a simple command. it also has some virtualenv plugins.


No windows, but on Linux and MacOS, I use the package manager or brew to install a system Python3, from which I derive venvs for basically all installations of new tools. I use one of the install flags to include the Python version in use in the prompt.

In addition, I use pyenv (pyenv) for development. I always install the newest Python version with eg `CONFIGURE_OPTS="--enable-optimizations" pyenv install 3.7.4` which IME consistently produces a ~10% faster Python compared to the various package managers. This makes it really simple to include in each project a `.python-version` file that tox will happily use to facilitate testing on multiple versions. As above, each project gets its own venv from the pyenv-based Python (I usually do the development using the most recent stable, but if you wanted to use a different version for the venv and development and just test against the most recent stable with tox that should also work).

I've used pew [0] a couple of times, which feels like it works quite well. It is a nice alternative to virtualenvwrapper.

However, my personal struggle with Python projects is when I work on projects that I also use my self on a daily basis. I have a tool that I wrote for automating some administrative stuff at work, and every so often I make some minor fixes and updates. So, I both use it and develop it at the same machine. I've never figured out how to properly do this with a virtualenv. I.e., I want the tool, say "mypytool", to be available and run the most recent version in any shell. At the same time, I would prefer not to install the requirements globally (or for that matter, at the user level). I would love to hear some suggestions on how to solve this use case.

[0]: https://github.com/berdario/pew

This works for me. But I am crazy.

- Install Python through Brew.

- Upgrade pip using pip. Install all further libraries using pip. Never install Python packages through brew.

- Use a plain virtualenv per project/repo.

A long time ago I wrote some bash that I keep in my `.bashrc` that overrides the `cd` command. When entering a directory, my little bash script iterates upwards in the directories and activated any virtualenv it finds. I also override the `virtualenv` command to automatically activate the env after creating it.

I am aware there are existing tools out there that do this. Pyenv does this right? But I never got used to them and this keeps things simple enough for me. I cannot forget to enter a virtualenv.

As I said, I am probably nuts. I also don't use any form of IDE. Just plain VIM.

Doing things differently doesn't make you nuts.

Do you mind sharing your dotfiles? This is really interesting to me.

Sure! I have a Gist here with this little snippet:


Hope it proves useful :)

To add to all these comments:

For the specific use-case of installing executable Python scripts, pipx [0] is the way to go. It creates a virtual environment for each package, and lets you drop a symlink to the executable in your `$PATH`. I've installed 41 packages [1] this way.

[0]: https://pipxproject.github.io/pipx/

[1]: https://gitlab.com/Seirdy/dotfiles/blob/master/Executables/s...

Note that my comment is in reference to Python3, since Python2 will not be maintained past 2020.

If you’re on a Mac, just use the brew installation (brew install). If you’re on some type of prod/containerized setup, use apt’s python (apt-get install).

I would not recommend building Python from source unless you _really_ know what you’re doing on that level, as you can unintentionally shoot yourself in the foot quite a bit. From there just using a virtualenv should be pretty straightforward.

In this way, you’re letting the package managers (written by much smarter people than you and I) do the heavy lifting.

I work on OSX and Ubuntu (different workstations, same setup scripts) and python setup has been frustrating for me. If brew updates the version of python, suddenly all of my virtualenvs seem to break.

What I've settled on is to use a python version manager (pyenv is the least intrusive balanced with most usable) and using direnv to create project-specific python environments. Adding `use python 3.7.3` to an `.envrc` in a directory will make it so cd-ing into it will create a virtualenv if it doesn't yet exist and use the pyenv-installed python at 3.7.3.

Can someone explain the benefit of using a separate env for each project? Why not just use the default env for all your projects? Does it have something to do with the ops part of devops?

Nothing to do with DevOps, more to do with keeping your different projects separate with their own dependencies. That way, if you need to move the package to another machine, you can just do a `pip freeze` to document which packages you need, then `pip install` them on the new machine.

If you just want a sandbox area where you'll be developing lots of little scripts and bits and pieces, there's nothing stopping you from having a single environment that you keep adding to as you go. You just miss out on the portability aspects mentioned above and potentially risk getting conflicts over time as the number of packages you install grows and grows.

You might need to make sure you’re using the same version of Python and project dependencies as you have deployed somewhere. Those versions might be different than your system version.

Keeping your project dependencies isolated also encourages you to document your dependencies (e.g. in requirements.txt), which is essential for reproducibility. If you have every package from every project installed globally then you tend to end up having issues when you or someone else tries to install a project in a different environment (the classic “I don’t know what’s wrong on your machine but it works fine on my mine”).

You might need to use different versions of python. That will require different versions of some packages. And you need to prevent conflicts.

The bigger the venv gets, the more likely you are to get version clashes - unless all your projects can use the same version of each dependency. Even then, it will tend to grow without bound.

- I use the brew python. Mostly for web dev, other times for pandas/numpy

- pip + requirements.txt I find is more than acceptable

- wheels > source when it comes to installing / distributing packages

I install the latest Python 3, PyEnv & PyEnv-Virtualenv through Homebrew (the latter 2 require a Python to run, so the Homebrew-installed Python 3 is used only for those), then use PyEnv to manage additional Python versions & virtualenvs.

I was a huge fan of pipenv in the beginning but it seems to have stagnated as 2 years later it's still slow as hell. I now deal with pip & pip-compile to pin versions.

I just use pyenv everywhere (Mac, Linux, Windows/WSL) and install Anaconda through that. Simplest and most uniform setup possible.

I have created following function in ~/.bash_functions

I use it as pvenv project_dir_name

function pvenv() {

        mkdir $1                              
        cd $1                                 
        python -m venv  .                     
        source ./bin/activate                 
        pip install --upgrade pip setuptools  

I work on Mac and on Linux. I used to use macports for python (sorry, I'm opinionated, I won't touch brew), along with either venv or virtualenv. But I've found that working within the Anaconda system for python has been the best experience overall. I've found conda environments to be a pretty clean solution.

Docker, your dockerfile has all you need and very specific version... why would you waste time on other stuff?

I install the macOS distribution from python.org/downloads/. I use that for general Python 3 projects and it's been working great.

Separately I also use anaconda if I need all the scientific packages + Jupyter etc.

brew install python pip install pyenv pyenv install <version> pip install virtualenv-wrapper mkvirtualenv —python=/path/to/python/bin foo Repeat the last step for each project

I previously used conda, now switching away from it as I noticed pretty bad dependency decay. Super frustrating when it can’t rebuild an environment from just two years ago.

1. pyenv - for managing multiple python versions

2. pipenv (with PIPENV_VENV_IN_PROJECT env var set) for managing project dependencies / venv

That's basically about it.

Not an exact answer to your question, but another data point: I use PyCharm. it handles everything pretty seamlessly. I really like it.

Have you tried Poetry? https://poetry.eustace.io/

Here's my simple Python list:

1. Pick Python binary / Official Docker Image

2. Use virtualenv for Developer Boxes

3. Well defined Pipfile / requirements.txt

4. Avoid most binary package building. Prefer those with readymade wheels. But if it has to be built, I prefer to make static libraries, use them to make static ELFs / modules, as opposed to /path/to/some.so, and bundle them / ship them as wheels for both development / production.

Again, you can set up simple Dockerfiles / scripts to do this for you.

Thanks, so what you're saying is that you can cook up the "proper" environment from any binary source, right? Do you have or know of an example for doing this? Do you do it uniquely for every project?

I do this. I do it uniquely for every project.

These reasons:

- I can't use one version of a binary distro for production, because fixes keep coming.

- Every company / people I work with uses different *nix distro, so the initial setup can look different, Esp when using bundled pythons.

- Irrespective of that, the final outcome should be normalized and sane. This is where virtualenv, tox - these kinds of things help.

- Source standardization is quite easy with Pipfile, so I pretty much rely on that mechanism.

- I have enough experience building libraries (shared and dynamic) and quite aware of ELF and PE issues that sometimes, custom solutions are needed. I avoid them as much, but sometimes they are needed. Why?

Maybe you don't have an Intel CPU (so CPUID based checks break) , or maybe you want to run TF under ROCm and not CUDA, or you're trying to compile a new shared library for RPi because nobody builds armv7l wheels unless they use a Pi, Or you need to link against OpenSSL/libxml for that wsgi based project - and keep runtime/image sizes small on a container.

In these specific cases, more often than not, stay away from anyone who promises you that distro X would / has always worked for them.

pipenv works well enough for me, it creates a lock file so you get deterministic builds. Plus its supported by heroku if you ever decide to host with them. https://github.com/pypa/pipenv

pyenv for python versions. and then venvs for each project to contain its packages from other projects. But to this day i struggle with polluted python/3 and pip/3 on my bash thanks to native mac install, brew and pyenv versions.

Obligatory xkcd: https://xkcd.com/1987/

At that rate, why not just install the toolchain and compile from source?

I install Python via PyEnv; then for every project, I use PipEnv.

Just use pyenv. Works everywhere.

pyenv/virtual environments


This, to explicitly choose which packages to install per environment.

And, beyond numpy/scipy (where the conda versions might be better optimize), often pip-install packages from PyPI into those conda environments.

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact