

Guido van Rossum: People want CPAN (discussion on Python packages) - fserb
http://thread.gmane.org/gmane.comp.python.distutils.devel/11359

======
amix
A few weeks ago I argued that language distribution platforms should be based
on a distributed version control system (like Mercurial or Git) and have an
user-friendly web-interface (like BitBucket or GitHub). Reputation (like the
one found in StacOverflow) should also be a part of the platform. Anyhow, read
more here if you are interested: <http://amix.dk/blog/viewEntry/19475>

~~~
steveklabnik
Nice post. I think this is a really cool idea in general. A few of us on one
of my projects have been thinking myself recently about a package manager
based on git, where you could literally merge in feature branches or patch
branches as you wanted...

~~~
n8agrin
While a nice idea, it would be worth finding out why github abandoned this
very concept in favor of gemcutter.org before diving in head first. I trust
the github guys as being far more competent than most in all matters of
distributed source control especially when it comes to package management.

~~~
pjhyett
The idea is sound, but it was a distraction for us, so we were more than happy
to offload rubygems to a dedicated service that could give it the attention it
deserves.

------
staunch
Both the CLI CPAN module/tool itself and <http://search.cpan.org/> are awesome
to work with. I think every language should shamelessly copy them first, and
try to improve upon them second.

It almost seems like Ruby/Python have been consciously reluctant to do so.

~~~
jrockway
CPAN's success, as with Debian's, is not due to the technical infrastructure
but rather the social infrastructure. There are plenty of ways to make
packages not work at all, it's just that the Perl people don't do that.

(If you have ever read the CPAN source code, you would be surprised it works
at all. Don't get me started on the various incompatible build systems, and
what happens when your module's build system depends on a newer version of the
build system.)

Haskell's Cabal is the technical model to steal. Don't let modules execute
their own code unless they actually need to. 99.9999% of modules do fine with
some sort of declarative interface, rather than actual code to do those
things.

~~~
chromatic
The CPAN works despite clunky code and competing implementations because the
functional decomposition is sufficiently effective. (EUMM is some of the worst
code in intent, design, and implementation I've ever read and used anyway.)

~~~
jrockway
The CPAN software infrastructure only works for things that it was intended to
do, it is very hard to make it do arbitrary things. If you want to find out
what file to download from the BackPAN to get Foo::Bar 0.2, that's too bad,
you have to download the BackPAN and index it yourself. Of course, there is no
way to index things without evalling code regexed out of every file. (And
there is no way to predict if "make install" will actually install Foo::Bar.)

Something CPAN couldn't easily do a few years ago was install to arbitrary
directories. If you set the right environment variables, EUMM would sort of do
the right thing. If you set different environment variables, sometimes MB
would do the right thing. Eventually EUMM and MB were patched so that it
almost always worked, and then local::lib was written to paper over the
differences.

And of course, nothing requires the use of EUMM or MB, so if a package doesn't
use it, you can't install it to your home directory.

Anyway, the EUMM is what you get when you write code to fix problems that
people complain about. MB is what you get when you write specs to "fix"
problems people complain about. Maybe someday we will have a build system that
has a sane design and actually works.

------
marcusbooster
I find package systems fantastic for things I don't want to care about it, but
get frustrated when it handles the things I do care about.

Or maybe that's just my own experience using Common Lisp on Ubuntu.

------
adw
The scientists he's talking about are smart people, but aren't really into
computers and (what's more) have no patience at all for the amount of pain it
takes to compile the dependencies Python packages need.

It's not the Python side of things that's the problem. Numpy and Scipy (which
nearly all Python scientific software depend on) is based on both bindings to
C and to Fortran, and getting those library ducks in a row - especially on
Windows or MacOS X, on Linux your package manager does it for you - is a pain
for someone who knows what they're doing and almost impossible for someone who
doesn't.

Plus, well, most scientists outside physics use Windows. If you need a
command-line tool you've already lost in that respect. What you're competing
against is often Excel.

The battle over Numpy being in core Python has been fought and lost, but short
of that level of integration, I don't see that there's going to be much
effective to do about this.

~~~
cdavid
The problem is more complicated than just C code being more difficult to
build.

The whole distutils infrastructure is messy and badly designed. It takes care
of everything from build up to installation and packaging, and all those parts
are tighly coupled. It is also incredibly inflexible, and the way to extend it
through subclassing leads to incompatible code (if package A subclass
distutils, and package B subclass the same thing, how can you use A and B ?).
Almost every design decision of distutils is wrong, and badly implemented.

Numpy and scipy binaries are built for every release: actually, that's the
platform we support the best in some sense since we can reliably build
binaries, and that saddens me quite a bit.

~~~
adw
You guys do an amazing and thankless job, and I'm sorry if I oversimplified
that aspect of it. It's just difficult to see how to solve the problems you
describe, let alone to explain those problems to a researcher you're trying to
wean off Fortran. :)

------
ableal
Debian's 'synaptic'. Help and leverage it to other platforms, if needed. Less
pain all around.

~~~
blasdel
Please god no.

Apt itself is barely good enough to handle libraries written in C, much less a
dynamic language with multiple potentially-incompatible runtimes, and Debian's
policies are dead set against making anything remotely wholesome:

    
    
      * As an author, affected middlemen have a stranglehold on easy distribution
      * License wankery (fuck debian-legal)
      * Teenagers randomly patching upstream software without review
      * Shipping non-standard configurations, often with features randomly disabled
      * Rearranging everything to fit their naive 'filesystem hierarchy'
        (this completely fucks up a decent packager like Ruby Gems)
      * Breaking off features into separate packages whenever possible
      * Shipping ancient versions of software with a selection of patches picked
        specifically to introduce no features, just cherry-pick 'bug-fixes'
      * Shipping multiple versions of a runtime with mutually-exclusive depgraphs
      * FUCKING RELEASE FREEZES
        There's no goddamn reason for any non-system software to be frozen ever
    

Ubuntu is making a decent stab at unfucking all this (at least on their turf)
with PPAs: <https://help.launchpad.net/Packaging/PPA>

~~~
DarkShikari
These are all problems with Debian's _use_ of synaptic; the program itself is
a very good package manager.

~~~
nailer
Actually I believe a few ar problems with the parent poster: "* Rearranging
everything to fit their naive 'filesystem hierarchy'(this completely fucks up
a decent packager like Ruby Gems)"

The FHS is a known standard which works with every language. What specifically
about Ruby makes it unique amongst all other software?

~~~
blasdel
Gems (like NeXT / OS X .app bundles) are self-contained, with the
documentation and data resources alongside the code in a standard way. This
makes it very easy to support having multiple versions of the same software
installed simultaneously, with an optionally qualified import statement to
disambiguate.

The FHS inspires maintainers to large amounts of useless and regressive tedium
in re-separating the peas and the carrots into global piles. It's not so bad
with traditional C libraries, but the brokenness is immediately obvious when
dealing with the libraries of a language that has anything resembling a module
system.

What's specific to Ruby is that their community somehow managed to not fuck up
their packaging medium.

~~~
nailer
Yes, but native package managers already allow multiple versions to be
installed simultaneously.

'What's specific to Ruby is that their community somehow managed to not fuck
up their packaging medium.'

Overwriting global binaries in /usr/bin is pretty fucked to me, and I don't
think I'm alone in that. Say I'm using puppet or OVirt or other Ruby based
system apps - I wouldn't want Gems breaking them. If Python did this (being
the basis for most Linux distros) or Perl did this on older Unix there would
be hell to pay.

------
wavesplash
Perhaps I'm late to the game but why not take a good look at the Gem/Gemcutter
Ruby packaging and distribution system?

The Ruby folks have done a few iterations of packaging systems and have pretty
much nailed it.

Might be worth studying the whole 'gem' system (discovery, distributed
publishing, versioning, dependencies, uninstall, etc).

<http://www.gemcutter.com>

~~~
aceofspades19
s/com/org

------
blue1
Common Lisp needs something well structured like CPAN too. The situation with
clbuild, mudballs (deceased?), asdf-install etc. is rather confusing IMHO.

