Hacker News new | past | comments | ask | show | jobs | submit login

It's funny to me that they're making a point that PyPI is better than core, because actually I think PyPI has created a rather crap ecosystem. The non-hierarchial organization of packages, the lack of curation, lack of inheriting past functionality and extending it as more standard functionality, etc has resulted in a confusing sprawl of packages with duplicate, incompatible, buggy functionality. It's a bit like Linux internals; it's grown haggard over time, isn't organized well, is badly documented, and so it's difficult to pick it up and use it without stumbling over a decade or more of stale documentation and obsolete software.

Perl has a much better set of modules that extend standard functionality, which considering how much flack Perl gets for being hard to read, is rather funny. Rather than every new feature being its own independent project, most of the useful modules inherit a parent and follow the same convention, leading to very simple and easy to use extensions. And Perl Core isn't all that great, but it does have some batteries included, and everything else is extended easily and in a more standard manner by CPAN.

Wow, I've had a really opposite experience with CPAN modules. I've overwhelmingly found them to not respect encapsulation (messing with all sorts of global state, not mentioning that they're doing it, and failing to clean up after themselves or even provide the tools to clean up well), be massively inconsistent in their APIs, have messy and hard-to-parse documentation (still better than Python's conventions here, though), and have some really silly hierarchy-related decisions, most of which I suspect stem from inter-maintainer politics and infighting, of which I've observed a large amount.

Sure, I've found some gems on CPAN, but, having worked on both Perl, Python, and Java at reasonable scale for awhile, I cannot understand all the praise CPAN gets. It's the worst-quality scripting language package ecosystem out there. Even NPM does a better job, and some things about NPM are awful. CPAN might have been the first/only/best package manager for a get-shit-done scripting language at some point, but not any more.

Separately, I agree about modules which extend language functionality (e.g. class systems, async programming, runtime typing) specifically. Perl does pretty well in that area. While many of those language-extension modules really don't play well with any other metaprogramming tools being installed in the project, I don't imagine that any alternatives in other languages do, either. My main beef above is with "simple" (read: not pervasive semantics changes) modules like IPC utilities, HTTP clients, or loggers that don't know how to stay in their lanes.

What functionality would you like to have on PyPI, in addition to curation?

The big "function" I would like is just organizing the packages differently to get people to think about and use them differently.

Search engines are a "cool" technology that have become the de facto way to find what you're looking for. But if there's a lot of content related to what you're looking for, they can suck.

Go to PyPI and search for "semantic version". 10,000+ projects for "semantic version" found. As you go through page after page of different modules related to versioning, the one module you won't find immediately is Versio (https://pypi.org/project/Versio/), a well-documented and useful module which I ended up using. I have no idea how I found this module, but it certainly wasn't from PyPI's search engine.

Now go to CPAN (really metacpan) and search for "semantic version". Yes, you're still looking at thousands of results - but wait! There are only two modules here that look useful: Version::Dotted::Semantic, and SemVer. And the description comes straight from the docs' README, rather than being a short uninformative blurb. The first module, Version::Dotted::Semantic, is inheriting a separate module, Version::Dotted, and adding some extra functionality. Not only does the search page give more information about the module, but the hierarchy makes it easier to find (and later extend) useful modules in an intuitive way. Since the base module's functionality is boring, generic, and simple, it's less likely that people will make 20 different versions of it, so it'll be reused more often and thus remain stable for a long time.

A lot of CPAN's module names have sprawled over time and gotten less useful, but there's still a general convention that you name your module as a hierarchy of what it does (even if it's kind of verbose) and make small, reusable modules, rather than giant modules that are hard to extend. Not all modules measure up to this standard, and there's definitely room to improve, but I think Python modules could benefit greatly from a system like this.

As far as curation goes, PyPI is often filled with cruft. While searching for Jenkins packages, you will come across lots of entries like this: https://pypi.org/project/jenkins2api/. The homepage leads to a GitHub 404, it's only ever had one release, and it has no documentation. This project should probably not have been listed on the main search page, or at least sorted well down the list by default with intelligent filters and marked accordingly. (The "date last updated" and "trending" sorting just results in having virtually no Jenkins-related modules in the results at all)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact