

Trouble in Node.js paradise: The mess that is npm - mikl
http://mikkel.hoegh.org/blog/2011/12/20/trouble-in-node-dot-js-paradise-the-mess-that-is-npm/

======
autarch
This is pretty funny for us Perl folks. People have been making this complaint
about CPAN for many, many years.

It's a valid complaint, but all of the alternatives seem to be much worse.

I think the idea of using various metrics to pick the best options is a good
one, and something we've been pursuing in Perl-land for many years (kwalitee,
CPAN Ra(n)tings, metacpan.org, cpantesters, and more).

~~~
mcantelon
Evolution > central planning doesn't. CPAN was a great stride in open source
culture.

~~~
gizzlon
And it's still improved upon.. metacpan.org, f.ex, is a nice improvement.

------
amix
I hope that the node.js community will embrace even more democracy and even
more choice when handling the "standard library" and the "external library".
Here is what I wrote 2 years ago about the ideas for a future language that
embraces democracy instead of tyranny.

What I can see is that npm relies heavily on GitHub and the tools that GitHub
provides (such as wikis and documentcloud). Personally, this kind of
integration could lead to a revolution, since it makes collaboration much
easier (it also makes it easier to evaluate reputation of developers,
dependencies etc.)

I am a Python developer and I think npm is already a MUCH better platform than
PyPi given the tight integration with GitHub and I can only imagine what the
future brings.

===

The idea of the social platform is to create a platform where developers can
collaborate and a platform that promotes quality software. CPAN, Ruby Gems and
Python Package Index are early versions of this vision and they need to become
more social to become more useful. The social platform should be applied to
the core libraries as well and not just be used for the external libraries. A
language's standard library should be a democracy where the best and most used
libraries win - and not like now, where libraries win by being selected by a
few dictators!

Essentials of this platform are:

* Easy distribution: It should be easy to push out modules to the platform.

* Easy forking: It should be easy to fork modules, to apply patches and to send patches back. Using something like git or mercurial is a must.

* Reviews and a reputation system: The platform should have reviews, but also reputation, something like Stack Overflow's excellent reputation system. I.e. a system where helpful developers are rewarded for their hard work.

* Search and discovery: It should be easy to find modules you are looking for and to compare modules. If I am looking for a template library, then I might want to sort template libraries by how many others are using them or by a developer's reputation.

* Fully integrated into the language: This platform should be fully included in the language and ship with the language. The signup process to get on the platform should be very easy.

* Trac like features included: This platform should include tickets, timeline of changes and a basic wiki.

Most of these pieces are partially implemented, especially in products like
GitHub and BitBucket. What is needed is a much better integration with the
languages. The bottom line is that languages should embrace "social
programming" and implement a platform similar to GitHub/BitBucket that enables
developers for easy collaboration.

from <http://amix.dk/blog/post/19475>

~~~
masklinn
> Essentials of this platform are:

I think most of your ideas are wrong, and are in fact what _hamper_ pypi: the
desire to have everything built in.

You mention CPAN, the CPAN ecosystem has most of the features you describe,
but virtually none of these come from CPAN itself. Instead, they come from
what people have built on top of and alongside CPAN, CPAN mostly hosts package
data and metadata and provides the core on which everybody else can build
useful services (search engines, bug trackers, comments systems, testing/CI
systems, CPAN.pm/CPANPLUS, etc...)

Pypi has made huge progress in the "search & discovery" category not from
improvements to pypi but from `pip` being separately created. Pypi tried to
add reviews and reputations, it stank and was removed.

> A language's standard library should be a democracy where the best and most
> used libraries win

No, a standard library is a core of broadly useful batteries, just because
"Rails" or "LXML" are very popular does not mean they belong anywhere near a
standard library.

------
substack
Perhaps you could just be more narrow in your searches with `npm search`?
Perhaps instead of searching for "asset management" you could just search for
a tool that does browser javascript bundling and then search for another tool
that does css bundling. There are fewer of each of those and you can refine
your search further by looking for specific things like bundlers that do node-
style require()s versus AMD-style requires.

Plus, on <http://search.npmjs.org> you can see how many people have starred a
project. You can star a project yourself with `npm star`. For a lot of
projects you can click the home page or github links from the project page
too. These are really useful to get a quick glance of what the API is like. If
a project doesn't have either of these then it probably isn't worth using.

~~~
mikl
Thanks for the tips – I do look forward to the stuff that Isaac and friends
are cooking up with more metrics, but I'd still like to see more collaboration
and less duplication :)

~~~
malandrew
You may also want to checkout the project pages for the most recent Node.js
Knockout. Those pages often list all the modules used by the teams.

To find modules worth knowing about, I started by looking at the modules used
by the winning teams.

In fact, the best way to get into any community and know what projects matter
is to find out who matters and then follow them to the projects that do
matter.

------
murz
<https://www.ruby-toolbox.com/> is one way the Ruby community solves this. I'm
not a huge fan of their font choices, but the interface provides an easy way
to see which gems in a given category are the most popular or most active.

For example, if we were looking for an asset management package like the OP,
we would quickly see that Jammit is the most widely used: <https://www.ruby-
toolbox.com/categories/Asset_Management>

~~~
davej
By the way, it's not as fully featured but..: <http://toolbox.no.de/>

------
malandrew
I agree with modeless in that duplication of effort is a non-problem. AFAIK
The Node community is taking a different more scalable approach to community
management that is painful short-term (especially for those looking for a
Rails, Drupal, Django experience) and more valuable and scalable long-term.

Curation creates bottlenecks, single points of failure and is very subjective.

The fact that members of the community are thinking about objective ways to
measure modules is better for the community long-term. Quality of the code in
packages is important because it suggests that the module will be more
maintainable and extensible long-term. But that is just one factor.

I hope the community leverages weighted social proof as a way to suggest which
modules are fittest and should survive and prosper. Those developers who are
most active in the community and contribute the most are also people that ship
code and rely on the code of others in the community. One of the best way to
determine which modules to use would be by weighted popularity where the usage
of a module by someone of importance in the community carries more weight than
usage by a non-contributer.

Basically, the idea of number of "watches" and "forks" in github needs to be
taken to its logical conclusion because not all watches and forks are created
equal.

I would imagine that these are the kinds of issues the node community is
considering as they try to come up with a scalable, objective way to manage
what succeeds.

Personally I think this is a much better solution than the approach in other
communities where certain library/technologies are foisted upon you under the
auspices of convention over configuration. Some communities no longer just
defining conventions that are widely adopted, but are picking winners and
losers among newer technologies instead of allowing time for the community
decide.

The community and open source projects around Node.js are growing too fast to
subject it to curation and expect the truly great projects to emerge
naturally.

As an endnote, the Drupal community and node.js community are fundamentally
different in their approach. Drupal is an entire platform and framework, where
many of the decisions of which module to use are make for you and you accept
them or have to hack away to change that decision. This works in the Drupal
community because by and large the problem space, content management, Drupal
addresses is much smaller than that of Node.js, any problem that is better
solution with asynchronous non-blocking I/O.

~~~
substack
I discussed this with isaacs in one of the nodeup episodes and we concluded
that a pagerank implementation for authors and projects that reads in the npm
stars and dependency graph might work. It would also be useful to consider the
github watchers, elapsed time since the last update, and the presence and
status of tests, perhaps by ingesting data from travis-ci.

Edit: turns out what I have in mind nearly already exists at
<http://eirikb.github.com/nipster>

~~~
malandrew
That's basically what I had in mind too -> edgerank for libraries. I reckon
you'd need to have some sort of decay function that is relative to the amount
of activity in that problem domain.

In inactive problem domains (such as linting, which these days sees few
commits and has almost no competition), you wouldn't need as strong a decay
coefficient. Lack of activity suggests a solved problem or something that is
no longer a problem.

Highly active areas such as asset management (ender.js, browserify,
require.js, etc.) you'd probably need some sort of coefficient for that
problem domain. Tagging could be used to strictly or loosely assign packages
to a problem domain.

Acceleration is another issue worth considering. How quickly is a project
being adopted among those that matter.

Anyways, it's not a trivial problem to solve by any means once you get around
to measuring social aspects surrounding a module, but it certainly is a step
in the right direction. It's also a problem whose solution can be continually
refined.

TBH, I reckon that any refined system is going to look increasingly like a
financial mark where certain behaviors are analogous to actions like put and
call options.

It'd be awesome to have a smidgen of transparency into private repositories in
the form of aggregated gripping of the require() statements in active private
projects. You could go even farther and look at the number of method
invocations of a particular library.

It would be cool to be able to view such data in the same way Google Trends
works. For example, it'd be interesting to compare optimist, nopt, commander
and nomnom.

(going to stop now because I'm just rambling now. hehe)

~~~
dkubb
Another idea I haven't seen mentioned is having a project building properly
via travis-ci.org or some other CI system shows that the project is actively
maintained and passing an automated build. Perhaps other metrics like code
coverage could be used as well.

------
minimax
With respect to the OP's specific problem, namely serving up compressed static
assets, why use Node.js at all? They're static files. Host 'em on a CDN or let
lighttpd take care of it. There's no need to add the complexity of Node to a
relatively simple problem.

------
jashkenas
Check out:

<http://toolbox.no.de/>

<http://search.npmjs.org/>

~~~
secoif
The problem is neither of these tools give you useful metrics up-front: I want
to compare the number of watchers/forks/updated/dependencies, but instead I
have to wade through each result item to make a decision.

It is open source though, <https://github.com/activesphere/nodetoolbox> so I
guess I could go fix it.

------
AdrianRossouw
I don't believe it's an issue, as node modules tend to be far smaller in scope
and have far fewer side-effects than Drupal modules. I personally find
projects mostly via github. I try to avoid any project that tries to do too
much. The moment it registers a route or renders a page, I'm gone.

There is no 'node.js way' to do things, so you gain the freedom to find tools
that work more closely to how you would like them to, instead of just being
forced to integrate with the existing stuff "because it's there".

There are no holy cows, or 'core that shall not be modified', because
everything is small enough that they are easily forkable/hackable and you can
easily maintain your own forks if you really had to. Npm can even use packages
directly from github.

NPM is a dream compared to drupal.org infra and the (albeit useful) hack that
is drush-make.

------
itay
I understand the author's frustration, but I don't think this is an issue with
npm - rather, it is an issue with the culture.

Most node packages are small - rarely more than a couple hundred lines, and
rarely do more than a single function (and do it well), in terms of
functionality. Given that, it's easy to understand why people tend to reinvent
the wheel - it's fun, exciting, and many times you think you can do it better.

~~~
mikl
Indeed, the cultural problem is one of the main points of the last section of
the post :)

------
modeless
Duplication of effort is a non-problem, and any attempt to "fix" it would
reduce contributions and discourage innovation. With good metrics the best
solutions will rise to the top over time.

~~~
astrodust
The problem is metrics. How is this problem addressed?

Ruby and Python seem to suffer the same problem with a lack of feedback on the
quality of offerings.

~~~
wycats
In Ruby, the Ruby Toolbox does a good job of providing reasonable, automatable
metrics. <https://www.ruby-toolbox.com/>

------
JoeAltmaier
Hey, the beauty of great support libraries is, its a joy to build on top of
them. So lots of people do.

Collaborate? Makes sense if you're doing it for a job. But for fun? A hobby?
To learn? No, then you have to do it yourself, that's the whole point.

Maybe npm could organize, some kind of 'social score' for how often projects
are used or how many '+'s they get or something.

But lets not discourage innovation, or hacking, or re-inventing something
already solved just for the joy of it!

------
shapeshed
Isn't this just that node is a newish framework? Discoverability and metrics
of quality are issues but I think many devs are excited by node because it is
green field. More libraries are good in the long term as the best ones will
rise to the top. npm itself solves dependency management really well, but I
agree currently a feature like npm star doesn't really work. I'm not sure npm
should be solving this problem though. The most valuable data comes from
GitHub on modules.

------
krmmalik
While i think the author has some valid points, I wonder if the concerns are a
little premature. I'm still learning Node, so not qualified to comment on many
things, but i do feel that the approach the core contributors have taken with
Node has been a pretty mature one so far.

I'm sure they'll figure out a decent solution soon enough.

------
tmcw
Nah. Making a nicer, more informative npmjs.org is totally doable and likely
an eventuality.

Making a package manager that only occasionally triggers the rage of picky
programmers is one heck of an accomplishment.

------
maxogden
In practice (for developers that write javascript well) it takes about 10
seconds of glancing through node package code on github to determine quality.

------
zgohr
Talk about an excellent problem to be running into.

~~~
mikl
Yes, this is truly a good kind of problem to have – but a problem none the
less :)

~~~
mcantelon
#nodejs irc.freenode.net

If the Github watch count for a Node module can't help you pick, just ask the
hundreds of people who are always in the IRC channel.

In the Drupal world someone who proposes a module that duplicates an existing
module that does things badly can be blocked. That's a bad policy because one
size fits all leads to wearing burlap sacks.

The fact that npm is _not_ a planned economy is brilliant. If you want to add
ratings, etc., npm is backed by CouchDB and thereform ridiculously each to
build upon. In addition to being able to freely read npm repo info, you can
also extend package.json info with your own fields. I remember when
drupamodules.com came out, providing a solution for rating Drupal modules
(something drupal.org still doesn't have, AFAIK). The guy who made it got
grief for not making something blessed by the mothership.

As an example of how CouchDB helps make npm awesome, wanna find every field
used in npm package.json files? Boom (thanks to Isaac S for this tidbit)!

<http://registry.npmjs.org/-/fields?group=true>

------
krisroadruck
This is the same type of thing that soured me on python. Choice overload.

