Hacker News new | comments | show | ask | jobs | submit login

In this case the npm ecosystem is providing more of a surrogate standard library. Imagine if there were no libc, for example, and so people had to reimplement all those functions; would you really want one package per function because of how "Unix philosophy" it would be?

This is where the JavaScript ecosystem is right now -- JS doesn't have the kind of robust standard library other languages take for granted, so you end up with a lot of these things that look like they should be stdlib functions, but are instead third-party packages the entire world has to depend on for a usable programming environment.

I don't understand the standard library argument at all. Standard library is always limited, there is always something that's not included - then what?

This has nothing to do with the JavaScript standard library but the insanity of NPM. There are modules that have even more dependents than `pad-left` despite being included in the standard library (e.g. https://www.npmjs.com/package/date-now, https://www.npmjs.com/package/is-positive, https://www.npmjs.com/package/is-lower-case)

The argument is that things like formatting a string -- including padding it -- should not be add-ons, they should be built into the core distribution you get as part of the language.

I didn't say it shouldn't be part of standard library, I am saying that there is always something that isn't part of the standard library. My point is: if this module was something that shouldn't be part of standard library, what argument would you then use?

I think you're assuming libc is more robust and useful than it actually is. Libc is extremely minimal (and is full of pitfalls as well).

JS has an extremely large and robust standard library in comparison.

Which is useful for binaries. Libraries, there tends to be a lot more aggregation. If this wasn't the case, you'd see libc spread over dozens and dozens of libraries like libmalloc librandom libstring libarithmetic libsocket libprocess libfilesystem libafilesystem (hey, async ops have a different api) libtime, etc.

Why would this be a bad thing? I don't need a random number generator if I just want to allocate memory.

Because a lot of little libraries makes namespaces more complicated, makes security auditing more difficult, makes troubleshooting more difficult as you have to start digging through compatibility of a ton more libraries, makes loading slower, because you have to fopen() a ton more files and parse their contents, etc. Add on top of that those little libraries needing other, probably redundant little libraries, and you can start to see how this can turn the structure of your code into mush really quickly.

At this point, optimizing runtimes to only include necessary functions, and optimizing files to only include necessary functions are both things that are pretty much a solved problem. For example, azer has a random-color library, and an rng library. Having both of those as azer-random or something means that someone automatically gets all the dependencies, without having to make many requests to the server. This makes build times shorter and a lot easier.

Sometimes, in order to best optimize for small, you have to have some parts that are big. Libraries tend to be one of those things where a few good, big libraries lead to a smaller footprint than many tiny libraries.

Not to mention maintenance. Low-hanging fruit here: left-pad, depended on by fricking everyone, <em>is owned by one guy</em>. That is not how you build a serious utility library that people should actually be using in real products. (You also probably shouldn't license it under the WTFPL, but that's another thing). When you aggregate under a common banner, you get maintainers for free and your project might have a bus factor better than 1.

Some of this things you mention are true, but:

> makes security auditing more difficult

What? If you go all the way, you just review all dependencies too. And if they have a good API, it's actually much easier. For example if your only source of filesystem access is libfilesystem, you can quickly list all modules which have any permanent local state.

Splitting huge libraries into well designed categories would make a lot of reviews easier.

> Having both of those as azer-random or something means that someone automatically gets all the dependencies, without having to make many requests to the server.

Also disagree. One-off builds shouldn't make a real difference. Continuous builds should have both local mirror and local caches.

Yeah but that's not the world of NPM. It's a clusterfuck of a maze of near duplicate dependencies with no hierarchy or anything. There's no organization or thought. It's just a bunch of crap tossed together to encourage cargo cults programming.

Think about the test matrix.

Taking an idea to the logical extreme is an effective means of invalidating said idea. How many UNIX utilities are 17 silly lines long?

A bit of code duplication would go a long way towards bringing sanity to JS land.

Quite a bit are pretty small.

Erm, that's not a very good example. You're pointing out some ancient source file from back when unix had no package management. These days echo.c is part of coreutils, a large package which is economic to manage dependencies for at scale.

It's interesting to think about the distinction between promiscuous dependencies (as pioneered by Gemfile) and the Unix way. I like the latter and loathe the former, but maybe I'm wrong. Can someone think of a better example? Is there a modern dpkg/rpm/yum package with <50 LoC of payload?

Edit: Incidentally, I just checked and echo.c in coreutils 8.25 weighs in at 194 LoC (excluding comments, empty lines and just curly braces). And that doesn't even include other files that are needed to build echo. Back in 2013 I did a similar analysis for cat, and found that it required 36 thousand lines of code (just .c and .h files). It's my favorite example of the rot at Linux's core. There's many complaints you can make about Unix package management, but 'too many tiny dependencies' seems unlikely to be one of them.

What on earth is cat doing that it needs 36KLoC (with dependencies)?

I'm starting to see where http://suckless.org/philosophy and http://landley.net/aboriginal/ are coming from (watch Rob's talks, they're very opinionated but very enjoyable).

Yup! In fact the only other data point I have on cat is http://landley.net/aboriginal/history.html which complains about cat being over 800 lines long back in 2002 :)

I didn't dig too deeply in 2013, but I did notice that about 2/3rds of the LoC were in headers.

>"When I looked at the gnu implementation of the "cat" command and found out its source file was 833 lines of C code (just to implement _cat_), I decided the FSF sucked at this whole "software" thing. (Ok, I discovered that reading the gcc source at Rutgers back in 1993, but at the time I thought only GCC was a horrible bloated mass of conflicting #ifdefs, not everything the FSF had ever touched. Back then I didn't know that the "Cathedral" in the original Cathedral and the Bazaar paper was specifically referring to the GNU project.)"

The version of cat.c here: http://www.scs.stanford.edu/histar/src/pkg/cat/cat.c has 255 lines in the source, and brings in these headers:

#include <sys/param.h> #include <sys/stat.h> #include <ctype.h> #include <err.h> #include <errno.h> #include <fcntl.h> #include <locale.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h>

which are needed for things like memory allocation, filesystem access, io, and so on.

One can imagine alternative implementations of cat that are full of #ifdefs to handle all the glorious varieties of unix that have ever been out there.

The main point in support of your argument is the fact that Unix utils are bundled in larger packages instead of shipping in single-command packages. Think of Fileutils, Shellutils, and Textutils... which got combined to form Coreutils! Ha! That leaves only Diffutils, for the time being anyway.

But none of Fileutils, Shellutils and Textutils was ever as tiny as many npm or gem modules.

I thought how commands are bundled into packages was the entirety of what we were discussing. That was my interpretation of larkinrichards's comment (way) up above. Packages are the unit of installation, not use. Packages are all we're arguing about.

My position: don't put words in God's mouth :) The unix way is commands that do one thing and do it well. But the unix way is silent on the right way to disseminate said commands. There you're on your own.

Underscore/lodash is a great bundle of functions that do simple things well. And it's a tiny enough library that there is really no need to split it into 270 modules.

I support packages of utility functions. Distributing them individually is a waste of resources when you have tree shaking.

I trust a dependency on lodash. I don't trust a dependency on a single 17 line function.

While I agree with you, tree shaking is relatively new in the JavaScript world thanks to Webpack 2 and Rollup.js before that, if you had a dependency you brought it's whole lib into your project whether you used one method or all of them. So just including Lodash wasn't an option for people who cared about loading times for their users. A 17 line module was.

Closure Compiler's advanced mode (tree shaking) has been around since 2010.

Lodash has been modular since 2013.

yes, let's blow up the entire concept that's worked fine for the ~5 years of node's existence because one dude did something extreme.

Woah woah. Hold on there. Lets not throw around strong words like "worked", "concept", "entire", "fine", "did" when discussing NPM.

This. I'm on Windows. Npm never worked for me, like not at all. Npm has cost me lots of wasted time, I know I should be thankful for this free product, but, but, Grrrrr...

Windows has always been a problem for Nodeland, thankfully Microsoft is working on making that better.

is unpublishing a module extreme?

I wonder how long /usr/bin/true source is.

On Linux, it's 30ish lines, with half of those there to make false.c able to reuse some code. (I know, it's stupid.)

In OpenBSD, it's 3 LoC IIRC.

On at least one OS I worked with, it was 0 bytes long because /bin/sh on an empty file returns true. (I think that was IRIX 4.x.) OTOH, that's slower than the executable from a 3 LoC do-nothing program.

Let it suffice to say that Linux distributions (and some closed OSes) try very hard to prevent package fragmentation. Experience has shown many times that excessive fragmentation leads to proliferation of unmaintained libraries.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact