
Show HN: Node.js node_modules file pruning - tjholowaychuk
https://github.com/tj/node-prune
======
orf
Rather than two giant 440px high images consisting of tiny text in the readme
I'd quite love to read exactly what it does and why. Does it prune files in
modules that are not needed? If so what heuristics does it use to prune files?
Or does it prune modules that are not needed?

Reading the code[1] it appears to delete directories and extensions based on a
blacklist.

1\. [https://github.com/tj/node-
prune/blob/master/prune.go#L15-L5...](https://github.com/tj/node-
prune/blob/master/prune.go#L15-L51)

~~~
goldenkey
It looks like it removes the files necessary for "npm install" assuming that
the module will do npm install immediately when you install it. This is not
true for all cases though..which is why the files should be kept around. Disk
space is cheap. This isn't bad for deployments though.

~~~
finstell
It's especially good for deployments to AWS Lambda.

------
gcommer
Love it. Saved 600MB... and test cases still passed :)

I ported it to bash really quick, as thats probably easier for people to drop
into their dotfiles if they don't have a Golang env setup.

[https://gist.github.com/gpittarelli/64d1e9b7c1a4af762ec467b1...](https://gist.github.com/gpittarelli/64d1e9b7c1a4af762ec467b1c7571dc2)

If we all just do our part and send a PR setting up .npmignore for one or two
of these projects, maybe we won't need this anymore

~~~
tuananh
Great. I wrapped your script into npm package
[https://www.npmjs.com/package/node-prune](https://www.npmjs.com/package/node-
prune)

~~~
arve0
And here is a project that helps you decide upfront if you should rely on that
bloated module:

[https://www.npmjs.com/package/download-
size](https://www.npmjs.com/package/download-size)

The tool is available online too:

[https://arve0.github.io/npm-download-size/](https://arve0.github.io/npm-
download-size/)

------
TheCoreh
[https://github.com/tj/node-
prune/blob/master/prune.go#L57](https://github.com/tj/node-
prune/blob/master/prune.go#L57)

Doesn't this also result in .d.ts files being removed? These are type
declaration files (Kinda like C's .h files) that provide types for you without
the file size overhead of the full TypeScript source.

~~~
zachrip
They aren't needed during runtime though, that's what this project is
attempting to clean up.

~~~
latchkey
ts-node no longer works.

~~~
smt88
What's the benefit of running ts-node in production?

~~~
latchkey
I don't have to distribute .js files.

------
mambodog
The Yarn autoclean command also does something like this:
[https://yarnpkg.com/lang/en/docs/cli/autoclean/](https://yarnpkg.com/lang/en/docs/cli/autoclean/)

~~~
chrismorgan
[https://github.com/tj/node-
prune/blob/20703f18e9a7996683f1b8...](https://github.com/tj/node-
prune/blob/20703f18e9a7996683f1b89769c7fda4654aed52/prune.go#L14):

    
    
      // Copied from yarn (mostly).
    

I infer that TJ used `yarn autoclean` as a starting point for this.

------
lxe
Awesome idea. Especially if you’re packaging your projects via something like
Docker this can shave hundreds of megabytes. (Even if you’re using the Docker
layer cache, any change to your dependencies will bust the entire node_modules
cache.)

The caveats is that this won’t work 100% of the time. There will be a
dependency deep down in the tree that needs to read a markdown file from the
file system or something of that sort. These issues could be hard to debug if
this tool doesn’t provide good visibility into what’s being deleted.

~~~
tuananh
you're looking for `yarn autoclean`. they have `.yarncleanrc` IIRC

------
namuol
Is this serious?

I mean, kudos for helping push node module developers to make better use of
`.npmignore` et al, but this project strikes me as overly snarky or even
trolling...

Seems like what we really need is for some volunteers to tackle the most-
installed packages on npm that publish with many extraneous files.

Hell, you could even automate it by writing a bot that (politely) points out
likely-extraneous files and opens a pull request with changes to `.npmignore`
to clean things up... Hmm... Weekend project brewing...

~~~
tjholowaychuk
It's not a troll, this is for AWS Lambda, where size impacts deployments and
cost starts. See my comment below about the automation, I think it
could/should be done too!

~~~
purrcat259
Wouldnt it make more sense to bundle, minify and treeshake your package than
to delete unused files in node_modules? Tree shaking will remove unsused code
even from files you are using.

~~~
tjholowaychuk
You can't always do that (native modules, other non-js assets..)

------
sergiotapia
Shouldn't this be handled entirely by npm? If these files are unnecessary why
do they get downloaded in the first place?

~~~
tjholowaychuk
Ideally people use the "files" array. It might be nice if someone writes a bot
to go around and fix large packages by adding that.

~~~
exogen
Many authors are strangely against it for some reason. One time I sorted my
node_modules by size and opened issues for the top offenders, you can see
their resistance here:

[https://github.com/crypto-
browserify/sha.js/issues/5](https://github.com/crypto-
browserify/sha.js/issues/5)

[https://github.com/medikoo/es5-ext/issues/11](https://github.com/medikoo/es5-ext/issues/11)

~~~
yorwba
Speaking as very-much-not-a-JS-developer: Isn't this essentially the same
problem Linux package managers solve with -dev, -doc, and -dbg packages? I.e.
the default install only contains the minimum necessary to use a program, and
if you need the header files/documentation/debug symbols, you can just install
them separately.

Is it too hard to meaningfully separate these parts of a package, or is it
more of a philosophical issue?

~~~
krisdol
It really is not too hard to separate these things in npm, but very few
packages do.

------
finchisko
No disrespect to TJ (he did many awesome projects I use almost daily), but
wouldn't simple: 'find node_modules -not -regex ".*.[js,json]" -delete'

do the same job? Why project in go?

~~~
tjholowaychuk
Maybe it's meta satire :D

~~~
finchisko
ok then my comment can be forgotten. :D

------
billmalarky
Oh wow TJ is working in nodeland again. I thought he had moved entirely
towards Go.

Glad to see more of his work!

Edit: It's Go lol.

------
tomchentw
Even better: submit a PR auto fix these issues on the corresponding GitHub
repo when pruning

------
billmalarky
TJ, [https://apex.sh/](https://apex.sh/) is so beautiful. Did you design it
entirely yourself or did you get help and if so who helped? Also mind if I ask
what your inspirations were?

~~~
tjholowaychuk
Thanks! I do all the design stuff myself. I wouldn't call myself a designer,
but I do enjoy it either way!

~~~
chrismorgan
Just one thing—please increase the font size. `--font-size: 14px` is just too
small for comfort. The standard default of 16px is a good balance. (`--font-
size-small` also naturally needs to increase.) Other than that, it’s a
pleasant minimal design.

------
franciscop
I've been thinking about this in a different way: what about bundling things
with rollup.js? Then each package in require() would be just a package.json
and an index.js. I think this might even help with performance.

Edit: lightened up (;

------
twhb
Symptomatic of the npm community’s overloading of “npm install” to serve both
users and contributors. It’s easy to configure “npm publish” to do essentially
this for all users, but it’s considered bad practice. Which I don’t
understand, since the dev use case is still fulfilled - and better - by “git
clone”. And merging the two doesn’t scale - Chrome’s dev download is several
GB and takes hours to build.

------
vkjv
We've started publishing modules with a whitelist in the npmignore to help
combat this.

    
    
        *
        !dist/**/*

------
dstroot
Nice job TJ. This definitely scratches an itch. Thought you were a Go dev
these days. ;)

~~~
igotsideas
Sarcasm?

~~~
billmalarky
[https://medium.com/@tjholowaychuk/farewell-node-
js-4ba9e7f3e...](https://medium.com/@tjholowaychuk/farewell-node-
js-4ba9e7f3e52b)

Update: I see now it's written in go... :-P

------
andreineculau
Usually I don't care about what language an executable is written in. After
all, as a user I'm just interested in whether it executes or not.

But these small-software situations amaze me. Someone with a node_modules
problem will have readily available sh, node and maybe python. So why golang?
What could those not do, or golang can do better to such extent that it trumps
availability? Similarly there's a price to pay in terms of people contributing
with a fix: who is interested in pruning node_modules and will send in a
golang PR?

In other words, if a dev would prefer Java (specifically chosen because it
exacerbates the startup time), would it still pass as ok? Luckily golang can
compile to binaries but that implies you give up availability on the other
end, now being confined to someone compiling and publishing on a regular
basis, as opposed to just pushing a fix commit to a git url.

None of the above would be of importance if this would be a personal-quality
repo with a note: hey i did this at 2am out of frustration, i chose the tools
that i knew best, use it at your own risk, opensourced to share knowledge and
to access it from my own projects, not as a "productified" software.

EDIT: I would much prefer a commenter's solution in sh for the reasons above
but also readibility:
[https://gist.github.com/gpittarelli/64d1e9b7c1a4af762ec467b1...](https://gist.github.com/gpittarelli/64d1e9b7c1a4af762ec467b1c7571dc2)
:clap:

~~~
tjholowaychuk
Man it's free code, you're reading into this far too much.

~~~
andreineculau
I'm trying really hard to see what price has to do with anything I've said.

~~~
always_good
Well, that's your problem right there.

This line particularly reveals your very sad flavor of entitlement:

> hey i did this at 2am out of frustration, i chose the tools that i knew
> best, use it at your own risk, opensourced to share knowledge and to access
> it from my own projects, not as a "productified" software.

------
hiphipjorge
This is completely useless because npm still downloads those files and the hdd
space is pretty negligeble. It should def go under the category of troll
driven development. This might be useful for docker images that will be
distrubuted and will be downloaded a lot, but even there it's a stretch.

That being said, it would be great if npm had some functionality around
packaging only the necessary files for actually running the module and
removing all unnecessary files (tests, source code, documentation) and have an
opt-in option to install those.

~~~
tjholowaychuk
It's for Lambda

~~~
hiphipjorge
Maybe documentation should include uses cases? Newer users to npm + node might
think this does something different. Just a suggestions.

~~~
BigJono
Maybe documentation for everything should contain use cases, first, before
anything.

Every Javascript tool has documentation the wrong way round. Quick start
guides and installation instructions are useless if you don't even have a good
reason to use the tool in the first place.

~~~
always_good
Dunno, if you need to be convinced to use a tool, then maybe you aren't yet in
the market for said tool. I think it's a cherry on top to describe use-cases
but not so critical that there's some sort of ecosystem problem that you
describe.

~~~
BigJono
The purpose is to _un_ -convince people from using it. Way too often people
have already made up their mind by the time they reach the docs, based purely
on how popular the tool is, how flashy the front page is, or whatever bad
advice they've received from someone who feels the need to justify their
library of choice.

~~~
tjholowaychuk
It's good to have critical thinking skills, if people can't figure out if they
want to use a tool or not, they should probably work on that a bit!

~~~
BigJono
For sure! Doesn't mean we shouldn't strive to give them a helping hand though.
I know I fell victim to the exact same problem when I was learning front-end
dev. I'm sure everyone in that space has had a "why on earth have I been using
this for the past 6 months?" moment for one library or another. I just feel
like we should be taking more preventative measures against that.

