Hacker News new | past | comments | ask | show | jobs | submit login
Huge no. of files for Angular 2 (stackoverflow.com)
110 points by ishener on Aug 2, 2016 | hide | past | web | favorite | 103 comments

Interesting fact that I recently came across:

bower and many other npm packages, has some dependencies that eventually depends on a package called "wordwrap".

And this "wordwrap" package somehow has its test folder exposed in npm.

The result:

Every single person using bower would have one or more copies of In Praise of Idleness by Bertrand Russell on your local machine, depending on how many of your projects has a npm dependency of this package:


Don't believe me? Try searching for "In Praise of Idleness" in Spotlight.

Edit: Someone had already sent a PR about this on GitHub: https://github.com/substack/node-wordwrap/pull/14

It's a good job they didn't pick the Anarchists' Cookbook.

Very nice! The file can also be found by searching for "Saran finds some mischief", which fees like an apt phrase.

Satan, I think you'll find.

I used to `touch node_modules/.metadata_never_index` to prevent Spotlight from wasting disk cycles by indexing that stupid folder. After searching "In Praise of Idleness", it doesn't seem to prevent it. :/

Anyone knows how to prevent all node_modules folders from getting indexed by Spotlight?

I ended up excluding the whole projects folder to prevent spotlight from indexing any source code. I never use spotlight to search any of my code anyway.

Just ignore your whole dev/code folder from Spotlight. It's not a good way to get to it anyway.

Is there an indexer in any operating system that's worth the aggravation? Windows Explorer's is hot garbage, and I've never been real impressed with any of the ones in my various Linux distributions.

Old-school grepping or using Notepad++'s find in files feature is far away the best method I've come across, which is... kinda sad.

The indexing itself is great - I use Everything [1] heavily.

The client on the other hand? It's baaad.

[1] http://www.voidtools.com/

I like Agent Ransack on Windows; much faster than the regular search. I've never looked into how it works.

I have 16 copies on my work computer.

I have 94 copies.

Think of it as 94 NPM upvotes.

I have 8 copies.

I wouldn't be hopeful about that PR. Substack is awesome but he's really bad at replying to notifications because of the tons of modules he has.

Wasteful yet subversive

I don't know about you but I think the world can spare an extra 29kb.

0 copies. too lazy to work.

This person is counting the node_modules directory. While JS is a bit insane and this directory will have a ridiculous number of files, they are concerned:

"because my deployment (google app engine) allows only 10K files"

meaning, they don't realize that node_modules is for development and not related to the application they would actually deploy.

Hope this comment stays at the top before all the "wow JS sucks!!!" people arrive :-) Though to be fair a "modern" JS dev environment does use a ton of stuff!

IIRC Angular 2 production builds are actually pretty efficient.

IIRC the Angular2 production builds are the largest among all the frameworks. If you add the router it sits just shy of a meg - and that's just the dependencies. Not sure if they've started working on bringing the size down yet, but there are 1KB alternatives[1] for those who cares about their users.

[1] - http://monkberry.js.org/

This website froze my Firefox running on an i7 CPU due to the animated background.

I'm not sure the right conclusion here is "oh, well then that is reasonable."

> Though to be fair a "modern" JS dev environment does use a ton of stuff!

Though to be fair a "modern" JS dev environment does use a ton of unnecessary stuff!

I fixed your comment.

Eh, transpiling ES2015+ to ES5 is not unnecessary if it makes you more productive—Babel is just really big.

Well to be fair, does anyone expect a compiler to be small?


Buble says "Files (940 KB)" but it doesn't do everything Babel does (just a strict subset).

I know that it's possible, but what's the purpose?

Is anyone's quality of life really impacted by the size of babel vs buble?

Does the size of a compiler really change anything for the average developer (obviously within reason)?

I personally don't care but I've had super long install times when using slow internet—which in many parts of the world is the standard speed. I'll keep using Babel, but I like that there are alternatives that install faster if need be.

I agree. Also there are multiple ways that you can install these things.

The "recommended" way is to install babel for each project independently. Space is cheap, internet speeds are fast for many, and avoiding version issues outweighs the savings.

But you can install globally, so you'd install babel once and can use it in all projects that way.

Then throw in the possibility of different package managers and you get a crazy amount of freedom and choice. People get overwhelmed with "javascript fatigue" and I just don't get it. You don't need to do everything, but having the option to is amazing.

I believe NPM also caches your file locally. So the second install of babel would 304 from GitHub and get pulled from your local cache, saving time and bandwidth.

Buble's purpose is that it is supposed to be significantly faster to compile from ES6 to ES5 with minimal configuration, since it actually does things out of the box.

But that has nothing to do with it's size. If Buble was 10% larger than Babel it would still solve it's purpose.

That's right, Buble's secret sauce to being fast is that is skips the code generation step, not because of the number of dependencies.

Although there definitely is a performance cost to Babel's large dep tree as each of these modules have to be found by Node (which is inefficient). If you use Babel with npm2 it is super slow, because npm2's folder structure causes more lookups.

We ship Babel built in with the AVA[1] test runner and I can confirm `babel-require` is our bottleneck.

[1] https://github.com/avajs/ava

Yeah, I think that was a bit of a mistake by you guys. AVA's entire point is to be fast but it has the perception of being slow because you've made transpiling a core feature. I would drop that and let people do their own transpiling.

Aside from the concurrent testing, Babel with async functions built-ins is our second biggest "bullet point", so even if you were right it's too late. AVA is opinionated and I think the benefits (from what people have told us) gained from using the latest syntax with no Babel config are worth the Babel bottleneck—which isn't that bad.

> and not related to the application they would actually deploy.


Granted, it's the same with everything- if you expanded every compressed file and counted every class, most languages and frameworks would just be incredibly nuts, because all of them aren't needed.

But, something could be done about that; you could better differentiate what is there for convenience, and what needs to be there and make the developer more aware of what they are using. It can make development more difficult if done poorly, but good examples of minimalist development are out there, e.g. Sinatra and similar frameworks that said X is too much- just use this.

Evaluating a framework by the number of files its dependencies are broken into is a a pretty poor measure of quality.

Why? If framework X does roughly the same as framework Y which is 50% in size then you can estimate that the code of Y is more effective.

Instead of "50% in size" I'll assume "50% fewer dependencies", because I think that's the point here.

I believe that creating a module without relying on other modules will likely lead to reinventing the wheel. Well, lots of wheels.

However, that might still be fine. But what about that one corner case you missed? It might already be solved in a third-party module that focuses on one thing only.

It's really not that bad to try and use specialized modules as much as you can. You can benefit from other people's cleverness and focus on more relevant work.

Yes, there will probably be a lot of on-disk overhead. But is that really relevant today?

This is the major part of the whole "left-pad" fiasco I don't get.

If there is a well written, well tested, and widely used micro-library out there that does one thing and does it very well, why not use it?

Even if you think you can re-implement it in 5 minutes, will yours be as fast? Will yours be as well tested? Will yours have an interface that many other developers already know and use?

Sometimes reinventing the wheel is needed, but most of the time using a well working wheel that someone else made is the best choice.

> If there is a well written, well tested, and widely used micro-library out there that does one thing and does it very well, why not use it?

Because every dependency comes with a cost. First of all, it needs to be available and the author might decide to pull it - maybe not from npm, but from github. Second is a matter of trust: Someone just needs to take over the left-pad authors npm account an all of a sudden he can inject arbitrary code into all projects using the dependency. I'd bet that 90% of folks don't even bother to check the left-pad code. So basically you need to trust each and every author of dependencies that they're benevolent and competent, that is: They don't drop the ball, get hacked, loose access, ... And that task gets harder and harder the more dependencies you have to vet. In a lot of instances just inlining the code would be better. A larger stdlib that can be selectively included would be better. It's a tough problem and npm just sits on an extreme end of the scale.

I still maintain that those problems can be solved with better tooling and package management rather than "bundling" dependencies.

Bundling to me is such a sledgehammer solution. Yeah, it can somewhat prevent many of those issues, but it also comes at a pretty large cost.

* it leads to code duplication

* it can ruin the performance of tree-shaking and minification systems

* it prevents you from swapping out a small module with another globally

* it makes it harder to inspect and debug the code that you have installed in the node_modules directory

* it makes it harder to verify that the code on your machine is the same as the source code

* the bundler can introduce bugs into the source code

* The package now needs to maintain a build step and needs to maintain a separate source and "binary"

And more. Plus, in the end you might not even be helping anything. A big repo like lodash can have just as many contributors as tons of little dependencies, and big repos aren't immune to the org running it going belly up.

I guess I see those problems as more of a "large amount of code" problem instead of a "large amount of dependencies" problem.

I wasn't talking about bundling but rather about something like C glibc or rusts stdlib. Having a solid stdlib that covers for example string padding can at the same time minimize code duplication and number of dependencies.

Neither did I deny that inlining everything comes at a cost as well, so the goal is to find a good point on the scale. I was just pointing out that having tons of small dependencies is not free of cost.

Fair enough, but with a language like Javascript a large standard library will never be a reality.

There are too many implementations (by design) and the language is such a "mutt" of designs that it will never happen.

I personally don't think that's a bad thing, but it is different to how many languages work.

One reason is diversification.

If there are N ways to write a program, M of which are security hazards, it's better to have M/N of all programs exposed to risk than have a M/N chance that all programs are borked.

"Reinventing the wheel" is a leaky cliché: the problem at hand isn't that people would independently try to come up with the solution to a simple problem (rolling something down a hill), but that the instantiations of a solution would be irrevocably linked, such that one flat tire stops all cars.

What's more - jesus, go outside, look at how many kinds of skateboard wheels, car wheels, bike wheels you see in five minutes time.

This is even more true, if like me you only delve into JavaScript on an occasional basis. The language is full of quirks and gotchas, so for me it would be the sensible choice.

Number of files is not the same as size, especially with the way node modules tend to be structured.

I did a bit of investigation into angular2 dependancies. It has alot more than angular1 and react


It seems to me, that angular2 has a second system problem.

I think its generally a problem with npm tbh. So many hidden deps

I started working with Laravel not long ago and found my project folder had 24,000+ files in it. And those don't compile down before you deploy... it makes me feel like I'm working on the tip of an unstable iceberg. Who the hell knows what's going on down there. No one person could possibly hope to know what it all actually does.

32,000 files for a hello world... jeez, i'm going back to java.....

Not a lot better:

  $ unzip -l /usr/java/jdk1.8.0_77/jre/lib/rt.jar | wc -l
It's just less of a burden on the host filesystem because those files are usually loaded straight from the jar (i.e. zip file).

Also note that this is only the standard library. If you add an enterprise framework you're probably pushing 50k.

JRE is a platform, so if you counting platform files, why not also counting NodeJS or browser sources too?

I'd rather look at comparable thing like JEE server + Spring + some server-side renderer like Thymeleaf.

I'd argue that Angular2 is a platform as well.

Yes, let's compare the industrial strength tested jdk with the average npm lib developer.

No, 32k source files for the whole Angular 2 SDK. The deployable hello world app would be one JS file.

Try exploding every Jar in the JDK and counting how many class files and resources there are.

babel 6 with jsx transformer used to install a comparable number of files due to module duplication. At one point it was a 100M install with some modules being duplicated 45 times. Much of this was the fault of npm 2. But with latest babel and npm 3 it's now a 35M install with 5700 files over 814 directories. I guess that's considered lean by modern standards.

I've switch from babel 6 to buble recently. buble runs three times faster and its install size is 4.6M with 212 files over 39 directories. The install of buble is literally just "npm install buble" and it runs out of the box without configuration. Competition is a good thing.

Just because of the way npm2 ordered the dependencies, the runtime of babel got incredibly slow [1]. Npm3 fixed that drastically, but I wonder how much is still wasted just because of navigating the file tree to the dependencies.

[1]: https://github.com/babel/babelify/issues/206

The opposite is true - directory traversals themselves are effectively free, this is not going to be something that slows down your app - but loading the JS & creating the IR will be much slower. The 4 second startup time with npm2 & babel6 is almost certainly due to the duplicated dependencies, which means literally hundreds of megs of JS have to be parsed & warmed up. With npm3, the same files are (correctly) reused which significantly speeds up start time.

Is there a "distribution" bundle convention for npm ? Analogous to static linking, it would be one .js file that would bind all dependencies into a bundle (e.g. gulp.dist.js) . In that case you would end up with a much smaller number of dependency files to manage.

There is for your final output (meaning the stuff you would upload to the server and serve to the user). but not for the development.

IMO it's a pretty big anti-pattern to do that. It just hides the problem of managing dependencies (see, it's not 10,000 files, it's just one!), but doesn't fix any of the issues associated with it.

Keeping each dependency small, and having tons of them means that deduplication can work better, tree shaking works better, and it lets you do things like swapping out one package for another with the same API.

In this case, most of the files pulled down are the development dependencies which are not going to be exposed to the production code.

I would be perfectly fine if npm pulled down "distribution" versions for tools like gulp, typescript et al

It might just be me living in a bubble, but I'd much rather have the full version downloaded to my machine in it's "raw" form than a "compiled" version.

Even just for the ability to dive into the source i'm using if i'm debugging something, or be able to look at the actual code i'm running if I want to understand how a tool works.

This is one of the reasons why I like how lodash handles their library. You can install the "regular" version of lodash and require it like "normal", or you can install a single big compiled lodash file, or you can install one that exports as ES6 modules, or you can install a single function at a time...

Obviously every package can't afford to spend that much time on packaging, but a framework similar to that along with some changes to NPM to allow tagging a package as an "alias" of another (so lodash-as-one-big-file will be treated as lodash for other packages) would go a long way into making everyone happy.

Why isn't npm managing packages like ruby gems?


Because package-lookup and the package manager are 2 completely separate systems in javascript.

When you `require` or `import` a file in node.js, it looks for a node_modules and looks for that name in there. If it can't find it there, it starts walking up the directory tree until it finds something it can use (to a point).

This is hardcoded and will be extremely difficult to change without a crazy amount of breaking.

The package manager is free to install however, but it needs to put things where the package-lookup can find them.

But it is still possible to have the best of both worlds.

Essentially, all they need to do is:

1. leave the current behavior for backwards compatibility; then

2. provide a flag like npm -G that exposes the correct behavior as suggested in the grand parent of using the same path like SHARED_DIR/node_modules/NAME/VERSION for package imports and package management.

With time, newer npm versions will default to the correct behavior. For folks that need backwards compatibility, this would require explicitly setting a npm --compat flag or similar.

The problem isn't in the "package manager" it's in node.js

node loads modules in a given pattern. Changing that pattern would be global to your project, and would cause issues with tons of 3rd party tools.

the best possible scenario would be to introduce a "new_node_modules" type directory and change to the new system, then look in "new_node_modules" first, then the legacy "node_modules" next, but that's a ton of work, a ton of 3rd party tool breakage, and a lot of possibility for new bugs and breakage for not all that much benefit.

That's not to say it shouldn't be done at some point, just that there are much bigger areas that need to be addressed sooner in the node ecosystem.

node's module resolution will not likely ever be fixed. Too many modules depend on its undocumented implementation details and there isn't the will to improve it. A major source of problems is node's symlink resolution scheme that depends on fully resolved paths, counter to how other UNIX programs use files. Because many module developers know how the resolution scheme works they often hard code behaviors and paths into their code that would basically prevent any alternative module resolution scheme from working.

Everyone knows it shouldn't be. Even a naive attempt by the average developer to write a package management system would take versions into account.

Again, it's not the package management system that is the problem, it's the module loader in node.js.

And it's easy to say "even a naive attempt by the average developer" would do it better, but I really don't think they would have.

Still, the fact is that this is what we have, and complaining about what could have been isn't going to do anyone any good, improving the system will.

Reminds me of this talk that I just watched: https://www.youtube.com/watch?v=k56wra39lwA

At first glance, I thought the title was "Huge NO: on files for Angular 2" - a vulnerability report on filesystem capabilities of Angular 2 and why it should be abandoned

Has npm finally figured out how to de-dupe dependencies? In one project, I have something like 47 copies of the same version of the same library, distributed at all levels of the node_modules hierarchy.

I try not to think about that JS tooling too hard, lest I start pulling my hair out and devolve into a screaming crazy person.

This sucks so hard. It's also quite easy to hit path length limits for certain operations (on windows) when you have this mess there.

Totally. We just ran into this issue. Running npm install inside a linux vm resulted in an endless loop, because the nfs mapping created path names that exceeded the length limit.

it's like an entire eco-system is fundamentally broken

Maybe, but it's not like there is no way of solving that issue.

It's just that this problem can't be solved entirely by npm. Node has to make changes to it's module loader as well.

NPM 3 tries, not entirely without success, to flatten the tree under node_modules/; I've had good results using it to resolve the kind of path length issues (in this case, with NTFS directories mounted in a VM) that you describe. Might be worth a look in your case as well.

This used to be the case. In npm3 and beyond it installs dependencies in a flat way, which seems to have solved the Windows file length limit problem.

It's actually not possible due to the way Node looks up dependencies. You could have a situation like:

        c/ (1.0.0)
        c/ (1.0.0)
    c/ (2.0.0)
Both a and b depend on c version 1.0.0, but since there's a version 2.0.0 in the root node_modules folder c can't be placed there, and has to be duplicated in a and b's own node_modules folder, otherwise Node couldn't find it for each of them.

It's entirely possible with symbolic links. https://github.com/rstacruz/pnpm

You could use links, version in names, ... You'd just have problem to take out one submodule I guess, but I doubt that's a use case anywhere.

I've built a fairly hacky solution to this before (for a different package manager) - it can be pretty simple:

                    c -> ../../c@1.0.0
                    c -> ../../c@1.0.0
        a -> versions/a@1.0.0
        b -> versions/b@1.0.0
        c -> versions/c@2.0.0

Some other people have linked to npm alternatives ied and pnpm which do essentially this.

npm v3 tries to dedupe packages during installation to reduce this problem. However, it still does not install packages in a deterministic way...

[0]: https://docs.npmjs.com/how-npm-works/npm3-dupe

[1]: https://docs.npmjs.com/how-npm-works/npm3-nondet

I was just about to write a comment to ask why it can't be done like this (using CAS / symlinks). So I guess it can. Are there any disadvantages to using ied over npm?

It's not fully stable yet. I tried a while ago, and after running into 3 showstoping bugs (and filing bugreports) i had to give up.

Great idea though, and it probably has improved a bunch since then.

This is one of the reasons why I love the JS ecosystem. You can even choose among a few tools you use to install packages from a package manager.

It does not support external packages from git remote url (github).

Private modules seem to be an issue at the moment, but a PR for this is pending.

I used it successfully in a project and it even worked with native modules.

To me it looks like a silly architectural mistake made by NPM/Node developers, considering that there's already pretty good dependency management solution on the market that does the things right.

Maven repositories have following structure, that allows to avoid duplication and take versions into account: /<vendor namespace>/<library>/<version>/<library artifact.ext>

Vendor namespace itself is hierarchical and usually related to domain name, e.g. "com/mycompany/myapp".

No idea, why this approach is not yet used in JS world (except the webjars), but it's high time to fix it this way.

Yes, that's already solved with npm 3.

Okay, but when you run the build process, how big is the resulting distributable?

My results of the hello world tutorial in Angular2 was 53 requests, 4933.49KB, and loaded in 1.74s on local dev, according to browser dev tools. All for one html file that had one h1 element.

Plus, it started out broken. I had to search elsewhere to find the solution to the error the tutorial produced.

That's not a production build.

Minify it, run dead code elimination. Exactly as you would with any other language with a compiler.

Also, worth noting that the tutorial being broken isn't symptomatic of JS, that's a problem with Angular (which has a history of sucking, and IMO Angular 2 just takes all of the problems with Angular 1 and adds more baggage to it).

Be aware as well that Angular 2 is a full-fledged Web Framework. Even after all of this compression and such, it is not going to be as lightweight as you'd expect simply due to the nature of what you've installed.

If you want something really lightweight, go with Rivets or React.

I'm confused by your response. I haven't claimed anything along the lines of your response nor am I implying anything about Angular nor JS. I simply just communicated my results from completing the tutorial.

I thought I’d never use Rails as example for a lightweight web framework, but Rails has to be the standard – and anything heavier than rails has to be completely discarded.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact