
Yarn Plug'n'Play: Getting rid of node_modules - Couto
https://github.com/yarnpkg/rfcs/pull/101
======
wildpeaks
Thing is, node_modules isn't just thirdparty modules.

It's just what the Node path resolution standard uses, which is also how Local
Modules (the "/src/node_modules" structure) allows you to import other files
with clean paths, without having to add a config in every tool of the
toolchain, post-install symlinks, or any other non-crossplatform tricks. It
just works because it's literally what Node uses to resolve paths, and all
build tools are based on Node, so when you stick to the standard resolution,
you can add new tools to the toolchain without needing a bunch of configs for
them to find your files. For example, it's now also the default resolution in
Typescript as well.

The only time /src/node_modules doesn't work is when tool goes out of its way
to break it, and wrongly assumes that node_modules can only ever be used for
external thirdparty code (e.g. Jest).

So best of luck to make Node + NPM + Yarn to agree on a new path resolution
syntax, but I hope we won't end up with another tool-specific resolution that
only works in Yarn.

~~~
cprecioso
This doesn't break that, it specifically says it will fall back to Node's
module resolution algorithm when the module you're looking isn't in the static
resolutions table. That means you can keep using that technique as you have
bee.

As an aside, you can also use lerna[1], yarn workspaces[2] or pnpm
workspaces[3] to achieve the same effect, depending on your package manager of
choice. You might get additional boosts to code organization/productivity,
it's explained in the links.

[1]: [https://lernajs.io](https://lernajs.io) [2]:
[https://yarnpkg.com/lang/en/docs/workspaces/](https://yarnpkg.com/lang/en/docs/workspaces/)
[3]:
[https://pnpm.js.org/docs/en/workspace.html](https://pnpm.js.org/docs/en/workspace.html)

~~~
Touche
Since we're doing an aside, what advantages do those "monorepo" things have
over git submodules?

~~~
Benjamin_Dobell
I don't maintain any monorepos, I've always used Git sub-modules; not just for
Node.js, but for all sorts of stuff.

 _However_ , I'm increasingly finding that sub-modules are a bit of a pain. If
I patch a sub-module, I have to move into every single project that depends on
the sub-module and pull the latest version. Additionally, if the sub-module is
large it's a real waste of storage on my computer.

That said, monorepos have annoyances of their own e.g. many modern package
managers will happily check-out dependencies directly from Git. However,
that's simply not going to work unless there's some standardisation in
monorepo structure that the package manager is able to interpret.

------
arcatek
I'm the author behind the proposal, feel free to ask me any question you might
have!

As a personal note, I'm super excited and so grateful to have had the chance
to work on this project - node_modules have been a longstanding thorn in the
side of the Javascript ecosystem, and to finally have a chance to try
something completely new is amazing.

~~~
shados
This is awesome. I didn't read the full proposal yet, but at first glance it's
very close to how we do things where I work (because we needed something long
before Node even existed, and never replaced it).

The biggest hurdle we constantly face is tools that make assumptions about the
environment (eg: Flow, TypeScript, Webstorm, etc), so changing anything to
module resolution breaks everything. We have to spend a lot of time hacking it
in every single time. Sometimes the hooks just aren't there (TypeScript has a
hook for overriding module resolution, but editors often don't give you an
easy way to hook into TypeScript's own hooks).

Any thoughts on how things would work for tools if this was to go forward?
Would they all need to be updated to honor this new way to resolve modules?

~~~
arcatek
Flow is already 100% compatible with PnP, through the use of the
`module.resolver` configuration settings. We even saw some slight perf
increases after switching in.

I guess the same could be done for other tools: the .pnp.js file can be easily
interfaced with anything you throw at it without them having to care about the
way the dependencies are resolved under the hood. Even non-js tools can simply
communicate with it through the cli interface, which returns JSON data ready
to use.

------
jorangreef
Removing `node_modules` would be fantastic, but not at the expense of native
modules:

From the PDF:

"On the long term we believe that post-install scripts are a problem by
themselves (they still need an install step, and can lead to security issues),
and we would suggest relying less on this feature in the future, possibly even
deprecating it in the long term. While native modules have their usefulness,
WebAssembly is becoming a more and more serious candidate for a portable
bytecode as the months pass."

WebAssembly is not "more and more a serious candidate" to replace native
modules.

The issue with post-install scripts needs a better long-term solution, but
simply deprecating native modules is not it.

~~~
arcatek
I mentioned it in another answer, but we'll eject the packages that require
postinstall scripts - so they will work as before, but will take extra time to
be installed.

As for wasm, I'm curious to hear what you think isn't good enough. I think the
two main issues are garbage collection and dynamic linking, and there's
ongoing work on them to fix them.

~~~
electroly
Native code has two benefits over JavaScript: it's faster, and it can call the
OS. WASM is faster, but it can't call the OS. It cannot replace all uses of
native modules.

~~~
zeroname
Calling the OS doesn't require native modules, it should be supported via a
native cffi in node, like in Python.

~~~
electroly
Are you describing something you wish that node had, or something that
actually exists? As far as I'm aware, native modules _are_ the FFI for node,
and that every package that involves presenting an FFI to the developer is
using a native module to do it.

~~~
zeroname
Node doesn't have it natively like Python does, but it exists:

[https://github.com/node-ffi/node-ffi](https://github.com/node-ffi/node-ffi)

~~~
electroly
Yeah, that's a native module. If you look at the C source code in "src", the
"NAN_METHOD" and stuff are Node native module macros. node-ffi will not work
if they remove native modules.

------
ohitsdom
npm announced crux yesterday, seems focused on the same problems. Approach
seems slightly different, although there aren't as much technical details
available for crux (unless someone has a better link):

[https://blog.npmjs.org/post/178027064160/next-generation-
pac...](https://blog.npmjs.org/post/178027064160/next-generation-package-
management)

edit: found the repo, seems a bit behind yarn's effort and not yet beta
status. [https://github.com/npm/crux](https://github.com/npm/crux)

------
munificent
This is effectively how pub, the package manager for Dart works. At version
resolution time, we generate a manifest file that contains the local paths to
every depended-upon package. Then, at compile/load time, tools just use that
file to locate packages.

------
gitgud
Very cool, does this mean you can check in the dependency file to version
control? Just like yarn.lock?

As much as I hate node_modules, there are times when I want to see how a
library is implemented. Is there a way to have some libraries in node_modules?
Say only the ones in listed in the package.json file

~~~
arcatek
The .pnp.js file can be checked-in, but needs the cache to work. Right now I'd
advise not to check-it in.

`yarn unplug` will eject a dependency from the cache and put it into the
project folder. That's a quick and easy way to inspect and debug libraries you
use.

~~~
w4tson
Just curious, why not?

~~~
arcatek
It doesn't have a lot of advantages right now. That said, it won't stay true
forever. PnP is but the first step of a long-term goal I have. Check the
Section 6.B from the whitepaper[1] for more details.

[1]
[https://github.com/yarnpkg/rfcs/blob/65b36475c04b1149eb51a81...](https://github.com/yarnpkg/rfcs/blob/65b36475c04b1149eb51a81875bad9b853e113fb/accepted/0000-plug-
an-play.md#b-tarball-unpacking)

------
jannes
I wish there was something I could do about reducing the contents of the
node_modules folder instead of hiding the files somewhere else on disk.

It angers me how many dependencies very simple projects amass.

~~~
zkochan
This change will reduce the size node_modules uses up (10 or more times).

pnpm saves one version of a package only ever once on a disk and node_modules
consume drastically less space

~~~
zkochan
here's an article about how much less disk space is used by pnpm. This concept
will probably come with the same disk space savings
[http://www.andrewconnell.com/blog/npm-yarn-pnpm-which-
packag...](http://www.andrewconnell.com/blog/npm-yarn-pnpm-which-package-
manager-should-you-use-for-sharepoint-framework-projects)

but this concept is fresh, pnpm already works;)

------
lucisferre
All I really want is a consistent solution to the `../../../..` problem with
local module resolutions.

~~~
danShumway
Put your content inside of a nested node_modules folder and you won't have
this problem anymore.

I don't think node_modules is perfect, and I get why it gets hate, but IMHO
the algorithm is actually kinda nice for nesting packages.

If you set up your folders like so:

\----

src--->node_modules--->utils--->helper.js

src--->main.js

\----

You can require your helper in main.js with a simple
``require('utils/helper.js');``

What's nice is that you _can 't_ do the reverse. So your helper doesn't get
the same access to your main.js. I use this a lot for testing - it means that
my tests can require my components using the same syntax that I use in normal
code, but those components don't have special access to the tests.

A big "aha" moment for me with node_modules was figuring out that it's an
entirely separate system from npm. It's not trying to build a flat package
structure; it's trying to make it easy to on-the-fly set up packages from
anywhere.

 _Edit: example
-[https://gitlab.com/dormouse/dormouse/tree/master/src](https://gitlab.com/dormouse/dormouse/tree/master/src)

I've also gotten into the habit of checking my installed packages into source
for projects that I don't expect users to rebuild (ie. websites, games,
etc...). That's a much longer conversation, but it mitigates a large number of
issues with node_modules on long-lived projects._

~~~
udp
_> A big "aha" moment for me with node_modules was figuring out that it's an
entirely separate system from npm._

I _wish_ this was true. My workflow from back in the early days of node has
always been to `npm install` external dependencies, then `npm link` (symlink)
the dependencies I'm working on. But npm >= v5 _removes_ such symlinks if I
`npm install` anything afterwards. I spend a significant amount of my time re-
linking what npm install unlinked.

Usually when I hit a problem like this it's because I'm things have moved on
and I'm doing something wrong. But when an npm developer says "consider npm as
it is to be broken" and closes the issue [1], I'm not so sure.

[1]
[https://github.com/npm/npm/issues/17287#issuecomment-4008339...](https://github.com/npm/npm/issues/17287#issuecomment-400833982)

~~~
danShumway
To repeat: a big "aha" moment for me with node_modules was figuring out that
it's an entirely separate system[0] from npm.

Node's module resolution has nothing to do with npm. Npm is a package
repository and installer built _on top_ of node_modules. The reason your
system broke is because npm cleared out your node_modules folder as part of
its install. And from the sound of things on your linked issue, the devs are
entirely aware of the problems this behavior causes and are planning to fix
it.

An interesting exercise that I highly encourage people to do if they're
finding this weird is to take a weekend and build their own version of npm
just to demystify what's going on with it. It's not that hard to do, Node
gives you all the tools you need - in its simplest form you need to curl
whatever packages you want to install, and stick them in a node_modules
folder. Then you need some way to track which packages you've downloaded. Node
handles all the rest.

That npm occasionally breaks Node behavior is bad - I've been on the wrong end
of those regressions as well[1] and it was _super_ frustrating. But that
doesn't really have anything to do with Node, it just means the npm team needs
to test their releases more.

[0]:
[https://nodejs.org/api/modules.html#modules_loading_from_nod...](https://nodejs.org/api/modules.html#modules_loading_from_node_modules_folders)

[1]:
[https://github.com/npm/npm/issues/18942](https://github.com/npm/npm/issues/18942)

------
i386
Congratulations to the JavaScript community in reinventing the Maven local
repository.

Hopefully yarn does a better job at validating dependencies than early Maven 1
and 2 did.

~~~
acemarke
I just saw two comments comparing Yarn's RFC to Maven - yours and one over on
Lobsters.

Having never used Maven, any chance you could point to a document explaining
how Maven's cache approach works, and maybe expand on the similarities between
that and Yarn's RFC?

~~~
i386
Sure, here you go.

[https://maven.apache.org/guides/introduction/introduction-
to...](https://maven.apache.org/guides/introduction/introduction-to-
repositories.html)

------
olingern
I just came here to say this is great work, and congrats to the Yarn team for
continually making package management more sane.

------
mstade
Nice work – what's being done to make this the default behavior in node and
deprecate node_modules proper, or is this forever destined to live in the
awkward place of userland-but-not-really kind of territory?

People have already mentioned native modules. Install scripts are a nuisance,
but exist for reasons. If you remove support for them – provided this project
takes off, which I suppose it will because bandwagons – you risk people
performing install-type tasks on first module load, potentially creating an
even worse situation. Has this been considered as a risk?

~~~
arcatek
> what's being done to make this the default behavior in node and deprecate
> node_modules proper

My hope is that PnP proves that this approach can work and raises the stakes
for Node to implement APIs that would allow for a better integration. It's a
long term plan and we have yet to prove ourselves on the long run, but I'm
quite confident :)

> If you remove support for them

I don't think we'll ever remove support for them. I'll personally advocate for
them not to be used because of their subpar developer experience, but apart
from that it won't cost us much to support them.

------
lxe
From what I understand: instead of cp -rf from offline cache, or ln -sf (which
doesn't work for a large percentage of packages due to node being node), they
propose to use a custom resolver to tap into the cache directly.

This will also break for various packages due to fs.readFile* dependencies,
gyp, and other things. If your dependency tree is 4k+ node modules, the
"vanilla" yarn or npm resolution and reconciliation techniques are already so
brittle, that changing them will undoubtedly break things.

------
cryptozeus
Currently I use npm for angular projects. Before writing even hello world , I
have to perform npm install on the project and install so many module to serve
up small web app. This seems like we are going in the right direction, cant
wait to try yarn as package manager.

------
ericintheloft2
Novice question from someone from ruby land: would this work similar to
bundler and Gemfile?

------
aij
How does yarn pnp compare to nix? (node2nix)

I've been thinking about unifying our current ivy+yarn+bower setup, but
haven't yet gotten much past thinking about it...

------
specialist
This is a compelling reason to switch:

From [https://pnpm.js.org](https://pnpm.js.org)

"Efficient: One version of a package is saved only ever once on a disk. So you
save dozens of gigabytes of disk space!"

Without digging, I imagine this will be more like Maven's cache.

NPM's design decision to flatten the version hierarchy baffled me. And has
occasionally tripped me up.

~~~
Rockslide
For everyone running a Docker setup, this is not an option... Docker doesn't
support symlinking files outside the build context.

~~~
striking
Docker also has a cache, making dependency installs not usually an issue.

~~~
Rockslide
Which works on a totally different level and doesn't help at all in terms of
sharing dependency snapshots between different projects.

~~~
striking
That is not something I'd want to do. Docker's solution is good enough for
almost every build that happens. I can eat the time cost not covered by that
solution and not have to worry about bugs stemming from improper copying of
dep snapshots.

------
sergiotapia
The problem in javascript seems more an ideological one where programmers
instantly reach for a package that has multiple dependencies of which those
dependencies have even more dependencies.

There's this culture of not caring about bloat it seems in the vast majority
of javascript projects. left_pad comes to the mind as the poster boy for this
stuff.

~~~
MrEfficiency
Isnt this the point of OOP?

Does any other language have a solution to this?

~~~
munchbunny
Yes and no. It's more a question of degrees and developer culture. I think JS
just has a stronger "glue stuff together" mentality combined with the lack of
a thorough standard API.

In my experience, C# libraries tend to be more averse by default to taking on
extra dependencies, but that's in part because .NET already does so much work
for you. Python is a bit less averse, but certainly not to the level of JS
where you can easily end up with hundreds of nodes in the dependency graph.
But then Python isn't used much for client UI code.

~~~
nojvek
Part of the problem comes from using a popular package. That package could be
importing 100’s of other things.

Typescript is one of the very few node modules that is very self contained.
You install babel, webpack and eslint and that’s easily over 1k packages.

So yes, js ecosystem is a nightmare for security folks since anyone of those
thousands of packages could access filesystem, network and create Backdoors.

Our express site got hacked because one of the sub dependencies was
compromised.

Seriously stay away from using nodejs to serve production traffic for serious
projects using glued packages. If you want to do it, use extremely thin, well
vetted packages and be very mindful of upgrades.

~~~
z3t4
I recommend giving each node process an Apparmor profile.

------
jensvdh
Can we get rid of Yarn instead?

~~~
acemarke
Why in the world would you want to do that? As this example shows, the
competition is good for the ecosystem.

