Hacker News new | past | comments | ask | show | jobs | submit login
Habits of a Happy Node Hacker (heroku.com)
91 points by snodgrass23 on Nov 11, 2015 | hide | past | favorite | 54 comments



> As long as each dependency is listed in package.json, anyone can create a working local copy of your app - including node_modules - by running npm install.

But it is going to download the internet, will take forever, and if the network is patchy, any dependencies that weren't downloaded will remain unresolved, even if you reinstall the package, and you'll have to manually install them with the help of `npm list`, or nuke node_modules and start over, and pray that this time the network remains perfect for a few minutes at a stretch.

(someone please tell me I've got it all wrong)


Either way you are going to have to potentially download a lot of stuff; I'd rather not have that bloat integrated into the repo. Also consider if you come onto a project and they've upgraded a library version multiple times, when you clone from the repo you are going to grab every revision they've ever used and checked in. This is going to be much bigger and slower than the equivalent npm install for that package.

If you are grabbing from an external repo, the spotty network is going to mess you up either way. If it's internal and your internet is that spotty, consider hosting a local npm cacheing server like this -- https://www.npmjs.com/package/sinopia.

My only issue with this technique is that 'npm start' and the rest aren't smart enough (please correct me if I'm wrong and there is a way to make it care!) to grab dependencies if you upgrade a dependency version and check that in.


(bundling node_modules into git is definitely not an option)


If you're checking your `node_modules` into the repo you're doing it wrong :\ should just check in package.json.


You're not necessarily wrong (that a ton of modules will be downloaded) but there are two points worth mentioning.

1) you can install from cache with something like:

  npm install --cache-min 9999999
2) npm's infrastructure has scaled extremely well. Gone are the days of spotty uptime and slow resolution. I maintain an npm proxy for one of—if not the—single largest consumer of npm modules and other than the very occasional 502 blip, npm has been rock-solid.


Thanks for clarifying.

I'll try the cache-min flag. But I suspect that if the cache contains packages with unresolved dependencies, then they'll also be of no use, and have to be removed too.

Also npm's servers has never given any trouble, the issue was always with my networks.


I found the versions a bigger problem.

I install something --save and get "^1.0.0" into my package.json, which resolves to 1.0.0

Someone installs my stuff and on her machine it resolves to 1.0.3 and nothing works anymore, even if she cloned an exact copy of my repository etc.

Next step is to lock down all versions in my package.json, so all wildcards are gone.

But the packages I depend on still have wildcards and can break happily.

This happens far too often for my taste and can (as far as I know) only be solved with third party software.


This is what you want to happen.

The alternative is, you shrinkwrap everything and lock it down, and your dependencies never change. Except that now it's been a year, and you want to update your dependencies, and it's this huge jump that requires so many changes that it's going to take a month of concentrated pain to do, so you never do it, and welcome to the world of legacy code.

Take your dependency-update hits in small bursts on the regular, and only lock down (via shrinkwrap or Docker images) for QA/release builds.


> This is what you want to happen.

Uh, no, I don't want my dependencies to break out from under me, thanks.

This is a major problem with the JS culture that I cannot stand. Too many JS devs can't be bothered to write software that doesn't suck, and so they try to instead tell their users their crappy problems are actually what they want. No, if you push changes to a library that break your library users' software without warning and a reasonable upgrade plan, you're doing a bad job.

> The alternative is, you shrinkwrap everything and lock it down, and your dependencies never change.

Or I use dependencies that upgrade their dependencies intentionally instead of whenever the `^` does it for them. The problem is that most of the JS packages out there are garbage software that doesn't do that. So it's necessary to be careful and only use dependencies which are well-maintained and reliable. ...which isn't unique to npm or a new problem, it's been well-known since all the way back when CPAN was new. Just because installing libraries is easy doesn't mean it's a good idea.

> Except that now it's been a year, and you want to update your dependencies, and it's this huge jump that requires so many changes that it's going to take a month of concentrated pain to do, so you never do it, and welcome to the world of legacy code.

Or you can do your job like a competent professional and don't wait a year to upgrade your dependencies. Again, fewer, more reliable dependencies helps with this; if you are using reasonable dependencies, they'll push few breaking changes and give warnings and reasonable upgrade plans.


So true...

But I have to admit, the problems I mentioned only occur about once a month and only cost a few hours to fix and we have over 100 dependencies.

If we wouldn't use so much third party stuff we would probably end up fixing this every few months, which is okay-ish.


Pick your poison, locked down deps that you constantly change the x.x.1 version of, or the potential for the situation you describe.

That actually isn't supposed to happen, because semver means don't do anything breaking in a patch release (or a minor release). Not that the system works perfectly, of course.


You can solve it with the npm shrinkwrap workflow mentioned in the article. It takes somewhat more effort, but locks down the full dependency tree.

People have historically experienced a lot of issues with shrinkwrap in older versions of npm; I'm optimistic that npm 3 solves a lot of them.


Yes, I hope for npm3 too, but at the moment everyone in my team tells me it is far to slow to use it for anything :\


Pick your poison, big dependencies or big repo + duplication. But the author is correct, the node way is to put deps in package.json and `npm i`.


The article suggests using node-foreman (#6); can someone explain the advantage of Procfile-based environment management? I read the docs and didn't see anything that couldn't be handled by a simple config.json or some environment variables.


Node-foreman does use environment variables.

The usefulness of the tool (vs just straight env vars) is mostly in its ability to:

1) start multiple processes at once during local dev, much like a process manager can on a platform like Heroku or a compose file can in Docker. For instance, I'm working on a system now that uses `web`, `game`, and `sms` processes - which can all be started together via `nf start`, and then when one crashes during development the whole system spins down.

2) inject env vars into a local app via `.env` files (similar to your config.json example)


The article is on heroku.com, so of course it's going to encourage Heroku-centric solutions.


> Cluster your app

How is this possible in my multiplayer game where the whole world lives in the program? E.g. var map = [...] and then map[x][y] when I want to use it, var players = [], players.push(newPlayer) etc. How can I share this state in a cluster?


In your current state, clustering is probably not be the best advice for your use case.

That being said, if you are planning on scaling, you'll likely be moving your map object into it's own process, or a more traditional database, etc. Then whatever node application you have that facilitates the communication between that db/process and your users (via websockets or http) will be the place where clustering will come in handy.


Your app should be clustered, and your db (where your state should live) can be clustered too. Ultimately you want as little state in your app servers as possible, since this makes it harder to scale them out, reboot them etc.

That's the ideal case anyway, its hard to avoid session state with websockets etc, but you can still push a lot down to the database layer


Well, you're main app should obviously not be clustered but think about the api you use to manage data. I don't know your architecture but I'm sure that you can separate your api from your user state and cluster those apis as well as use various reverse proxies and load balancing services.


You need some form of IPC. Easiest thing that comes to my mind is just to use a redis hash.


Clustering is useful mainly when your app gets large or when you are handling redundancy. If you aren't doing that, your app is small enough not to need it. :)

If you want a fun exercise, try monkeypatching your global objects (map, players, etc.) to have getters/setters/etc. that transparently handle the syncing stuff.


Would Redis solve your problem?


Kind of a random question. What's the state of generator based control flow? Is this the future "habit" of a node hacker?


async/await is built on generators and promises.


Sorry I haven't followed the node community for a while. Are you referring to this: https://github.com/yortus/asyncawait?

Is this becoming popular / the new standard?


Thanks Hunter. That was a really helpful article.

Though I don't agree with every suggestion, memory tuning and clustering are both really important and not discussed widely enough. save-exact is a good tip too - I've just been mass-deleting the carets before committing.


Ironically, right after the author mentions "The leading edge of developers are simplifying their builds" w/ vanilla JS, he suggests using Babel to get bleeding edge JS features...


The need for babel will (hopefully) go away in the future, though. So using ES6 features with Babel will end you up with a simpler project eventually, as opposed to writing ES5 with some libraries to do the same.


Node.js is the rare example of a Linux-centric tool with great cross-platform support.

Apart from the problems with Windows, npm, and path lengths.


There are hiccups, but as a general statement I stand by it.

"Node is pretty impressive on Windows. You get the feeling that JavaScript (as well as HTML and CSS) are a first class citizen in the Operating System. I already have a list of dozens of Native Windows Apps I want to build using Node."

http://daverupert.com/2015/08/developing-on-windows/

(good contrast with other experiences from a polyglot)

"Node is quite awesome on Windows to be honest."

https://medium.com/@vernonk/surface-pro-4-the-ipad-i-always-...



I may be somewhat biased because my primary platform is Windows, but IMHO 260 bytes is plenty for a full path. I've personally never run into that limit with what I do, but I'm aware that there are some programming languages and environments which even encourage very deep nesting (often with directories that contain nothing more than another one), making it much harder to navigate the resulting structure. It's good to see the Node.js people recognise this situation and go back to flatter hierarchies, even if it was Windows' path length limitations that motivated it.

http://blog.codinghorror.com/filesystem-paths-how-long-is-to...


The annoying thing is that the limitation only exists in some APIs. So you can create paths that are to long, but then most software can't deal with them, which is worse than not allowing long paths at all.


The new babel 6 stuff was the first time I had hit the limitation, but their names are extraordinarily long.


There are a lot of problems, actually. I don't think Node.js has bad cross-platform support but calling it "great" is definitely a stretch. Node's APIs are very low-level and unix-centric and the result is it is very easy to make a library that doesn't work well on Windows because it assumes unix-style paths, for example.

Or try using child_process.spawn without resorting to a 3rd party module or adding platform specific code yourself. Yeesh.


Path lengths is a windows issue not a node one. The same issue exists for C# or any windows language. BTW this is fixed with the latest npm!


Btw, that was fixed in node 5.0 with npm 3.


#4 seems spurious. Just another example of "If everyone did it my way, then everyone would be doing it my way." What if people don't want to do it your way? What if people can actually remember to spell correctly? EDIT: *nix has case-sensitivity so that people can actually use it.


It isn't about spelling. It's about behavior that will be presumed correct in one environment but not in another. It's about portability.

For another node example - in Linux, you can successfully use paths like `let filepath='foobar/' + filename`. That's correct in the given environment. However, such code will break when run in Windows. This is why `let filepath = path.join('foobar', filename)` is a superior solution.

A Windows or OSX user can save Myfile.js all day long and require('MyFile') with success. That same code will break as soon as you try to run it in Linux.

These recommendations are the product of supporting the many thousands of Node apps built each week on Heroku.


Thanks for the reply. While I understand the argument about preventing pitfalls, it's reasonable to expect programmers to pay attention to capitalization, since source code requires you to anyway. I think the path of least resistance here, is to tell programmers to spell things the same way as the last person to work on the code spelled it.


Windows supports regular / symbols in pathnames. What system doesn't?


Regarding point 9, if you don't check your `node_modules` into version control don't you run the risk of them disappearing from NPM?


"If you are paranoid about depending on the npm ecosystem, you should run a private npm mirror or a private cache.

If you want 100% confidence in being able to reproduce the specific bytes included in a deployment, you should use an additional mechanism that can verify contents rather than versions. For example, Amazon machine images, DigitalOcean snapshots, Heroku slugs, or simple tarballs."

https://docs.npmjs.com/misc/faq#should-i-check-my-node-modul...


I think if you are worried about this you should be using Dockers and Docker Hub.


Even when assuming availability, what about modules that change?

Either because authors "bugix" their existing versions and don't bother to increase the version number. Or because a malicious network actor delivered modified code.

That's why I prefer to add a SHA-256 hash/checksum to every downloaded dependency file. In some settings that hash might be more important than the actual version number.


npm only lets you publish a version once, if the version number already exists for that module, you have to change the version number


Okay, that's good to know.

But I'm still uneasy with this, as the crypto hash approach provides some more safety features. For example, it protects against attacks the NPM platform itself (assuming they happen after you incorporated the library into your project). Also, it enables us to download the package from any other source (e.g. some archive/mirror), via plain HTTP, while still being safe from downloading a modified package.

That's why I still wish the crypto hashes were included in package.json (automatically on "install --save-exact", of course).


Do you guys not use pip, and stuff? I'm surprised by all the confusion on this point. Package managers are part of modern programming platforms. Repos only need to have your code, not everyone else's.


That's a good point, I've never considered the possibility of a package I'm using disappear from NPM. I never check in my node_modules folder and let CI build processes handle all of that.


You could fork the libs and point to your own repo. That's what we usually did with Cocoapods.


Until I looked at the actual title of the post, I was a little concerned that somebody had invented a time machine.

It's been changed now, but originally was titled "10 Habits of a Happy Node Hacker (2016)"


The "10" violated the guidelines explicitly (magic numbers in titles are supposed to be removed) and the "(2016)", while fine in principle, is ambiguous with HN's convention for annotating years. So we took those out.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: