
Napa.js: A multi-threaded JavaScript runtime - rimher
https://github.com/Microsoft/napajs
======
drderidder
This looks great, but it's important to bear in mind the architecture you
intend to run on. I recently made an application blazingly fast by - among
other things - parallelization using the node cluster module. On my 4-core
laptop it flies. Imagine my surprise when I deployed to the cloud environment
and found the typical virtual server in our cluster has only a single CPU
core. The worker threads just sit around waiting for their turn to run, one at
a time. On the other hand, the platform for serverless functions has 8 cores.
At a minumum, before you jump into multi-threading, know what
`require('os').cpus()` tells you on your target platform.

~~~
santoriv
Which platform for serverless functions has 8 cores? I have a CPU intensive
data deployment script (Node.js) that takes 12 hours on a single thread but
can be chopped up to take advantage of more cores. Our build server on ec2 has
2 cores so it's about 6 hours. It would be great to know if we could push the
job into serverless and get it done a lot faster.

~~~
mirko22
How can a "serverless function" have cores if it doesn't have servers and "it
is just code" :/

~~~
XaspR8d
Because "server" started taking on the meaning of "configurable box" to people
who were frustrated with configuration, so "serverless" means
"unprovisioned/unconfigurable" machines.

Now if we started talking about "computeless" architecture I'll be confused.
(Though maybe that'll be the trendy name for serverless data sources/sinks in
a few years...)

~~~
mirko22
> so "serverless" means "unprovisioned/unconfigurable" machines

I am pretty sure in English serverless means no server and
"unprovisioned/unconfigurable" machines means you didn't provision them and
you cannot configure them. Even in analogical sense this makes no sense.
Something i could relate to is something like "Pay as you use" or
"configurationless servers".

But that is just me, and if you think it is ok to randomly change the meaning
of words that means me personally and randomly don't need to accept your new
meaning (not giving out, just trying to explain my rationale)

Downvote all you want, but please do point out where i am wrong.

~~~
ralmeida
In a very literal sense, you're not wrong. But the concept of a _serverless_
architecture (as of common parlance today in 2017) has a lot of nuances which
are hard to convey in _any single_ word.

So eventually people picked one. Today, the most common are "serverless
architecture", "FaaS", or simply "Lambda" (borrowing from AWS).

 _You_ don't _have_ to do anything. But it's simply a _fact_ that many people
know what you're talking about if you say the word "serverless". And that's
what language is, a (kinda) agreed upon set of words which let you communicate
with other people. If everyone but you understands a word, and you are
crusading that they change it to something else, what is the point?

If you're interested, the concept of "prescriptivism" may be enlightening.

~~~
mirko22
My point is that i was not aware what "serverless" meant when i first came
upon it and the word itself did not convey any meaning without a lot of
context as it does not lend it self to analogies such as "server", "container"
or even "cloud" for that matter.

~~~
dkersten
True. I regularly encounter words whose meanings in whatever context I don't
understand. Usually, if a word sounds confusing or used in the wrong context,
a quick Google search will clarify what I was missing. Sure its a bit
irritating when words are recycled to have a different meaning, but its rarely
an issue in practice, in my experience and opinion.

------
nathan_long
An alternative approach to concurrency is the one that Erlang (and your
operating system) take: memory-isolated processes, preemptively multi-tasked,
where neither IO nor computation in one process can adversely affect the
others because the scheduler prevents it.

I wrote this up:
[https://news.ycombinator.com/item?id=15499629](https://news.ycombinator.com/item?id=15499629)

~~~
underwater
Ryan Dahl wanted Node apps to use a multi-process model. Hence the name. The
vision wasn’t as evolved as Erlang, but conceptually closer than what Node
looks like today.

“I believe this to be a basis for designing very large distributed programs.
The “nodes” need to be organized: given a communication protocol, told how to
connect to each other.” ttps://www.americaninno.com/boston/node-js-
interview-4-questions-with-creator-ryan-dahl/

------
bertolo1988
You can get rid of the need to multi threading by deploying more containers in
the same machine or via orchestration.

I don't understand why people keep insisting that the lack of multi threading
support is a Javascript problem when there are better and more scalable ways
of using your machine resources.

~~~
chrisseaton
> You can get rid of the need to multi threading by deploying more containers
> in the same machine or via orchestration.

What if you have a large shared in-memory data structure that you want to
update with lots of irregular translations in parallel? Like many graph
problems? How are you going to do that with multiple containers? As in
industry we just don't understand how to distribute that kind of problem
effectively.

~~~
maxpert
Shared data structure among multiple threads... this sounds utterly fimilar
and evil! Redis is single-threaded, probably one of fastest,has data different
structures, can handle high loads, code is easy to reason, something that just
works.

One of the reasons Node is successful is the simplicity of single threaded
code. Way easier to reason, I would question the usage of Node if you are
doing something CPU bound with it. You can use golang or C# with tasks for
that.

~~~
Gonzih
So just because shared memory is hard you are ok with sacrificing performance
and replacing memory access with io hops? That sounds like an overkill and not
suitable for every task.

~~~
egeozcan
I like node.js and use it very often (had my first package reach over 100
stars, woohoo :) ) but I don't understand why it needs to be suitable for
every task.

If you really want to do something creative with the shared memory, I guess
you could do that in a "native module" written in c++ or even Rust[1].

I'm not saying that it's not doable with JS, it's just that it's already been
done (as in, has a solution that works).

[1]: [https://github.com/neon-bindings/neon](https://github.com/neon-
bindings/neon)

~~~
PunchTornado
Why should i learn a new language for that? It's good to have as many options
as possible in js and you take the one that fits you best.

~~~
epicide
Because JS isn't good at everything just like C++ and Rust aren't good at
everything.

Right tool for the job.

~~~
RussianCow
But if you take that to an extreme, you end up with a hundred tools. I think
it's good to have the option to do parallel computing in JS, for those times
when it's worth the tradeoff versus having to adopt a completely new
language/platform.

~~~
epicide
Sure, anything taken to an extreme is bad. I wasn't suggesting to do that.

I think if there is a sensible use-case for parallel computing in JS, it would
be good to have. However, trying to make a solution before we have a (clear)
problem is foolish.

I'm not saying there isn't already a use-case, but I haven't seen one that
isn't already covered by languages better suited to solving those problems
(e.g. Rust).

Edit to give a different example: parallel computing in JS is like trying to
write a web framework in Rust. Sure, you _can_ do it, but Node is already
better suited to doing that. At best, you're making a worse version of
something that already exists.

~~~
RussianCow
> parallel computing in JS is like trying to write a web framework in Rust.
> Sure, you can do it, but Node is already better suited to doing that. At
> best, you're making a worse version of something that already exists.

I agree with you, but my point is it's not black and white. For a sufficiently
small or simple project, it might make sense to write a web backend in Rust,
or do parallel computing in JS, if the cost of learning a new platform
outweighs the cost of using the "wrong tool".

In most circumstances, yes, you probably shouldn't use Node.js for parallel
computing tasks, just like you shouldn't use C++ for web development, but for
some use cases it might be useful. And maybe those use cases don't exist (I
don't have much experience in this area, so I don't know), but I just don't
like when blanket statements like "use the right tool for the job" dismiss the
work other people have done. Surely if Microsoft created this, they have a use
case in mind for it?

------
j_s
I will circle back in a year or so to see if this 'sticks' \-- very few
Microsoft open source projects provide enough value / demonstrate enough
inertia this early in their lifecycle to justify any investment on my part.

[https://github.com/MicrosoftArchive/redis/issues/556](https://github.com/MicrosoftArchive/redis/issues/556)
(Jun-Sep 2017)

> _Why do MS always do this??? Start something, announce it aloud "we are now
> open source, we are now this and that blah blah" then quietly do a 360 and
> moonwalk away_

~~~
panta
Perhaps a 180? :-)

~~~
j_s
To be clear for non-Michael-Jackson-fans: 360 + moonwalk works out the same
(by sliding away backwards).

[https://en.wikipedia.org/wiki/Moonwalk_%28dance%29](https://en.wikipedia.org/wiki/Moonwalk_%28dance%29)

> _moves backwards while seemingly walking forwards_

It's actually a near-perfect analogy here, where actions speak louder than
words re: Microsoft's commitment to maintaining adaptations of existing open
source projects. I think the example provided is enough to trigger a careful
analysis before jumping in, and I would love to see counter-examples to help
balance the evaluation.

(Please note: very specifically requesting counter-examples of Microsoft-
official, intended for production, open source repositories demonstrating
long-term maintenance of tweaks of/dependencies on established open source
projects that for whatever reason were never up-streamed.)

~~~
chucky_z
In the case of Microsoft specifically it's a silly meme that started due to
the Xbox 360.

'Why do they call it the Xbox 360? Because when you see it you do a 360 and
walk away.'

------
z3t4
Looks almost the same as something using fork and message passing, which also
scales over several machines. The examples should show use cases that use
shared memory or examples where it wouln't be possible to use worker
processes.

~~~
nevir
V8 doesn't support (real) forking, unfortunately

------
daiyip
Here is why napajs is started: [https://github.com/Microsoft/napajs/wiki/Why-
Napa.js](https://github.com/Microsoft/napajs/wiki/Why-Napa.js)

------
polskibus
This is far from the first attempt to do bring multithreading/parallelism into
JS. A (failed) example:

[https://en.wikipedia.org/wiki/River_Trail_(JavaScript_engine...](https://en.wikipedia.org/wiki/River_Trail_\(JavaScript_engine\))

Another attempt (although a different approach) is here :
[https://github.com/tc39/ecmascript_sharedmem](https://github.com/tc39/ecmascript_sharedmem)

Hopefully, Microsoft will learn from the errors of others. Interestingly, napa
is build on V8, not on MS own javascript engine.

~~~
13years
Webkit is also working on this:

[https://webkit.org/blog/7846/concurrent-javascript-it-can-
wo...](https://webkit.org/blog/7846/concurrent-javascript-it-can-work/)

------
amelius
Nice, but:

1\. It is exposed as a nodejs module, but some other modules may be at
conflict because they may wrongly assume that they are the only thread
running. E.g., some modules may be using global variables in C++.

2\. Still, it doesn't support fast immutable communication (structural
sharing) to other threads through shared memory.

~~~
MuffinFlavored
What would it take to get number two to become a reality? Some sort of special
module? This is the one main drawback I've found doing "production stuff" with
Node.

Does Java offer cross-thread shared memory? Go, Haskell, Python, etc.?

~~~
cjalmeida
Java does shared memory by default and even has concurrency related concepts
as fundamentals of the language (Object.wait(), etc.) .

Python supports threading but is not very useful due to the infamous GIL.
However, multiprocessing is usually a decent alternative and it supports
shared memory on forking (copy on write, though). Also you can use mmap easily
for IPC.

------
drinchev
One of the great practical examples of this repository is to serve as a
counter-argument against people, who complain that NodeJS is not multi-
threaded.

For anything else, IMHO the use-case will be so narrow, that I wonder how MS
will justify the development resources for maintaining it.

------
bsimpson
I know that the history of projects at a giant company isn't always
straightforward, but it's interesting that they chose V8, even though Chakra
is maintained by the same company (and supports Node APIs).

------
jondubois
I think this could be useful but I really hope they get it right (especially
the broadcast and other cross-worker communication features). Right now it
looks like the broadcast feature cannot target specific workers - That's
already a red flag. If they push too hard on thread locking and data
consistency across workers, then it's not going to scale and it may lose all
the advantages of being able to run on multiple CPU cores.

~~~
daiyip
A key design goal is to make all workers within a zone symmetric,
broadcasting/executing to specific workers is an anti-pattern. Please see:
[https://github.com/Microsoft/napajs/wiki/introduction#zone](https://github.com/Microsoft/napajs/wiki/introduction#zone)

------
chuckdries
I'm excited. My friend is working on extending rpgmaker, a game engine built
on node. Say what you will about building a game engine on top of Node but
this could be a very useful way to offload CPU heavy tasks from the main
render thread

------
thom_nic
Any informed commentary on the pros/ cons of this vs cluster?

As far as I can see, cluster workers are all uniform (no task-specific pools)
and cluster has no broadcast (just `worker.send(message)`.)

~~~
maxpert
Cluster runs multiple processes, this is multi-threaded. So everything is
isolated there, communication can happen only through IPC mechanisms, and high
memory overhead. In case of threads its typical synchronization with of course
shared memory side effects. It’s a typical thread vs process debate.

------
jonkirkman
This looks very similar to the concurrency and isolation patterns of the Dart
VM.

"Use Isolates for secure, concurrent apps. Spawn an isolate to run Dart
functions and libraries in an isolated heap, and take advantage of multiple
CPU cores." \-- [https://dart-lang.github.io/server/server.html](https://dart-
lang.github.io/server/server.html)

------
christophilus
This is interesting. I see some negativity in the comments, but I think this
was really a missing piece of the Node ecosystem, and I hope this solution
ends up getting good adoption. On a slightly related note, I wonder if
ClojureScript + Node would be the best way to take advantage of this. Has
anyone here used ClojureScript with Node?

~~~
cormacrelf
If you were wanting multithreaded Clojure, you're probably already using the
JVM. It's much more mature and performant and tunable for that use case.

~~~
christophilus
True! One thing I have noticed, though, is that a lot of Clojure code is
blocking (database drivers, and such), so some subset of problems might be
easier to write on Node. In addition, we are currently running Node in
production, but not the JVM, so targeting Node with ClojureScript is an easier
sell than creating a whole new deployment target.

------
partycoder
One of the reasons npm modules are largely compatible with each other is the
fact that they share the same threading model, which is what the event loop
offers.

Then, this has a lot of compatibility issues, and release-wise it will be
hell. I don't think this is a good idea.

------
Touche
This looks a lot like web workers, which Node.js does not have. Hopefully they
will add it.
[https://github.com/nodejs/worker/issues/2](https://github.com/nodejs/worker/issues/2)

~~~
je42
fork works fine in node.js

~~~
je42
why down vote ? fork is fine as long as the communication between the two
processes is low-bandwidth.

~~~
nevir
Shared memory & low overhead sharing of expensive objects is probably _the_
primary reason to use Napa

~~~
MuffinFlavored
I don't think Napa features shared memory though, does it?

~~~
daiyip
Since everything is in a single process, they can be shared via native
structures. Each JS thread has its own heap, there will be a cost to transfer
the shared native structures into JS object.

For complex objects, usually JSON is used thus marshall/unmarshall is needed.
But for objects like UTF-8 string or ArrayBuffer, the same layout is used
across JS and C++, thus at almost no cost.

Another thing is that between addon-modules, they can pass pointer of native
structures (like Buffer) using 2 uint32 through JS. In this case, JS works as
a binding language.

------
pantalaimon
It makes you wonder why they choose V8 instead of their own JavaScript engine

[https://github.com/Microsoft/ChakraCore](https://github.com/Microsoft/ChakraCore)

~~~
johnhattan
My personal theory is that they're trying to improve the JS engine in Electron
so they can improve VSCode. The VSCode people have posted some performance
complaints in the past.

I would assume this would benefit desktop JS most, so it'd undoubtedly help
Electron apps.

EDIT : There's a better explanation at
[https://github.com/Microsoft/napajs/wiki/Why-
Napa.js](https://github.com/Microsoft/napajs/wiki/Why-Napa.js)

------
Magical
What does this give you over using fork in node? The zone concept seems nice
but I'm not understanding why I would use napa when node is designed to easily
spawn nodes of parallel execution?

~~~
christophilus
If I understand correctly, it looks like it allows some basic sharing of data
without duplicating it in a bunch of heaps. It's not 100% shared memory, in
that if several threads are accessing an object, that object will be
duplicated in each thread, but it has a store mechanism which essentially can
serve as an in-process cache that is shared across multiple threads.

~~~
daiyip
That's correct. Think about a scenario that each worker needs to access a 1GB
hash map. In node cluster, we have to load this 1GB map per process, but in
the same process, all JS threads can access the map via an addon.

------
raresp
I was expecting something like this for a long time. Thanks!

------
0xADADA
I dont think JavaScript needs threads.

~~~
speedplane
> I dont think JavaScript needs threads.

Seriously... when publishing a project like this, the authors ought to give a
least a bit of insight into their motivation. Their examples include
calculating Fibonocci in parallel, not exactly breakthrough.

~~~
always_good
I know it's addicting to be dismissive, but:

[https://github.com/Microsoft/napajs/wiki/Why-
Napa.js](https://github.com/Microsoft/napajs/wiki/Why-Napa.js)

------
diegoperini
Is this multi threaded Node workers that can only share read access to their
parent process' pid?

~~~
maxpert
Nops pure threads inside a process.

~~~
diegoperini
Is hiding the mutable shared memory and introducing mandatory serialization
for message passing a design decision or technical limitation of v8?

------
dheera
Can we please stop calling things ".js" if they are not actually JavaScript
files or libraries written in JavaScript meant for direct inclusion in
JavaScript source? Call it NapaJS, not Napa.js.

------
k__
How does it compare the (discontinued) JXCore?

~~~
daiyip
JXCore is based on node codebase, while Napa.js is written from scratch with
multi-threading consideration at its core. We evaluated JXCore before starting
Napa.js, but found easier to re-architecture to get performance and memory
sharing right.

------
appimonster
Sounds great! Now, We can make memory efficient Electron with this kind of
runtime. It will make JavaScript to be used in creating desktop application
more reasonable.

~~~
Cthulhu_
Not really, multithreaded applications are not more memory-efficient.

~~~
daiyip
Yes in common sense.

But for node cluster's case, if we need to load a data set for a worker to
serve request, it has to be loaded on each process. While the multi-threaded
model will have only one copy.

------
romanovcode
This is good however IMO if you need parallelism you should not use node
period.

------
dtzur
Shared memory!

------
Dinux
Took long enough for threads to reach JS as well

------
maxpert
Wait what? Why are they getting rid of simplicity and model that just works.
Once again Microsoft managed to put on some C# programmers on JS. The first
thing they thought was why doesn’t it support threads! Let’s build thread
support and get a promotion!

Edit: I would question usage of Node if everything you are doing is CPU bound.
I think people started down-voting me without going through docs, I mean just
read this documentation page and tell me this is right
[https://github.com/Microsoft/napajs/blob/master/docs/api/mem...](https://github.com/Microsoft/napajs/blob/master/docs/api/memory.md#memory-
allocation-in-cpp-addon)

~~~
chenzhekl
The README expained it clearly

> As it evolves, we find it useful to complement Node.js in CPU-bound tasks,
> with the capability of executing JavaScript in multiple V8 isolates and
> communicating between them.

Not everything MS did was evil.

~~~
couchand
It's not evil it's just dumb. If your workload is CPU bound you probably
shouldn't be running it in Node.js.

~~~
fenwick67
There are some cases on the threshold that this can be useful for.

For example, let's say your Node application's requirements changed, and now
you have to transform a set of five JSON objects to one JSON object, with some
big arrays, indexing, and sorting required. The operation takes 10-100ms of
cpu time, which isn't nuts, but with thousands of requests per second would
normally be a catastrophe. If you can spawn a thread in this case you can save
the day.

~~~
couchand
You say that 10-100ms of CPU time on the hot path of your requests isn't nuts,
but we'll just have to agree to disagree about that.

~~~
fenwick67
All I meant was "transforming and merging complex documents can really take
this much time", not "stopping your whole server application for this long is
okay" (that would be nuts).

