
Async Iterators: These Promises Are Killing My Performance - akubera
https://medium.com/netscape/4767df03d85b
======
Matthias247
I agree that the decision that Promise continuations run always in a new
eventloop iteration will have a serious impact in performance. Nevertheless I
think it is a great design, because it provides you with lots of guarantees. I
have worked a lot with C# Promises (Tasks), where the implemenation allows
synchronous completions, and that gives way more room for errors. E.g.
whenever you call TaskCompletionSource.SetResult(xyz) you might directly jump
into the completion handler which executes arbitrary code (and may even call
recursively back into your code). If you don't know that, don't know when it
happens (understanding all the SynchronizationContext stuff) and don't take
precautions against it (like checking for state modification after promise
resolval) you might run into hard to find errors. Similar things can happen on
the using side. In JS-land a continuation will always run in the next
eventloop iteration, which guarantees that code after Promise.then() will
still get executed before it. In other designs the continuation might run
directly in .then sometimes, which is also another thing to account for. In my
opinion the JS promise spec provides the most sensible and least error prone
behavior. In my opinion that's great, because for the average programmer
correctness is more important than performance. The smaller amount of coded
that really needs to have a very high performance can still be hand-optimized,
e.g. by falling back to synchronous callbacks where it matters, or for the
usecases of the link reading bigger parts of the file at once and then
performing more of the parsing synchronously.

~~~
shangxiao
I appreciate the fact that you've taken the time to offer a comment which I
personally think is interesting & valuable, but with all due respect, could
you type in paragraphs please… it makes it a lot easier to read :)

------
jkrems
Not sure which spec the article is referring to but it's not the one
implemented by V8 natively.

> The problem is that Promises have to be resolved in the next turn of the
> event loop. This is part of the spec.

That statement is false unless I'm missing something major here. Promises
aren't resolved in the next turn of the event loop. They _should_ be resolved
as part of the microtask queue which can happen at the end of the tick
(recursively).

Example (tested on node 8):

    
    
        Promise.resolve()
          .then(() => console.log('1'))
          .then(() => console.log('2'))
          .then(() => console.log('still the same tick!'));
        setTimeout(() => console.log('timeout happening in the next tick'), 0);
    

Result:

    
    
        1
        2
        still the same tick!
        timeout happening in the next tick
    

The same will happen with async functions:

    
    
        (async () => {
          console.log(await Promise.resolve('1'));
          console.log(await Promise.resolve('2'));
          console.log(await Promise.resolve('still the same tick!'));
        })();
        setTimeout(() => console.log('timeout happening in the next tick'), 0);

~~~
noonat
You're correct. The spec only specifies that "onFulfilled or onRejected must
not be called until the execution context stack contains only platform code."
This doesn't require that the implementation use the same macro-task
scheduling mechanism that setTimeout might use.

------
jerf
"Unfortunately, generators and iterators are synchronous. Unless we’re willing
to read our entire file into memory, they can’t help us."

This doesn't seem to follow to me. Synchronous reading means you have to read
a given chunk in before you can process it, which means that if you're not
reading out of cache that you're going to stall on the spinning disk, but
synchronous code doesn't require you to read the whole file at once. And you
can easily write something that will process it into nice lines for you. The
"standard library" may not do that because there isn't much standard library
for Javascript, but it's not that hard.

The proof is, that's what the Python snippet is doing. It's built into Python
in this case, and I would expect it does some magic internal stuff to
accelerate this common case (in particular doing some scanning at the C level
to see how long of a Python string to allocate once, instead of using Python-
level string manipulation), but the API ought to be one you can do in Node,
even synchronously, with generators.

~~~
Arnavion
>Synchronous reading means you have to read a given chunk in before you can
process it, which means that if you're not reading out of cache that you're
going to stall on the spinning disk, but synchronous code doesn't require you
to read the whole file at once.

It's implicit that stalling the node.js event loop is not allowed, so a
synchronous "line" generator or iterator would block the event loop when it
has exhausted the data read so far. The node.js event loop cannnot suspend the
current task and process other tasks while waiting for more data to be read.

Edit: The point of the author's post is that async iterators generate a
Promise (and thus a yield to the event loop) for every element, even when the
element is already available synchronously. Even if the generator has read
three lines in the latest call to read(), it must still yield three promises
of one line each (each involving one yield back to the event loop) before all
three lines are processed.

Edit 2: That said, it's not hard to fix this, and the author actually comes
close to the solution but doesn't realize it. They say the result of the async
iterator in the spec should contain an array of values, but it's just as easy
to write the "line" iterator to yield an array of lines as its values without
changing the spec.

~~~
hetman
I think the problem with yielding an array of lines is that all the code that
uses the result of the function would have to be aware of this grouping of
results all the way up the stack.

What he was proposing was a solution that _looked_ like you were only
returning one value per iteration (so none of the code doing the processing
had to be aware there are arrays of values) but underneath the covers was
actually being processed in batches on each run through the event loop.

It's actually a very elegant solution. The only code that would have to be
aware that it's actually dealing with batches of values is the code producing
those batches right at the top of the stack of calls.

~~~
Arnavion
That's correct, and I understand the author's solution. My point was that
changing the spec is not a hard requirement, since it can be accomplished in
user code.

~~~
hetman
Then I suppose we don't need async iterators at all since this can all already
be done with callbacks.

------
sp527
If you're using Node.js as a stateless web server (the principle use case
motivating its design), when are you reading files off the local disk other
than in the initialization phase? What's the problem with applying the 'right
tool for the job' mindset and defaulting to other platforms for more
generalized computing tasks?

~~~
camus2
> If you're using Node.js as a stateless web server (the principle use case
> motivating its design), when are you reading files off the local disk other
> than in the initialization phase? What's the problem with applying the
> 'right tool for the job' mindset and defaulting to other platforms for more
> generalized computing tasks?

what's the point using nodejs if you can't do a simple task such as reading a
file? Use something else to begin with. NodeJS wasn't designed to be
"stateless", no more than Javascript was. If you want "stateless" use clojure
or Haskell.

------
iamleppert
Not a scientific test:

    
    
      var fs = require(‘fs’),
       csv = require(‘csv-parser’);
      var start = new Date().getTime();
      fs.createReadStream(‘./Batting.csv’)
       .pipe(csv())
       .on(‘data’, function() {})
       .on(‘end’, function() {
         console.log(new Date().getTime() — start);
       });
    

This takes about 400 ms on my machine to parse a 6.1 MB CSV (15.25 MB/s). I
didn’t do anything useful with the data, but I did verify it exercises the
parsing function.

Clearly all the work/time is spent in the actual parsing and handling corner-
cases in CSV trying to be correct over high performance/throughput.

------
xpaulbettsx
> (note: I’m using TypeScript to convert the async iterators and generators
> into something that node.js can run. See my code here.)

This is your problem, TypeScript's Promises polyfill is really really slow.
Switch TypeScript to emit ESNext and run it on Node 8 and you'll see a much
much different situation.

~~~
Arnavion
TS does not provide any Promises polyfill.

------
sebcat
I've written event-driven, non-blocking style C for a couple of different
projects on POSIX platforms. When a file descriptor becomes readable, you read
from it until read(2) returns EAGAIN[1] or until a condition is fulfilled,
such as the read buffer containing a delimiter. People new to event-driven
programming may just invoke read(2) once for each readable event, check for a
condition and then yield to the event loop. This sounds like the same mistake,
only with more turtles stacked on top of it.

[1] This applies to sockets and pipes (incl. stdin), not regular file I/O
which usually ends up being off-loaded to a thread/process pool (if needed)

~~~
0xcde4c3db
I don't think it's quite the same thing, but that description reminds me of
Ted Unangst's "accidentally nonblocking" post.

[https://www.tedunangst.com/flak/post/accidentally-
nonblockin...](https://www.tedunangst.com/flak/post/accidentally-nonblocking)

------
skybrian
re: "Promises have to be resolved in the next turn of the event loop. This is
part of the spec."

This seems imprecise? There are multiple queues and the details seem to be
implementation-specific [1]. The Promises/A+ specification doesn't seem to
specify a particular implementation [2].

The reason this might matter is that if you use promises to do something CPU-
intensive without doing I/O, it might be possible to starve the event queue,
similar to what happens if you're just writing a loop.

On the other hand, the example in the article is waiting on reads, so they'll
be delivered through the event queue. Can reading larger chunks at a time and
buffering be used to speed that up?

[1]
[https://github.com/nodejs/node/issues/2736#issuecomment-1386...](https://github.com/nodejs/node/issues/2736#issuecomment-138607657)

[2] "This can be implemented with either a 'macro-task' mechanism such as
setTimeout or setImmediate, or with a 'micro-task' mechanism such as
MutationObserver or process.nextTick. Since the promise implementation is
considered platform code, it may itself contain a task-scheduling queue or
'trampoline' in which the handlers are called."

~~~
kevincox
I think the main concerns is that the following code happens in the expected
order.

console.log("First");

promise.then(_ => console.log("Third"));

console.log("Second");

This means that even if `promise` is fulfilled already the callback must be
delayed _at least_ until the next turn of the event loop.

~~~
skybrian
Yes, it does need to be delayed until the stack is unwound, but it doesn't
need to be delayed until after any events already on the main event queue.
(I'm not sure what "the next turn of the event loop" means in the presence of
multiple queues.)

------
mighty_bander
The number of excuses for node's poor performance here is shocking. Talk about
high-level architectural concepts and quibble about the likelihood of this
implementation being necessary if you like, but if you can't provide a decent
implementation for this very basic use case, then node has a hole in it.

~~~
ricardobeat
Node is not optimized for this use case at all. Try parsing 64 files at once
then compare them again.

------
bluepnume
The biggest problem I had with forced promise asynchronicity was with browser
promise polyfills, which use setTimeout to force success and error handlers to
be called asynchronously.

This is fine, unless you're:

1\. Modifying the DOM and hoping to avoid jankiness 2\. Trying to work with
popup windows, in a browser which deprioritizes setTimeout in all but the
focused window.

I ended up implementing [https://github.com/krakenjs/zalgo-
promise](https://github.com/krakenjs/zalgo-promise) to get around this. It
intentionally "releases Zalgo" by allowing promises to be resolved/then'd
synchronously, but my belief is this doesn't have to cause bugs if it's used
consistently.

~~~
Arnavion
You're using poor polyfills if they use setTimeout. Good polyfills [1] use
MutationObserver which uses the (a?) microtask queue.

Invoking a Promise's then() callback synchronously is a violation of the spec.
IIRC there are some tests to prevent that in the test suite.

[1]:
[https://github.com/petkaantonov/bluebird/blob/2c9f7a44/src/s...](https://github.com/petkaantonov/bluebird/blob/2c9f7a44/src/schedule.js#L31)

------
shinypotatoe
Interesting and surprising. Node 8 supports async/await as well as iterator
functions. So the target for the typescript transpiler can be set to 'es7'.
Would be interesting if this has an inpact on the performance.

~~~
rictic
Poking around with it, it looks like that gives a 20% or so improvement in
node v8.4.0, from 1239ms to 1022ms (vs 56ms for the sync version).

------
ilaksh
Would be nice to say what version of Node.js.

~~~
wereHamster
It doesn't matter which version of node. The fact that promises must be
resolved in the next tick is baked into the spec, no upgrade of node will ever
improve that situation.

~~~
ilaksh
Right but there were a lot of specific numbers so the context for specific
benchmarks is versions.

Anyway it sounds like we should try to fix the spec.

------
amelius
My problem with promises is that you can't "uninvoke" them. I.e., if all
listeners unsubscribe, the original job they were waiting for could be killed,
but the design of Promises doesn't allow that. Instead, the job will continue
to be executed (without any listeners), and thus wasting resources.

~~~
joshstrange
I won't argue that sometimes you might want to cancel a promise that you no
longer need (ex: user cancels ajax operation) but just because there are no
listeners does not always mean you don't want it to complete (ex: Fire and
forget, not always a great idea sure but a thing nonetheless).

~~~
amelius
True, but you can implement fire and forget by adding a dummy listener. The
functionality I'm suggesting is more general.

------
fibo
For efficient text processing the only language that I'd use is Perl5.

~~~
deckiedan
Perl is great, but not that much (if any) faster to run than the equivalent
Python, these days.

If you use perl regularly, then the $_ syntax stuff is super fast to type and
do all kinds of cool stuff, but I find as a irregular scripter that I forget
after 6 months or so. Python's monotonous syntax means stuff is generally
still understandable even by non programmers.

------
egeozcan
In the documentation, there is a specific example for reading a file line by
line, and it is very concise[1]. I'm too lazy to test how fast it is though.

[1]:
[https://nodejs.org/api/readline.html#readline_example_read_f...](https://nodejs.org/api/readline.html#readline_example_read_file_stream_line_by_line)

