Hacker News new | past | comments | ask | show | jobs | submit login
Node-fib: Fast non-blocking fibonacci server (github.com/glenjamin)
194 points by dchest on Oct 3, 2011 | hide | past | favorite | 118 comments

Note to self: Starting immediately, all raganwald projects will have a “Is it any good?” section in the readme, and the answer shall be “yes."

Author here, didn't really expect this to get picked up anywhere but since it has I'd like to point out the idea was to demonstrate that computationally expensive algorithms can be split across multiple iterations of the event loop to avoid blocking it.

In this case concurrent requests take advantage of each others' memoisation, which would be somewhat trickier to do with threads as you'd probably need to worry about locking.

Yes, you too can by required by the compiler to implement cooperative multitasking by hand. In 2011.

Yes. It is an answer to the criticism made, and I acknowledge that. But it is not a very good answer to the objection. The better answer is "don't do that in Node.js", which is still not all that great (it's really easy to accidentally write something that blocks badly), but is better.

Well the other option for computationally expensive code is to use some sort of worker that runs a sufficiently fast language.

JavaScript on v8 is actually one of the fastest interpreted languages available, so unless you really need to drop down into C or similar, splitting across the event loop or using another node child process is not an unreasonable way to approach CPU heavy calculations.

> Well the other option for computationally expensive code is to use some sort of worker that runs a sufficiently fast language.

Which only helps if you know your code is going to be slow. If you somehow implemented an algorithm with a quadratic complexity and did not test for sufficiently large input, you might not realize what's going to happen before it hits production.

> JavaScript on v8 is actually one of the fastest interpreted languages available

1. Nobody is denying that.

2. The issue is with the behavior of evented systems in general and node in particular in case of in-request CPU-bound code paths, namely that the whole server blocks killing concurrency and basically DOSing the instance.

I can't disagree with any of these points.

NodeJS is no magic bullet, those who treat it as such should be extremely wary. It is, however, rather nice to work with.

Cooperative multitasking will always have the blocking problem, whether implemented manually or by the language. The manual callback chaining is tedious, though.

Preemptive multitasking solves that problem, but if mixed with lots of shared mutable state, it reintroduces much worse problems of non-determinism and unreproducible bugs.

The golden path involves preemptive multitasking with little-to-no shared state. That way you get to have your cake (determinism) and eat it too (no blocking problems/starvation).

> The golden path involves preemptive multitasking with little-to-no shared state. Erlang (and probably some other languages) seems to have taken this approach.

(go too)

I am not quite sure what it is you are trying to say. Node does (in this case) require a bit more hands on approach than most other languages but that is because of a subtile but important difference. Node uses corporative concurrency whereas threads are not corporative. This is both a disadvantage (you are required to do more work and cannot take proper advantage of multiple CPUs) and a huge advantage (no locking is needed and you can share the results between different execution points).

Of course the real issue is that you should choose a better implementation of the algorithm.

Lets not forget that process-per-connection has a really bad performance rep.

These newfangled non-blocking designs are seemingly much better for everyday tasks.

And saying that an async IO server is a bad choice for computational things is, well, a well-known tradeoff that 99% of apps don't have to worry about.

When was the last time someone complained that their webservers were CPU bound? Its always the DB that is the bottleneck...

Yes. It is an answer to the criticism made, and I acknowledge that.

It's an answer that says: "You're using the wrong tool for the job."

Which is particularly weird given that Fibonoacci itself is probably the most overused example of algorithm-to-promote-paradigm in computer science. Except it's for a different paradigm: recursion, not asynchronous IO.

Still, it's interesting in a recursive sort of way.

Fib is designed to be a piece of code that runs really slowly but doesn't require typing in many lines of code. This makes it a reasonable benchmark for things like "how fast can a function be called", and also a good example of "something that takes a long time".

"it's really easy to accidentally write something that blocks badly"

What is this nonsense? Isn't it time we knock this one on the head? Let me let you into a secret: If you write CPU intensive enough code in any framework you can eventually block all further requests. This whole debate was based on FUD and a complete misunderstanding of what node is all about. Read the previous posts and don't post generic, bullshit comments like "which is still not all that great".

"If you write CPU intensive enough code in any framework you can eventually block all further requests."

"Intensive enough" is a vague and fuzzy term which you can hide too much behind. So let me put it this way: Go grab Yaws, the Erlang web framework. Write a web page that goes into an infinite loop. Write some other web pages that work normally. Observe that visiting the infinite-loop web page once does not cause the rest of the web server to stop serving. Yaws may time it out eventually, too, not clear from a quick look at the docs. In fact it will only marginally decrease performance, even on single core machines.

Yes, if you bash on that page often enough you will eventually degrade service to an unacceptable level. But you will not bring down the whole server, or even that OS process, and it will take substantially more than one hit per process or one hit per core.

Now, go grab Node, and write a web page that goes into an infinite loop. You just brought that OS process down, from the user's point of view.

Node is qualitatively much easier to lock up an OS process with than Erlang. Or Haskell, or Go, or anything else with a modern task scheduler, which is an ever-increasing number of language platforms.

At some point the fair time given to all the various workers stuck in an infinite loop will starve legitimate low-work tasks

CPU is finite. RAM is finite. The kernel has limits on things like sockets.

(Erlang is super-neat, and Go goes a long way in the same direction. But I rather imagine the original poster meant that people should use CGI)

But who deploys node in a single instance? There's documentation all over the place which tells you how to load balance over several instances - and it's easy to do so. This is my point, the node is cancer article was pure troll.

OK. Set up a load balanced infinite loop. Result: A load balanced infinite loop. This is not a win for Node. Load balancing across a number of hung processes buys you very little. (Not quite zero; you get a chance to detect the fact that it's hung and restart it, as long as these pathological requests aren't coming in fast enough. Hope the user who poked the bug doesn't hit refresh too many times!)

I still think you may not understand what modern schedulers end up doing here.

How does this differ from MacClients in apache, or process limits on CGI? I agree with you, but don't see how this problem is specific to node...

It isn't specific to Node. What's specific to Node is that there's a whole bunch of hype convincing people that Node is the epitome of multitasking, when in fact it's just yet another event-loop based system, subject to the same foibles. The same very well known foibles.

Node isn't a bad technology and I don't hate it. Well, I personally hate working in the event-loop paradigm (due to abundant experience) but that's no discredit to Node, which simply is what it is. The hype is toxic. The hype is basically full of flat-out lies. It teaches people that the state-of-the-art as of 1990 or so is the state of the art today. The hype claims Node is blazing a new path in the field of concurrency, when in fact it's traveling a 4-lane highway with fast food and hotels, while putting blindfolds on its partisans to hide them from the fact they're actually smack dab in the middle of civilization.

you realize that you're just making Ted's point for him from the opposite direction, right?

his point, when you look behind the trolling, is that node.js is not magical special sauce and that shitty coders who write poorly-scaling code will be shitty coders who write poorly-scaling code no matter what technology they use -- the "cancerous" properties of node arise simply because of the amount of groupthink that pitches the Next Big Thing as intrinsically better than everything that came before it.

to expand upon this, here are some other representative Ted Dziuba posts:





if I had to sum up his philosophy in three sentences, here they are:

your job is software engineering. every bit of fancy shiny stuff that you add onto your software is another thing that can break. the most important decision you need to make in your job is determining when your tools are good enough and don't need further elaboration -- at a certain point you need to stop jerking off about your toolchain, and just ship your project.

Yeah, some of those blog posts don't make much sense. The queues article, for example. RabitMQ and ActiveMQ are big Java things to keep your sysadmins up at night. ZeroMQ is just a networking library. Yes, they all end in "MQ", but at the end of the day, everything in computing as a message queue. If you don't like it, go back to typewriters :)

Fine. So you pick your tool (Java, Python, Node.js, presumably with an Nginx or Apache front end, though I'm not sure you want to put Node behind Apache), and use it.

I don't see how Node.js isn't a valid tool. Async can be a bit of an over-optimization, but you don't have to use it (even in Node), as Ted's naive Fib server shows. And Javascript is ugly as sin. But so's PHP, the language behind Wikipedia, and you have to use JS (or something like Coffee-script) anyway.

I don't know enough about Node to really judge it, but there's nothing I've heard in this whole flame-war that really rules it out.

agreed -- for what it's worth, I've looked into node.js for some of my projects, then thrown it out because I already know other technologies which serve the same purpose. the problem isn't with the technology, the problem is with the marketing (and that includes grass-roots marketing through engineers who swear by the technology as one-size-fits-all.)

Any decent troll needs a nugget of truth at the bottom to hook people in.

It is an old question 'How do we tell truths that might hurt?'.

If you try the well reasoned analysis, you get passed over. It turns out that no-one pays attention unless there is a fight happening (c.f. tech crunch's reporting style)

'If the truths are sufficiently impalatable, our audience is psychically incapable of accepting them and we will be written off as totally unrealistic, hopelessly idealistic, dangerously revolutionary, foolishly gullible or what have you.'

The morale is - everyone admonishes a flame, but nothing else gathers posts quite like it. If you think something is terrible, holding back will get you nowhere.

I like you.

Personally I've never seen node.js pitched as a solution for newbies or sub-par coders toiling away in the enterprise trenches. I've always seen it marketed as a useful tool for people who know WTF they're doing.

buzz is omnidirectional -- when people start talking about a technology, platform, or stack, people of all shapes and colors will show up and use it. and when your product pitches itself as having super amazing performance due to this programming paradigm omg!, you are responsible for making its limitations known to all and sundry who use it, rather than assuming prior knowledge.

also, the joyent node.js homepage itself claims, as a business advantage:

  "• Huge JavaScript developer pool at the ready for faster development"
implying that any Javascript developer can just dig their hands right into server code and get working.

Right, let's just stop talking about cool new technologies since some people might misuse them. Does anyone want to help me debug my webapp? It's written in C.

nice strawman.

my point isn't that you shouldn't build up buzz. it's that, when buzz exists around something, your job as a platform implementor includes making people aware of the things your product can't do well, and pitfalls the end-user might run into.

saying "well, it's their own fault for not being clueful enough to know what they were doing wrong, this technology is for pro hackers only!" is developer-hostile.

Being a wee bit pessimistically inclined, I somewhat agree with the assessment of the node.js programmer pool (seems similar to the first migration of PHP programmers to Ruby after the infamous RoR screencast), I find it hard to attribute to this to the developer(s).

Or in other words,

a) what did they do wrong?

b) What do you think would be the best way to "spin" a new platform like that without either being cluelessly elitist or giving the unwashed masses false hope?

disclaimer: I'm not a node.js expert by any means, and most of my day's work is on client code written in system programming languages (C and C++.) that which isn't, is in C#. I don't know of anything in particular that they're doing wrong because I don't really keep in touch with that particular PL community.

that said, my ideal pitch page would contain something to the effect of: "extensive standard library, including robust facilities for concurrent/non-blocking I/O". nothing marketing-y about non-blocking I/O or a given programming model making a language intrinsically "fast" or scalable, like nodejs.org's:

"This is in contrast to today's more common concurrency model where OS threads are employed. Thread-based networking is relatively inefficient and very difficult to use. [...] Almost no function in Node directly performs I/O, so the process never blocks. Because nothing blocks, less-than-expert programmers are able to develop fast systems."

in addition, node.js is in a little bit of a weird position of introducing a whole new language environment for an existing programming language, since node.js is effectively the vanguard of server-side/concurrent Javascript: I'd hope for an official series of tutorials designed to introduce the nonprogrammer, or "Web app-only programmer" of Javascript, to concurrent programming in the large.

that said, the node community might already be doing things like this which aren't obvious to the layman, and if so, good on them. good concurrent programmers are hard to find, and anything that encourages people to learn how to do concurrency well is a good thing.

Node is marketed towards people with Javascript experience to a large degree, it seems to me (I could be wrong). People with Javascript experience more than likely have it via writing for a browser. A browser functions functions identically to Node: the event loop. So, those coming from writing Javascript in a browser will merely carry through their knowledge of development practices to Node, meaning they intrinsically will write event based code.

Or at least that line of thought makes sense to me.

Really? I've almost never seen node.js pitched to anyone other than newbies (although not in the enterprise trenches for sure).

Rarely the pitch involves being able to share libraries between the server side and browser side (wonderful benefit of node.js)

Most of the time it's being sold as "you already know javascript" or "it's super fast, because non-blocking is magic sauce!"

I haven't used Node.js for anything serious but the hype and especially the discussion around the now infamous node is cancer post sure makes it seem like Biilmann was correct when he said "For better or worse, Node is like the PHP of concurrency and ever so often worse is better"

Even if that's not true, if that demographic is the dominant users of and contributors to Node then that's what Node will become.


I find the notion of sharing libraries between the client and server kind of odd, in that people who talk about the "open web" think it's a desirable quality.

it allows you to rev a protocol faster, but at the cost of not being forced to have a point of reference that isn't also intrinsically tied to one implementation of the protocol, which seems a very un-open thing to do to me.


The libraries I was thinking of sharing are all ones that are tightly coupled to the specific pages they would be on and all related to UI.

Things like form validation, user input normalization, navigation options. There are ton of small data manipulation functions that you either have to maintain in both javascript and your server side language and cause some weird bugs when one version doesn't behave exactly the same as the other or else you need to make a bunch of unnecessary ajax calls just to avoid reimplementing things in js.

There is much deeper integration you could do but I think we're on the same page as to why that's a bad idea.

If I try to ask for a large number (say 1 million) I run out of RAM:

  $ node app.js 
  FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory
Somehow you're using O(n) RAM to do the calculation. Seems bad bro.

This code is the epitome of roflscale

That's a consequence of memoisation. It scales better than the original code though.

Still, the original point (node is cooperatively multitasked) was clear. Anything beyond that and I just want to reach for better Fibonacci algorithms.

And this is why, any cache without an eviction policy and/or size limitation is bad (and memoization _is_ a cache). I shudder every time I see a "magical" memoization function whose interface is just

   memo function-to-memoize

I think it'a a valuable tutorial to show node.js idioms. It's pretty easy to read, but I for one would struggle to write it as concisely.

Oh god, I hope you're joking.

I have to agree with you, I guess it only shows how little I know node.js

I think this proves Ted's point. Look at that code. It is ugly as sin. Since I'm not an expert at node.js, I can't follow what is going on at all. Plus, this is memoizing. That is just cheating, it doesn't matter how you implement the algorithm if it memoizes.

If the creator of this github repo is to be believed, implementing a 3 line algorithm in node, you must use 20 lines of intricate, ugly code. If that is the case, node.js IS cancer.

Node.js can be done in different ways to this, the fact is, that this code works asynchronously, and this is how Node does async.

Instead of memoizing in the server you could memoize in the browser and use socket.io to ask connected browsers if they have a memoized value for a given fib number. That would get you around the limitations that bascule mentioned.

Now there's an idea! That could be rather fun...

As much as this whole Ted-gate thing has turned into a amped up argument of sorts, there has been a significant amount of knowledge for everyone shaken out of the trees about node.js as a result.

Take for example this implementation, how many people getting started with node knew about async or memoization?

Could be just me, but I learned a lot about what not to do and how to do certain things more efficiently with node as a result of the whole fiasco.

Thanks Ted :)

Aahz's law

   The best way to get information on Usenet is not to ask a
   question, but to post the wrong information.

There's an xkcd for everything... http://xkcd.com/386/

Unfortunately, it's usually that one.

async is not part of the node.js core - it's an external module.

I'd like to point out that this uses important improvements over the naive Ziuba's strawman version: memoization and callbacks using the event loop. The callbacks improve concurrency, allowing a single process/thread to multitask requests, and memoization reduces total time spent per request.

Actually, if the async lib memoization facility shares values between requests, which seems quite sure to me, all requests but the first one are served from its cache in the test the author uses as example. This still doesn't highlight any strength of node.js IMO, asides from the easy shared memoization. It could be translated to Tornado-web, an async Python framework, almost literally.

> I'd like to point out that this uses important improvements over the naive Ziuba's strawman version

The Ziuba version was not "a strawman", it was an example used to trivially generate high in-request CPU load and demonstrate the misbehavior of evented system under this condition.

If you want to compute fibonacci fast, just use Binet's Fibonacci formula, it runs in O(1), then your only issue is that you're going to overflow very fast on doubles (not sure you'll even reach fib 1500). Using decimal, you can probably go higher before reaching your limits (though you'll run out of display space long before that if you print all digits): Python's decimal.Decimal reaches ~fib 4000000000 before an overflow error.

Binet's formula doesn't run in O(1) if you want an exact answer. Using a + b * sqrt(5) representations or floating point with sufficient precision, you can get O(n log n 2^O(log* n)) time (quoting Wikipedia's Furer's algorithm page here). Which beats O(n log n log log n) time, if you were wondering. If something beats that for computing Fibonacci numbers, that'd be pretty cool.

It does in fact share memoization. Easy shared state is a key strength of the single-threaded model. :)

Re: Node.js, you're right. The core of the debate is whether the single-threaded, evented model is a legitimate theoretical approach to concurrency. That programming style is the same across Node.js, Tornado-web, EventMachine, etc. It translates literally to Tornado-web because the idioms are the same.

Is it just me, or have all these implementations of fast Fibonacci servers missed the original intent of the 'Cancer' article? As I read it, it was more of a commentary on giving novice concurrent programmers enough rope to hang themselves than on the capabilities of Node itself.

Ted-gate was advocating a return to CGI instead...

Perhaps if enough alternative implementations are submitted to HN it will drown out the silence on the actual issue that was raised.

The only difference in this implementation is that it uses memory-based caching (memoization) to compute given value once and then serve the cached copy.

Of course this is fast for 1000 iterations, since the effective cost is zero from the second request.

People really are misunderstanding the critique of fibbonacci as representing any CPU intensive task.


All the author did here was remove the CPU intensity by caching the calculation

It also uses process.nextTick aggressively (for each recursive call) so that the server can take other requests in the middle of computing a big sum.

I'm actually very confused by his criticism. You can write a CPU intensive task in any language, and you'll have the same problem.

Or is that the point? Some people believe Node.js will magically make all processing computations = 0? I'm all for discouraging the rumor that Node.js will solve every problem, but don't call it Cancer.

The issue is what happens when you write that CPU intensive task. If you do it in idiomatic Go, that same server will keep on responding to other requests in the meantime. With Node.js, that is not the case. You have to do things like, well, what this article does, to get it to work.

I'm still not getting it. Go might be able to handle an extra request or two for high CPU intensive tasks, but it will inevitably suffer the same consequence of locking all it's processes given enough requests.

Still doing anything that is CPU intense on that layer is stupid in the first place.

A so called 'goroutine' locking up will not lock up the rest of that server. Other pages will still be served by that server. This has nothing to do with the efficiency of the underlying language, it has to do with Go doing cooperative multitasking transparently. It might as well be a separate process (in fact, for all you really know it could be).

The code seen here does cooperative multitasking explicitly. It is essentially explicitly specifying places where rescheduling can take place. The result is the same but with this Node code you are getting your hands far more dirty.

The point of the Cancer article is that since it (allegedly) is not made explicitly clear what is going on, programmers can deal far more damage to themselves than they would with other schemes.

I really don't know how to explain it any better than that.

PS: "cpu intensive" is relative. The fib example is a deliberate exaggeration of what you'd see in reality, to make the point trivial to observe.

Right. If Ted-gate was advocating using go or erlang I'd be so cool with everything except his choice of language.

But he wanted us to go back to CGI folks...

To be clear, CGI doesn't exhibit the issues seen with Node either, it's issues are elsewhere (and separately debatable).

Well, that just makes me more confused, because most languages I know handle multitasking explicitly, and they can all be dangerous when programmers exaggerate "cpu intensive" processes. I appreciate you trying to explain this guy's expectations, but I can't help but see this all as trolling FUD.

Many languages offer threads, which either the operating system or the runtime will preëmpt to ensure they all make progress in some vaguely fair way no matter how they behave. I haven't been required to write "I have more work to do, but this would be a good place to interrupt me" for fifteen years (when Win95 displaced Win3).

I wonder why I don't often see folks directly computing Fib(n) using the equation given in SICP exercise 1.13:

Fib(n) = round(φ^n / sqrt(5)), where φ = (1 + sqrt(5)) / 2.




To answer your (possible rhetorical) question literally, the requirement here isn’t that it generate Fibonacci numbers efficiently, it’s that it implement the recursive algorithm efficiently.

It’s true that another algorithm will be much faster, but that deprives us of the opportunity to discuss how this implementation of the recursive algorithm delivers much higher performance than a naïve implementation.

The back story is that an abrasive blog post asserted that Node.js was a terrible platform for certain types of problems, and this algorithm was given as an example. The author here is simply showing that with care, Node.js will not be the problem.

For shits and giggles, here’s another Fib algorithm, chosen strictly for the pleasure of implementing it:


The specific point is that the evented model is not designed as a "one size fits all" solution. It's great for certain scenarios, not very good for certain others.

Basically, the original blog post tries to get to this, remarking on node.js is sold as if it was the silver bullet of web servers, that will scale to anything for anything you .

Of course, you can try to adapt most problems into something that would work in an evented model, but you have to know VERY WELL what you're doing.

IMO the solution is simple: polyglotism. Have parts of your app in node. Other parts in python. Others in ruby. Whatever works best for each specific small part :)

Node is not useless, but it's definitely not my choice to program everything and anything. Same thing with, for example, Rails :)

It’s tangential to this thread, but since it’s a topic close to my heart:

If you just implement this formula using doubles then it only gives the correct answer for input values up to about 70. Yes it’s pretty fast, but a static table of the first 70 Fibonacci numbers would be even faster. If you’re only interested in the first few dozen Fibonacci numbers then it doesn’t really matter what algorithm you use, as long as it isn’t naive recursion.

If you want to get correct answers for larger Fibonacci numbers, you need to work harder to use this equation. You need some sort of multiple-precision floating point arithmetic library, for example.

And once you’ve done that, the resulting algorithm is quite a lot slower than the best integer methods.

I wrote a big blog post about this back in July: http://bosker.wordpress.com/2011/07/27/computing-fibonacci-n...

That formula fails for fib(71) with double precision floating point arithmetic:

    In [1]: def fib(n):
       ...:     a,b = 0,1
       ...:     for i in range(0,n): a,b = b,a+b
       ...:     return a

    In [2]: def fib2(n):
       ...:     phi = (1+sqrt(5))/2
       ...:     return round(phi**n / sqrt(5))

    In [3]: fib2(71)
    Out[3]: 308061521170130.0

    In [4]: fib(71)
    Out[4]: 308061521170129L
And it's not that the latter is not exactly representable as a floating point number:

    In [5]: float(_)
    Out[5]: 308061521170129.0
So you'd have to use bigger precision numbers. How many digits are you going to use? You have to use enough not to produce a wrong result and not too many to slow down the algorithm. I bet that the resulting algorithm is slower than the standard [1 1; 1 0]^n matrix exponentiation. Note that your formula comes from the eigenvalue expansion of this matrix exponentiation, which -- despite what you might hear from undergraduate linear algebra courses -- is not a good method for matrix exponentiation. On the contrary, finding eigenvalues is done by a process analogous to matrix exponentiation: http://en.wikipedia.org/wiki/Power_iteration

Would be cheating, man. Micro-benchmarking is all about using fictitious tests to show off fake performance in blog posts.

Or, for that matter, using fictitious tests to show of fake lack-of-performance in blog posts.

Why would you want to use that formula? The doubling identities are faster and avoid floating-point arithmetic.

I just tried this:

    Math.round(Math.pow((1 + Math.sqrt(5))/2, n) / Math.sqrt(5))
But it starts returning incorrect results at fib(76), presumably due to floating point calculations lacking enough precision.

That's actually pretty good. Double precision floating point (IEEE format, anyway), can represent all positive integers up to 9007199254740990, which takes you up to Fib(78), so the fact that you can get up to Fib(76) means you are getting close to the top. Good enough for most work.

After 9007199254740990, double precision only represents even integers. Fib(79) is odd, so can't be represented. Fib(80) would be representable though, as will be every other Fibonacci number for a while. Then comes a range where only multiples of 4 can be represented, and then a range with multiple of 8, and so on, so while there will be representable Fibonacci numbers in those ranges, more and more will be omitted.

By using a field extension over √5, you can evaluate this exactly without rounding or resorting to floating point.

I think that's a very sharp observation. I tried it, but in order to exponentiate the vectors (instead of floating point numbers) I end up with the original matrix algorithm (pre-eigenvalues)...

This is the wrong approach. The right approach is to write your computationally-intensive code so that it runs in another process, and then to talk to that process over the network. Network servers must only talk to the network.

If I have unused capacity, I can afford simpler threaded code. If I'm overloaded, I need some other platform for the real work. It sounds interesting but unless I need to write my own software load balancer, I'm not sure what node (as implemented today) is appropriate for doing.

Note to Ted: next time, just use sleep().

To which the response is the same: simply don't block the event loop and use setTimeout.

The whole cancer thing is based on the premise that blocking the event loop is unavoidable. I'm attempting to say that it isn't. Jerf's points about how it can happen accidentally are perfectly valid, it's up to the developer not to do something stupid and to actually have decent tests and benchmarks in place - as it would be with any approach.

Its certainly true that significantly poor performance in code which doesn't release the loop will degrade a single-threaded event loop far more than a threaded approach.

CPU speed and bus speed is finite on the machine you're running on. Using setTimeout to poll the results of some thread you've spawned to run a calculation is _worse_ performance than just blocking in the first place, because now when you try to profile your application (how does one do that in node.js, anyways?) you're going to be looking at a million setTimeout closed loops.

Here's Ted's entire point: "The point is that the HTTP server isn't the same entity doing the application work."

..which has basically been true ever since mod_jk was created.

The point is moot. HTTP is just the transport protocol in this case. Doesn't have to say that it has to be the front end service that talks to the end user web browser. The OP of the "cancer" article made up that shit. I can bind the listener of a node service to an UNIX Domain Socket and talk HTTP over that socket with a front end web server, instead of plain TCP (reduces the IPC latency). It is IPC between distinct components. It delegates the dynamic page generation to another service. It is a text interface, although sometimes, well, most of the times, the binary protocols outperform the text based protocols therefore the theory is crap. Isn't this "UNIX way" enough for some people? Forget the part that says: any IPC adds extra latency to the request - response paradigm aka the opposite of the stuff a web stack should do.

Can you tell me at least one example from the myriad xGI protocols for interfacing a web server with an application server that actually brings any benefit over having plain HTTP/1.1 and a reverse proxy? Does it help to have yet another protocol for doing basically the same task: IPC between a couple of servers? Any xGI client (aka web server) is basically a reverse proxy for an IPC protocol. Does it say somewhere, carved in stone, that this protocol must be other than HTTP? I think that most people that spend the time engineering yet another xGI solution simply don't see the forest for the trees.

PS: nobody stops anybody to drop a scgi.js or fastcgi.js module for node.js if that keeps people warm at night. But maybe I'm one of those tired of this shit: engineering yet another xGI protocol for every programming language that floats around the web. NIH syndrome, I'd say. At least HTTP is ubiquitous, while most servers already HAVE a reverse proxy handler.

Doesn't that kind of take the wind out of Node's sails as far as its claims of being easier for dummies to write non-blocking code, if you have to do all the usual due diligence anyway?

I'm genuinely curious as someone who's only familiarity with Node is hype and through hearing the adjective "evented" thrown around with wild abandon lately.

  function fib(n)
	var phi = (Math.sqrt(5) + 1)/2
	return Math.floor( Math.pow(phi, n)/Math.sqrt(5)+0.5)

Anyone know offhand what the minimum value of n is for which this fails due to floating point inaccuracies?

Javascript doesn't have more than one number data type. This algorithm may fail earlier due to loss of precision from the way JS does exponentiation or sqrt or some such but even a purely additive method will fail due to floating point inaccuracies as well.

Edit: On an in-browser test with Chrome I get accurate values up to 75 with the closed form function and accurate values up to n=78 for the recursive / additive function.


    $ time curl localhost:3000/1000000
    curl: (52) Empty reply from server

    real	1m0.392s
    user	0m0.006s
    sys 	0m0.004s
This actually crashes the server because it runs out of memory:

    $ node app.js 
    FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory

You can probably get it work by asking it for increasingly large numbers to break up the calculation into stages.

v8's current GC implementation has a hard memory limit of 1GB which is uppable to 1.7GB on 64bit systems. There's a new GC implementation in the works which is supposed to be able to beat this.

The memory usage is due to the overhead of nextTick creating a new stack each time, and the interpreter keeping track of where the callbacks need to lead. Splitting the calculation up into batches could be incorporated into the algorithm quite simply by adding an additional task to async.series which called fib(i) for i to n with a step of 10 when n is > 100.

Obviously there are much more efficient and effective ways of calculating the fibonacci sequence, but I quite like the recursive algorithm for its clearness and its amenability to memoisation.

32bit or 62bit node? If 64bit did you specify it can use more ram than the 1.7GB that's the default setting?

You can up the limit by starting node with --max-old-space-size=<size in MB>

Fibonacci isn't a kind of problem that should be bounded on RAM though. Something is fundamentally broken with the way this is implemented.

Sure it is. Stack space isn't free

It is with TCO.

How exactly is the particular implementation in question tail-call optimized? Honest question, not trying to be snarky. I'm well aware that fib can be done to use TCO, but this doesn't appear to do so

Any tool in the wrong hands, is dangerous. So anyone trying to prove that fib makes node slow, is stupid. Node.js was made to make memory requests unblocking; not to generate the fib series for you.

Trying to nail a hammer with a drilling machine, is not to take yo anywhere.


I'm actually quite impression with the 5000 requests/second ab result. I can't get more than 1000 requests/second on my pony laptop. What hardware are you using?

Wow, this really made me laugh.

Me too. I was rolling my eyes this morning at all the he said she said stuff. It's been a pretty standard example of "hacker news eating its own tail" today.

- Node sucks;

- No it doesn't * 2;

- Use Haskell;

- (others) ?

This one however has made me laugh as well :)

too much boilerplate :)


BREAKING: glenjamin cures cancer.

Too bad the code is so nasty

How so? IMHO, it's an elegant demonstration of the async module.

But is the async module itself elegant? I guess it's relative, it's even less elegant not to use the module. The async module is still a hack, the cleanest one Node has.

I think the whole discussion about "cancer" wouldn't have happened if the V8 team added continuations by now. They mentioned that they will so we might be cleaning up big chunks of code.

It's funny how a tech can get so popular with these big warts. Probably because people use it in a very limited way, no one writes CPU intensive code because you rarely need it for serving sites and data. Then the clumsy callbacks aren't so bad.

Check out the Cilk version, which is not only concurrent but parallel as well: http://myxman.org/dp/node/182

I agree with jerf that either you shouldn't have to worry about splitting your computation at all or at least you should have syntactic sugar for it. The Node solution has much more noise than code.

http://hpaste.org/52109, in case one prefers to be more explicit.

It's beautiful how node takes a 2 line algorithm and turns it into 20 lines of twisted magic, lest you block all your connections.

I lolled (at work).

lol ted is really mad :D https://github.com/teddziuba/node-fib

Now I just need an async ajax library, so I can do:

  fibonacci(40,function(f) {

I've just submitted a pull request that I think really enhances this app. https://github.com/glenjamin/node-fib/pull/5/files

wow.. this is incredibly juvenile. you should need to appreciate other people's opinions too apart from yours


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact