Hacker News new | past | comments | ask | show | jobs | submit login

>You don't need to reason about multiple instruction pointers or mutexes or re-entrant, interruptible execution because all the javascript you write just lives in a single thread.

I see....on the other hand, when I write request-handling code using (any web framework ever on any platform), you're suggesting that we do worry about mutexes and reentrance (not to mention, handling a single request uses multiple threads?) and that these details aren't already handled by (any web framework ever) ?

Interesting.




Node is not a web framework. It's something one might use to build a web framework. So it's a bit silly to say that 'any web framework ever' already deals will this for you – well, duh, but the framework author certainly had to deal with thread safety (possibly incorrectly). And don't say no framework user has ever screwed up shared state because they didn't realize how threadlocal works or that requests might be handled in another thread...


> but the framework author certainly had to deal with thread safety

usually not. if your framework builds off the common web integration system for the target environment, i.e. Java Servlets, WSGI in Python, etc., you're already handed a request scope within a thread. It's only web servers and containers that really have a heavy lifting job to do.


Okay. Clone the Rails repo and do this: git log --grep thread


To be fair, though, Django is not thread safe. Now while I respect Rails for taking the effort in making it thread safe (IIRC, it was a Google Summer of Code project) - it doesn't seem to be a must-have to have your framework be thread safe.


My understanding is:

When you are using a ruby or python based web framework, each request is a blocking request i.e. after each request you wait for the response before proceeding further e.g. urllib.urlopen(), which as a result causes the no of requests handled/second to go down. Using eventmachine or gevent respectively for ruby or python is one way to overcome this.

In node.js there is no such concept of blocking requests and everything is processed in a single thread. But you can always fork multiple processes using its cluster module http://nodejs.org/docs/latest/api/cluster.html

edit: Oh, I was not trying not advocate anything. nodejs is indeed just a choice, if I didn't make myself very clear when I said "My understanding is:"


Ruby and Python (not to mention basically every other language) have a plethora of threaded, forking, and evented models for concurrency. Thin/Rack/Sinatra, for example, behaves capably in a thread-per-request mode. Reactor systems are not the only (nor are they even preferable in many cases) choice.


blocking / non-blocking has nothing to do with language; for e.g., see tornado for async request handling in python.


How did you find gevent and totally miss Twisted, which is linked on the front page of nodejs.org?

There are a handful of frameworks that assume every request is fully concurrent and non-blocking. Twisted-based stuff is the most prevalent, for Python, like Nevow, Athena, and the stuff baked into twisted.web. However, WSGI itself doesn't say anything about the concurrency of individual requests, and it's totally possible for WSGI requests to be multithreaded, multiprocess, or otherwise concurrent.


I think what he is getting at is that if you want to handle multiple requests within the same process (in order to have in-process shared state, so you can make your chat server without external dependencies or whatever), you either need an event loop, threads, or both.

If you have threads, you do have to worry about mutexes and reentrance.


none of that has anything to do with writing web-request-handling code, it has to do with writing custom servers.

There are plenty of asynchronous solutions for writing servers, including the very widely used Twisted platform as well as the Erlang programming language. It's a tad annoying that node.js is touted largely by front end developers as solving a supposedly previously unaddressed problem.

to wit:

> emerging set of programming conventions, philosophies, and values that I see evolving in the node.js community

no, sorry, they've already emerged, they've already evolved. Go download Twisted. Use node.js if you happen to like it better. But it's not the fricking messiah.


as someone that has battled with twisted, I don't think this is a fair rebuttal. Twisted does _allow_ you to create servers, but it makes it pretty difficult and messy.

Node allows you to do the same thing very elegantly.

I don't think the problem is unaddressed, but node has the most elegant solution I've seen.


The asynchronous aspects were a very small piece of the article for a reason. Node is doing a lot of very interesting, important, and novel things that aren't related to asynchronous events at all. Twisted in particular suffers from too much exposed surface area and having to manage the reactor yourself, which hurts reusability a lot.


I'd prefer to see a detailed article about that.

Also I'd like to see more detail on why exactly writing custom servers is so profoundly important all the sudden. I hardly see the advantage of even nginx over apache, though that's a different issue.


WebSockets probably. None of the traditional server-side web frameworks (Ruby on Rails, Django, Java Servlets, ...) can deal with them because you need to maintain open connections with each client. That also makes asynchronous I/O important (a feature of nginx over apache).

Most WebSockets solutions are clumsy at best, because you need to run something independent from the rest of your web stack. Unless you use solutions like Wt (http://www.webtoolkit.eu/wt).


What do you mean when you say "independent of your stack"?

memcached, postgres, rabbit, etc aren't written in the language I use, but they're very much a part of my stack.

Using an external application to hold the WebSocket connections and be the middleman between your app code and the user is not automatically clumsy. Clumsy would be re-writing your app in JavaScript because you believe it's the only language suited to WebSockets.


Did you even read the line you quoted? He didn't say "emerging set of software capabilities." The programming conventions, philosophies, and values of the Node.js community are evolving. Your response doesn't make any sense.


You hear this when people praise node.js, and it might be true for all I know, but I have yet to see a concrete example comparing node.js code to the same code in another framework/language. Would be very interesting and help non-node.js users understand the benefits.


As soon as you're doing anything beyond CRUD on a database, you will probably have to worry about them.

Unless Erlang blablabla...


When working with requests that share resources via web frameworks that deal with threads, the large consensus is to put the resource into an external managed resource. This external resource such as a database will manage the locking for you to prevent write conflicts, however, when caching values for complex interaction beyond CRUD you may have update conflicts where a new request has an in memory copy of an outdated resource. This is where the problem lies.


well, that's a cache invalidation issue. Any data that's cached, by definition may not be the freshest version, unless you've implemented a very nice write-through situation. If you'd like multiple requests to share an in-memory-only cache, potentially using write-through, then yes that deals with synchronization issues. I wouldn't characterize them as super-tough synchronization issues and you certainly won't have a "blocking IO" problem with an in-memory system.

If you want to write your own caching server, fine, use node.js. But now you're writing your own cache server, you've found some problem that memcached, redis, etc. all do not solve. Is this an everyday use case ?


For me it is. However i tend to just use in-memory caches instead of a cache server, which may suffer the same issue. For standard websites, this is a non-issue for CRUD, for task/activity based sites that can have long lived tasks that persist after a request this is very important.


Wait, how does Node solve this where others don't?


By giving a shared in-memory resource for the current state of an object that is not mutated from other threads during a single stack build / teardown. Concurrent access in threads requires locking the object until it is in a determinate state, while in Node you are guaranteed the state until the stack unwinds.


But this isn't unique to Node, is it? It seems like you're saying you've added a feature when you've really removed one. You can create a single threaded event loop in any language and get the same properties, no?

I'd argue something like Clojure is actually providing a feature here, instead of taking something away from your toolkit. You can have a single view of an object for the life of, say, a request all while its root binding is actually being mutated by other threads. You won't "see" the new value until you deref the root binding on the next request. Except nobody stole threads from you and sent you a bill.


Perhaps it is taking away a "feature" in some senses, but in my view it is taking the logical step not allowing concurrent access to preempt during execution. I often want the current value that has changed after the original context is changed by an asynchronous task (IP addresses of internal servers changing while a script was running came up today).

There is no way preempting access / memory contention is a feature, but Clojure avoids this with somewhat immutable state which can make keeping up to date values painful, although I may not be experienced enough to state much about Clojure.

For web services such as ours where we have values changing underneath us it is elegant that we keep a value the same through a single flow of control (until the stack unwinds). Even if it is incorrect for one part of the task as a whole, it is predictable where the values can change and dealing with errors from pointing to the wrong object / value is trivial compared to most race conditions (yes, node does those before anyone jumps in).

The environment here is key though. Node was built as a single threaded event loop. All the bindings for node / libraries for node expect this. Libgmp's love of aborting threads after a process gives it a wrong value is a good example of where the single threaded environment fights the threaded model, and the same problems of expecting threads is apparent in many programming environments (.Net Http stack I'm looking at you).

So in many ways: Node does not give you something that cannot be done in other environments; but, in other environments there is a lot of existing code that encourages thread usage. Doing something in twisted or the like proved difficult once I needed libraries that had been written expecting threads. The same is true in Node, but I can be confident that good libraries / bindings for Node provide things that expect to work in a single threaded event loop. And I like the command queue / event loop / actor based / reactive / whatever you want to call it. I like it more than anything due to the lack of concurrent edits, but allowing a lot of mutability at the same time.


Ah but node is really just the appearance of a single thread which is actually a v8 managed evented threadpool (or some such magic).


Only one is used for computation (afaik), the rest are used to toss I/O onto. gevent/eventlet in Python do the same, as do other languages. Nothing unique there.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: