The problem of the main thread waiting on a response from the database is easily solved without rewriting all your code, which incidentally will have the same issues, since there's no new magic there.
If your software is a giant, ugly hairball because you wrote it in a hurry (I know I did this many times!) don't blame it on the language or framework. What I'm wondering is: Python/Twisted/Django rank equally or similarly on all the points from the list, except for Team Experience, which has to be much higher with the old stack than with a completely new technology and set of tools. This makes me think that there was a "play with shiny things" point after all.
Python has no such conventions, making many libraries into dangerous minefields for the Twisted developer.
I'm personally not a huge fan of node, and I prefer to write non-blocking code in Aleph/Netty (which is closer to twisted in terms of potential to block due to blocking java libs) myself.
But, I can understand on a large team not wanting to deal with the potential for blocking libs.
That said, if performance is that important, it should be part of your test suite, and should catch that.
I believe our single biggest mistake from a technical side was not reigning in our use Django ORM earlier in our applications life. We had Twisted services running huge Django ORM operations inside of the Twisted thread pool. It was very easy to get going, but as our services grew, not only was this not very performant, and it was extremely hard to debug
Django has in many ways become the industry standard in regards to fast development - but it, as so many other ORMs, does not scale and produces horrible SQL-statements.
Also for many applications I find Django to be a huge overkill - and in more recent times I've focused a lot more energy using microframeworks such as Flask and Bottle.
I disagree - many ORM's in python world? Try using sqlalchemy it will put django's ORM to shame, it will build fast good looking queries most of time, and if you have a rare case of something not good enough - you can easly tweak it. Can't say anything about Storm but saying that "industry standard" django ORM is OK because "others have same flaws" is just distorting reality to fit this thesis ;-) Ugly truth is django ORM is not ok and it bites many people when they are at the point where it's real pain to fix this - the more developers are aware of this the better.
> The django documentation for many things such as this is almost non existent.
He just linked you to documentation with exactly what you needed, so it surely exists.
> Some of it is from older versions as well.
He linked to the current 1.3 docs.
> And im sorry but how is a newbie supposed to find this stuff.
The second section of the docs is about models, the second heading is all about QuerySets. The link he pasted is literally right in front of you.
QuerySets: Executing queries | QuerySet method reference
If I were to diagram those two and Node.js, I'd say it looks like this:
Easy <---------------------------> Simple
Rails <-------> Django <------> Node.js
(a thought provoking talk for those who haven't seen it)
It is a brilliant talk. The only part that he left out of the discussion, but is worthy of a talk in its own right is the notion of the community (formation and characteristics) as one of the artifacts that result from the constructs of a language.
Also, not to parrot, but it's true - node is cancer. It's for fad chasers who have no idea how real servers manage to serve volume.
Fortunately, most people inexperienced enough to choose node are inexperienced enough to never need to scale, so it'll work out okay for them - for implementation-specific values of 'okay'. :)
Woah. I do both Python and JS, and guess what - there are a fair number of people who hate Python's significant whitespace. And to them, it's also a "hacky" language (with all kinds of __foo__ stuff). So, shitty-ness is in the eyes of the beholder.
> It's for fad chasers who have no idea how real servers manage to serve volume.
There are real world apps like Trello, which is on Node, and are scaling well. Get your facts right, or stop trolling.
But whitespace as debate topic over corner cases in JS language? really?
And also Trello is just launched and suddenly it scales off miles away? really?
Pagerank 6 == webscale?
Those facts are weird...
Arguing syntax is more often then not pretty absurd, unless you are arguing how the syntax actually changes semantics (like with S-expressions and overloaded parens).
> You even suggest Coffeescript which is big on making
How does the age of the argument impact its validity.
> you're only admitting that you're too lazy to learn how to program with a prototype-based language.
If you don't like Erlang you are just admitting you are too lazy to learn the actor model. See I'll just call you "lazy" it will validate my argument.
Not disagreeing with you here, but it would help if you'd actually mention why. You said something about it being a prototype based language. Is that why you think it is great at asynchronous design?
Python and other non-browser languages are far more powerful and expressive, and — since you refer to Python 3 — willing to further improve.
if __name__ == "__main__":
For example, I think if you'd have to explain how to change the string representation of some kind of object, saying "just define the 'str' method and then call it, like 'something.str()'" is simpler than "define a '__str__' method -- don't worry about all the underscores -- but then don't call it directly; instead of that, use the global 'str' function".
I don't know if that extra indirection (i.e. calling 'str', which, in turn, calls '__str__'; or 'next', that calls '__next__' in Python 3) adds more value than the simplicity of not having that extra indirection in the first place.
Just about the only underscore that you can't really avoid is the constructor syntax, "def __init__()". Which I guess is tolerable.
Can you elaborate on that point? I thought that was the canonical way to do exactly that... http://docs.python.org/library/__main__.html. If there's an easier/less verbose method (pun not intended) I'd love to know about it.
It is this environment in which the idiomatic “conditional script” stanza causes a script to run
The conditional script stanza is optional, and intended to make a .py file function as both a script and a module (i.e. be importable). To me, it's unclear why you would want to do that. The only thing you have to do to avoid using __main__ is not require any file to be both a script and a module/library. Then just put any entry-point code at the top level of whatever script invokes your program. Unfortunately the __main__ stanza is pretty widely used though.
On the topic of modules, there is a wrinkle in requiring __init__.py to be present in any library subdirectory that you want to be part of the namespace. It would be nice to have a more intuitive way for that.
I especially love it when I'm first coding, cause I like to test as I go. That allows me to run it to see what works and what doesn't.
When I'm finished, I know that I can just plop that file into a directory and use it as a library afterward. The majority of the code that I write implements that, and I find it a godsend, personally.
if that is their only reason to claim JS is a better language, i wouldn't bother listening to their opinion too much.
I think that pquerna has some pretty good ideas about how "real servers" work, but leaving that be, you could actually explain what's wrong with the system, rather than attack the people.
Having some Erlang experience myself, I can find plenty not to like about node.js. However, the Erlang guys have fired round after round into their own feet to the point where even if it's inferior, node.js has probably already surpassed Erlang for many server tasks.
Here's some of what I think is wrong with Erlang:
Several years ago, it was getting some "early hype noises", but sort of sputtered and bounced and never got off the runway.
If you look at who's building serious systems with Node.js, that statement doesn't hold up. CloudKick / the Rackspace monitoring team are hardly amateurs - neither are Joyent, LinkedIn, eBay or Yahoo.
Also see facebook and PHP. Just because Facebook uses PHP, it doesn't mean that PHP is a good language.
These people actively measured (using a spreadsheet at that) the various parameters of a programming language and settled the one that fits their requirements.
That spreadsheet is not "science" even though it has numbers in it. I can put higher numbers for Python and arrive at a different answer.
I completely agree with this comment: http://news.ycombinator.com/item?id=3366883 elsewhere in the thread.
And I'd like to see the results of your study where you have proven that "most people inexperienced enough to choose node are inexperienced enough to never need to scale". You should publish such a bombshell study.
But in truth, your fear is causing you to make nasty attacks and, well, just make shit up. Come on, just admit it, Sneak.
It's not fear of new, it's fear of broken. I encourage you to use whatever shiny stack-of-the-month you enjoy, as it basically amounts to my competitive advantage over people like that. I'll be over here using Apache and nginx and MySQL and python and perl and other languages that don't make me write abominations like '==!', same way I have forever. Best of luck to you.
>>> a = 1000; b = 1000; print (a is b)
>>> a = 1000
>>> b = 1000
>>> print (a is b)
Python has its quirks too. Most people learn them and get over it.
tl;dr; use "is" only with singetons (None etc)
if foo is None:
if bar is not None:
No it's not. You're just making one hell of an assumption.
And this person still has not learned from their mistake, they think that using SQLAlchemy would have solved their problem. If they wanted to make Twisted work they should have picked a database that can be used in a non-blocking way.
The reason why I like node.js is that everything is built in a non-blocking way. Twisted has the same philosophy until people think they can get away using a blocking library inside the IO loop.
And it's exactly the same situation as anything they could have used in Python, as soon as they add actual code to get work done (as opposed to hand over work and wait for a callback).
"""The reason why I like node.js is that everything is built in a non-blocking way. Twisted has the same philosophy until people think they can get away using a blocking library inside the IO loop."""
That's not what they were doing, though. At the end of the non-blocking fiesta you want to actually return results to a waiting connection -- actual results, rendered templates etc. That will block and take time -- especially in node.js which is single threaded.
So they just traded Python's threads/processes/lightweight threads for node.js processes.
build a process pool and queue jobs to background processes, it's the same arch, you'd use if you had threads so what's your point?
Also, the tone on a keyboard is often easily abrasive. The pen is mightier than the sword. The keyboard is harsher than the mouth. Don't fall for the bait focus on the arguments.
There has been discussion about how to improve the site, sometimes from PG himself, sometimes from others. It isn't perfect, but in comparing the signal-to-noise ratio of HN to some of those other sites, I find that HN shines. Do you have any specific ideas on how to improve things here?
102257 perl http://www.cpan.org/
33270 java http://search.maven.org/#stats
31921 ruby https://rubygems.org/
18068 python http://pypi.python.org/pypi
5732 node http://search.npmjs.org/
And also keep in mind that scripting languages have higher number largely because people use them for writing scripts/command-line as well. And if they insist to use Java for scripting well then... they deserve to be punished by walking up and down the stairs 20x per library per decision.
Last time I looked there wasn't much along the line of SciPy, NumPy, NLTK, or Matplotlib.
The rails community (and by extension ruby) is generally more product focused, whereas the python community (and by extension django) are more science focused.
I intentionally ordered the language and the framework differently for each because most people who use ruby/rails, got into that community for the sake of the framework, whereas most of the people who use python/django, got into that community for the sake of the language.
Finally, python has a much greater adoption among in the academic world, so it's natural to expect that its community will have built more academic libraries like those you mentioned.
One thing I like that I've seen from the node.js community is that the fact that they are developing with the same language on both the client and server, that they consider them one in the same. The only thing separating the two is latency and connection reliability. The latency issue is psychologically not much different than having a bias for doing things in memory and avoiding disk IO server-side. The lack of connection robustness is likely to evolve into solutions that mirror the problems that the erlang/OTP community has spent a lot of time solving.
a bit of an understatement.
We had to write many things. There is a list of our dependencies at the bottom of this post:
Having said that, the only ones I consider moderately painful is when we need to deal with node-thrift, which on the Python or JVM is well maintained by existing groups.
LISP for example is a great technology for a lot of things, but library-wise it suffers because within that community there is a tendency for people to "roll their own" solutions.
Java has tons of libraries, but I'm not sure most people would say it's the right technology for a lot of things. It's generally the right technology for projects that involve lots of developers with mixed levels of ability (before some Java person down votes me for this, consider that Gosling himself made this point back in the mid-90s when he invented the language)
and how does "we picked Node.js" logically follow from "It is obvious that the JVM platform is one of the best ways to build large distributed systems right now"?
i can only begin to imagine the conversations a year down the line:
- how did you decide this?
- we used a spreadsheet. science!
- so you weighted against the asynchronous approach that confused your developers earlier?
- actually, we put most weight on it being an exciting new technology.
We played with the weights.
What I posted was just where it was left after a meeting 9 months ago.
Yes, we definitely weight against our feelings of how we failed to employee Twisted Python. This was our experience, are you suggesting we shouldn't consider our experiences when evaluating something?
It really came down to a choice between pursuing a JVM based system, and Node.js.
Yup, we probably did want to do something new. But guess what, its worked. I'm sure we would have haters the other direction if we were posting about how Node.js has failed us, but it hasn't, yet.
how will node fix the issues you had with twisted?
A lot of people use them for solving problems they DON'T have.
Lots of people use both Mongo and Node.js at ill-fated attempts at premature scaling (similar to premature optimization).
When you first start out, you don't realize that flow control is going to be an issue, you don't realize that you're likely going to need to figure out how to do stream processing. Those aren't especially difficult to learn, but it's a delayed learning curve. You don't know that you'll need them until you run into a dead end. There are libraries/modules that help with that now, but it's not trivial to figure out which is needed (at least not for someone just getting started).
Node is an interesting tool, and I think it solves some issues elegantly. I think some people use it for projects it's not suited for, but that happens with every language/framework/tool.
You don't get scale from a tool. You get it from experience in knowing how to run scaleable systems, and architect scaleable algorithms / systems.
How is scale "built in" to Node any more than any other modern language/runtime?
"I want to scale my app on modern, multi-core computers. I know, I'll write everything so that it all runs in a single thread!"
This is the same fallacy that gets repeated every time node is mentioned. If you're scaling up past a single machine you have to figure out how to share state between multiple machines anyway, so you may as well run a process per core. Then you have a single way to share state, rather than one way between cores and another between machines.
So unless you're writing a desktop app that needs to scale up but only to all of the cores of a single machine, you're sometimes better off simplifying the state-sharing logic.
And your 'cancer' language is unhelpful. What are you afraid of? That somebody somewhere is using a bad language when a better one is available? That the NodeConf organizers reserved all of the good hotel rooms?
Why are you so angry at a programming community?
It's not like "somebody somewhere is using a bad language when a better one is available" is not hurting CS in general.
I think the real danger is people thinking they've learned everything there is to learn and closing their minds. Which includes telling people that "node.js is cancer". Node is letting people do all kinds of interesting things, and it isn't killing off anybody's favorite language.
I'm thinking along these lines:
And that's --the second part-- is where they and you are wrong, as tons of teams have found out. Scale is never built in, you have to work for it.
"Scale built in" is another way of saying "Mongo is webscale".
If you're a good front-end JS developers but never ever wrote a minimum reusable JS library, you've got a lot to learn before diving to JS.
As long as the code is waaay more modular than this:
They should be alright in terms of adding new features. And if Node.JS hits scalability issue? well tough luck I suppose.
I see that JVM is a very serious contender but as already noted: License issue. And let me add this: I'm betting my ass that there's the "I don't want to write Java" whisperers as well. Now now, don't BS me. I know how the so-called "engineers" think of Java these days. Sure, you could use Clojure, Scala, or JRuby. But seeing the competitor is Node.JS, pretty easy choice don't you think?
My experience writing JS code for Node.JS based platform (ExpressJS, etc) is that there's a bigger chance that you'll write more code. Code to make the code more modular (sounds weird isn't it?). Code to make sure you work-around the warts of JS. Each line of code should be highly scrutinized due to JS warts.
This thread is going to be a typical fun nerds-fight. I guarantee you. I'm going to grab a popcorn and watch nerds doing keyboard slamming.
OT: "We should of used SQLAlchemy. We should of built ..." I thought this expression was Bristol-specific. Apparently it's taking US too.
It seems they looked at Gevent, Go, C++ and the JVM. As usual, no love for Haskell :(, not that they're entirely to blame there.
Now, let’s say you don’t want to do callbacks, but still take advantage of the Reactor pattern in EM. I wrote a patch for Sinatra that uses Fibers to wrap callbacks, hence allowing you to continue to program synchronously as you used to, and it’s called Sinatra Synchrony (http://kyledrake.net/sinatra-synchrony).
But in order to do non-blocking IO, you don’t need to use a Reactor pattern, because Ruby internally does not block on IO (the GIL does not affect this). And of course you get solutions like JRuby (built on the JVM), which provide for threads (and Rubinius 2 coming soon).
My APIs written with any and all of these concurrency options can get thousands of hits per second. It’s quite scalable, but still provides all the rich libraries, reusable code, and testing support that makes my APIs high quality, which to me (and my users) is more important than making them fast. With this approach, my APIs are both high quality and fast. And I have never experienced a single crash of any worker processes in production. I would know, because my process monitor is smart enough to observe the workers in my deploy code and informs me, while adding another worker to replace the fallen one. Did I mention I can do zero-downtime hot deploys? Again, all implemented in Ruby.
Go's not-in-core packages are a mixed bag. Lots of "one off" experiment projects. Which is fine, thats where it is in its life. Node.js was in the same place packages wise 18 months ago.
Additionally, at the time we were looking at this, Go was still doing releases every 2 weeks -- this was months before the Go 1.0 plan was even announced:
We're using it for handling jobs from Resque, which reads and writes from our database and do a ping to a 3rd party server.
But. The amount of bugs in the em-enabled libraries is just amazing. For example em-activerecord just didn't work so well under huge load (dropping SQL-connections), so I had to write my own mini ORM for the workers. Also Resque blocks by default, so of course doing a non-blocking and non-forking version was a priority.
The worst thing here are the error messages and backtraces. There are none (DeadFiberException for failing assert) and with hacks you can get out a bit more information if the job crashes.
Now thinking this again, it would be a good idea to write it again with Node.js. Just because it's still more mature compared to EventMachine, and all of it's libraries are reactor core friendly by default.
That's interesting - could you elaborate (here or a blog post) on your findings. what was different between your ORM and em-activerecord that made it more performant.
The thing is - my first thought would have been to leave the ORM alone and focus on the DB side (like more sophisticated connection pooling/pgbouncer, etc. ). Which is why I'm interested in what went wrong in the ORM that made it screw up when used in a non-blocking kind of a setup.
I will do a blog post about this when I finally have some time (christmas holidays, maybe). Em-activerecord worked very nice first, but when I hit it with thousands of concurrent jobs, some workers just dropped dead saying MySQL couldn't answer. This was annoying, because the workers didn't really fail in Resque, just failed to do their job and if this was in production it would've cost us thousands.
So, my own ORM is just a database superclass with em-connection-pool, configuration, openstruct and couple of class and instance methods (insert, update, find, query). And now this thing is really really fast, using lot less of sql connections (delayed job had 300 connections, now we'll need only ~40 connections) and scales really well.
The thing here is, that when I'm not using Rails at all, I don't know why I should use evented Ruby instead of Node.
You can probably use both node and ruby. From what ive seen sites that are using node use it more for the backend to serve ajax then for serving the web pages. Although i have seen few site with very ajax based interfaces that apparently use just node. It really depends on your ui i guess. Good luck with whatever you choose though!
This seems like exactly what would hit anybody who is trying to build a worker model for processing data... hell even for sending emails. It would be interesting to see where you are pushing the limits - it could even be something deep like
This problem is noted by the developers of em-activerecord and em-synchrony a several months ago already. There seems to be no progress, which might lead to a harder problem or that people are not using these libraries in bigger services.
But my point was, that Node.js seems to be much much more finished product compared to EventMachine for example. With EM you have to live without most of the Rails and for example testing the EM code with unit tests is not so nice job to do...
What I liked the most with EM are the Ruby fibers. The best solution for hiding the callbacks and writing nice and readable code by far. Too bad the fibers are a bit of a ghetto still, like somebody said earlier.