Is it really that much of a problem spending a few extra hours prototyping different solutions and factoring in performance from the start. Why wouldn't I pick Vertx over Play if it 2-3x faster with similar productivity. That's money not spend on VPSs that can be used for marketing.
Likewise why wouldn't I look at a binary serialization format if it means my iPhone app starts up 5 seconds quicker. Users decide whether or not your app is worth keeping in those few seconds of starting the app. Why not spend time trying to keep them ?
Or as great as PostgreSQL is for single server deployment why wouldn't I look at something easier to manage, trivial to scale and a more agile schema like a DynamoDB, MongoHQ, Cassandra. Especially if I am a single person startup.
I understand about premature optimisation but it seems to make a lot more sense to properly consider your architecture from the start rather than deferring it. Or did we all forget Digg v4 ?
It helps if you've seen, firsthand, a startup wasting time worrying about trying to optimize stuff they don't even know if they're going to need to optimize.
> my iPhone app starts up 5 seconds quicker
That's something you know you need right away, apparently. It's probably a sensible investment.
> Or as great as PostgreSQL is for single server deployment why wouldn't I look at something easier to manage, trivial to scale and a more agile schema like a DynamoDB, MongoHQ, Cassandra.
Because there are some big tradeoffs involved in using those. And because Postgres is not at all hard to manage or scale to a point where you can probably afford to look into other solutions or hire someone to scale it. This article is, after all, written by an Instagram guy, and they seemed to do ok with Postgres in terms of scaling.
That's not to say you shouldn't think about it at all, but there are so many unknowns that it often makes much more sense to spend time doing market experiments rather than technology experiments.
It's dangerous to prematurely over-optimize, as the OP says, but it's never a bad thing to think about optimization. Otherwise you'll end up with crap.
I'd personally say, go as fast as you can without having to work hard for it. But as others have mentioned, picking something with a lot of options and not coding inefficiently (e.g. don't use associate lists for everything when you know this should be a dictionary).
Picking the right infrastructure DOES matter if you're hosting it in 1 machine in your basement. It matters less when you got millions in funding, a team of developers, and machines with 32 GB of RAM in your disposal.
Not enforcing the schema for your data is one of the riskiest optimizations you can make; it lets any bug start irreversibly poisoning your data for however long it takes you to realize it.
More information here: http://c2.com/cgi/wiki?PrematureOptimization
I hope you appreciate that this does not represent all or even most real problems.
>Or as great as PostgreSQL is for single server deployment why wouldn't I look at something easier to manage, trivial to scale and a more agile schema like a DynamoDB, MongoHQ, Cassandra
That has nothing to do with premature optimization, that is just about using unproven data stores because you didn't bother to read the label.
I would change "Don’t obsess about potential performance" to "Don’t obsess about potential performance except for binary roadblocks".
What's a "binary roadblock"? Something that, when it doesn't run fast, no amount of "tuning" could fix it. You better be obsessing over things like these or you'll be dead in the water as soon as that nasty roadblock manifests itself. Some examples:
- 2 proven technologies whose combination is not proven
- bad database structure that doesn't hurt until a query you never planned for
- bad program structure whose memory leaks don't show without an outlying case
- lazy code that blows up as soon as programmer #2 "enhances" it
- shit that works by accident until it doesn't work anymore
- sleepers than appear to run well until you try to scale
- structures aimed at incorrect user assumptions
Instead of talking about "binary roadblocks", couldn't you just say "don't obsess about performance, but that doesn't mean don't think about it at all"?
Some readers of the results have asked why the tests are so trivial (although the multiple query test is anything but; and we are adding more computationally intensive tests for later rounds).
We chose to start with trivial request types to establish the high-water mark that the platform and framework set. Any given web application built on the framework can only expect to perform with higher latency and fewer requests per second than the high-water mark.
In turn, the higher-complexity request types provide some insight into the converge-to-zero point along the complexity axis. Prior to seeing the data, I think some of us intuitively assume the converge-to-zero point is lower than it actually is, using the all too popular refrains such as "well, the app is always waiting on the database anyway." Yes and no. Certainly database queries are generally expensive. But some frameworks can run twenty trivial queries on a gigabit-connected database and serialize the resulting list to JSON faster than others can serialize a single trivial object to JSON.
Choosing a framework that provides headroom for custom application logic--and having a rough idea of how the platform will perform with your custom logic as demonstrated by representative tests--is a good idea. And it doesn't take a whole lot of effort to do that kind of research now. It's easy to avoid premature scale pain.
Benchmarks which artificially exclude common optimization techniques are avoiding the interesting question of room for optimization in favor of meaningless pissing matches about 'language speed' or 'framework speed'.
If I build on an efficient/high-performance framework, I can make sloppy code to start and get the job done fast. When the time comes to optimize, I have a great deal of headroom available. I see this as "ability to optimize."
If on the other hand, the framework (and, as importantly, the platform) is already consuming a great deal of the CPU's clock cycles for things outside of my control, my ability to optimize is greatly diminished. I will run into a brick wall unwittingly erected by the framework and its platform.
If my ability to optimize becomes a matter of replacing core components of the framework with marginally faster alternatives, the experience devolves into a frustrating guessing game wrought with arcane insider's knowledge ("Everyone knows you don't use the standard JSON serializer!") and meandering futile experimentation ("I wonder if using a while loop rather than a for loop would squeeze this request down to 200ms?")
I'd rather know that the framework is designed to give as many as possible of the CPU's cycles to me and my application. I can then be reckless at first and careful when time permits.
Which benchmarks artificially exclude common optimization techniques? If you're referring to ours, please tell us what is on your mind. You brought this complaint up in the comments on our most recent round , but didn't follow up to let us know what we did wrong. We are absolutely not interested in artificially excluding common optimization techniques. In fact, we want the benchmarks to be as representative of common production deployments as possible.
And hopefully, long before that, you've answered "Who cares about this?", "How will I make money?", "Is there actually a big enough market?", and so on...
If I want it for myself, I'll make it. If I think other people will want it, I'll find a way to get it there - either by selling or giving it away. Maybe that makes me a bad entrepreneur, but it means I have more fun in the things I build.
A business for the sake of just making a business feels wrong. A business for the sake of making an awesome product feels right.
You're free to start a business by building the product of your dreams, but no matter how much you personally like it, there's no guarantee people will pay for it. In other words, if you want money, but start with building a product, you're likely to have a bad time.
But if you just want to build an awesome product, and don't care about whether you make any money with it, then there's no problem.
I'd suggest instead that identifying an audience for a potential product is actually part of making an awesome product.
There's a pretty wide stretch of territory between not caring whether something is business viable and only caring about building a successful business.
While I agree with this advice in general, there are cases in which it is advantageous to allow your stack to define you. For example, Jane Street Capital appears to have established themselves as the place to work if you are interested in both functional programming and finance, greatly aiding their recruiting efforts.
Granted, they didn't start out using OCaml; their founders made a switch after the firm had already established itself. However, I wouldn't be surprised if a startup (where the founders have a specific skillset/background) could exploit a similar first-mover advantage.
Maybe your product needs to be very high performance so you hire a team of C programmers. What you might find is that their idea of iterating quickly is on a different timescale than yours, due to the low-level and unforgiving nature of C perhaps your team tends to be pedantic (by your standards) or correctness/detail oriented (by theirs).
In this situation your team, influenced by your technology choice, is bringing a different value set than your own to the table. BUT this is not (necessarily) a bad thing, if C really is be best tech for your product your team can strive merge both your and their values; however, my point is without you actively trying to do that, your tech just defined the value set, not you. (In this case it also possible that maybe Rust or Go would have been better tech, but that is a different conversation)
Not just from dynamically typed language people, but from each other. I don't know any C people or go people who like C++ or java for example.
About time to market: From my experience (and Reinertsen ) time to market is not lost during programming but during the decision phase. Optimize your decision speed (reduce meetings, don't postpone decisions, have a clear decider, know what to decide and what not, ...), not your technology stack if you want to beat others to market.
I did have an offer for some years: Anyone beating a Java (the COBOL of the internet world) stack 5x to market, I pay him $1000. No one did take it yet.
One of my slide deck on the topic:
 "Developing Products in Half the Time: New Rules, New Tools" Preston G. Smith, Donald G. Reinertsen
There was some VC around here who talked about the advantage that startups have because they can choose some of these tools to quickly develop in and such...
Just realized that it's also a context switch when I go from Python to Mongo, especially when I'm working in the Mongo shell.
When making decisions for a library, ask yourself:
1) "What kind of problem is this library solving?" "Can I solve the same problem in a simpler way with less code, and later adopt the library if necessary?" -- Dependencies don't always make your life easier.
2) "Does this library make certain problems simple, and other problems no less hard than before? Or maybe SLIGHTLY harder?" -- Last thing you want is to limit yourself to a subset of solutions. Until you know EXACTLY what the long-term goal is, marrying to something that ties your hands is a bad idea.
3) "Is it magic?" -- This is important. In a startup you don't have time time or resources to just sit down and grok a library. If its too magical then its better to avoid it because when something goes wrong, you will have to either sit down and grok it, or rip it out.
Other bonus points for good decisions:
Break everything, but have the most solid data model possible. Test the shit out of it too.