No offense to the Art.sy team, but this is just Heroku marketing copy. It lacks the detail we typically see on these sorts of high scalability "how we did it" posts. Non-trivial applications scale non-trivially, and when someone comes along claiming they have solved the scalability problem with the push of a button I am instantly skeptical.
I would really like to see more details about their architecture, especially about the MongoHQ integration.
I realize this is off-topic but just as a bug report, I tried to search for "cezanne watercolor" and it didn't understand, despite that you have watercolors by Cézanne. Also it wasn't clear whether watercolors were best found under "paintings" or "works on paper". I think you need to make it easier for non-computer people to find specific types of image.
It's an interesting observation. We built a full text semantic search in 2011 by reverse-indexing art search results from popular search engines and it could do things like "Cezanne watercolors" at ease.
We showed it to people and we found that one could easily trick the system with things like "Worst American Art Ever". That generates results, but shows the limits of a general semantic search in a narrow context.
Happy to hear any suggestions about how to make something like this both useful and not too easy to make look really silly.
It only works in that case. What if there's another work that is called "Cezanne's Favorite Watercolor", and we're just in the beginning of search hell :)
Same thing for us at Art.sy. We have humans doing targeted emails, with MailChimp.
Sendgrid is an SMTP relay with high deliverability. That's all it does. You can find who received what on the website, too. With MailChimp you have to setup lists and all that unnecessary stuff.
There's one big drawback to using both: we currently have to sync our users to MailChimp and sync MailChimp unsubscribes back. Hence we're going to get rid of MC eventually when we can build good enough UI to manage mass emails.
I'd read this with a grain of salt. I remember looking at art.sy via the NYTimes link and thinking the site was terribly slow. I still find it moderately slow. I am not sure they really need mixpanel, google analytics, and kissmetrics, on every click and the api calls should take less than the 1-4 seconds I am seeing.
dblock from Art.sy here. The analytics comment is totally fair - we spent a lot of time looking at all kinds of stats as we keep experimenting and it needs to be trimmed down to 1 (or none :)).
Our average API response is 380ms. It's about 20x too long as far as I am concerned. It's not Heroku's fault to be fair, there's a mix of Ruby code, database queries and some not so easy to optimize math in some cases. It's definitely work in progress.
I suspect I'm well outside your target demographic, but I'm seeing ~22 whole seconds to onload (according to Firebug) here in Sydney, Australia (as 52 requests comprising 1.6Meg of data.)
That certainly seems excessive for a non-logged-in user on your main landing page.
Thanks for the kind words. We load 1GB of CSS and JS. OK, maybe I am exaggerating :) It's a constant struggle between "it looks and feels amazing" and "it's super fast".
I'm just curious, how much traffic NYTime brought you guys. I'm also on Heroku and my app(http://www.tubalr.com) "survived" several articles around the web(Mashable, TechCrunch, The Verge, The Next Web, Japan LifeHacker, and several others) within a very short timespan. My bill has never been over $25, which includes no extra workers.. just extra DB storage and an upgraded DNS plugin.
Art.sy was lucky to be the top article on NYTimes (the one with the large image at the top of their homepage) from about 5pm to 11pm, were the #3 most emailed, and the NYT article was #3 on Hacker News. We went to 1500 concurrents almost instantly, and maintained it for several hours. This is nothing for many large sites, but was the most we'd had. We had added some extra dynos in preparation, and our API response time actually went down as more obscure routes were hit and cached.
We prepared for the worst, though, and had built a failsafe way to progressively shut off more demanding features on the site (for instance: our related search results on artwork pages) in case we were getting overwhelmed. Much better to have a reduced feature set during launch than a broken site. Fortunately we didn't have to flip that switch!
Ugh! this is just a puff from heroku. Yes, I'm sure it's a great tech, but we have no idea of whether it is accurate or not. This piece has no analysis, no comparisons, and certainly no independence or objectivity.
I think it's totally worth it. Right now it still costs us less than a developer to run on Heroku and not that long ago we were 3 of those devs. I think having our front-ends on EC2 along with setting up memcached/mongo/nginx/etc ourselves is a much higher upfront cost.
Then there're traffic spikes. Pretty hard to deploy to bare metal or EC2 instantly without building your own Heroku-like system.
I use heroku for prototyping, which it is great for. I would not use it for larger projects in production (disclaimer: while also being a developer, I'm the company syadmin). If we measure downtime, heroku had much more downtime, than the servers we have internally - and I find heroku extremely slow compared to what I can deliver with basic hardware. Their database offerings are awesome, though.
The convenience is totally worth it for prototypes and very early stage startups. The less work I do on a system level is more time I can spend on my startup.
Once you move past that, because your app is well structured you can easily deploy on your own hardware.
Really? Do you work on your own car? Is it because you think you aren't smart enough to do it or because you think that spending your time on other efforts is a more valuable configuration?
AWS is the electric company. Heroku tends to be the circuitry, outlets and light switches for your house. You may take it for granted because it looks so simple, but it's the simplicity that makes it great.
As PaaS providers begin to mature and turn attention to more worthy challenges (like geo-agnostic deployments, easier scaling, for example), then their value is only going to increase.
They will shut down your instance, if you don't use it for an hour or two - and then it will take like 8-10 seconds to load it again. If you can live with that, I guess that would work, yes.
From what I see, when you have a single dyno, Heroku deploys you to a different environment sandbox called "development". Typically that has less uptime and takes lots of upgrades all the time with interesting consequences. I believe they promote to "production" applications much less frequently.
I think this article can be a little misleading. Sure, with cloud tech you can easily spool up more instances. And yes, it is great to not have to worry about configuring a load balancer (I guess). But just because adding more instances sort of fixes a problem doesn't mean it is a good thing to do. Not "having to do calculations" is a very bad attitude. You should know where the bottle necks are, and if adding instances is actually necessary or just some scotch tape. Bad architecture and coding can bite you in ways that adding hardware cannot fix.
I would really like to see more details about their architecture, especially about the MongoHQ integration.