Hacker News new | past | comments | ask | show | jobs | submit login
The Costs of Bookmarking (blog.pinboard.in)
221 points by jambo on Sept 11, 2011 | hide | past | web | favorite | 72 comments

This seems too expensive and I'd like to try and help, so I'm going to give you an idea of my hosting bill and why it's low and then suggest something for you:

I pay around $3k per month. I own my own servers and lease a full rack and I serve roughly 1 billion page impressions per month. My bandwidth consumption is measured in Mbps rather than amount of data transferred because I get billed using 95th percentile billing. I average around 130 megabits per second of transfer - constantly, peaking at 150mbps I'm transferring roughly 40 terabytes of data per month. 95th percentile billing and owning your own servers is they key here.

To give you an idea, for one month of your hosting bill you can buy 1, possibly 2 servers from Dell and put them in a half rack that will cost you around $800/month including power, secure access, bandwidth, etc. Those servers will last around 5 years with a possible drive replacement or two during that time for a few bucks.

But I think you have another problem that's making things worse. With 15,000 active users you should be able to support them on one or two small Linode servers using round robin DNS. That's a relatively small user base and the number of requests per second can't be anything more than 10 per second? So I'm guessing something about your basic app architecture is off. It could be that you're not using Nginx to reverse proxy to Apache and you think you need more apache children/processes, and therefore more memory, than you actually do. You could have a DB that doesn't have indexes in the right places and so you're IO bound.

I would suggest first looking at your app and seeing where the bottlenecks are in performance. Fix that first, then look at hosting.


-How many servers do you currently have and give us a rough idea of config.

-How many app requests per second do you get at peak?

-What's your peak bandwidth throughput in Mbps?

-On your servers, is lack of CPU or lack of disk throughput the bottleneck?

-Have you had problems running out of memory that caused you to buy more servers? Which app ran out of memory?

-Give us an idea of your server config. e.g. nginx -> Apache -> MySQL & Redis. Do the servers talk to each other and if so what do they do?

This is a generous offering of expertise, but supposing you were to help him out and he were to take on implementation costs, wouldn't "Here's what I'd do to get you to a billion impressions a month of customer demand for your product" be a higher-priority task than saving a few hundred bucks?

The most common failure mode in scaling for startups is to have no scaling problem at all.

Costs that are that out of whack with expectations would usually indicate some very low hanging fruit. If you can save $800/month by adding a couple of database indexes and tweaking your config file, then yeah, it's worth it.

That said, from what Maciej has written in the past, he sounds competent in these things (i.e. has his databases set up for remote replication and failover), so it would seem that the culprit is more likely over-engineering than under.

Well, if we scaled this expense level straight up to a billion impressions, pinboard would be bankrupt instantly.

I think his questions aren't meant to signify an offer expertise but rather say this base level of knowledge and data should be available whether you're on a shared hosting account or run your own Tier IV center. He's not suggesting a rearchitecture, he's just suggesting know the environment. Decisions are wishy-washy without evidence.

Just for another data point, a while back one of my customers' sites got really popular overnight, and peaked at around 25Mbit/s sustained traffic, with Apache handling around 800+ requests/second, on a Linode 768. No RAM issues, the server wasn't stumbling at all, and the site was still loading nice and quick.

No nginx magic in that configuration, either. Just straight Apache, with a few tweaks.

Could you share your apache config please! I find this incredible. Just very curious how you managed this and hoping to learn something. The reason I ask is because:

Apache has two stable modes of operation, threaded or one process per child. Both of them require one process or thread per connection busy being served, which requires memory. Apache consumes around 20M per thread or process if PHP or mod_perl are loaded. Roughly 5M if neither are loaded.

If you have keepalive enabled and you're doing 800 req/s and keepalive timeout is say 30 seconds (the default is 300) then you're going to need roughly 800 * 30 seconds apache children to keep up. So that's 24000 apache children or 120 gigs of memory in your server.

A common hack is to disable keepalive which causes clients to disconnect quicker, but they still tend to stay connected for 1 to 2 seconds per request - especially folks geographically far away thanks to latency. So at 800 r/s you're going to need 1600 apache children to keep up, or 8 gigabytes of memory in your server.

Now you could have used an experimental MPM and gotten these results, but I haven't heard of that being used much. So I'm really curious how you managed to get 800 r/s from Apache on a server with 768 Megs of memory.

For the uninitiated: Most people put Nginx or Lighttpd in front of apache which can serve 10,000+ concurrent connections with a single thread. That then talks to apache and connects/disconnects very fast which keeps the threads or processes free and often 10 processes is enough for thousands of concurrent connections with keepalive enabled on the front end.

> Apache has two stable modes of operation, threaded or one process per child.

This is obsolete information: if you are using remotely recent versions of Apache 2.2 (as in, anything from the last year and a half at least: older might even still be fine), you should really be using mpm_event (unless your architecture is reliant on some horribly broken Apache module).

(Also, after having spent years running nginx for this specific purpose, I have to say that it is actually a horrible choice: it isn't smart enough to juggle HTTP/1.1 connections to backend servers, so it burns through ephemeral ports and is fundamentally incapable of warming up its TCP windows... to make that configuration scale in a real "tens of millions of users environment" you end up having to get pretty nit-picky with your kernel-level TCP configuration.)

(Really, though, you should just get a real CDN and drop nginx: a CDN will also reduce the latency of your application from far away locations by holding open connections half-way around the world to your backend, allowing you to drop a whole round-trip from a request that often only requires two round-trips total. I now use CDN->ELB->Apache, and I couldn't be happier with the result: the things nginx was attempting to do are much better handled by these other services.)

The event mpm is experimental: http://httpd.apache.org/docs/2.2/mod/event.html

Recent benchmarks show apache is a dog with high concurrency, even with the event mpm.


Not sure about your nginx criticism. It's the darling of many high traffic production sites. Here's my weekly netstat off a low-end dell front end box on a gigabit link:


(The bottom right number is the interesting one)

My nginx config is pretty stock - standard reverse proxy to apache.

CDN's are gods gift to latency and I use several, but now and then you need an actual web server to do actual work.

Another (unrelated) point: I just looked at the nbonvin benchmark, and it has some serious issues.

First, he misconfigured Apache for this environment: by having StartServers lower than the expected minimal process concurrency, they are goading Apache into constantly spawning and shutting down backends (I had similarly spiky performance until I realized this a while back).

However, the setting that simply damns this benchmark is that he has MaxClients set to 150... nginx's equivalent setting (worker_processes) is set to 1024. In essence, he needlessly hobbled Apache, and if Apache does well at all in comparison to nginx, it is probably because the benchmark is flawed. Seriously: this setup is so bad that Apache was returning 503 errors (this is mentioned in the conclusion area) because its configuration told it to stop responding to these incoming requests (and yet, he didn't bother considering that worth examining: he just tossed the detail out there as if it was Apache's fault).

And, given that Apache was less than half as bad (and depending on whether he counted 503 responses in his numbers, possibly "almost as good"), we then do go ahead and question the benchmark itself, just to find that the guy is using Apache bench... Apache bench doesn't actually claim to be very good at highly concurrent testing... specifically, it isn't actually good at swamping remote servers as it isn't itself very efficient. As one random backed-up example, from the BUGS section of the ab man page:

"""The rather heavy use of strstr(3) shows up top in profile, which might indicate a performance problem; i.e., you would measure the ab performance rather than the server's."""

Really, this website's results are based on a kind of "toy" benchmark, and should not really be trusted: the guy (as mentioned in his comments on his post), was trying to go for "the default setup" on these systems, but the default setup is not tuned for performance (this is especially the case with Apache, where distributions expect people to install it on almost everything, including your calculator). I mean, even the nginx settings should be tweaked: in a high-concurrency environment, 64/10096 (where I will admit I haven't spent much time tuning the ratio) would be a much better choice than 1/1024 (the Ubuntu default).

(Reading more of the comments on the benchmark, you can see that other people commented on some of these problems and more, and even wrote entire blog posts responses ;P.)

I am not arguing that the documentation calls the module "experimental", but if you follow the discussion by the developers about this module you will find out that the fact that it is still called this is due only to a combination of 1) the major version of Apache not having been bumped in a very long time, 2) a number of Apache modules in the wild that are poorly coded (not that I've ever managed to actually find one), and most importantly: 3) it does not work on all platforms (Linux is great). Things that have previously been considered "the reason" mpm_event is marked experimental are all now obsolete; for a specific example, SSL now works 100% correctly with mpm_event.

Also, nginx being a "darling of many high traffic production sites" does not mean it actually works well for this purpose: if you do a Google search for "nginx" and "ephemeral" you get lots of evidence to the contrary, and you can also prove my statements from first principals of TCP if you really don't believe me: this need not be based on silly anecdotes, you would simply expect nginx to have issues with ephemeral ports due to the way it is designed and implemented (a reverse proxy making new outgoing connections for each incoming one), and if it didn't you would be surprised and probably want to publish a paper on it. ;P

""Compared to putting tornado processes behind nginx, this approach is simpler (since fdserver is much simpler than configuring nginx/haproxy) and avoids issues with ephemeral port limits that can be a problem in high-traffic proxied services.""" -- http://tornadogists.org/1073945/

"""This makes load testing complicated since the nginx machine quickly runs out of ephemeral ports.""" -- http://mailman.nginx.org/pipermail/nginx/2008-February/00352...

My service has tens of millions of users distributed worldwide, making many billions of requests per month to my hostnames. My setup is mostly coded for mod_python (generally considered to be an "older module", especially considering it is no longer even maintained by the upstream developers). I make complex usage of requests making recurrent subrequests through different languages. A good amount of my traffic is SSL.

Of course, most requests are cached at the CDN, so they don't have to go through to my backends, but I still handle way more than a billion requests all the way through to my dynamic webapp every month. These are all handled, eventually, by two boxes running Apache, and I only need two boxes because I want to handle one of them randomly failing (I can easily handle the load on one box: each box can handle, and actually has under previous concepts for my architecture, 3200 concurrent clients).

As for mpm_event in this environment? It works, is stable, is why I could handle 3200 concurrent clients, and you should not be avoiding it because you feel it is "experimental" (yes, even with mod_python). I did run across one or two Linux kernel builds that had regressions that affected Apache+mpm_event (horrible concurrent performance), but you are better off noticing that and steering away from them than avoiding mpm_event.

That said, I want to make it clear that I am not arguing against reverse proxies: I am only making the point that your CDN /is/ a reverse proxy, so there's little point in additionally adding nginx to the setup unless you can't handle enough concurrent connections from the master CDN nodes around the world, in which case what you really want is "just" a load balancer, and you really still want one that is smart enough to use HTTP/1.1 to connect to its backends, and that simply isn't nginx. (Humorously, DNS round-robin, if you think of it as a load balancer, actually works great for this HTTP/1.1 problem, but there are other reasons to avoid it, of course. ;P)

(Now, this said, I heard a few days ago that the just released nginx 1.1 branch now supports persistent backend connections, but I haven't been able to find it in the release notes.)

(Also, as your comment about "now and then you need an actual web server to do actual work" implies to me, but this might totally be incorrect, that you didn't yet notice that a CDN actually provides insanely high latency benefits even if all of your content is dynamic and all of it has to go through to the backend. If you did not know this, you should read my commentary here: http://news.ycombinator.com/item?id=2823268 .)

Well, you were right in your math -- I made my previous comment before going back and checking the actual numbers, which was really stupid on my part. From the "Congratulations!" email I sent the client:

> ...sustained 100+ requests per second for the last couple of hours; peaks of over 40Mbits of traffic; thousands of simultaneous connections.

So, 1/8th of what you were calculating. Sorry about that.

I use mpm-worker and have keepalives turned off altogether. I also use suexec and fcgid (not fastcgi, unfortunately). I was using mem_cache at the time, but that's off now because it breaks the newest version of Wordpress.

Also, I should have mentioned that this was a static site, so PHP wasn't a factor.

If anybody's still interested in the server config, I'll be happy to share it.

Thanks this is helpful.

Maciej has said repeatedly (including on HN) that disk is the bottleneck keeping him off Linode and VPS in general.

Not all web apps use resources in the same way; you're quite likely making an apples and oranges comparison here (though you don't say what your own web app actually does). It makes sense that Pinboard hits the disk a lot because there's not a ton of shared data -- each user tends to have his own trove of data. Yes, certain URLs will be bookmarked a lot but each bookmark can have different access restrictions, summary data, tags and so on. And no one bookmark is going to make up a big percentage of hits -- you're talking about a ton of bookmarks, each accessed rarely. It sounds a lot more like webmail than, say, a blog in terms of access patterns.

You have questions, and it's nice that you (eventually) asked them and admitted all that you DON'T know about Pinboard. But you should probably ask them before you dole out unsolicited advice, not after.

I would be thrilled to get unsolicited advice like this about anything I built, even if the advice was wrong. You made a good point, but if you had just worded it a little differently, it wouldn't come across like you were trying to take someone down a peg for writing something thoughtful on HN.

I don't know anything about your setup and architecture but do remember, pinboard was one of the sites that was swooped up in the RAID that got instapaper and a few others.

Instapaper stayed up because the server taken was only a slave. Pinboard had some of their main hardware taken, and while things did slow down, they too stayed up. Half, maybe more of those hosting fees could be servers that do nothing at all but sit around and wait for a RAID.

Would you still be able to build out a hypothetical system as you mention and be able to handle one data center effectively disappearing? Or does that mean taking your information and essentially doubling it, to have a second backup source?

Take note. This is actually a useful response that people benefit from reading. As opposed to the several one liners already posted that all say variants of "your hosting is expensive". Thanks for taking the time to write this response up.

Don't you think web crawling (for users with archival accounts, where the full text + associated images/resources for each bookmark is stored and indexed on Pinboard's servers) is probably using up more of those resources than actually serving up pages? I don't think 10 pages per second or whatever is really the relevant metric for this app.

jambo is not Maciej Cegłowski, (username "idlewords") who runs pinboard. So asking him for technical details won't be very helpful.

Incidentally, Maciej is banned from HN,[1] due to not being being nice enough to someone.

1: https://twitter.com/#!/pinboard/status/111332316458135553

Maciej is not currently banned from HN (he was, briefly, after an impolitic comment about a post, but Paul Graham unbanned him within moments of me questioning the ban).

Maciej seems pretty annoyed by what happened, and I don't blame him. But there you go. Shit happens!

He's basically one of the friendliest people I've met on the Internet, though, so if you want to help him out you can just ping him on Twitter.

You use the word "friendliest" to describe a guy who three comments ago called someone a "douchebag" because his post was too long?

He's got a history of similar behavior. He really is a bit of a troll. He certainly doesn't follow HN guideline of "Don't say things you wouldn't say in a face to face conversation.".

My guess is PG only unbanned him so he wouldn't face the criticism of seeming to censor one of his critics.

I read Maciej's comments compulsively, all of them, and your summary does not square with my experience. At all.

I thought his comment about Sebastian's post was impolitic, but I chose that word carefully.

I don't care to psychoanalyze 'pg or how he chooses to run the site. He's a busy guy and is more than entitled to make moderation decisions that I disagree with.

I don't understand why you felt the need to write your comment at all. What good did it serve?

How do you know he wouldn't say that in person?

Experience. Every Internet Tough Guy I've met is normal/civil in person. Besides, I left out the "Be civil" part of that guideline. Calling someone a "douchebag" for writing a long-winded post wouldn't fit most people's definition of civility.

He didn't call Sebastian a douchebag for writing a long post.

He called Sebastian prolix for writing a long post.

He called Sebastian a douchebag for writing that long post.

If we're going to hellban people for writing individual uncivil posts, that's a problem. But I don't think that's what's going to happen.

You should stop calling people "Internet Tough Guys". You have literally no idea who you're talking about. Comments like yours are Part Of The Problem. Maybe you should re-evaluate the notion of starting whole threads on how much you dislike one particular person.

He didn't 'start whole threads' on how much he dislikes a particular person. He merely pointed out that calling someone a douchebag doesn't fit the guidelines for this site.

I like reading idlewords and think he's a good/interesting writer and am glad he wasn't permanently banned. But your reply here defending his words seems a little emotional and knee-jerk don't you think?

Comments like his "douchebag" comment are Part Of The Problem and that's why he was banned. Your description of his comment as merely "impolitic" is only slightly more ridiculous than you referring to him as the "friendliest" guy you know. I probably shouldn't have pointed it out, but I did.

Regardless, I don't have any vendetta against him or care strongly about him being on the site or not. I realize he's a smart guy who just has a bit of an Uncov streak in him. No big deal.

For what it's worth I was a proud recipient of a 50+ score reply that started with "Fuck you!". Maciej's "douchebag" is clearly nothing in comparison and it hardly warranted the permaban.

You're missing the point of the guideline. Even if that particular individual would say it in person, it's still not considered acceptable behavior.

The comment in question, if anybody is interested:


Thanks. I didn't know idlewords was banned & was assuming he'd be by to answer.

I can second this. I hosted CentSports (a recently acquired side project) (~1mm userbase, ~500gb dataset) in a full cabinet for under $1100/mo. At it's peak we did about 80 million pageviews per month on a heavily IO-constrained, heavily active OLTP workload (...not nearly as impressive as mmaunder's!)

By the time the doors closed, we had about 16 RU's filled, IIRC. The beefiest DB box had 64gb of RAM in it, which as it turns out is a $1200 one-time upgrade for colo vs $1500/mo extra on your dedicated (quick price checks on newegg and softlayer, I'm sure both numbers can come down).

Yes, making the leap into colo can seem like a big up front cost, but over time these costs do work in your favor. Yes, there is a support cost involved (either with remote hands or you getting up at 3am and replacing that disk yourself). But If your workload and dataset are such that you don't necessarily require a new box spun up in less than 24 hours, the savings could be quite great for you.

While I pay nowhere near this much, and we're not nearly as popular, I still like his solutions. Why? I like to sleep. I'd rather pay 2x as much for a reliable host for a solution that works than that I'm up at 3am fixing stuff.

Hmm. For another data point, http://historio.us, which has about 3k active users, costs $30/mo to run. I'm sure costs wouldn't scale completely linearly with users, but there you go.

HELLO! I recently signed up for your service, tested it out a bit, and really wanted to give you $20 for a year's plan, but was dissuaded by the fact that there are no blog entries since November 25th, and by the fact that indexing of PDFs, a feature request which was made about a year ago in one of your polls, is still not done. Can you tell me the status of the PDF indexing, and/or about the general maintenance and updates that are going on behind the scenes? I like your service a bit more than Pinboard, despite their additional features, and I would still like to commit to yours, in the form of a paying customer, but I first want to know that it hasn't been abandoned, or at least that it's not totally on autopilot. Thanks in advance for a detailed reply.

P.S. I really hate to point this out, but $30 per month does not inspire confidence. How do you handle backups, and switching over to another machine if the main one should fail (as well as other such items, which normally incur an extra cost in this kind of setup)?

Hello! The service hasn't been abandoned, but feature development has slowed down a bit, sadly, hence PDF indexing taking so long. Apart from that, though, everything is working perfectly, and we do change some small things from time to time or upgrade/maintain components. Downtime should be excellent, though, we've had minimal downtime for the past year.

Right now, everything is served from a single server, which is why we get hit by some datacenter maintenance from time to time. Backups are made daily as well as multiple times a week, so that shouldn't be a concern... Let me know if you have any other questions!

I like your interface a bit better than Pinboard's, because yours is more like Google's. I like some other things about yours, although I don't recall them at the moment. However, I do remember that I was able to bookmark PDFs, even if they aren't searchable. So, I can at least add tags.

I see that your Twitter is rather current, but it wouldn't hurt to do a monthly blog update, even something minor, because I assume that there are other people, than myself, who look at these details when it comes to deciding whether to commit to a service. Said commitment is really less about money, and more about the perceived reliability, stability, and long-term viability of the service.

I suggest that, in the export function, you include everything about each bookmark, such as tags, dates, and whatnot, because it will likely make the customer feel better to know that they can get a complete snapshot of their efforts, whether they are leaving the service, or whether they are attempting to occasionally perform some external operation on their data. Giving customers complete ownership of their information is a good differentiator, because most websites don't provide such a thing.

My last question: How do you handle (D)DOS attacks on public bookmarks? I can see this as being a potential problem.

Thank you for providing an alternative to Pinboard!

Hmm, you have a good point about blog updates, it's just that we use Twitter for minor announcements. I'll have to change that, though.

Export already includes all the data except the actual page text, so by downloading that file once in a while you can reconstruct your bookmarks almost perfectly (or just import it into your browser).

There haven't been any DoS attacks yet, but varnish is a champ, so that would be pretty easy to mitigate, depending on the scale...

Varnish is nice for load spikes, but how do you cope with, say, a SYN flood? (Honest question, I'm curious.)

We don't, we haven't needed to yet.

$2k for 15000 users. <wrong>Each user is worth $7.50/month to you. But, they only pay once to sign up.</wrong> Each user is worth 13 cents.

So, it seems that the users paying for the archival service($25/yr) are the one's keeping the lights on at your bookmarking service. 1000 such users would bring in $25K. Basically you need 7% of your users(15000) to break even today.

Each user is worth $2000/15000 ~= 13 cents

My Question:

1. Is it easy to get recurring paying users?

2. In the long run(for any paying web service) what % of the user base do you think will be such users ? [Is 20% the max ? or 30 % ?]

EDIT: Fixed my cost per user

Did you mean to compute the cost/user? If so, that comes out to a little over 13c/month. Still, that comes to $1.60/year, meaning that a user would start to cost money right around the end of the 6th year of usage (assuming hosting costs don't go down).

But if they're not paying for archival, are they really the folks who are causing the high hosting costs? As other have speculated, it would seem that the archival service might be the major reason that hosting is so expensive.

Yes. I computed it reverse. Thanks.

I think its the other way: 2k/15k = 13 cents per user. Given that I just paid ~$10 to join, my guess is that I just bought around 6 years of storage.

That said, it still looks quite expensive.

keep in mind he said active users, there's probably a number of inactive users who paid the initial fee but now don't add to server workload.

Since we seem to be sharing the numbers now, let me add my 2c. I ran the Hamachi service [1] with 3 mil registered accounts off 4 co-located servers at a cost of about $700/mo in total. The servers were mid-range 1U Dell boxes, each costing around $1K. That was back in '04, so I'm sure prices have gone down quite a bit since then.

[1] http://en.wikipedia.org/wiki/Hamachi_(software)

FYI, Maciej is banned from HN - his comments are autodead, so don't expect responses.


It would be enlightening to know a bit more about the actual server architecture of the site. I understand that the storage footprint per user for a bookmarking site is quite big[1], but I still think $2k for 15k users is a fairly high hosting bill.

[1] Since they offer full archiving of bookmarked websites to their premium users.

There's more information about this in earlier blog post, e.g.: http://blog.pinboard.in/2011/08/a_short_rant_about_hosting/

As noted in that post, Pinboard's hosting costs are so high because it runs on (multiple?) beefy dedicated servers (8-16 cores, 24-48GB RAM) costing $500-$1000/month each.

Presumably he has gone this route in order to keep a large database mostly in-memory, but I'd be interested in hearing more about the application constraints that led to this architecture. Some applications certainly need beefy servers, but they have to provide a lot of bang for your buck to compete with the 10-50 VPSs you could rent for the same cost.

When you need 48GB of RAM, having 50 VPSs with 1GB of RAM doesn't do you much good, not to mention the i/o issues he describes.

I agree completely. It's the genesis of "needing 48GB of RAM" that I'm curious about. The I/O issue with shared hosting, as I understood it, was that it took a long time to rebuild a busted RAID volume. This, to me, seems like a different symptom with the same underlying cause: there is a big server in the middle whose operation is critical to the site (otherwise, having 1 of N replicas take a long time to rebuild wouldn't be a big deal). For some sites, this big-server-in-the-middle is fundamentally necessary to do what the site does. I would like to know what it is about Pinboard that puts them in this boat. This is an honest question, not a sneaky way of saying "surely they don't need a server that big."

hint: You don't need 48gb ram to run a social bookmarking site.

Thanks for sharing the data.

Can you break this down into how many actual servers you have? Is providing a high level of redundancy the source of the high costs?

Why do the S3 numbers fluctuate so much -- that implies a lot of transient data. At $100\month you are storing ~50MB per active user?

That's an amazing amount of money to be spending on hosting.

Should be spending nearer $200/mo total for hosting a service like this with that number of users IMHO.

Why do you think so? Not all sites are created equal. And Pinboard is definitely on the lean side.

Try hosting a cluster of Magento 'daily deal' sites on Amazon infrastructure, with shared storage and isolated RDS instances for each region. You end up with 15-20k $ per month for god damned bloated magento installs.

...then don't use magento.

What I said was that for the functionality pinboard provides, and the number of users using it, the hosting costs are extremely high.

If the hosting costs are high because of inefficient software, or bad architecture decisions, then those should be changed.

If the hosting costs are high because of inefficient software, or bad architecture decisions, then those should be changed.

If anyone here believes this, make your best estimate as to how many man-months you need for a re-architecture and how many hundreds of dollars you'll save in hosting costs, then do the division to get your effective hourly rate. I'll pay you that plus 50% for contract programming work.

In the example, the current hosting is $24k/yr. If that can be slashed to $2.4k/yr, you've saved $21.6k a year in hosting.

For $21.6k/yr I'd say it's worth a week or two re-architecting.

Yes, it's a different game if you're profitable and $21.6k is negligible, but if you're a startup you should be spending time to optimize things.

The other point is one of scaling. If you're paying $2k/mo to support 15k users, when you scale to 15m users, you could be paying $2m/mo unless you fix things early on.

What makes you think that in "a week or two" you are able to cut costs by a factor of ten without sacrificing performance and reliability?

Well, there's a lot of people here expressing surprise at the cost of hosting a small user base of simple data in a not complicated problem domain.

It looks an order of magnitude too expensive to us, there's probably some simple thing wrong in the architecture.

Maybe we're all missing a key complexity of the service. The Archival service might be it.

I'm imagining the full text search of the archive to be a factor, but then again I don't know much about search.

I am not using it personally but administer and maintain a cluster, among other people. Magento is brain dead.

As for high hosting bills, build your own equivalent version and then share your own hosting bill data.

Well, my own data point is Mibbit. And I can tell you I certainly don't spend that much on hosting.

Still, easy to criticize without knowing the full facts...

Maciej, can you share that google doc in a way that would let viewing it without needing to create an account with Google?

Saved as HTML and reuploaded: http://pastehtml.com/view/b6z5s6bc5.html


You don't need an account to view the spreadsheet - at least not at the time of posting this.

what % of the users are paying? what is the average rate they each pay (since i know you had the each additional person pays .01 more thing for a while)

What was your rationale behind these ISPs vs say pure cloud (EC2/rackspace/stormondemand etc)? Or what made you pick each option?

The answer to your second question is covered here: http://blog.pinboard.in/2011/08/a_short_rant_about_hosting/ ("Why not go with Linode/AWS/[other virtualized hosting]?")

tl;dr - I/O performance

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact