By their own numbers, they do about 14.3m pageviews/day. Let's assume half of that is AdBlocked. A single 300x250 at an exceptionally low $0.05 CPM would yield $3,575/day in ad revenue, or $107,250/month. I'll guarantee that reddit could pull far, far higher CPMs. Like, an order magnitude or two higher.
Edit: I'm off by an order of magnitude, see below. Point stands, though, that there is a lot of ad money they're leaving on the table.
(14,300,000 x $0.05 x 0.5)/1000 = $357.50/day
I still think Reddit could easily pay their costs by running ads, not sure why they are not doing so.
If they dropped in AdSense units, there would be an uproar for about a week, then people would adapt and life would go on with a significantly bolstered revenue stream. Worst that happens is that more users would still be adblocking, but given their page view volume compared to their expenses, I can't imagine that even that would be a net loss.
Uh... therefore reddit gold. That's the point.
The key to this is user retention, and if reddit users are loyal to reddit (which they seem to be, especially those willing to pull their credit cards) I wouldn't be surprised to see retentions on the order of 6 months or more.
What do you base the 2% on ?
For Reddit, yes you'll get a number of die-hard fans who want to subscribe. But I don't believe that there is a big enough value proposition for the userbase to do that en-masse. They may as well just set up their own reddit since the code is open source (And wouldn't be that hard to recreate if it wasn't).
It might be different if:
A). Reddit wasn't spending crazy money on servers
B). Reddit wasn't owned by a multinational corporation
C). Reddit hadn't pandered to, and cultivated a staunchly
anti advertising userbase.
Also from their latest blog post, it looks like they're just spending the 'reddit gold' money on more hardware! Instead of fixing the underlying issues.
The 'news' angle is a silly one, but they could definitely think up features that people would pay for that are not available right now in the free product.
The real issue with reddit making money from advertising (aside from the ad blocking) is that the CPMs that are quoted here (between 2 and 9$) are not realistic for their number of pageviews. By the time all the unsold inventory is taken out you probably end with $0.05 ECPM or maybe 10 cts per click (and that would be pretty good).
I'm really interested in how much they were paying before for their bandwidth and hosting if going to EC2 actually lowered their costs, they must have had the worst deal on the net for that to be true.
Right now, $30K / month buys you 20 (very) fat servers and 20 Gbps flat rate, managed hosting.
I'd really like to see someone make the case they can get that kind of performance out of EC2 for a similar cost.
And hiring a designer! Because, you know, that'll fix their load issues.
(The point stands, though, that "basic" doesn't mean "light").
Whitelisting on adblock will let through a tiny bit of CPM based revenue, but people won't click in the numbers that 'normal' people do.
The only way I can think of to cultivate an anti-ad userbase is to not have ads, but Reddit has had advertising for a long time. For a while it was even a bit obnoxious.
Still though, I agree. At something like a $2 CPM (which is NOT that hard to get for someone their size) they'd still be pulling in insane money for just one ad unit.
No on seems to be mentioning other revenue streams, user stats, customised reddits, merchandise sales, ...?
The users now have an option to not see ads (ie. buy a gold subscription), so they can't really complain about having a single ad slot on the page for non-gold users.
To go further, they could not show the ad slot at all if there isn't a campaign running on the site (ie. don't show 'backfill') which would further increase the value of the slot.
$100k+ a month from that ad slot would be very very easy to achieve, and with minimal sales work (they could outsource it to Doubleclick or another network). Sites like Techcrunch and Gawker make millions from ad revenue with a fraction of the traffic - reddit could hit 6 figures a month with a single unit that isn't even displayed all the time.
It would be a perfect balance of retaining the style of the site along with bringing in revenue
TBH, when CN bought reddit, this is what I thought they would do
What's digg getting? A bit of googling shows that most people who get front-paged see ECPM of <$1...
Why do you think that reddit would be so much higher?
(I agree, they should be showing more ads)
That'll cover the servers, easy, but what about salaries?
I have no idea what constitutes normal ad rates for a site because all the projects I've worked on have been pathological cases in either direction.
What would some realistic numbers for something like Reddit be?
Here you go. Took a bit. Also it's risen about 1$ in the last year.
http://www.thathigh.com - got any tips for me? how do you usually pursue direct advertisers?
(1820 * 57) + (910 * 23) = $124,670/yr in reservation fees
(2800 * 57) + (1400 * 23) = $191,800/3yr in reservation fees
Of course this assumes they're doing 3 year reservations, but I don't see any reason not to on their XL instances (I could be wrong though, not sure how open Conde Nast is to logic like this).
What was the limiting factor on dedicated? CPU? ram? bandwidth?
Maybe you just had a really bad dedicated server deal? Were they reasonably priced?
Even VPSs are cheaper than Amazon.
I started out on VPSs and saved a lot moving to dedicated servers once it made sense to. For the dedi servers I use I don't have to pay for RAM monthly, and I get bandwidth at a good price.
The kind of money you're paying on servers is just obscene. It wouldn't matter if Reddit had the revenue of course...
I wonder why google, yahoo and facebook don't run their site on ec2... if it's cheaper.
It's a very interesting case to analyze though - perhaps acquired too soon, never had enough pressure on monetization until it was too late... Questions over how well the site was architected. Sounds like they're using a lot of 'new' unproven 'hip' things. Casandra? :/
Seems like the founders and YC have been extremely quiet about the problems... It'd be interesting to hear their take on things.
Also can't imagine how Conde Naste could be happy with things.
EC2 only came about because Amazon run their servers on it, and they had so much spare capacity, so they sell it.
If by EC2 you mean "bunches of servers", sure.
On GoGrid you can buy cloud servers, or you can rent dedicated servers (and you can intermix the two). The latter are quite a bit less expensive for a given quanta of resources, while the former obviously offer greater dynamic flexibility (with a significant premium).
Actually considering the terrible I/O rate of services like EC2, dedicated often offers a dramatic advantage.
>They have a whole pile of virtualised servers they can turn on an off by the minutes.
But they don't. So they use none of the upside, and have all of the downside. Yay!
That's not true at all. We shut down machines when we are over capacity (rarely) and we often have to bring up a bunch of new machines where there is a traffic spike.
Well, for one, they are A LOT bigger than us. But you'd be surprised who DOES run on EC2. The biggest one I'm allowed to tell you about is Netflix. Their entire streaming service is run off EC2. I guess they're idiots too, huh?
For startups, and Reddit, the difference does(should) matter.
EC2 certainly isn't always cheaper, but it also certainly can be cheaper.
Even without that, keep in mind dedicated servers have investments in hardware management. That's a huge cost. Plus, when you're a fast-paced company, the ability to move quickly is invaluable, which EC2 definitely does allow, but dedicated does not.
Even without that, managed dedicated servers are still often more expensive. Rackspace costs $420/mo for their cheapest dedicated, which is roughly equivalent to 2 small EC2 instances (~$140/mo). The Planet has a similar(ish) machine for $184/mo.
 - http://www.rackspace.com/managed_hosting/configurations.php
 - http://www.theplanet.com/dedicated-hosting.aspx
That said, don't forget that both the Rackspace and The Planet boxes come with 2 terabytes of transfer which would be $300 from Amazon. When you factor that in, suddenly Rackspace becomes competitive and The Planet vastly cheaper.
Softlayer will sell you a quadcore box with 8GB of ram and 4 terabytes of transfer a month for $200.
Yes, the dedicated servers might be less. But when one of them breaks, I have to wait for the provider to fix it. On EC2, I can replace it in 5 minutes.
> I started out on VPSs and saved a lot moving to dedicated servers once it made sense to. For the dedi servers I use I don't have to pay for RAM monthly, and I get bandwidth at a good price.
EC2 doesn't charge for RAM either and the bandwidth is at a great price.
> The kind of money you're paying on servers is just obscene.
It's really not that much more than other hosting providers, and they offer features that the other ones don't. The two biggest being the speed with which I can add new machines and the speed I can add more disk.
* Layered Tech: $169 (2 TB/month)
* The Planet: $149 (1.5 TB/month)
* Superb Hosting: $119 (2 TB/month)
* Hostway: $99 (2 TB/month)
* Server Beach: $75 (1.2 TB/month)
* Cari.net: $60 (1.3 TB/month)
* Amazon EC2: $244.40
If a dedicated server dies, just spin up some VPSes while you order a new dedicated server or get it fixed :/ It's not a great problem for the rare hardware failure.
The dedicated servers I use get me 5TB transfer for around $100/month. That would cost me around $1,000 on Amazon.
In any event. You're wasting money. Reddit could easily be hosted for $3k/mo or so.
I think you could get 75 of Dell R410s with 8 cores each and 16G of memory for under $300k. A top of the line colo will run you about $2k a month for a rack and a half with power and room to run all 75 boxes. Where you run into trouble is if the company you buy from charges you for the box cost every year. Getting charged as if you are re-buying every year would work out to something like $27k a month. Those boxes will easily last 3 years and if you span it out over 3 years you are looking at $10k a month. That is half what you are paying for EC2.
With 75 good quality machines I bet you wouldn't see more than three hardware failure a year if that. All 3 of those would probably be drive failures. I imagine the redundancy of the software would handle that without a problem.
I agree that the continence of EC2 is very nice. We use EC2 a lot but just in a lot more "elastic" way. We have found it is cheaper to colo boxes if the demand is constant.
> Actually they're even lower
Yeah I like that you declare knowing the costs of Reddit's servers.
2) You're right, I mis-spoke and should have said "they may be" instead of "they're". That being said, I would hope any frequent internet dweller wouldn't argue semantics and would give me the benefit of the doubt with respect to tone and inflection.
The others are quite clearly marked as estimates, this one not really, and the second estimate from the top in the reddit thread (yielding an estimate of $22,390.37/mo) got the following comment by one jedberg:
> Yes, once again, you are totally accurate. That is almost exactly what it costs to run reddit, as of today. However, with our projected growth, we're looking to be closer to 350K by the end of the year.
note that the comment got that estimate based off 1-year reserved servers, which jedberg's comment suggest is broadly correct rather than 3-year reservations.
If it's truly a python issue, it should not be taken lightly by entrepreneurs. I hope it's not...I'm an avid user of both ruby and python, and want to believe they can both be used to create successful and maintainable (in terms of effort and cost) sites.
Let's compare this with Facebook. In October 2009, Facebook announced they had 30k servers. In September 2009, there was a rough estimate that Facebook served 200 billion pageviews per month. That implies 73k pageviews per second, or 2 pageviews per second per server. Clearly the pageviews are a rough estimate, but even if facebook served 1 trillion pageviews per month they still wouldn't be beating reddit for efficiency.
I have a feeling if you run the numbers for any other highly dynamic site at scale, you'll find that amortized over every server in use, you won't get a lot better than 10 pageviews per second.
10 pageviews per second is pretty lame IMHO. Crazy crazy server costs. Amazon is crazy expensive! Why use them? Either Reddit need to drastically change the way they do things, or they deserve to die.
Facebook also doesn't count all webserver HITS as pageviews, whereas most of Reddit's do count.
Beyond that, pageviews for Facebook may eat more resources (CPU/RAM) than Reddit on average due to photo uploading and other misc. things on Facebook. This means a server is working harder at 2pageviews/sec/server for Facebook.
How is that even relevant? Images cost in bandwidth, not in computing power.
> Does Reddit find things only specific to your account
Uh yes, on every single submission and comments. The admins clearly stated that what used the most resources was the voting system. As well as every single user (whether you marked them as friends) and the list of links itself, which is extracted from the user's own list of reddits.
There is barely anything on a logged in user's page which isn't at least in part specific to that user's account.
> I could really go on for a long time
No, I don't think so, you've been 0/everything so far, it's time to stop.
Sounds like you need to do some STFU yourself, having clearly never come anywhere near this problem domain. Using standard tools to process and resize user-uploaded images can easily soak up all the CPU time and memory you could throw at it.
I recently spent a bunch of time rewriting a client's image processing pipeline fromt the usual O(n)-space ImageMagick crap that loads the full uncompressed image into memory a few times over to be O(1)-space doing streaming downsampling by exploiting the compression implementations of JPEG and PNG. It was more than worth it — even with the PNG implementation being a bit more CPU intensive now, it's a lot nicer having it operate in constant space without fear of swap and the OOM-killer.
Your personalized news feed and what you can see depends on the settings of all your friends and those friends.
Reddit is just a simple forum site...
Facebook have revenue, and profit. They can afford to buy more servers and have them sitting idle.
Reddit, should be trying to run on an optimal number of servers. Firstly, cheap hosting (Which is NOT amazon), and secondly less servers. 80 is just crazy.
As someone who's worked in a data centre before, I can say hardware problems only increase as you scale up. We had around 350 servers with an average of three drives each, and we probably went to the data centre to swap dead hard drives on average three times a week - SAS, SATA, or SCSI. Nevermind the time I spent testing motherboards, swapping CPUs out, etc.
Owning your own servers means dealing with maintenance and downtime. Virtualizing 10 servers on one physical server is great until that one physical server's RAM starts acting flaky, and then you have to take those 10 servers offline, or shift the VMs onto your other hardware.
Renting your servers typically means paying a monthly cost for the hardware, including RAM and disk, long after the fees have paid the DC back their costs.
For Reddit, being able to bring N extra nodes for X purpose (mysql, cassandra, web serving, etc.) with a few clicks in a few minutes is likely a huge draw for them. It means they can grow gradually (instead of having to shell out another $10-20k for a new server), and it means they can dynamically adjust to meet their needs. For example, if Monday is a busier day than Saturday by a significant stretch (and if their architecture allows it), they can bring more nodes online early Monday morning to handle the load. Take them offline until Thursday evening to handle the 'It's Friday, to hell with doing work' rush I'm sure they get, and so on.
That said; this looks, sounds and smells like database scaling issue more than anything. It is interesting to ponder the long-term costs of running a site like this on EC2 versus your own maintained server farm.
That web server is running separate from databases and other resources. It never sees load. We're using a static language, but even if ruby was 10x slower, it wouldn't matter.
On the other hand, change the contents of a stored procedure so that it doesn't line up with indexing properly, and page load goes from >0.1s to <4s.
What I'm really trying to say here is... prove it.
Actually, you do! :) http://code.reddit.com
But to help you out, I'll tell you that a good chunk of the expensive loops are written in C.
This also calls into question the original assumption, that posters believed Reddit had far too many wasteful servers to handle their service, and that it was because they use a dynamic language.
Scaling is caching and architecture, not writing your app in Java.
I'm also not sure what makes a Facebook game's domain more "widely cachable". 100% of users are logged in. The vast majority of actions taken are changing state. Any app touchpoint is writing to the database. Page caching is nearly impossible.
It's a lot less different than you think.
You obviously know how to scale an app if you were pulling 16 million pageviews a day, and I don't intend to discount that at all. I just mean to point out that while fundamentally the same problem, reddit has to deal with a version of that particular problem that most applications don't begin to approach.
I never understood this. Does reddit really need to spend the capital making sure I see a stranger's upvote the moment it occurs? A 60 second delay to refresh pages in batches seems perfectly reasonable. Perhaps with a client side script to mark my own upvotes so the system doesn't look like is losing my selections.
Warbook was a Facebook application written in Ruby on Rails I ran by myself in late 2007 - 2008. It grew to over 16 million pageviews a day. At the time it was more pageviews than Twitter.
I scaled it using the following stack: Perlbal for load balancing, LightHTTPD for static assets, Mongrel for dynamic requests, Memcached for caching, and MySQL for relational data storage.
I used two medium instances for load balancing, one medium instance for asset hosting, 15 small instances for mongrel, one XL instance for memecached, and one XL instance for MySQL.
I used memcached as a "write-through" cache. Everything in cache was considered fresh. Every write of a cachable object would write to both MySQL and memcached. Every read of a cachable object would start with memcached first and failover to MySQL. This reduced reads on the database by 95%.
Total hosting costs were ~$2,000 a month.
Facebook giveth and Facebook taketh away.
Mibbit is written in Java (Custom written framework and server), and handles traffic just fine. Java is insanely efficient for network IO.
It's an apples to frogs comparison, but if you measure 'page views' then Mibbit does bajillions, on a handful of servers.
That's cute, but was your web app as hard to cache as Reddit?
Scaling a fully static website is trivial, and the more dynamism you include, the harder it becomes.
There is barely anything static on a logged in Reddit user's page.
USERCLASS = SHA1(SUBREDDIT1 XOR SUBREDDIT2 XOR SUBREDDIT3)
For logged in users, you're still going to need the voting status for the current user on every single submission as well as that submission's hidden status for the current user to decide whether or not a submission should be displayed in the listing.
I believe they almost never hit the DB directly, so these are probably recached immediately (or submitted to both the cache and the DB at the same time), but that still means quite a lot of traffic.
And what/who pay for that?
In addition to that, the software behind HN doesn't do nearly as much as the reddit software: there is no automatic checking of new messages, there are no sub reddits you need to work on, there are only so many votes to be processed, etc.
I have no clue about who pays for what, but I guess PG pays.
When your userbase is filled with college kids and anti-capitalist, you are going to have a tough time making money,
After all, they're aiming for 2% of the userbase subscribing. That's a large number when they're offering basically nothing of value in return. I know other 'freemium' business get 1-2% subscribers, but surely that's when they're offering a real improvement in the service, such as Dropbox's 2GB -> 20GB.
There is a good argument that advertising actively moves you away from the EMH which is one of the strongest arguments in favor of capitalism.
I said anti-capitalist because many people were against advertisements and the gold accounts. While there may be many people lurking on reddit that are not anti-capitalist, it's a sentiment I see in almost every topic related to making money on that site.
The users of reddit are wide and varied. There is a significant number of redditors who have demonstrated their willingness to pay reddit, me among them, and don't regret it.
Along the way, I realized it would be trivial to make a mobile HN reader, so I spent some time doing that http://toadjaw.com/hn
Keep up your website. Maybe you should put an ad on it, I would click it once a day to help :)
Whilst it sounds like a lot of money remember that the whole sites annual EC2 bill could easily be paid for several years by one of the Newhouse brothers (who own Advance Publications, who own Conde Nast) selling one of their Reubens or probably just giving up their bank interest for a month (est. about ¼-million USD per day for the poor one : 1day of 2% annual interest on $4.5 billion).
I don't want to argue that at all. I'd argue that bailing them out as if they're a charity is wrong given that supporting it would be pocket change for the owners - less than they'd pay for a painting that they probably don't even bother to look at (I know that these paintings are not to look at, humour me).
Renting an empty rack is about 200 EUR/month, and you can fill it with back to back mounted Atoms with SSDs for less than 40 kUSD. 160 cores and 320 GByte RAM with some 16 TByte/s peak throughput for some ~2 kW is not that bad for the price.