

Google on today’s massive Google+ spam influx: “We ran out of disk space” - suneliot
http://venturebeat.com/2011/07/09/google-on-massive-google-spam-influx-we-ran-out-of-disk-space/

======
ChuckMcM
Interesting. Given the minimum size of a Colossus cluster I find the
explanation Vic gives unsatisfying. That being said I'm pretty impressed with
the overall product, it is the best attempt yet to unseat Facebook, and I
predict it will if Facebook can't come up with a credible response quickly.

The killer feature is that its blended with Gmail, and since a lot of people
keep gmail open all the time it means you get notified and you see stuff. Not
as common to keep one's facebook page open.

~~~
mlinsey
_since a lot of people keep gmail open ... Not as common to keep one's
facebook page open._

Do you have data on this? I'd think it would be close, if not the opposite.

Edit for clarification:

In my social circles, what you said is probably true, but I know many others
use FB messages more than email (any email, not just gmail). Facebook has more
pageviews total, and Facebook has probably around 3X as many active users (I
couldn't find numbers at the same point in time, Gmail was at around 200
million last November as per the WSJ, Facebook was at 500 million last July
and 750 million last week)

Don't get me wrong, I agree with you that G+ notifications in Gmail (and on
Google Search!) is hugely powerful, and will mean G+ engagement among its
users will stay quite high. But when you talk about "unseating" Facebook, you
have to first come to grips with just how entrenched it is, compared to social
networks that rose and fell before it. It is so entrenched that Gmail
integration alone will not be enough - Facebook is substantially bigger than
Gmail.

~~~
ChuckMcM
You are correct in that I was generalizing when I should not have been.

I've used Google Apps for work and since a lot of communication comes through
email that means Gmail is open (or at least getting notifications with the
talk gadget). I would not be surprised if you were correct that many folks
leave Facebook open all the time.

~~~
riffraff
it's worse than that: among my non technical friends email is used only for
"serious stuff" as talking with teachers (or students, for my friends who
teach) and for work. For the others facebook messages have supplanted email.

I am obviously not sure this is a global trend, but I keep using the test "who
did you receive an email from, recently, who isn't work or a fellow technical
person?". Results are somewhat scary.

~~~
wisty
Yeah, email is now the snail-mail of the mid-2000s.

~~~
ChuckMcM
Interesting, this means the spammers will have to change venues to reach that
part of their demographic. That should be giving them something to think
about.

------
tybris
Not so long ago hard disk space was abundant, then programmers realized hard
disk space was abundant.

------
adaml_623
Shouldn't there be a Beta tag on the Google Plus icon. Perhaps they left it on
Gmail for so long they didn't think anybody would pay attention to it anymore.

------
craigmccaskill
Some quick napkin math on numbers:

If you're to believe Eric Schmidt when he says 'millions' and put that at 3
million users (being generous), guessing that each user uploads 20 megabytes
of content (again, generous) thats:

3x10^6 × 20 MB

6x10^7 MB or

60TB

Sanity check or does that not seem like a lot of resources to allocate to a
project of this size?

Edit: formatting

~~~
carbonica
60TB? That's about a couple grand worth of space. Why would that be a lot of
space to Google?

~~~
whiskers
Even using 3TB consumer grade SATA disks you'd need 20 of them, an enclosure
for 20 disks costs a _lot_ more than a couple of grand.

~~~
moe
Not really.

[http://shopping.yahoo.com/721280117-supermicro-
sc216-e2-r900...](http://shopping.yahoo.com/721280117-supermicro-
sc216-e2-r900ub-system-case/)

~~~
whiskers
That's a chassis for 2.5" disks so you'd be looking at 60x1TB disks, and that
would mean 3 of those enclosures. Now rack them somewhere and add power -
still much more than a couple of grand, we haven't even paid for the disks
yet...

~~~
carbonica
Yeah, I was a tad hyperbolic in just referring to the disks. I would expect
the costs to be around $20/GB/year when you also factor in power - bigger
drives are making a difference, but the other factors always cost more than
the disks themselves.

It doesn't change the fact that 60TB is _tiny_ for a company whose every
product involves storing enormous quantities of data and serving them at
monstrous scale.

~~~
ChuckMcM
And according to the GFS paper their are three copies of every chunk in a GFS
cluster so that is 180TB, and they probably don't depend on one GFS cluster to
meet their availability guidelines so if you had two that is really 320TB
(180TB * 2).

And the amazing part is if you are in an open event where Google is talking
about their infrastructure in general terms you will realize that that has to
be mouse nuts compared to the amount of 'spinning rust' they have going on at
any one time.

------
oflannabhra
So I guess "field testing" is the new beta testing? Google watered down beta
by applying it to finished products to the degree that they've had to invent a
new term to fill its function.

------
benologist
I still have this happen, and every time I tell myself right, today I'm going
to put in a warning system that'll stop this happening .... and then I clear
up a ton of space and move on.

HTTP.sys logging has been a real pain in my ass, every time I deploy a new
server I forget to disable it and we do so much traffic it fills the drive
completely overnight.

~~~
jackowayed
Sounds like you need to automate setting up a new server

~~~
pavel_lishin
Or at least automate the logging process. 5-minute log monitor that compresses
old logs, and maybe moves them off the system onto a cloud server once space
starts to run out, while also e-mailing the people responsible.

~~~
benologist
Yep that would help, notifications are built into about 1/2 my platform now
... gradually getting them more robust. :)

------
staunch
Running out of disk space is probably the single biggest category of server
problem that occurs.

------
robryan
Seems today and yesterday (for me and those I have invited anyway) there has
been no issue with creating an account straight away. So I'm guessing this has
sent the total users on the service up pretty fast.

------
senthilnayagam
running out of diskspace is devops 101, it is high time these colossus
startups start publishing which CMM level standard they meet

~~~
vegai
Are you joking?

~~~
senthilnayagam
no, I am pretty serious, down votes dont change reality. there are
apps/services which are signing on 99.999% availability.

if after having a expertise in managing 200million+ gmail accounts, I would
consider it a bad fumble at 5 million+ users

~~~
laz
Mentioning devops undermines your credibility. What service is 5 nines? How is
that measured?

In general, Google SRE just gets it done. Sometimes people screw up. It
happens.

