
Gmail Services Global Outage - nedsma
https://outage.report/gmail
======
postit
I'm sure that wasn't correlated with my today's select 100k messages, mark
then as read and archive all - but the timing was perfect

~~~
jlawson
It was correlated - it wasn't caused.

Sorry - grammar eye twitch!

------
zulgan
We managed to centralize everything, email, git, even the web. I understand
99.99 looks fine, but is somewhat sad to see half the world without email.

~~~
airstrike
I'd be happy to move away from Gmail. Unfortunatelty, I happen to hate spam so
that's not really an option.

~~~
reaperducer
_I 'd be happy to move away from Gmail. Unfortunatelty, I happen to hate spam
so that's not really an option_

That was what kept me on GMail for so long. But about a month ago I moved
several accounts to FastMail (at the urging of others on HN), and have been
pleasantly surprised by the results.

FM seems to have a lot fewer false positives, and the amount of spam that gets
through seems only marginally more than with GMail.

FM offers a 30-day free trial, which even includes using your own domain. That
I found surprising. Usually trials so restricted you can't really get a sense
of what you're getting into.

~~~
eeeeeeeeeeeee
Did not have the same results regarding spam on Fastmail. I still use them for
a mostly private email address that friends have, but it never did a great job
with a highly public email address.

It doesn’t look like they’re doing much beyond what I did 10+ years ago when I
was running high volume mail servers and it was never enough:
[https://www.fastmail.com/help/technical/spamchecks.html](https://www.fastmail.com/help/technical/spamchecks.html)

------
ts330
Well Microsoft had their annual mail outage the other day... I guess they
could have coordinated a bit better to get them done and out of the way at the
same time.

~~~
wil421
Microsoft has had at least 2 major Azure outages that affected their SSO
product. My system was unreachable by anyone but admins. Our systems engineer
couldn’t do anything but wait on Microsoft to fix the issue on their end.

~~~
bharrison
We spent ~4 hours troubleshooting an Office365 deployment before realizing the
SSO outage was the cause.

The simultaneous fury and relief when we successfully logged users on the next
day having made no functional changes was a watershed moment in our self-
hosted services commitment.

------
mattlondon
Does not appear to be any problems today in Switzerland. "Global" might be
overstating it?

Edit: actual status page says "significant subset of users":
[https://www.google.com/appsstatus#hl=en-
GB&v=status](https://www.google.com/appsstatus#hl=en-GB&v=status)

~~~
puzzle
Given how many services it affected and the mention of 404 errors, I would
suspect a GFE bug or bad configuration that started rolling out worldwide
(hence the geographically diverse, but not 100% spread of the issue). A decade
ago, Tuesdays were a special day for GFEs, but that hasn't been the case for
years now. Perhaps it's just the Tuesday curse that persists. :-)

My money is either on that or the static content service.

~~~
bamboozled
GFE?

~~~
skj
"Google Front End", the process directly between the open internet and
internal Google services.

------
Felz
As someone working on a mail service of my own, this is heartening to see.
Even Google messes up sometimes.

~~~
Waterluvian
Yup. And that's by design. Stability is asymptotic and increasingly expensive
to reach toward 100%. There's a very intentional "this is good enough" point.

~~~
x3tm
What are the main factors that could lead to such a situation?

~~~
gizzlon
From the first SRE book [1]:

    
    
       "The error budget stems from the observation that 100% is the wrong reliability target for basically everything 
       (pacemakers and anti-lock brakes being notable exceptions). 
       
       In general, for any software service or system, 100% is not 
       the right reliability target because no user can tell the 
       difference between a system being 100% available and 99.999% available. 
       There are many other systems in the path between 
       user and service (their laptop, their home WiFi, their ISP, 
       the power grid…) and those systems collectively are far less 
       than 99.999% available. Thus, the marginal difference 
       between 99.999% and 100% gets lost in the noise of other 
       unavailability, and the user receives no benefit from the 
       enormous effort required to add that last 0.001% of 
       availability.
    
       If 100% is the wrong reliability target for a system, what, 
       then, is the right reliability target for the system? This 
       actually isn’t a technical question at all—it’s a product question.."
    

[1] [https://landing.google.com/sre/sre-
book/chapters/introductio...](https://landing.google.com/sre/sre-
book/chapters/introduction/)

[https://landing.google.com/sre/books/](https://landing.google.com/sre/books/)

~~~
null_content
We don't tolerate houses collapsing out of nowhere, brakes failing over the
course of normal usage and planes falling out of the sky during routine
flights.

But for some reason, we HAVE TO tolerate software crapping itself once a year?

I don't accept this logic. This is just a sign of how sloppy the industry has
become.

This is the reason your phone becomes obsolete after 2 years, whereas your car
can continue to run after multiple decades of abuse.

~~~
mattzito
I think this is a false equivalency. If we're talking about "service
unavailability", planes break all the time. Houses have to be vacated because
of flooding, fire, insect infestation. Brakes do fail. Just like with
software, we accept a certain level of risk in exchange for cost/convenience
efficiencies (e.g. we don't want our planes to fall out of the sky, but we're
okay with getting stranded in phoenix for 24 hours because of a busted landing
gear).

~~~
CydeWeys
Also, brakes contribute to service unavailability. Brake pads need to be
replaced on average every 50k miles, which takes the average driver 4 years.
And let's say the average length of time your car is at the mechanic's to fix
brakes is 3 days. That's 3 days of unavailability every 4 years just for brake
pad replacements, or 99.8% availability (two nines!), just because of brake
pad repairs. Add in all the other required car maintenance, and depending on
the reliability of the vehicle, and you might be down into one nine territory.

Gmail going down is like your car being in the shop. It's not equivalent to a
plane crashing; the equivalent there would be the entire contents and history
of your Gmail account being unrecoverably deleted, and you yourself had no
backups. Of course, I'd still much rather have that happen a hundred times
than be in one fatal plane crash ..

~~~
x3tm
Gmail seems to have 3 nines, although I couldn't find a better reference than
this [1], where other services are included:

> [Google's infrastructure] delivers Gmail and other services to hundreds of
> millions of users with 99.978% availability and no scheduled downtime.

[1]
[https://support.google.com/googlecloud/answer/6056635?hl=en](https://support.google.com/googlecloud/answer/6056635?hl=en)

PS. 99.978% availability translates as a downtime of ~ 2 hours/year total. Not
bad! But it's when things break that we realize how performant and reliable
they actually are.

Edits: various typos.

~~~
tracker1
I'm consistently amazed how well Google and Facebook are at staying up.
They're two services that I don't think I've really experienced a broad
outage. Of course with Facebook's data designs, there's sometimes quirkiness
as a result, but it's rarely completely off for me.

Google, I think I've only really noticed it offline once in the past 10 years
or so. Not complaining at all.

------
nedsma
Signed in accounts are not affected. However, if you sign out and attempt to
sign in again, you'll get 404.

~~~
cristianbica
Woraround is going to
[https://mail.google.com/mail?labs=0](https://mail.google.com/mail?labs=0)

------
moviuro
[https://inbox.google.com](https://inbox.google.com) though programmed for
termination looks unaffected here. Maybe worth a shot?

~~~
TurningCanadian
I'm really going to miss Inbox. Its interface is so much cleaner than gmail's.

~~~
Drew_
It's also been untouched by ads for its whole lifetime.

------
kyrra
It's fixed now. Not sure the cause, but it seems to have only impacted GSutie
related services? (not global to Google)

[https://www.google.com/appsstatus#hl=en&v=status](https://www.google.com/appsstatus#hl=en&v=status)

------
air7
Ha. Apparently this site (outage.report) is used by scammers to lure victims.

>Discussion| Please don't call "support numbers" posted below — most probably
it's a scam. Make sure to report and "downvote" such posts.

------
bovermyer
I am not having any problems accessing Gmail.

However, I'm now worried that I might not be receiving email that I otherwise
should be...

------
gtt
Semi-related, but can anyone suggest good open source email client? I'm
working on several computers with macos/linux and Gmail isn't slow only on my
deep learning rig, so I'm looking for something to replace it.

~~~
andyjohnson0
I've been happily using Thunderbird for 10+ years. Supports Windows, Mac, and
Linux natively.

[https://www.thunderbird.net](https://www.thunderbird.net)

~~~
gtt
In my experience it crashes a few times a week and have abandoned feel. Is
there any modern fork?

~~~
tethys
Mozilla is starting to throw resources at Thunderbird again, so it will
hopefully get better this year.
[https://blog.mozilla.org/thunderbird/](https://blog.mozilla.org/thunderbird/)

------
theduality
Issue affects my GMail account, but thankfully not any of my GSuite accounts.

------
nkozyra
Strange to get a 404 error there. This happens once or twice a year with
Gmail, does it not?

------
steve1977
I guess that's why they've just announced a price increase ;)

------
zerop
Have been consistently getting 404 on Gmail post login. App works fine.

------
jatsign
Got into gmail, but youtube login gives me a 404.

------
sidcool
No problems for GSuite users in India.

------
franky_g
No problems here in Leeds, UK

------
steventhedev
What are the chances this is related to the G+ phase out? The timing is a
little too convenient.

------
franky_g
No problems in Leeds UK

------
rusk
No issues here ...

------
sneilan
They just announced a price hike for personal accounts from 5 to 6 dollars per
user too.

