Hacker News new | comments | show | ask | jobs | submit login
Moving the NYT Games Platform to Google Cloud With Zero Downtime (nytimes.com)
180 points by jprob on Dec 7, 2017 | hide | past | web | favorite | 59 comments

> We found that some web customers were unable to access the puzzle, and found the cause of the problem to be App Engine’s limit on the size of outbound request headers (16KB). Users with a large amount of third-party cookies had their identity stripped from the proxied request. We made a quick fix to proxy only the headers and cookies we needed and we were back in action.

That’s pretty funny. Users of NYT are sending request headers with sixteen kilobytes of tracking data. Maybe that’s the real problem eh?

I wonder which news website has the largest amount of trackers. If I let the CNN home page sit open in chrome, I can come back an hour later and find thousands of requests blocked by uBlock.

Try a Cracked.com article. I left one open for 10 minutes and had 50,000 requests and 80MB transfered.

Holy crap, you weren't kidding. Loaded their main page, 12MB transferred in the first minute before I noped out of there.

How times change.

That would've taken half a hour on a dial-up connection.

> Users of NYT are sending request headers with sixteen kilobytes of tracking data.

How do you see this? What tools in the browser?

Well nyt saw it by proxying user traffic. You could do the same with a chrome extension, similar to uBlock.

If you use chrome just look at network under the developer tools.

If the website is free then you are the product. Not sure why it’s a surprise these companies aim to maximize the value they can derive from their product (aka our data).

OP said it was funny, not that it was a surprise.

It's not free. NYT relies on a subscription model.

The mini-crossword is free as well as some of their full crosswords.

Some. I actually cancelled my subscription with them because of the horrendously large Google Home ads on all the paid crosswords. When I asked them about it I got about four non-answers — it's ridiculous that even if you pay they still serve you ads.

Newspapers have always included ads even if you pay; not sure why anyone would expect “on the internet” to change that.

Because I pay specifically for only the crossword part. I bought a book of 50 (of NYT puzzles) for the same price, and it came with no ads. Much better deal, if you ask me.

I'm a paid subscriber to NYT (so it's not free) but I'm still to prone to this issue

If you're a paid subscriber then I think it's morally OK to use an ad blocker on them. You're paying for what you're consuming, and in all likelihood you're only earning them a few cents to maybe a dollar from ads anyways. What you spend on your subscription drives way more value.

Something I haven't though of before - with paper newspapers, there is no way to opt out of ads, and those are also subscription based. Why is the internet different? There is no opt-out for newspapers. Presumably they are getting less revenue than the customer agreed upon if the ads aren't displayed or aren't clicked on. Is using an ad blocker effectively stealing?

Note: I use an ad blocker primarily as a means against malware.

If internet ads were as unobtrusive as newspaper print ads (I'm not talking about the shitty free tabloids), a lot fewer people would be using ad-blockers.

It’s not a surprise, but it is a certain kind of schadenfreude to see a bug caused purely by the sheer amount of trackers they’re forcing into their users’ browsers. It might have been a good time for them to do some internal reflection.

And btw it’s not free; there is a paywall and you can subscribe to the NYT.

NYT isn't forcing these trackers into the user's browsers, ad banners are. You might say that's a distinction without a difference but I disagree, until you work with programmatic ad stuff it's difficult to fathom just how stupid it is, but also how unavoidable it is if you want to make money.

They could try a subscription model.

Glibness aside, there is probably more written about NYT's revenue from ads vs subscription than any other major media company.

IIRC they recently made more money from digital subs than they made from ads for the first time ever. But if they removed ads tomorrow it would still destroy the business as it currently operates.

Is anyone else curious why the NYT uses Medium? Their own website is literally about reading stuff

(Sorry if this is off-topic)

"How we did X" is "not worthy" of the main brand.

It's an engineering blog post, which usually serve double duty as being both informative and also useful for recruiting ("Look at the cool stuff we are building! Come be a part of it!"). Case in point, the post ends with "we’re currently hiring for a variety of roles and career levels".

In addition to not being really appropriate for nytimes.com, I'm guessing that publishing content there brings along a lot of extra cruft that is probably not necessary for a post like this (advertising, paywall system, isolating it from the "real" NYTimes content, etc.). Easier to just throw it up on Medium and call it a day.


Our CTO made our first post to Medium explaining the move: https://open.nytimes.com/introducing-the-new-open-blog-23eba...

This is really fascinating to me. Is this because engineers who they want to recruit dislike the New York Times brand, or because readers of the New York Times don't want to read things as informal as transparent blog posts about internal NYT decisions?

It's very easy for me to see something like blog.newyorktimes.com with a similar design / community philosophy as Medium, but would that somehow cheapen the experience for NYT readers? Or does NYT just not see itself as a "hip tech company" like Medium? I have endless questions about this, haha.

It seems to me like there's a lot of unstated assumptions hiding in "not appropriate for nytimes.com". Some things mentioned include -- "advertising, paywall system, isolating it from the "real" NYTimes content, etc.". This is absolutely baffling to me! I would be much more inclined to read regular NYT content were it not for these things.

I think you're over-thinking this. There isn't a huge crossover audience for engineer blog posts and general NYT audience, and I imagine the engineering blog posts do not go through the same editorial process content on nytimes.com does. That alone makes the case for using a different domain.

I'm being a cheeky detractor of NYT here. I think "candid, engineering-style blogposts" are the future of news, and ancient vehicles like New York Times are long dead. I think the "general NYT audience" is participating in #FakeNews, and they should radically reconsider their information diet.

As vivid example of this, compare James Birdle breaking the "Youtube exploitative kid videos" story way before, and in greater depth, than in any major publication. This is actually the future of news, and pretending like aging institutions like the New York Times are remotely relevant anymore is longshot wishful thinking.

Editorialization, fact-checking, and cultural leadership have important roles to play, and I'm excited to see these features unbundled into separate services. I'm long on services like Verrit and Snopes, and wish that I, as an independent publisher, could pay an intern to get official statements, cross-check narratives with history, and perform some of these functions. As is, I think people are operating under the delusion that ONLY NYT-style institutions can perform these functions, which baffles me.

(Actually, the future is probably more like James posting on jamesbridle.com, and then aggregating it through sites like Hacker News. But what do I know, I'm just a millennial who doesn't understand all these big partisan topics like modern journalism)


Hmm. I'm going to disagree with that! I think "candid, engineering-style blogposts" are and will continue to be great for an engineering audience, but I'm very skeptical that they will be great for a wide audience. For instance, I read and was fascinated by the YouTube kids post, but I do not know anyone outside of the tech industry that read it. And you're wrong to say he reported it way before, the NYT published this two days previous:


and Birdle's post itself links to reporting by New York Magazine from 2016. I don't dispute that his post goes into more detail, I just dispute that longer automatically equals better. Someone with domain knowledge reporting a story in great depth and a major publication reporting a simplified version for mass consumption is certainly not a new model.

I'd also strongly disagree that NYT is an ageing institution unable to adapt to this modern tech reality. John Herrman writes some of the most perceptive pieces about the state of tech out there:


(and a minor quibble: I don't think the post linked here and the Youtube Kids post are in any way comparable. The engineering writeup is not news in any way, shape or form, it's just a guide to how NYT implemented something)

Thank you for this good response. I didn't notice the NYT covering this story before, because I have cut NYT out from my life due to their malicious, partisan reporting. So maybe I should be less bold about my evaluations of them and just continue to enjoy my personally-curated, high-information-dense feed.

Doesn't this lead to vendor lock-in? All these Google proprietary services seem like they would be a big issue if they decide for whatever reason to migrate away from GCP.

They can rewrite it again on the next cool lang (rust,kotlin etc) using nanoservices and some new per-column/second-pricing db.

Distributing data across the /tmp directories of many AWS lambda functions is the future of storage.

Wait...you can do that?

It's a trade-off between time to market and risk of vender lock in. Also, typical tech stack got fully or partially rewritten every a couple of years.

Which ones? There are ways to use way, way more Google services in your architecture. They have a bunch of industry standard stuff there, presumably that's how they were even able to migrate from AWS in the first place.

well if you just use appengine without using a vendor lock-in service, i.e. using cloudsql instead of datastore, etc. than you probably won't run into trouble. but it looks like appengine still has it's momentum (they actually added java8 support lately)

It’s actually double lock-in, so 2x worse.

You used to have to just be afraid of lock-in, which I don’t think is as big an issue as it sometimes seems.

But with Google, you’re not only locked in but might be LOCKED OUT when they kill your product.

Don’t be ridiculous. Google killing a feed reader is a way different from Google killing a cloud service with paying customers and SLA agreements.

Like the QPX Express API?

Interesting point. However that’s not a google cloud product and never had an SLA (the QBX FAQ says “we do not guarantee support”). It’s also a unique case because of its reliance on third party data vendors.

If google starts killing their cloud products, I will eat my socks. Just let me wash them first.

?? This happened before.

This is from their legal agreement:

7.1 Discontinuance of Services. Subject to Section 7.2, Google may discontinue any Services or any portion or feature for any reason at any time without liability to Customer.

7.2 Deprecation Policy. Google will announce if it intends to discontinue or make backwards incompatible changes to the Services specified at the URL in the next sentence. Google will use commercially reasonable efforts to continue to operate those Services versions and features identified at https://cloud.google.com/terms/deprecation without these changes for at least one year after that announcement, unless (as Google determines in its reasonable good faith judgment)

So technically they can do it, though their enterprise customers likely have stronger agreements that require at least X time (probably 1 year) notice

Of course they can do it. I’m sure similar language exists in AWS and Azure agreements.

Look, I hate a lot of what Google stands for and where it’s going. But I find it very implausible they’ll kill any non-beta products that are part of google cloud platform. GCP is poised to take the place of AdWords as the google golden goose, helping them to diversify from their heavy reliance on advertising for revenue. They do not want to screw that up.

I’m sure they are well aware of the uprising that would cause amongst developers, aka the core customers of GCP. It would be a stupid move in a highly competitive cloud market, effectively telegraphing the fact that you can not rely on GCP services to exist in perpetuity. Their competitors would likely respond by re-implementing the shut down product with a compatible API so they could literally steal disgruntled users from GCP.

If you’re really concerned about this, the solution is pretty simple: don’t use GCP. If you want to use it, then only rely on the very core services that google clearly has strong incentives not to kill. Those would likely be VMs and any products that have an equivalent at another cloud vendor.

Silly statement. Google is in the cloud business and this is very different than a free product they offer.

The Flights API they just killed? Custom searches, which many websites paid for, which they killed?

Google has a habit of killing things, no matter if you pay for it and your business relies on it, or not.

If I remember correctly, they were required to keep that API up for a specified amount of time after the acquisition and they have kept it longer than that.

And that’s an excuse how?

Many of their cloud APIs are also acquisitions. The entire Firebase product line, and the Fabric.io product line are acquisitions.

Should we expect those to also disappear suddenly?

I think the difference is that the QPX was the byproduct of an acquisition (ITA Software) while Firebase and Fabric.io were the desired targets in those respective acquisitions.

Usually when a large company relies heavily on a cloud provider, they have an additional contract that specifies, among other things, advanced warning of any pending shutdown, often measured in years, to give them enough time to adjust and also to appease their shareholders and auditors.

And even without this, Google has a history of proactively notifying paying customers years in advance of termination of a commercial enterprise service. The Search Appliances are a perfect example -- EOL was announced a couple years ago but support has persisted for existing customers and only next spring will they finally be fully unsupported. Moreover, Google is actively offering migration plans & assistance to move GSA customers to the new Cloud Search service, or even to third party indexers like Elastic.

I get the gist of the OP's complaint, but like you said, that behavior pattern is just not tenable in the kind of operating environment Google Cloud finds itself in these days.

Disclaimer: I work for Google Cloud, but not on any of the aforementioned products.

If anyone wants to try collaborating on crosswords in real-time, try


You can upload .puz files or let it download from NYT with your subscription, then share the link with friends.

(Web only for now, sorry.)

Since there are people from NYT here, can you point me in a direction to help figure out why my streaks have been all messed up the past few months? Not sure if it's an app bug or something on the backend, but puzzles are retroactively being marked as being completed perfectly when they're not. Email in bio if you'd like to discuss more.

What does zero downtime mean?

No interruptions of services?

Or just that people could still log in all the time?

Pretty sure they mean people could login again after the switch.

I can't imagine what the purpose would be of capturing session data for each logged in user and transferring that over... I wouldn't even expect that of a fortune 500 company moving platforms.

If that is what they did, it warrants a post on its own.

Imagine if you were half way through the puzzle when the cutover happened, and then you lost your entire puzzle state and the board was reset. For the die-hard crossword players, this would be devastating.

Imho, in that case "zero downtime" is the wrong term.

Because it implies that nobody's running session went "down".

That's much harder because otherwise you'd just start a new service parallel to the other one, and flip a switch that directs all new logins to the new service.

It means users' puzzle and game progress was never interrupted, along with login sessions. A half-played puzzle before the cutover could be picked up as it was afterward.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact