Hacker News new | past | comments | ask | show | jobs | submit login
Twitter migrates data to Google Cloud (cloud.google.com)
680 points by theBashShell 88 days ago | hide | past | web | favorite | 234 comments



Disclosure: I work at Google Cloud (and directly with Derek and the Twitter team).

I’ll try to edit this page tomorrow when I’m at a computer, but there’s much more information in Derek’s talk at NEXT [1]. They (rightfully) didn’t want to get into a detailed “this is what we saw on <X>”, but Derek alludes to their careful benchmarking across providers.

While you should always assume smart people make economically reasonable decisions, Derek’s point about savings is about list price differences that result from total system performance (and not any sort of special discounting). I’m hoping a follow-up talk will let us say more about how the migration is going, while this talk was focused on the decision to move part of (!) their Hadoop environment to GCP.

[1] https://m.youtube.com/watch?v=4FLFcWgZdo4


FWIW I’m not a big cloud fan, but I was tasked with the obligation of finding the “least worst” cloud provider in terms of predictable throughout and total cost (inc dev/ops time) and GCP came out as a clear winner despite heavy financial incentives from both Azure and AWS. (We’re a very large corp).

Due to our new allegiance to google cloud I’ve been given a little more privileged access to engineers (after the fact) and I can tell that I definitely made the right choice, the people I spoke to favoured having a clean/clear backend with actual quality of life features; but since they don’t go nicely into feature comparison charts people often think that GCP is less mature or featureful.

I’m a real convert, and our company uses all three of the big cloud providers in some fashion- but my team only deals with GCP and we’ve had the least headaches.

What I’m trying to say is: you guys are doing great. I’m really happy with the product for my use-cases.


I have a 240+ core cluster running on GCE now with additional appengine frontend for some api items, and I loathe the environment. I've used Heroku, AWS, Azure, Digital Ocean, and IBM SoftLayer in the past. I worked on service readiness and billing on Azure at Microsoft.

If it works for you thats good, but it wouldn't be my first choice even if I found it to cost less than other alternatives.


Interesting; where does it fall down for you?


Yeah bump. I rarely read about people not liking GCP (aside from their customer service of course) but would love to hear any issues you may be having.


As someone whose worked with multiple cloud providers (at very large scales), I would not recommend google cloud simply because of support issues I've had and clearly misrepresented service capabilities and limits.

I will agree that GCP has some nice quality of life features, and is certainly favorable for a small project compared to some other providers, but I find it hard to trust google's ability to keep up with the demands of a large organization.


Same here. I'm the tech lead responsible for cloud infrastructure at a company with a large cloud presence, and have been doing this for years. Google has better technology in a lot of ways, but awful customer support and even customer treatment. A billing glitch on their side tore down all of our infrastructure and data at a previous employer, without so much as a "we're sorry". With AWS, if such a thing happens, you have an email from your account rep.


In what way(s) do you find GCP better than AWS? My company was using both for a while but has migrated more towards AWS lately. But I'm just learning my way around cloud development now so I don't have insight into the major differences between providers.


GCP has the fastest, simplest, and cheapest primitives for building. Organization and project hierarchy combined with IAM permissions (integrated with G Suite if you use it) make security and access easy. Every project has its own namespace and can be transferred to different owners or billed separately.

VMs don't have a mess of instance types, just standard but customizable cpu/ram with any disks and local SSDs attached. Live-migrated, billed to the second, and automatically grouped per cpu/ram increments for billing discounts.

Networking is 2gbps/core upto 16gpbs/instance with low latency and high throughput regardless of zone and doesn't need placement groups. VPCs are globally connected and can be peered and shared easily across projects so that it's maintained in one place across multiple teams. Fast global load balancing with a single IP across many protocols and immediate scaling.

Storage is fast with uncapped bandwidth and has strongly consistent listings. BigQuery, BigTable and lots of other services have "no-ops" so there's no overhead other than getting your work done. Support is also flat rate and not a ripoff percentage.

The major downsides to GCP are the lack of managed services, limited and outdated versions for what they do offer, poor documentation and broken SDKs, and dealing with the opinionated approach of their core in-house products. This usually means as a startup that you are more productive on AWS or Azure because you can get going in a few clicks with a vast ecosystem, while GCP is a better home for larger companies that have everything built and just need strong core offerings with operational simplicity.


Your reasoning and the conclusion resonates a lot with my own findings for a project that we tried to bootstrap. GCP's GKE and Istio integration are becoming more compelling to consider them for containerized workloads. EKS isn't quite there. One of the things we are struggling with is finding a qualified partner who can help us build our new project on GCP. We don't have a mature infrastructure team and while we try to bootstrap one, we would like to rely on partners to help us move the needle. Even partners in GCP premier list don't have case studies of migrating monoliths to GCP. Most of them talk about GSuite migration which isn't quite the same. AWS wins this battle. They have far more mature partners with better track record and thought leadership. I wonder if you see this the same way. Maybe partnership is not crucial for you? Some insights from the community would help.


Yes. AWS biggest advantage now is the giant marketplace of vendors and partners so you can get help and managed services for just about anything.

In my experience, most startups just want managed options they can run themselves rather than engaging partners but if that's what you need then AWS will have more companies to offer, although GCP does have qualified partners. I recommend contacting one of the GCP developer advocates either in this thread or on twitter for help, or email me separately and I'll put you in touch.

Also I havent worked with them but https://shinesolutions.com/ has put out plenty of articles and case studies that seems to show they're pretty capable, that might work for you.


Just curious, Which support tier you are using from GCP?

For an early stage startup/Indie developer, GCP support model is not appeasing. $100 Per user for development role is unacceptable. (at least to us, the stage where we are now)


We use the production roles but you don't have to signup every single person, just those who file tickets and interact with support.

If you're very early then you can also try the older support pricing: https://cloud.google.com/support/premium/

If that's still too expensive then you can probably rely on the free support and forums until you have more spend and revenue.


Depending where you're based, there are some very good partners that can help here. It really depends on what level of cooperation you're looking for from "here, you do it" to "just give us guidance along the way".

https://cloud.google.com/solutions/migration-center/ is a reasonable jumping off point. Sorry if that didn't come up more clearly.


We have talked to Velostrata and some other GCP partners. It's hard to assess their usefulness. Unfortunately, they don't have much of open source credibility. Picking by case studies seems like picking out a car by reading advertising material.


I did some perf testing of van’s across azure, gcp and aws.

Gcp was always exactly the same perf, on the button exactly.

Azure was all over the place, fast then slow then fast then slow.

Aws was in the middle.

If you want guaranteed perf then go with gcp 100%


> The major downsides to GCP are the lack of managed services, limited and outdated versions for what they do offer, poor documentation and broken SDKs, and dealing with the opinionated approach of their core in-house products. This usually means as a startup that you are more productive on AWS or Azure because you can get going in a few clicks with a vast ecosystem, while GCP is a better home for larger companies that have everything built and just need strong core offerings with operational simplicity.

That is spot on. After a year of research and testing on GCP and AWS, I concluded with the same thing. AWS is much more startup/indie-friendly.


Our blocker in using GCP is that they do not offer a managed Oracle Database. AWS does with RDS.

We hate Oracle like the next person, but we're locked in. And there's no way we're going to the "Oracle Cloud"


That's more on Oracle than Google, and probably unlikely to ever happen at this point. They've already increased licensing costs to make it more expensive to run in other clouds.


You know, that's kind of weird. You'd think they'd be really worried about the general trend and would be introducing smaller, cheaper versions of cloud Oracle. Are they just that convinced that people will stick with Oracle? I wonder what makes them think that. I'm not familiar enough with Oracle to take a guess.


What about running oracle on a google instance with a regional disk? Sure, it's not as sexy as a managed service but functionally it would be very similar.


Not only is it not sexy, it's probably the most work. You're now administering the server, storage & the database; you're the DBA for your Oracle instance and databases; Oracle loves to hammer this scenario for licensing (they'd rather you pay for their "cloud") and paying a premium for the underlying hardware that needs to managed remotely.

I can understand why this is a no-go for the GP.


Yeah, I understand it too; We're currently being squeezed by MS licensing in GCP (but it's basically free in Azure!).

This is the cost of lock-in.

Personally I'd rather have the expertise on staff than pay oracle (and microsoft) for this shitty behaviour.


What are you talking about, google support is absolutely "some rip off percentage". If you are saying it isn't you clearly aren't working with it at any sort of scale.

Edit: I'm being downvoted but it's true, if you need enterprise support you're paying the same percentages you would on any other cloud service.


https://cloud.google.com/support/#support-options

The role-based support is flat-rate but you're right that the Enterprise tier is $15k or percentage spend. From my experience, most companies are fine with the 1-hour production tier.

I haven't worked with GCP beyond 6 figure scale but I dont think that matters. If you need the 15-min response times and the TAM guidance then that's the fee but at least you can opt-out if you don't.


Hey! I'm actually glad you asked. :)

So, obviously my experience is based on my use-case so things that are important to me are predictability of resources and consistency of data.

From a resource perspective (GCP:Compute) there seems to be a tendency of setting CPU affinity for certain cores, often in the same NUMA zone on the host. This means my tickrate doesn't get hijacked on certain cores randomly. That's quite nice honestly and it's something that on-prem hyperconverged VM solutions don't manage. The NUMA zone affinity / samezone affects our performance quite drastically as certain applications (such as in-memory databases or gameworlds) allocate large chunks of memory and then do little updates millions of times a second. This means memory bandwidth is very important.

GCP also has live migration of instances which is 100% transparent to the application (even the CPU clock seems to be moved over), unlike my colleagues on AWS who get 24h notifications of host machine maintenances I've never had to "deal" with even a single instance outage since starting up on GCP 1y ago. But I would assume if AWS hasn't moved on solving this problem they will soon.

Things like VPC networking being inter-zone by default is something that just makes sense and solves some headaches that I hear other teams having.

From a storage performance perspective, on AWS we were hard-capped to 5GBps per instance to S3, such a hard limit does not exist for GCP instances to cloud storage so we're able to make use of it. (in fact, I can often saturate my 10GBit interface limit even to non-google instances). A n AWS sales rep that this hard-limit can't be removed no matter how much we pay.

I can't attest to the difference in APIs, since I use terraform and that abstracts away anything I would be using.

Regarding storage again; we had reps from AWS talking to us at length and they would not guarantee that fsync() would be honoured on elastic storage. They stated that "it should never be needed" but honestly I'm not comfortable with that answer. I mean, my bare-metal database instances haven't gone down in 3 years but it doesn't mean I'm going to start decreasing the consistency for raw throughput.

There's a lot more but these are the most important things that I can think of right now.


Hi! I'm from the EC2 team.

The fixed performance EC2 instances (in contrast to instances with burstable performance) all have dedicated physical CPU allocations, and 1:1 affinity from vCPU to the underlying CPU. Similarly, memory allocations are fixed, pre-allocated, and are aligned to the NUMA properties of the host system. When combined with both our Xen hypervisor and Nitro hypervisor, this provides practically all of the performance of the host system with regard to memory latency and bandwidth.

See: https://twitter.com/_msw_/status/1045032259189760000

With our latest 25 Gbps and 100 Gbps capable instances, EC2 to S3 traffic can make use of all of the bandwith provided to the instance.

Please send me more information about the interaction you had on the EBS side, msw at amazon dot com. All writes to EBS are durability recorded to nonvolatile storage before the write is acknowledged to the operating system running in the EC2 instance. fsync() does not necessarily force unit access, and explicit FUA / flush / barriers are not required for data durability (unlike some storage devices or cloud storage systems). Perhaps there was confusion about the question that was asked.


Can attest to very little CPU performance loss via OpenMP applications over bare metal. Maybe as little as 5% for our use case (numerical modeling of weather impacts).


Re: NUMA, I had thought it was fairly straightforward to peg VMs to particular NUMA nodes (or declare CPU or memory affinity) in VMware. I agree GCP does it well (better than the other public clouds!), but I've also seen this done well on-prem.


I have too, which is why I singled out hyper-converged.

In hyper converged environments the storage and the compute live together on a service mesh and thus some processing is done on the hypervisor to manage storage, those processes don't have an affinity (in the case of vmware for example) so a guest VM core gets suspended for a couple of hundred clock ticks sometimes.


Hi! I'm from the EC2 engineering team.

This is a super challenging problem for general purpose virtualization stacks. EC2 has been working to avoid any guest VM interruptions. With the Nitro hypervisor, there are no management tasks that share CPUs with guest workloads.

See more here, including data from a real customer with real-time requirements running on EC2 under the Nitro hypervisor: https://youtu.be/e8DVmwj3OEs?t=1796


That's a good point, I have seen cases where we've separated a portion of nodes for VSAN or ScaleIO out to a separate cluster to avoid this behaviour on latency sensitive workloads. These nodes don't need a ton of RAM and you'd better have good E-W bandwidth for them, but it's always a tradeoff...


Hi! I'm from the EC2 team.

This is a great tool to help uncover if CPU and memory NUMA within a VM aligns to the physical host or not:

https://www.cl.cam.ac.uk/research/srg/netos/projects/ipc-ben...

Paper from the USENIX 2012 conference: http://anil.recoil.org/papers/drafts/2012-usenix-ipc-draft1....

EC2 instances have been NUMA optimized since we launched our CC1 instances in 2010. I encourage you to try ipc-bench on your cloud provider of choice, if CPU and NUMA affinity is important to your workload.


Regarding the 5 gbps cap, they said at re:invent 2017 that they increased the cap to 25 gpbs. Did that never happen?

https://m.youtube.com/watch?v=9x8hz1oRWbE


Yes it did: https://aws.amazon.com/blogs/aws/the-floodgates-are-open-inc...

However it may depend on instance type and other factors. This complexity is a problem with AWS.


What 24 hour host machine maintaince? Maybe once a year I get an email 1 month in advance that a server will be rebooted or I can do it myself... I stop/start on the week and don’t get another email for another year...


Three years on GCP here... I’ve not once received one of these notifications. In my prior 7 years on AWS, these happened all the time, especially when you run a lot of VMs in different zones/regions (probably less visible if you are all in one zone).

Live migrations is amazing.

Now if only Google paid more attention to certain features of their GLB... amazing tech with showstopper bugs means amazing tech I can’t use. (Also means it’s not so amazing)


It depends on a lot, like region, instance type, security issues, etc. No notices is still better than any, and live migration also helps with reliability by moving your VM if possible instead of it going down with the host.


I think he means the notice is sent 24h before the scheduled maintenance. They don’t always provide a lot of advance notice.


That's why I'm confused, 7 years on AWS and never had less than 1 month notice.


Sounds like you've never seen an actual hardware failure; you've only seen scheduled maintenance. You don't get any notice at all with host failures. You get an email like this:

"We have important news about your account (AWS Account ID: XXXX). EC2 has detected degradation of the underlying hardware hosting your Amazon EC2 instance (instance-ID: i-XXXX) in the us-east-1 region. Due to this degradation, your instance could already be unreachable. After 2018-12-28 19:00 UTC your instance, which has an EBS volume as the root device, will be stopped."

The stop date is 2 weeks after the email is sent, but as the message states, the instance is likely already dead. It was already dead in this case. GCP will live migrate your instance to another host.


That's the email I get. I had 1 in 2018 but it was 4 weeks from the received date, not 2. Only a couple of our servers are older than 12 months tho, we usually generate new AMI's roll them out to keep them up to date.


I wouldn't jinx it.

Like the parent said; we used to receive these notices quite frequently and the amount of time varies wildly. I should ask the teams who currently/primarily use AWS perhaps the situation has changed?

But it was not uncommon to have 24hrs of notice, especially for instances in US-EAST2 for some reason.


We use us-west-2, but I'm touching wood non-the-less.


Can you explain more about that 5Gbps per instance to s3? Is that documented anywhere? How many prefixes were you using?


This was 5Gbps until January of last year, when it was increased to 25Gbps:

https://aws.amazon.com/blogs/aws/the-floodgates-are-open-inc...



Aside from pricing (different workloads yield different results), I'd say that:

GCP has much more modern APIs (they got to do "right" the first time around, while AWS had to learn from their "mistakes")

GCP is less mature with documentation, but still excellent (AWS is first class). I've found bugs, and others I know have run into straight up wrong docs - a quick message to support yielded the correct information and docs update within the week.

GCP is far more "opinionated" about how you should run a service. AWS is opinionated as well, but less so. What this means is that while GCP will sell you resources in traditional way, to go GCP native or from scratch really requires buying into the GCP "way" of doing things more than AWS does. Basically, K8S or go home.

We use GCP and I enjoy it quite a bit. I have extensive AWS experience as well, but no strong preference.


GCP is not done "right the first time". Most of the interesting things you see in GCP are 3rd-5th generations of internal products.


How is that a contradiction? You are saying that by the time the products made it to become external SPIs, they were already battle tested, right?


That's true, but I wasn't trying to contradict, only explain why some Google's products feel so polished since early access. Also explains why some designs may feel awkward: they were assuming the user to be Google products, with all the ballast of building for a billion users.


we are struggling to maintain price parity on Google Cloud versus both Azure and AWS.

AWS now has the new arm machines in production which are cheaper. AWS also allows yearly prepaid managed databases - while Google still charges per second billing (with some monthly discount). My AWS TCO comes out to be much, much lower (almost 30%) than GCP if i take into account committed use/prepayment discounts.


Interesting; maybe it's not solving the same use-cases for you then :)

For us: on AWS/Azure we had to benchmark each instance when it came up and if it performed poorly it had to be reaped and redeployed constantly. The increased overhead in dealing with things "the AWS way" was enough by itself to offset the pricing difference with the steep AWS discount we were getting.

Understanding administrative overhead is complicated but I'm sure you are taking into account the cost of humans. (or, you've already absorbed the cost of automating it?) :)

As with all things; use what works for you. I'm just very happy with GCP over the others coming from bare metal.


it depends on usecases - i run a data heavy usecase. We deploy using docker swarm, so our redeployment is fairly well taken care of.

in terms of benchmarking - this is not something we care about (since we dont have so much CPU utilization), so probably its easier for me.

For my usecase, AWS has better administrative tools than google. I use Route 53, SES, S3 - in all those three, no other products come close (transfer acceleration...im looking at you).

The one place that I agree with you is Dataproc. That's far superior to EMR in terms of spin-up/down.


I dislike the concept of committing to use. It creates a disincentive to optimize jobs, because "we've already paid for three years of this".


It's up to you to see it the other way around: if you optimize jobs you can fit future workloads in the infrastructure you've already paid for.

I prefer to think of reserved instances as a purely financial trick.


The pay-as-you-go reductions give you more flexibility about when you make optimisations to your code though and help you handle uncertainty around this.

For example, if I have a new workload that won't fit on existing hardware with AWS, I can either: a) reserve the new instances, or b) not reserve on the assumption I will be able to optimise in the 'near' future.

In practice this is hard to know but AWS forces you to come to a correct decision up front, whereas Google Cloud allows you to not have to answer this up front without being punished later for making the 'wrong' choice.


Right, but that requires your new workloads to fit well with the resources you have committed to. If you've committed to many small machines, they may not be well suited to an especially heavy workload. If you've skewed your commitment towards a large amount of memory per compute (or vice versa), you can find yourself with lots of RAM (or compute) and no natural use-case.

At the bottom line, commitment impedes change.


Your point is valid, thanks for the answer.


From a startup POV, I'll give you the other perspective.

I now have a significantly reduced opex that saves me a lot of money. If I grow very fast (and outgrow it)... I really dont care, because I will most likely have a financing event.

If I dont grow fast enough to justify the spend, I have bigger problems.

In addition, I have to commend Google and Azure here - the way they do committed use is very flexible. They price it on number of units (cores, RAM, whatever) - so if you outgrow it, you still get the discount + full cost of additional units. On AWS, if you outgrow, you have to frikking sell off your machines on their auctions and buy new ones.

Only problem is that Google doesnt do this for databases (which are very expensive)


> On AWS, if you outgrow, you have to frikking sell off your machines on their auctions and buy new ones.

I don't believe this is entirely true. The AWS reserved credits are good within the machine family. So a t2 credit is good for all t2 instance sizes. Not as flexible as CPU/RAM credits, but more so than it was in the past.


true - but in multiples of instances. so 2xt2.xlarge = 1 t2.2xlarge. But if you have 3xt2.xlarge, then the swap cannot be partially for 2xt2.2xlarge . So you have to sell one of the t2.xlarge.

its not very nice. that's why they have a reserved instance marketplace - https://aws.amazon.com/ec2/purchasing-options/reserved-insta...

what is new is the convertible reserved instances. I havent used it - and you may be right there. But i still maintain the GCP/Azure way is light years better


Which is why amazon’s own tool suggests buying smalls or nanos.

I do think gcp and azure do it better though.


I like google's implementation of it, since you commit to use a certain amount of CPU, and RAM, and not instance types. So you can commit to a 4cpu/15GB ram system, and a year later, double the size of the vm, and you still have the cheaper price on the first 4CPU/15GB of ram. And its per project. (I would love it if I could purchase per organization)


Yes, GCP has the best billing model. You purchase CPU/RAM capacity and it works the same whether on-demand, sustained use discount, or commitments.

Separating capacity from instance type makes it much more natural and easy to use for your actual requirements.


From the Twitter Hadoop to GCP talk that Boulos linked:

- >500k cores

- >300PB storage

- >12,500 cluster size

- >1T messages per day

And there's also this other talk, "How Twitter Migrated its On-Prem Analytics to Google Cloud" - focused on their migration to BigQuery:

- https://www.youtube.com/watch?v=sitnQxyejUg

- 20 TB/ day of raw log data, >100k events/sec

- Loading ~1TB/hour into BigQuery.

- Serving 5,000+ complex queries / second. p99 ~300ms

Disclosure: I'm Felipe Hoffa and I work for Google Cloud https://twitter.com/felipehoffa.


At this scale I'm a little surprised Twitter saved money on the migration. My assumptions were that at this scale the upfront cost of migration trumped any kind of difference in annual maintenance, esp. considering you still do need an SRE team anyway who knows big data well enough to solve any and all software related problems


At that scale the team itself likely costs peanuts.


yeah but you still need the IT team too. Not sure, from my conversations at cough cough other media giants the economics of moving to cloud don't always work out at massive scale


Oh, certainly. I was talking more about the fact that the monthly cost of 125.000 instances likely beats the cost of your IT team whatever it’s size is.

I’d be interested in their cost/benefit analysis for this one.


Yeah, what I mean is it's not just cloud costs vs. local hosting costs. You have to come up with different processes for SRE and IT (removing mundane hardware work, accounting for cloud idiosyncrasies), benchmark, make sure your hosting cloud can handle scaling up to some upper bound estimate of your future user base, etc. At a company the size of twitter these become legitimate concerns. I've been throttled on certain services of certain cloud providers at 600 data nodes lol... of course twitter can negotiate higher allowances but massive scaling doesn't happen with the flick of a switch, sometimes you need big, long-term, expensive contracts


12,500, not 125,000. Still a lot ;)


Automation, the twitter team is full of SV veterans who have run into many problems at scale years ago. At least it used to be.


> 1T messages per day

How? I mean there are 7B people on earth. Are people that active on Twitter? Or does "messages" mean something else here?


A message in this case probably doesn't mean a Tweet. A message if probably just a message on the bus/in the queue and any action on Twitter probably produces one. Like, retweet, tweet, follow, sign up, everything.


Oh if the likes are included then it makes sense.


Bots, news organizations, and teenagers. ~143 messages/day per human doesn't seem unreasonable.


There are tons of bots on Twitter


A message probably counts every tweet a person views at a minimum.


It's stunning how many resources we as a species are wasting on something of so little net positive value to humanity as Twitter.


I'm not a fan of or user of Twitter, but I can't deny that anything that allows people to communicate instantly, easily, and broadly is a benefit to humanity.


So in your opinion Facebook is a benefit to humanity also? Which conflicts with nearly every study of it ever done?


Only on HN would I see a comment start with

> Disclosure: I work at Google Cloud (and directly with Derek and the Twitter team).

Thanks for making this community awesome. I want to work at Google or with Google one day. One of the few companies I have always admired and always will.


It is interesting point to bring up but nonetheless discount pricing is not out of question.

It is possible that while GCP gives a good value based on the writing on the tin, someone else might beat that with a good margin under special terms.

That being said, I have reasons to think very few could beat Google on compute or storage game and definitely much harder to beat the big G on Networking side of things.


Companies like Twitter would get a significant discount from any provider, though I imagine AWS as market leader would have less incentive than GCP to go quite as deep.

Disclosure: I worked for GCP until early 2015, but I haven't worked for Google since then, have no inside info on this deal and am not relying on anything from my time at Google for this comment.


I wrote a blog post on our implementation of cloud sync and migration to Google Cloud / Firebase.

https://getpolarized.io/2019/01/03/building-cloud-sync-on-go...

I like the google storage API the most honestly. Firebase is really nice but the storage API is very polished and basically does exactly what you need it to do without the complexity of S3.


I tried to use both GCP and AWS and got killed with egress charges from GCP. I suppose I got to pick one and stick with it, so I picked AWS.


Both clouds charge significantly for egress. Were you sending a lot of network traffic to SaaS vendors that were hosted in AWS but not GCP? That would make your choice pretty natural, assuming you wanted to stick with those vendors.


Yes, the majority of my clients is on AWS so I have to stick with it.


well most probably twitter might not be able to cope up with future attacks or the cost to mitigate them.


> Derek’s point about savings is about list price differences that result from total system performance (and not any sort of special discounting).

Obviously that's the thing you'd choose to highlight in a public Google-hosted presentation after a strategic partnership that is a bonanza in marketing for Google Cloud. It wouldn't work so well to tell a room full of engineers "We moved to Google cuz they gave us a fat discount, one you can't get because you're not Twitter!"

The fact that you allocated a top level URL for it is embarrassing (https://cloud.google.com/twitter/). Can you imagine if Amazon advertised https://aws.amazon.com/netflix/?


> The fact that you allocated a top level URL for it is embarrassing (https://cloud.google.com/twitter/). Can you imagine if Amazon advertised https://aws.amazon.com/netflix/?

It's a case study. Google didn't create a one-off URL for Twitter, but have the same URL pattern for all their public case studies:

- https://cloud.google.com/chilean-healthcare/

- https://cloud.google.com/now-ims/

- etc.

Amazon has absolutely the same, except that their URL is slightly different, but I didn't think that there is a URL police which determines which URLs are considered embarrassing for case studies and which not:

https://aws.amazon.com/solutions/case-studies/netflix/


Reading most comments here I feel like nobody wants to give a tiny bit of credit to either Twitter or Google, which I find disenchanting.

It is true that any cloud provider would get good publicity by being able to say that Twitter runs on their cloud, but that is precisely the same reason why we shouldn't just shrug this off as a publicity/business decision, because just like Google might have tried to persuade Twitter other cloud providers will most certainly have tried their best too. If Twitter still went with GCP then I think it is only fair to assume that there must have been some real advantage in going with GCP. I don't think anyone who says this must have been a pure business decision is entirely honest with themselves, because a cloud which cannot handle Twitter's volume is no good even if it was entirely for free.

I have nothing to do with Twitter (actually don't even like Twitter because it is such a toxic social media platform) and I don't like many things which Google does, but as an engineer myself I can only say that after using Azure, AWS and GCP for many years in a commercial setting that GCP is indeed years ahead. The quality, speed and reliability of GCP is second to none from my experience and I honestly couldn't say the same about AWS and especially not at all about Azure.


I'm not worried about GCP's quality or technical merits.

Google's management is what scares me. I'd never build a business around a company with so little follow-through, commitment to product longevity, or focus.


> The quality, speed and reliability of GCP is second to none from my experience

Can you be more specific on what reliability GCP is years ahead of the others? I'm guessing numbers will be pretty comparable on all platforms so if we want to declare outright winners it would be useful to cite a number to indicate where you experienced this.


the quality, speed and reliability of GCP is second to none?

I hope this is a joke. AS someone who started on GCP, and got tired of the TERRIBLE quality and reliability of their docs among other issues saying that they are ahead is an absolute joke.

I've found some corner cases with AWS, but got quick resolution even on things they have in "beta" - and consistency from documentation through to system is high.

Secondarily, AWS seems to support even old tech FOREVER. I have a simpledb based app. That tech is 9 years old now. Still ticking. When you are building stuff up over time and can't afford to rebuild on the new hotness every other year, this is nice.

Anyone know the revenue AWS and GCP generate? Seriously hard to believe GCP is so many years ahead in this space from my own experience.


The scale of these megacorps is crazy! I wonder how you even plan to move that much data. I know that Amazon has a service called Snowmobile for stuff like this. They probably did something similar here.

"You can transfer up to 100PB per Snowmobile, a 45-foot long rugged sized shipping container, pulled by a semi-trailer truck."

https://www.youtube.com/watch?v=8vQmTZTq7nw


This is Derek, from the video. Sorry I am just joing the conversation mow.

For this use case, we setup 800 Gbps of interconnect with Google.


So at 300PB it took about a month to transfer? Sounds like an interesting project in itself.


Yeah! We’ll be operating in a hybrid model steady-state, so the links aren’t just for the one-time copy.


That's insane.

With the interconnect being faster than the network cards by several orders of magnitude (I assume unless there is special hardware), how is it structured to utilize all of the 800Gbps?


We are using dedicated clusters to push data over the links.

A member of our team proposed a session for Google NEXT '19 that goes into the architecture we are using in a lot more depth. We're waiting to see if that talk is accepted or not, but in any case we're planning to share more soon.


Wow!


Migrations like these take time and happen step by step.

First you can try to replicate some data and move read instances. After that step by step the rest.

Its really not possible to do that overnight or months even.

Fyi: im currently migrating twice the Twitter size one in the world infrastructure from aws to azure now.


Isn‘t exporting 600TB of data from aws ridicously expensive?


Twitters number is in PB, not TB. At that scale, assume that none of the normal pricing and procedures apply.


600TB would come out to around $50000 which for Twitter is peanuts. This is their bandwidth pricing though but I guess they'd use snowmobile export (if that exists?) instead.


For reference, 300PB (the amount referenced in the article) using the same pricing would cost $25M. Still might be a reasonable one-time cost for migration for a large corporation, though, even disregarding the "bulk" rates they'd see at that scale.


AWS Snowmobile cost is $0.005/GB per month which comes out to $1.5M/month for 300PB for however long the data is on the truck.


> Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.

https://what-if.xkcd.com/31/


The mention of "SneakerNet" reminded me of old times when we had a bunch of old Macintoshes (pre-OSX) connected with a proprietary local network called LocalTalk. It was so unreliable that we had a couple of 3.5" floppy disks that we used to transfer files when it didn't work; the disks were labelled "LocalWalk". :D


Is there anything new here? Twitter announced this move back in May of last year: https://blog.twitter.com/engineering/en_us/topics/infrastruc...

The video linked on the page is also from last August: https://www.youtube.com/watch?v=T1zjmNAuMjs

Throw a 2018 tag on this?


Correct me if I'm wrong, but this doesn't say what Twitter moved to the cloud. It could be literally anything, and may not in fact be core tweet or user data.

This strikes me as a team (or teams) outgrowing the storage options that were offered internally and choosing to outsource to fit their needs. Isolated use for one business case, and not indicative of a broad movement to the cloud on the part of the whole org.

I could be wrong, of course.

I'd like to know more. And I'd certainly like to know what the engineers at Twitter think.


The post on Twitter's side goes into more detail: https://blog.twitter.com/engineering/en_us/topics/infrastruc...

Relevant technical bits:

> Today, we are excited to announce that we are working with Google Cloud to move cold data storage and our flexible compute Hadoop clusters to Google Cloud Platform. This will enable us to enhance the experience and productivity of our engineering teams working with our data platform.

> Twitter runs multiple large Hadoop clusters that are among the biggest in the world. In fact, our Hadoop file systems host more than 300PB of data across tens of thousands of servers.


You are not wrong. The message here is that whatever data was transferred, Google was able to handle hundreds of petabytes of data. It could just be billions of photos of cats


billions photos of cats is invaluable data, mind you


Well Twitter is now part of the official White House record, isn't it?


'cold storage'

It's going to be logs data, backups, and users original uploaded data (rather than reencoded versions) that they hope will be valuable one day.


you missed this part "and our flexible compute Hadoop clusters"


Every post like this is just PR at the end of the day. Google’s goal is to prove to potential customers they can play with the big boys (AWS and Azure).


If you are using containers I would argue GCP is the place to be.

If you are still using full VMs AWS is the place to be

Azure is what I am running into when I get contracts in the midwest and I am really not sure why they are using it :)


If you're working with retail customers (retail giants included), AWS is pretty much banned in order to not pay a competitor, so you end up on Azure.


Heh.. forgot that startup I worked at and ended up at Rackspace because of that :)


The next global Twitter downtime?

Their credit card failed at Google Payments, acc gets marked for fraud and no one in support can help them recover. Twitter is gone for good.


I don't think any decent size Google Cloud customer pays by credit card.


I think it was a joke.


This. GCP may be good and all, but it scares me to hell how 1) Liberal they are with blocking payment accounts 2) The fact that your payment account is tied to everything! If it goes down, your entire account is locked. Anything on GCP goes down. Hell GSuite goes down. Say goodbye to your accounts, your purchases and your identities.

If you also use Google Domains or Fi, also good luck getting your number and email.. You're basically locked out of all your accounts. With minimal chance of recovery.


One of my customers had recently (past month) their credit card maxed out, and GCP/GSuite continued working for well over a week. We just kept receiving daily emails about payment issues until the card had its limit increased.

This was on the zero-support GCP/GSuite package.


This doesn't happen with medium or large sized orgs that use a reseller.


This has long been fixed.


Every few weeks there's a story about something in Google's automated machinery failing a customer with no recourse except to post an angry blog/twitter/hackernews note and hope it get's noticed. The core issue (bad and powerless customer service) isn't fixed and all that changes is the particular way it may bite you.


When you're Twitter sized, that was never a problem to begin with. For mere mortals, that continues to be a problem.


Where's twitter gonna twit if they go down?


6-7 months isn’t too long[1]. If this is the fix you were taking about it happened last year. According to Wikipedia GCP started in 2008 and generally available in 2011 as App engine.

It took them 7-8 years to solve billing?

[1]https://medium.com/@serverpunch/why-you-should-not-use-googl...


This is analytics data. Likely won’t bring down the site but delay internal reports and ETL pipelines


Not even close.


I've scoured the comments and article, but I might have missed it... What did Twitter migrate from?? Did they have their own data centre before moving to Google Cloud?



Another step backward for the decentralisation of the Internet... not that Twitter was decentralised in the first place, but it's unsettling to see Google taking over more and more of the "big pieces" of Internet.


Funnily enough, Twitter and Google were/are neighbors in a QTS datacenter in Atlanta.


Just moving across the street then? It's probably the fastest data transfer ever by physically moving all the hard drives!


Any inside stories on what twitter discovered when they benchmarked Google with other providers. I imagine they must have done a very exhaustive evaluation


For a company of Twitter's size, benchmarks likely played a small role compared to price negotiations. Being able to say Twitter runs in your cloud is worth a lot to any cloud provider. They're likely operating at steep discounts and probably played off all the major providers to get the best offer.


Agreed to all of what you said, but also, performance in some circumstances is a discount. One of the Google Cloud employees said "Derek’s point about savings is about list price differences that result from total system performance (and not any sort of special discounting)."

I don't have any internal numbers, but imagine if Twitter could meet its needs with an N-node Redshift cluster or an N/2-node Dataproc cluster, because Dataproc is so much faster at flipping the bits or whatever Twitter wants to do (probably network performance really). Google could charge more per node and still come out cheaper because of performance.


Or, Google thought the publicity was worth a gigantic discount.

(EDIT: The provided quote from a Twitter employee hints at this directly: "the savings")


I did a gig for <huge content producer/delivery company> and at the time mostly did AWS implementations (I work at AWS now full time, for what it's worth). The client was really interested in going with Google Cloud because they were promised essentially free storage, with no limit.

This is a company with more petabytes of video footage in their data lake than I've seen before and GCP wanted to give them free storage for the prestige (and possible upside, longterm) of having Big Company on their list.


"savings" does not imply discounts, mind you. GCP is actually cheaper on most fronts compared to AWS, Azure.

Quote from reply above: "Derek’s point about savings is about list price differences that result from total system performance (and not any sort of special discounting)"


Google may be playing the long game. I've seen an effect on my team after moving from physical hardware and centralized capacity provisioning to GCP with immediate gratification hardware. It's easier to spend money so we ask for more stuff more often.


it is The Jevons paradox [1]. I have seen it as well happening when I migrated a client from a standard cloud architecture to a serverless architecture.

>>> In economics, the Jevons paradox (sometimes Jevons effect) occurs when technological progress or government policy increases the efficiency with which a resource is used (reducing the amount necessary for any one use), but the rate of consumption of that resource rises due to increasing demand.

[1] https://en.wikipedia.org/wiki/Jevons_paradox


A comment from a Google Cloud employee seems to suggest otherwise.

> Derek’s point about savings is about list price differences that result from total system performance (and not any sort of special discounting).


There's a lot of ways that both can be true. For example, metering variables that are friendly to how Google does things.


This is an interesting case, sorta like Amazon going to NYC. If you give discounts and lower your profit you get something, plus PR. If you don't you loose it all.

But if all ask for the Twitter discount, you're screwed. However, by by next year no one can compare the offerings side by side. Better machines, cheaper offerings etc.


This is Derek.

We shared a bunch of stuff in this video: https://youtu.be/4FLFcWgZdo4

One of the interesting benchmarks was GridMix performance at scale. And overall performance for throughput-oriented workloads like Hadoop was generally strong. Most cloud benchmarking I have seen sticks to micro-benchmarks and single-node benchmarks, so they miss this.


It will be simplistic to assume such decision is tech-driven. Per my experience, it is mostly business driven. Who offers the best deal with the lowest price, gets the business.

You'd be surprised, if you are a big enough customer, even better if you are a reputable one, how much catering the cloud provider would be willing to offer.


Hadoop is an offline storage mechanism used for analytics, random access benchmarks barely matter compared to the overall cost of storage+compute so long as access time aren't atrocious


Perhaps they did a solid analysis, but Twitter doesn't appear to be a rational actor. Look at Jack Dorsey's recent interview as an example, at one point he names 3 separate things as priority #1: https://www.huffingtonpost.com/entry/jack-dorsey-twitter-int...

Apparently this is a trend going back years, it doesn't seem to be doing anything good for Twitter or Square: https://sg.finance.yahoo.com/news/big-twitter-investor-defen...


People have to come to realize that nothing that gets said at this level should be taken literally. At the C-level of Fortune 500 companies, everything is crafted to create a narrative.

People claiming that Twitter or Facebook has a political bias are missing the real point. For example, I think that Zuck really, truly, doesn't give a fink about political bias. How could he? He'd never get any sleep! In his position, all he cares about is the money. If that takes making this or that comment or public statement or testimony to Congress, it doesn't matter. It's all geared towards and carefully crafted towards keeping the money flowing.

In that sense, he and the top brass at a place like Facebook are all political whores, so to speak. Selling their convictions for whatever keeps the spigot open.


You mean Square. Stripe is led by the Collison brothers but is in the same industry.


Sorry, all the payments companies just kinda blend together for me (having had to deal with too many of 'em). Corrected the post above.

On a sidenote, Square, Stripe, Payment Ninja, <insert payments startup> so far seems to be lipstick on the pig that is the underlying debit/credit networks, often with a hefty price for said lipstick. The US badly needs a better version of Zelle that supports business transactions.


For recurring/invoice transactions, you could use GoCardless: https://gocardless.com/en-ca/pricing/ We don't use debit/credit card networks but bank account->bank account networks like ACH, SEPA, etc.

Not yet available in the US. Coming soon though.


Well hopefully google does not decide to abandon google cloud :)


Hopefully Google doesn't abandon search!


Hopefully Google doesn't abandon human civilization, and choose intelligent machines over biological ones!


Search+Ads:Google Windows+Office:microsoft

Those are the sacred cows.


Microsoft has been hacking off bits of dependency on Windows these days in order to push and support their cloud business, to the point I would argue Microsoft is aggressively trying to push people off of Windows+Office and onto Azure+365. Microsoft would far rather you use Linux on their cloud platform over having you use Windows literally anywhere else.

I don't think Windows counts as a sacred cow anymore at Microsoft.


They are king of search and distant third in the cloud game.


Huge win for Google. I hope this helps Twitter focus on features and not firefighting infrastructure issues.


Does Twitter need more features? Perhaps they could focus more on customer support and moderation.


A downvote button would be nice. The ability to edit a tweet would also be nice. It’s 2019 after all, and we can edit things on other platforms. But yes, moderation would be great. Maybe they can leverage Google’s machine learning hardware and smarts.


Just makes the eventual Google acquisition one bit easier.


What would Google gain from the acquisition that they don't already have?

AFAIK Twitter allows search indexing and is already a publisher on their network.


>>What would Google gain from the acquisition that they don't already have?

Google will "enhance the tweets" by showing 4-5 "relevant ads" before you even see the tweet. Tweethancer


They don't provide their realtime firehose for free, though.


Surely that cost isn't anywhere near $24B, which is the market cap for Twitter.


For 24B, you get Twitter (an asset), plus its Firehose. I was mostly pointing at then Twitter has data that google does not.


Yeah, they've failed at every attempt of making a pure social network (they bought youtube, its a kind of social network) with Microsoft buying github, it would be prudent of them to buy Twitter before someone else does.


That crossed my mind the minute I read the headline too. I can imagine folks at Google getting excited about this first step...


It will be interesting to watch. My hunch is that after looking at the shit storm that is happening around Facebook, Google may be having second thoughts about being in the social networking business (youtube excluded).


With Twitter's current state and Google's moves to kill their social side, this is unlikely. Why buy a moderate sized social network when you just killed your similarly sized social network?


If you're referring to G+, I don't think G+ was a "moderate sized social network" any more than twitter (although in opposite directions).

Also, imo, calling them "similarly sized" would be incorrect, except perhaps in signed up users (from which google got a ton from the "login with google and your account is already setup", not daily active users)


300 million monthly active users on G+ to Twitter's 335 million monthly active users, they were nearly identical in size.

https://en.wikipedia.org/wiki/Google%2B#Growth

https://en.wikipedia.org/wiki/Twitter (see sidebar)


Google may have been exaggerating these numbers. From https://www.blog.google/technology/safety-security/expeditin...

> Google+ currently has low usage and engagement: 90 percent of Google+ user sessions are less than five seconds


Is that countint actual use of Google Plus or does that count anything that connects to it (comment boxes, YouTube, etc.).

Hell, the Chromecast screensaver photos are hosted on Google Plus.


Don’t forget that they ran logins and notifications for everything through the Google+ interface and did things to juice the numbers like uncontrollable push notifications every time someone you’d ever exchanged email with joined (hi random person who bought a bookcase on Craigslist!).

The G+ numbers looked suspicious if you ran any sort of public website, where e.g. I saw metrics for referrals or shares at least an order of magnitude lower than Facebook or Twitter.


Wow, really?? In that case, what caused G+ to fail? I was always under the impression that it was due to (low) DAUs.


The Wikipedia article goes on to talk about G+'s very low engagement. I wouldn't be surprised if Twitter's engagement rates were somewhere near Facebook's.

> User engagement on Google+ was low compared with its competitors; ComScore estimated that users averaged just 3.3 minutes on the site in January 2012, versus 7.5 hours for Facebook.[22][23]

(https://en.wikipedia.org/wiki/Google%2B#Growth)


That makes you question if they actually had 300 millions "active" users.


High MAUs doesn't necessarily translate to high DAUs.


Indeed, and high DAUs doesn't translate to a high number of engaged users either. Just a 301 HTTP redirect through a domain as part of a chain can count as DAU and mean nothing.


yeah i visited g+ primarily for the active dev communities oriented around google products. def just once or twice a week and the activity on those "circles" was pretty slow


Slow growth/lack of user traction was what caused Google to pull the plug. Its a similar overarching problem to Twitter, but more of users being disinterested than actively repulsed due to the content the most popular users of said service post...


Hun. I was under impression that at higher scale of operations building your own cloud is more economical. I guess money is not an issue for Twitter.


They are moving batch jobs to GCP. Because there probably the occasional burst loads and the often underutilized capacity makes using a flexible compute fabric (aka cloud provider) a better choice.

If they are moving everything, well, Netflix did that too. Maybe opportunity cost, maybe having your engineers work on your core product instead of running VMs is cheaper altogether.


Much of Netflix’s spend would presumably be on CDNs, which they run themselves not on AWS.


uh correct me if I'm wrong but all CDN's that Netflix purchases also need to have a large storage cache backing them right? Meaning each CDN Netflix uses for local caching also requires a colocated datastore to circumvent their centralized-bandwidth issue.


You're not wrong per se but Netflix takes a similar route to Google Global Cache in that they provide the hardware and place it inside the networks of other ISPs etc etc. So it's a CDN in the sense its a distributed content delivery network but not in the sense that they just use large traditional CDN providers.

Netflix provides massive storage boxes to ISPs that serve content from within the network of the ISP the user is connecting from. This can save the ISP a lot of external traffic so they generally want to do this to save costs and meet customer demands. YouTube does a similar thing.

They call it OpenConnect:

https://openconnect.netflix.com/en_gb/


I don't think Netflix puts their content on CDNs. A prerequisite for Netflix entering a country is whether AWS has a datacenter in that country (or for smaller countries, near that country). For example, Netflix only offered services in Australia when AWS opened an Australian datacenter. If Netflix were distributing their content via CDNs, then it wouldn't matter so much where AWS datacenters are, it would only matter where the CDN edge nodes were. I suspect that Netflix has far too much content to host it economically on a CDN.


The reason Netflix would potentially wait for a proximate AWS datacenter is because all of their apps, backend services, and interface UIs are served from EC2 instances; all of the actual content delivery is in fact handled by their FreeBSD-based OpenConnect appliances. In other words, no, Netflix doesn't put their content on other, third-party CDNs like Cloudfront, Fastly, Limelight Networks, etc., but they do absolutely serve it all from their own, custom-built CDN/hardware.

[1]: https://openconnect.netflix.com/ [2]: https://fosdem.org/2019/schedule/event/netflix_freebsd/ [3]: https://news.ycombinator.com/item?id=11129627 [4]: https://www.slideshare.net/aspyker/container-world-2018 [etc.]


not sure about GCP, but for other cloud providers Hadoop clusters are reserved hourly and don't actually work well for saving money on batched computes. This is due to Hadoop clusters requiring physical data colocation to meet performance needs (i.e. avoid non-rack-local maps) - even if you were to come up with a by-the-hour compute payment mechanism, you would need a by-the-long-term data storage mechanism that could persist to the point that you could spin up co-located compute capable of operating on that storage... not nearly a trivial problem


I'm guessing Twitter runs their own infra for the really heavy throughput stuff. They're big enough that I'd imagine economies of scale kicked in a long time ago.

It sounds like they moved stuff like logging/metrics and query systems onto GCP, which makes sense because as another poster said utilization is probably bursty.


Indeed, these workloads are bursty. They also tend to involve running lots of different processing frameworks over the same data, which makes the value proposition for separating compute and storage stronger.


This thread, and site, tells the tale. Google is using this for marketing purposes. They already have some trite 2 minute marketing ad running as well. The deal that Twitter received is going to come with extensive incentives unavailable to typical operations. Twitter also was likely able to play the typical providers against each other. 'Oh, well AWS is offering me [x,y,z]. Can you do better?'


Well yeah but that is the case for anyone large enough.


side note, if you want to know why distributed infrastructure engineering has been the place to be recently, think about the fact that engineers making $200-500k/year are making decisions that can save big corporations $1-5mm per year in compute/storage costs


The issue with GCP and with Google in general is support. I'm certain that's not an issue for Twitter as there are probably multiple people on call just for the Twitter deal. But it might mislead the average company down a bad path believing there is decent support with GCP. Any custom issues (like billing) and it's a headache dealing with Google.


What Twitter is paying for and the level of service they receive from Google, is very different from the typical GCP pricing and service level that would be available for say, your company...

Realistically, it can't be considered as the same service, can it?


One thing to remember with the benchmarks is that when evaluating a cloud provider, the benchmarks are useless if talked about with absolute numbers or with regards to hardware specs. Because they're so horizontally scalable, it's more a question of average cost per compute operation or cost per (peta) byte.

Both AWS and Google clouds are perfectly capable of running Twitter, and running any software that Twitter uses, even if the number of machines is different. The only "benchmark" applicable is actually the total negotiated cost of the machines required to get the job done.

So it's not about being able to do things on Google cloud with fewer processors or less RAM or faster hard drives - Google was willing to give them a lower total cost of ownership for reasons known only to Google.

Do not assume that Google (or anybody else) will give you a similar preferential deal. Ignore the "benchmarks".


I don't think it's purely a measure of cost per compute operation. That assumes you would run the same software on any cloud using plain VMs or a service that slightly abstracts them. That may be true in this case since Twitter will continue to use Hadoop/Spark, but sometimes you can get a real advantage from switching to a service that only one cloud provider offers. For instance, someone in these comments pointed out that Twitter already migrated their anayltics workload to BigQuery. Evaluating BigQuery vs Redshift vs something else is not as clear cut as cost per compute operation.


If you want to check which cloud is best for you, run your application on all of them, measure your average+median+p95 cost per whatever (tweet/post/reader/dollar of revenue/ticket/?). Then factor in which platform is easier to work with, has better tooling and community support, because until you hit a certain scale your time will always be more expensive.

After you do all that, just start with Heroku and Postgres with some AWS Lambda, then move to ALB + AWS Fargate + RDS|Aurora / DynamoDB as you get bigger, then to NLB + ECS with a cluster of 20-80 On Demand and Spot Instances, then to a cluster of ARM instances on Spot.

If you need to build your own datacenter at that point, you'll know. And you would have built a 12Factor app to make all of the above work, so migrating will be easy.


How are they gonna move all that 300 pb of data to Google's data center? A fleet of semitrailer trucks?


For instance AWS has 50 Terabytes devices that they ship from your own data center to their data center: https://aws.amazon.com/snowball/


They also have the 100PB Snowmobile.

https://aws.amazon.com/snowmobile/


Disclosure: I work on Google Cloud (with Twitter).

Pipes! That data isn't static either. As Derek said in a sibling comment, they setup hundreds of Gbps of peering. Each 100 Gbps gets you a Petabyte per day (roughly). Planning the move and doing the tooling to keep stuff synchronized takes longer than the actual transfer once you've got say 800 Gbps (just 30 days ish). And again, you're going to have lots of data churn per month, so you want pipes anyway.


I'm curious what happens to everyone at Twitter who was employed in managing their old data centre.


Well I'm sure they still have amazing career prospects just based on their resumes.


Sure, but what's the actual process for dealing with them? Will they be let go in small batches as the infrastructure moves over to GCP, or what?


This isn't all of Twitter's workloads moving to GCP. Seems like just cold storage and some aspects of their Hadoop workloads.


So what will happen to the small Cloud team in NY? I thought they were developing their own dc.


This really makes me wonder if Twitter finally made up their mind who they will merge with.


I too was thinking this, does this mean a future acquisition is not out of the question? Frankly, I find a Google owned Twitter to be a terrifying prospect. Not that Jack is doing a great job there...but I'd much rather they be owned by multiple companies in a partnership than by one company. Twitter, imo, is far far more powerful than Facebook in driving world events given the leadership and journalists on the platform.


I think by this announcement, it is probably safe to say, even at the size of Twitter, it is probably still more economical to handle your operations to bigger players.

Hosting your own infrastructure makes less and less sense going forward, the cloud is the new king now.


Probably not safe to conclude anything. Twitter has very different requirements than something like a bank, an insurance company, an auto manufacturer, etc.


I dont believe that this becomes a nail in the coffin for hosting your own infrastructure. This should still be evaluated on a case by case basis.


More beneficial for Google I believe.


I know my comment won't move this discussion forward in any fashion, but I have to ask anyway.

What was this data, and why did they feel they needed to keep it?

I mean - 300PB is nothing to sneeze at. I can understand wanting to keep important data around (IP, for instance, or recent server logs for the last 30 days or even a year) - but this amount seems to go well beyond that.

In fact, it almost seems like they have stored every single tweet ever written. In addition to probably all of their logs, and who knows what else.

Why are they keeping all of that information? If it is all the tweets ever written - they don't have copyright to it, so such information wouldn't belong to them...

I'm not a big twitter user - can I search for a tweet on their system that I (or someone else) made say...5 years ago or longer? If this is a bunch of tweets - is that why they keep them around?

Are people in general aware of this? That is - if they are tweets - that Twitter has stored all of them, forever and ever, and that they are all still accessible (if not by the general public, then by law enforcement at the very least)?

If they are tweets - do users have any recourse at being able to remove those tweets (aside from those that they likely are legally bound to keep - I am thinking of the current POTUS's tweets, for instance)?

What is the ultimate purpose behind keeping all of that data? Even if it wasn't a bunch of tweets, whatever it is can't really be that useful otherwise, except for maybe the most recent stuff created in the past year, at best. Things like that - logs, etc - might be destined for a rotated "cold storage" and would ultimately "age out". Accounting records (financial data, etc) probably isn't that great a percentage of the data (if at all?) - but whatever was there, too, would likely need to be saved for a greater range - but even there maybe only 10-15 years worth (and it certainly wouldn't even comprise a fraction of the bulk).

So what is it? Why is it being kept? To what purpose? While 300PB isn't that great of an amount in physical terms (that is, the storage medium probably doesn't take up a great amount of space in a datacenter), it still isn't anything tiny, either (especially depending on what medium was used - I mean, I doubt they are storing it all on a ton of 256GB microSD cards - though that does bring up an interesting idea/thought to mind - but I digress).

I'm of the thought that companies shouldn't just "throw away" data - but it should be tempered by "importance to humanity over time". Not everything should be kept - but, for instance, it would have been nice if we (humanity) had kept around the original transmissions from the surface of the moon, or the blueprints to the rocket that took humans there. But we probably don't need all the various interoffice memoranda and fluff at the various NASA contractors (we probably don't need the accounting records - but BOMs might be useful).

That's just a couple of examples I can think of off the top of my head, but history is rife with instances of companies imploding or being bought out or restructured in such a manner that data is just "destroyed" instead of kept. Or lost, or otherwise made unretrievable to humanity.

Here with Twitter - an active company to be sure - we have the seemingly exact opposite - almost like r/datahoarders were in charge.

...and that should concern people a bit, regardless of which company it is - but especially a company like Twitter.


I think the vast majority of people would expect Twitter to keep all the tweets they've made...


This feels like marketing material. Twitter jumped over to GCC because of the offer they received.


The first clue was that the news was being hosted on https://cloud.google.com/


Next step - acquire twitter?


I definitely wonder if Twitter's management considered this. Any price for Twitter goes down by at least a few million if no major cloud migration is necessary.

That said, a few million is a rounding error on a price Google would pay for Twitter.


wow. talk about "loss of service challenges".


[Removed]


LOL


I have had a hard time dealing with twitter engineering since the account activity API debacle.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: