Team in London started working the problem and called in reinforcements from elsewhere;
Upper management (me and one other person) got involved as it was serious/not resolved fast;
I spoke with the network team in London who seemed to have a good handle on the problem and how they were working to resolve but we decided to wake a couple of other smart folks up to make sure we had all the best people on it;
Problem got resolved by the diligence of an engineer in London getting through to and talking with DQE;
Some people went back to bed;
Tom worked on writing our internal incident report so that details were captured fast and people had visibility. He then volunteered to be point on writing the public blog (1415 UTC);
Folks in California woke up and got involved with the blog. Ton of people contributed to it from around the world with Tom fielding all the changes and ideas;
Very senior people at Cloudflare (including legal) signed off and we posted (1958 UTC).
No one had an axe to grind with Verizon. We were working a complex problem affecting a good chunk of our traffic and customers. Everyone was calm and collected and thoughtful throughout.
Shout out to the Support team who handled an additional 1,000 support requests during the incident!
The incident itself and lack of response (for HOURS) from Verizon's side is absolutely unacceptable. It's 2019, filtering ALL of your customer's routes according to - at least - the IRR (including the legacy ones connected to the old router in the closet) and having a responsive 24/7 NOC contact in PeeringDB are a matter of course.
Proper carriers like NTT go above and beyond simple IRR filtering nowadays with things like peerlock (http://instituut.net/~job/peerlock_manual.pdf).
AT&T uses RPKI and was completely unaffected: https://twitter.com/Jerome_UZ/status/1143276134907305984
EdgeCast used to be my favourite CDN. Not sure how they are doing now.
I love the shaming of Verizon without the sugar coat. Divisive for sure, but a welcomed one.