My clients were affected by this outage but I was impressed at the way Cloudflare handled it…:
[…] In the summer of 2019, Cloudflare experienced a significant outage. We had to take down our service and quickly put it back up globally. Additionally, the incident was highly visible since customer websites became inaccessible. As incident planning was solidly in place, the team simply followed the game plan:
- Get the right people in a conference room right away
- Mobilise the cross functional group
- Give everybody a specific job to do (note taking, decision making, technical analysis, customer communication, and so on)
- Be available to customers immediately to explain and offer assistance
About a week later, we published a detailed post-mortem. This transparent, detailed communication generated a great deal of goodwill with customers and industry partners. It all came from having a clear incident response plan in place from the start.