Hava Blog

Design for failure lessons learnt from the Sydney AWS outage

Posted by Rebecca Rumble on June 6, 2016

Sydney’s wild weather brought down an availability zone in AWS’s AP-SOUTHEAST-2 Region on Sunday night.

Websites went down, customer service calls went up, twitter went nuts, engineers scrambled to find work arounds and management started asking “Why?”. 

If your website crashed, you know by now that it’s probably because your application wasn’t designed for region failure.

One outage should not be reason for you to start thinking that the cloud isn’t right for you, or that you should move service providers. But it should make you revisit your architecture. 

Failure in cloud services is inevitable regardless of your provider. Outages happen so you must design for failure. Your actual infrastructure availability is irrelevant to your application availability. 100% uptime should be achievable even when your cloud provider has an outage regardless of its size.