{"id":40730,"date":"2023-10-20T15:09:40","date_gmt":"2023-10-20T15:09:40","guid":{"rendered":"http:\/\/startupsmart.test\/2023\/10\/20\/massive-amazon-web-services-outage-caused-by-human-error-startupsmart\/"},"modified":"2023-10-20T15:09:40","modified_gmt":"2023-10-20T15:09:40","slug":"massive-amazon-web-services-outage-caused-by-human-error-startupsmart","status":"publish","type":"post","link":"https:\/\/www.startupsmart.com.au\/uncategorized\/massive-amazon-web-services-outage-caused-by-human-error-startupsmart\/","title":{"rendered":"Massive Amazon Web Services outage caused by “human error” – StartupSmart"},"content":{"rendered":"
\"Watching<\/div>\n

A massive outage of Amazon Web Services on\u00a0Tuesday that led\u00a0to disruptions for huge numbers of cloud-based businesses and websites was caused\u00a0a technician inputting an incorrect command, Amazon says.<\/p>\n

The Amazon Web Services (AWS) facility in North Virginia suffered outages on February 28, leading to a number of websites that\u00a0rely on its services to\u00a0slow down or experience\u00a0intermittent outages.<\/p>\n

AWS is one of the biggest providers of cloud-computing services for websites and businesses all over the world.<\/p>\n

It offers\u00a0database storage, on-demand content delivery, and other essential infrastructure services.<\/p>\n

Prominent businesses like\u00a0Xero, Slack and Square were affected by the issue, as were websites like Atlassian\u2019s Bitbucket, Github, and Kickstarter,\u00a0according to VentureBeat.<\/em><\/a>\u00a0Hundreds of other websites were also affected.<\/p>\n

The services are now back online. AWS released a statement<\/a>\u00a0today\u00a0addressing the cause of the outage, blaming it on an incorrect command that was entered by a technician whilst debugging the company\u2019s Simple Storage Service (S3).<\/p>\n

\u201cAt 9:37AM PST, an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process,\u201d AWS said.<\/p>\n

\u201cUnfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended.\u201d<\/p>\n

This caused a chain reaction, where the removed servers had been supporting another set of two servers, which required them both to be restarted.<\/p>\n

\u201cWhile these subsystems were being restarted, S3 was unable to service requests,\u201d AWS wrote.<\/p>\n

Companies should implement \u201ccontrol\u201d systems<\/h3>\n

The company has said it will be making \u201cseveral changes\u201d such as system safeguards as a result of the incident.<\/p>\n

This course of action\u00a0is essential, says cyber security expert at Sense of Security Michael McKinnon.<\/p>\n

\u201cWhat AWS is doing is implementing a control, which are mechanics [that] companies can use to prevent a certain action from happening,\u201d McKinnon told SmartCompany<\/em>.<\/p>\n

\u201cThere was really no protection in place beforehand to stop someone at AWS from taking that certain action [leading to the outage], so now they\u2019re implementing a technical control to prevent it.\u201d<\/p>\n

McKinnon believes the nature of AWS\u2019s infrastructure resulted in the significant failure, with the system being built and maintained by the company itself.<\/p>\n

\u201cNaturally there will be an occasion like this, an unintended consequence of the system discovered by human error,\u201d he says.<\/p>\n

Unfortunately for SMEs, McKinnon says there\u2019s not much to be done about human error, even with \u201call the systems in place\u201d.<\/p>\n

\u201cHuman error is human error, and businesses will suffer. You can have all the systems in place, but at some point something will rely on human decisions to be made,\u201d he says.<\/p>\n

\u201cYou can\u2019t know all the exposure points, even if you do the best brainstorming possible you still won\u2019t identify all of them.\u201d<\/p>\n

\u201cGood businesses build systems around people\u201d<\/h3>\n

In situations like system outages, McKinnon says it\u2019s best to deal with the fallout \u201cswiftly and efficiently\u201d and implement any measures possible to stop it from happening again.<\/p>\n

As for what those measures are, \u201cgood businesses build systems around people\u201d, McKinnon says.<\/p>\n

\u201cLook at these things from the perspective of what the business could have done. Did you provide enough training?\u201d he says.<\/p>\n

\u201cIt\u2019s best to always build your system to cater to human error.\u201d<\/p>\n

Finally, in communicating outages or errors to customers, it\u2019s important for businesses to differentiate the scenario from a data breach or hack, believes McKinnon.<\/p>\n

\u201cIt\u2019s important to convey is as an outage not a breach, as it\u2019s a common misconception. \u2018Oh no they\u2019ve been hacked\u2019 is a default view, so it\u2019s good to reassure people,\u201d he says.<\/p>\n

\u201cPaint human error in a way where we can deal with it. It is and always will be a fundamental tech issue, and it\u2019s part of who we are.\u201d<\/p>\n

SmartCompany<\/em> contacted Amazon Web Services Australia but it had no\u00a0further comment on the outage.<\/p>\n

This article was originally published on SmartCompany<\/a>.<\/em><\/p>\n

Follow StartupSmart on<\/em>\u00a0Facebook<\/a>,<\/em>\u00a0Twitter<\/a>,\u00a0LinkedIn<\/a>\u00a0and iTunes<\/a>.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"

A massive outage of Amazon Web Services on\u00a0Tuesday that led\u00a0to disruptions for huge numbers of cloud-based businesses and websites was<\/p>\n","protected":false},"author":2,"featured_media":61107,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/www.startupsmart.com.au\/wp-json\/wp\/v2\/posts\/40730"}],"collection":[{"href":"https:\/\/www.startupsmart.com.au\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.startupsmart.com.au\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.startupsmart.com.au\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.startupsmart.com.au\/wp-json\/wp\/v2\/comments?post=40730"}],"version-history":[{"count":0,"href":"https:\/\/www.startupsmart.com.au\/wp-json\/wp\/v2\/posts\/40730\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.startupsmart.com.au\/wp-json\/wp\/v2\/media\/61107"}],"wp:attachment":[{"href":"https:\/\/www.startupsmart.com.au\/wp-json\/wp\/v2\/media?parent=40730"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.startupsmart.com.au\/wp-json\/wp\/v2\/categories?post=40730"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.startupsmart.com.au\/wp-json\/wp\/v2\/tags?post=40730"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}