Several customers of Amazon Web Services (AWS) have experienced tough times due to the complexity and market dominance that made it difficult to back up their data with other providers, Reuters reported.
Amazon said that an impairment of several network devices in its AWS Virginia data center region has caused the prolonged outage on Tuesday. The outage temporarily interrupted streaming platforms Netflix and Disney+, trading app Robinhood Markets and even Amazon’s own e-commerce site, which makes heavy use of AWS.
An Amazon spokesperson said that the issues had been resolved.
The trail of damage from a network problem at a single region called US-EAST-1 underscored how difficult it is for companies to spread their cloud computing around.
Amazon is the world’s biggest cloud computing firm with 32 percent of the overall market, according to Calays. Rivals like Microsoft’s Azure (21 percent share), and Alphabet’s Google (8 percent share) are trying to lure AWS customers to use parts of their clouds, often as a backup.
Crafting a complex online service that can be shifted from one provider to another in case of emergency is far from simple, said Naveen Chhabra, a senior analyst with research firm Forrester. Rather than being a singular cloud, AWS is composed of hundreds of different services, from basic building blocks like computing power and storage to advanced services like high-speed databases and artificial intelligence training.
Any given website, Chhabra said, might use several dozen of those individual services, each of which must work for the site to function. It is difficult to make a backup on another cloud provider because some services are proprietary to AWS and some work very differently at another provider.
Another issue that makes it hard for businesses to diversify is that AWS makes it relatively cheap to send data into its cloud, but then charges higher prices for “egress fees” to get data out of its cloud to take to a rival.
“That amplifies issues like this (outage) when they happen,” said Matthew Prince, chief executive of internet security firm Cloudflare. “A more resilient cloud is one where egress fees are eliminated and customers can be multi-cloud. I think that would actually increase the faith customers have in the cloud.”
AWS itself has critical dependencies within its own services where they are linked together in ways that can cause one to fail when another fails, said Angelique Medina, head of product market at Cisco Systems’ ThousandEyes. That is because AWS’s complex services are often built on top of its own more basic services. One problem that crops up with a basic function like networking can cascade through services that depend on it.
AWS earlier said “the outage was affecting some of our monitoring and incident response tooling, which is delaying our ability to provide updates.”
Medina said AWS have critical services clustered in its US-EAST-1 region, where another outage last year also had a widely felt impact.
Chhabra, Forrester analyst, said Amazon has done a lot of heavy lifting to make its own services resilient. What Amazon does not do for its customers is build applications in a way that can withstand an outage by tapping multiple locations or providers.
“It’s this tradeoff you always have between something that is decentralized, something that’s secure and something that’s useable,” said Charly Fei, product lead for Inter Blockchain Communication lead at The Interchain Foundation, which is focused on technologies for decentralizing computing. “It’s not something where you’ll ever get a perfect solution that gets all three.”