[German]Blink is the series name of surveillance cameras from Amazon that can be found in many households. I am a bit suspicious of the Blink surveillance cameras, since they involve Amazon via a user account. But there’s more to come. As it currently stands, Blink cameras are on strike in certain regions because an Amazon data center is “under fire.” Addendum: It seems, that it wasn’t a fire, it was a failure in a data center cooling unit. This shows how shaky the whole cloud construction is.
I don’t have a lot of information – just a fresh tweet from June 10, 2021, in which Kevin Beaumont asks if the Blink camera is down for other users.
Was it a fire in data center?
Upon research, Beaumont then discovered that an Amazon (AWS) data center in Europe was on fire. The news about this fire can be found in the following tweet.
The details are given in the following status message: an increase in API errors is reported in an Availability Zone (euc1-az1) due to increased ambient temperatures.
Issue with EU Amazon Elastic Computer Cloud (Frankfurt)
That’s a really nice way of communicating that – the cloud is starting to sweat. You can also say it more drastically: In Frankfurt, the hut with the Amazon Web Cloud is on fire and nothing works anymore – even the surveillance cameras are on strike – no pictures of the fire.
They avoid to write about a fire, but I’m living in the suburbs of Frankfurt. We have Midnight in a nice summer night and the temperature is about ~ 20 degree Celsius.
Addendum: In a quick search on Twitter, I found that some people were tangentially affected because their applications suddenly couldn’t use the AWS cloud. Cite:
Engineers at numerous companies in Europe are now being woken up by their alerting systems to save businesses from catastrophic consequences.
Kinesis Data Streams, Kinesis Firehose, Amazon Relational Database Service, and AWS CloudFormation was also affected. Fits with the recent event that a software bug in the Fastly CDN shot down numerous websites off the Internet (see my German post Software-Bug Grund für Fastly-Ausfall).
Addendum2: In the meantime I found the following status message on Twitter. Due to the increased ambient temperature, a few EC2 instances in a single Availability Zone (euc1-az1) in the EU Central 1 region have shut down (power cycled).
They write that the ambient temperature is dropping and most of the affected EC2 instances are running again or are back in operation. They are also trying to reroute users to other available EC2 zones.
A cooling unit was down
Addendum 3: Now The Register has this article since 3 hours, dealing also with this event. There they cite:
While temperatures continue to return to normal levels, engineers are still not able to enter the affected part of the Availability Zone. We believe that the environment will be safe for re-entry within the next 30 minutes, but are working on recovery remotely at this stage.
A 4:12PM update reported that staff were still unable to enter the site for safety reasons.
It seems that the toxic gas could has been released to data centre rooms, so it’s not save to enter. Or a CO2 fire extinction system has been triggered.
AWS has updated its incident report and revealed that the incident was caused by “failure of a control system which disabled multiple air handlers in the affected Availability Zone.” According to The Register, the air handlers to cool the data center stopped working. Then the “ambient temperatures began to rise” to unsafe levels, so AWS servers networking kit shut down.
“Unfortunately, because this issue impacted several redundant network switches, a larger number of EC2 instances in this single Availability Zone lost network connectivity,” the AWS update adds. Then they wrote:
While our operators would normally had been able to restore cooling before impact, a fire suppression system activated inside a section of the affected Availability Zone.
When this system activates, the data center is evacuated and sealed, and a chemical is dispersed to remove oxygen from the air to extinguish any fire.
AWS staff had to wait for the local fire department to arrive and attest that the building was safe, as The Register cites. Once that signoff was secured, AWS says “the building needed to be re-oxygenated before it was safe for engineers to enter the facility and restore the affected networking gear and servers.” Well, that’s the risk of the cloud.
Amazon Elastic Compute Cloud (Amazon EC2)
According to this page, the Amazon Elastic Compute Cloud (Amazon EC2) web service provides secure, scalable computing capacity in the cloud. The service is designed to make cloud computing easier for developers. Amazon EC2’s simple web service interface makes it effortless to obtain and configure capacity. It gives you complete control over your compute resources as well as running in Amazon’s proven compute environment. Highlights:
- Per SLA guaranteed availability of 99.99% in each Amazon EC2 region. Each AWS region consists of at least 3 Availability Zones.
- The AWS model of regions and Availability Zones is recommended by Gartner as a proven approach for running enterprise applications that require high availability.
Amazon operates two data centers for Amazon Web Services (AWS) in Frankfurt. That’s all I could find out yet.
Cookies helps to fund this blog: Cookie settings