Microsoft Azure Virtual Machines: Outage for 6 hours (2021/10/13)

[German]On October 13, 2021, Microsoft Azure experienced a disruption where virtual machine services were down for six hours (05:12 UTC to 11:45 UTC). Here is some information about it.


Advertising

The service disruption probably did not affect all customers who had booked Microsoft Azure Virtual Machines, but only a subset, as Microsoft writes in the status history page:

Virtual Machines – Mitigated (Tracking ID 0NC_-L9G)

Summary of impact: Between 05:12 UTC and 11:45 UTC on 13 Oct 2021, a subset of customers using Windows Virtual Machines may have received failure notifications when performing service management operations – such as start, create, update, delete. Deployments of new VMs and any updates to extensions may have failed. Non-Windows Virtual Machines, and existing running Windows Virtual Machines should not have been impacted by this issue. Additionally, services with dependencies on Windows VMs may have also experienced similar failures when creating resources.

Affected customers experienced Windows virtual machine error messages when attempting service management operations such as start, create, update, and delete. Deploying new VMs and upgrading extensions could also fail.

Interestingly, virtual machines not running Windows but Linux, for example, as well as Windows virtual machines already running were not affected by this glitch.

The cause was that calls during service management operations failed because the required version data of an artifact could not be retrieved. A required VMGuestAgent could not be retrieved from the repository by the backend compute resource provider (CRP).

The deeper reason was that the release architecture of the VM Gas Agent extension had just been migrated (as part of a migration of the old backend systems for service management) to a new platform. And this one uses the latest Azure Resource Manager (ARM) capabilities. This was solved by setting the corresponding extensions to the correct expected level (public in this case). (via)


Advertising

This entry was posted in Cloud, issue and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

Note: Please note the rules for commenting on the blog (first comments and linked posts end up in moderation, I release them every few hours, I rigorously delete SEO posts/SPAM).