Microsoft 365: Service degration (August 5, 2024)

Stop - Pixabay[German]On August 5, 2024, there will be another disruption to Microsoft 365 features and services, which will probably affect various users worldwide. Microsoft has already confirmed the problem under MO851360 in the status area for MS 365 administrators. The effects are likely to vary depending on the user. Some are experiencing sluggish access to Microsoft 365 features and services, while other users are unable to access them at all.


Advertising

German blog reader Andreas P. sent me the content of an email (thanks for that) in which the problem is confirmed by Microsoft.

Users may be unable to access or use Microsoft 365 services and features

ID: MO851360

Issue type: Advisory

Status

Investigating

Impacted services

Microsoft 365 suite

Details

Title: Users may be unable to access or use Microsoft 365 services and features
User impact: Users may be unable to access or use Microsoft 365 services and features.
More info: Impacted services include but may not be limited to Microsoft Teams.
Current status: We're investigating a potential issue with Microsoft 365 and checking for impact to your organization. We'll provide an update within 30 minutes.

X also references the above status entry, while the Office 365 status page showed no glitches when I checked.

Addendum: The colleagues at Bleeping Computer report here, that the disruption began at 18:22 UTC (19:22 CET) and lasted for over two hours. During this time, some services in North and South America were unavailable. In the Azure

Between 18:22 and 19:49 UTC on 5 August 2024, a subset of customers experienced intermittent connection errors, timeouts, or latency while connecting to Microsoft services that leverage Azure Front Door (AFD), as a result of an issue that impacted multiple geographies. The issue was limited to internal Microsoft services hosted on AFD, and did not impact external commercial customers using AFD.

What do we know so far?

We engaged several teams to investigate, and determined that the impact was caused by a recent configuration change made by an internal service team that uses AFD. Once that change was understood as the trigger event, we initiated a rollback of the configuration change which fully mitigated all customer impact.

How did we respond?

  • 18:22 UTC – Customer impact began, triggered by the configuration change.
  • 18:42 UTC – Monitoring detected issues, engineers engaged to investigate.
  • 19:23 UTC – Rollback of the change was initiated.
  • 19:25 UTC – Rollback of the change was completed.
  • 19:49 UTC – Customer impact mitigated, confirmed through telemetry.

What happens next?

Our team will be completing an internal retrospective to understand the incident in more detail. We will publish a Preliminary Post Incident Review (PIR) within approximately 72 hours, to share more details on what happened and how we responded. After our internal retrospective is completed, generally within 14 days, we will publish a Final Post Incident Review with any additional details and learnings.


Advertising

This entry was posted in Cloud, issue and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

Note: Please note the rules for commenting on the blog (first comments and linked posts end up in moderation, I release them every few hours, I rigorously delete SEO posts/SPAM).