Why numerous IT systems around the world failed due to two errors on July 19, 2024

Stop - Pixabay[German]On July 19, 2024, there were numerous outages to Windows IT systems worldwide. Operations at airports came to a standstill, banks could no longer work, trains were canceled and companies sent their employees home (e.g. Tegut) because the IT systems were no longer working. However, it was not a cyber attack, but the simultaneous occurrence of two errors – independent of each other, which led to the failure of functions. On the one hand, there was a failure of the Microsoft cloud, relating to Microsoft 365 and the relevant services. And there was a faulty update for a security solution from the US provider CrowdStrike, which caused Windows systems to crash with a blue screen. Here is a brief follow-up.


Advertising

IT systems, including critical infrastructure, down

It sounded dramatic what was reported to me by readers in the early hours of July 19, 2024. Blog reader Michael said in a short message "Something is going on in the world… Berlin Airport and and other airports are probably completely down". Readers subsequently sent me messages by email and via Facebook as well as on X. Someone wrote to me on X: "Microsoft must be having real problems right now. See the posts on Twitter under Microsoft. BSODs. Probably something to do with Crowdstrike.".

Many outages worldwide

The reports in German told me that the IT of critical infrastructures has also been affected – airports are one of them. On X, a reader pointed me to the following tweet – at most airports in India, everything is down – the passenger received a handwritten boarding pass.

IT-Ausfall: Handgeschriebene Bordkarte

German article says that airports, trains, radio stations etc. are affected. Others reported that gas stations and banks are also affected. It wasn't just German airports that were affected – I've also seen reports that hospitals have been affected. Websites of insurance companies, telecomunication providers and stock exchanges were reported to be partially unavailable. The checkout systems of several retailers were down.

As outages have also been reported for Amazon Web Services (AWS), Mercedes and Continental AG are also affected. The same applies to airlines Turkish Airlines, KLM, Lufthansa, Ryanair. It is unclear to me what is due to CrowdStrike and what is due to Microsoft 365 and the technical disruptions described below.


Advertising

The Washington Post has also published a summary of the outages here. The colleagues at Bleeping Computer have compiled further examples from the USA and around the world here – there is little going on at the moment, even fire departments and police are affected.

Bitlocker grounds administrators

This BlueSky post states that the CrowdStrike fix must be deployed manually for each individual machine. However, in the post Windows systems throw BSOD due to faulty CrowdStrike update I had explained that there are other workarounds. However, these cannot always be applied – a reader pointed me to this tweet where you can read:

Too bad everything I have is Bitlocker encrypted, including the Window Server hosting the Bitlocker recovery keys! Having to type a 48 digit number on 5K machines, my carpel tunnel will scream and my eyes will melt.

Someone is facing the mess of requiring a bitlocker key to boot on 5,000 machines. The above post belongs to this thread and a blog reader notes:

Ultimately, the systems can't be fixe via patch at all. This is manual work. Have fun when a few people look after thousands of distributed systems. The workaround is also only applicable if you have the Bitlocker unlock key, for example…

That sounds like quite a lot of weekend work and the systems are likely to be down for longer unless the administrators come up with another trick.

Two independent events

At first it looked like the Microsoft cloud problems and the faulty CrowdStrike update were related. But they are two separate events that happened to occur at the same time or overlapped.

  • The US cyber security company CrowdStrike had rolled out an update for its security system, which caused crashes with BlueScreen on Windows as the host. This caused numerous IT systems worldwide to go offline. I have compiled the details known so far in the blog post Windows systems throw BSOD due to faulty CrowdStrike update.
  • At Microsoft, on the other hand, a configuration change to Azure led to problems with the cloud. As a result, the Microsoft 365 services (OneDrive, SharePoint etc.) were no longer accessible and the apps from this suite no longer worked. I have now reported on this situation in the article Worldwide outage of Microsoft 365 (July 19, 2024).

The consequences are likely to be serious for some companies, and I assume that the global IT disruptions will lead to losses running into billions.

Wake-up call to rethink the IT infrastructure

I mentioned it earlier: the fact that even critical infrastructure such as airports or hospitals were affected should be a clear wake-up call for IT managers. German Prof. Dennis Kipker, who is active in the field of cyber security, wrote on X:

Absolute worst case and reason why we should not place unlimited trust in the cloud! From KRITIS to the state to industrial companies. The best example of how the mantra "IT from the socket" is a fallacy!

He is alluding to the fact that more and more IT services are being outsourced to the cloud and nobody really understands what is happening where. I would put it this way: fundamental principles of IT reliability are constantly being violated – a "single point of failure" means that nothing works at airports, petrol stations are on strike, trains can no longer be dispatched, checkout systems are on strike when shopping, and half of companies are no longer able to work.

Imagine if there had been a supply chain attack on CrowdStrike and instead of a faulty update, malware had been introduced that then loaded kill software or ransomware. When will the responsible CIOs realize that things can't go on like this? If a determined cyber actor gets something like this through a supply chain attack, they can plunge the world into the digital abyss. Or how do you see it?

Similar articles:
Worldwide outage of Microsoft 365 (July 19, 2024)
Windows systems throw BSOD due to faulty CrowdStrike update
Why numerous IT systems around the world failed due to two errors on July 19, 2024
CrowdStrike analysis: Why an empty file led to BlueSceen
Review of the CrowdStrike incident, the biggest computer glitch of all time


Advertising

This entry was posted in Cloud, issue, Windows and tagged , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

Note: Please note the rules for commenting on the blog (first comments and linked posts end up in moderation, I release them every few hours, I rigorously delete SEO posts/SPAM).