[German]Why is the quality of Windows 10 poor? Within my blog posts, I discuss from time to time some reasons. Now there is a new voice. Former Microsoft employee @Barnacules explains in a YouTube video why the quality of the current operating system and especially the updates from his point of view 'goes down the drain' and is currently 'in the basement'.
I was on the road for the weekend and therefore mostly offline. But Garman blog reader Rudi K. thankfully sent me an e-mail on Sunday informing my about the topic. I'll pull the things a bit together, describe the points of criticism of the Microsoft employee and classify the whole thing a bit. Because it's not that new, the ex-Microsoft employee confirms, what I've posted in several articles within my German blog in the past.
The view of the ex-Microsoft employee
In a video, former Microsoft employee Jerry Berg, aka @Barnacules, has presented his point of view on the quality problems of Windows 10. Background: Berg spent 15 years as Senior Software Developer (SDET) at Microsoft, where he worked as a developer and tester for Vista and Windows Server. When exactly he left Microsoft is not quite clear in the video. Here is the video by Berg – whose presentation takes a little getting used to for my taste – but which shows some of the problems that Microsoft's management has been trying to solve with certain decisions.
I have summarized the key points Berg points out in his video as a problem for the quality of Windows 10 in the following. But have a look at the video for yourself, maybe I overheard something, weighted it wrong or sorted it too diagonally. Here is my interpretation of the whole thing:
- The core problem of Microsoft is that there used to be test departments for the products. There the products were tested intensively, so that most bugs were found. Only when these departments gave their OK was software released for end users. However, these test teams were released and disbanded in the context of the big wave of redundancies in 2014/2015. I had reported several times here in the blog (see link list at the end of article and my German article Scheitert Microsofts neuer Entwicklungs-Workflow? from 2016). #
- According to Berg, and Microsoft's statements, today people rely more on short tests of the developers in virtual machines, on the telemetry data and on the fact that the errors are noticed by the user (consumer) – or by insiders.
- A cornerstone of the (missing) 'quality assurance' are currently the Windows Insiders, who are supposed to take over the task of the former test department. But the problem is, the Windows Insiders don't report all errors. And when errors are reported, the insiders cannot provide Microsoft with the information to reproduce the error. In other words: Microsoft is practicing blind flying and hopes to get ahead somehow according to the principle 'eyes closed and through'.
- Berg explains in the video that telemetry is good for performance measurements and tuning. Sometimes you can also detect bugs by telemetry. This does not work if the crash happens outside an instrumented module. A core problem of the whole telemetry approach used by Microsoft: In case of crashes, mini dumps (memory dump of the crashing module) may be sent. But these dumps don't contain the required information. Complete kernel dumps with the data of all processes would be to huge and are practically unmanageable and can't be transmitted via the Internet (60 to 120 GB of data are transmitted).
- So Microsoft gets only a part of the information via telemetry and in case of crashes the developers only can try to take the data to reproduce the crash. If the developers find a problem, a patch is developed. But the developers are never sure if this really fixed the original issue. So the patch goes to the Windows insiders for testing in the hope that they will somehow report something back or that telemetry will provide new insights.
- The problem is that the mini-dump data is often not sufficient to locate the problem. So the Microsoft developers are looking for the needle in the haystack, and when they find and fix a bug, they can't be sure if it was the bug triggered in the crash report.
In the video, Berg suggests that Microsoft goes there and sees if the telemetry data collected can be used to find a cause. If it is not possible to find the cause of a crash using the mini dumps, the 60 or 120 GByte files of a complete dump should be requested from selected testers.
In the video Berg also explains that the Windows insiders do not represent all the hardware and software available in the field. After users getting pretty upset with the first Windows 10 feature updates, Microsoft is now rolling them out in waves. It's hoped that they will become aware of major issues early on and be able to avoid major catastrophes. If a machine doesn't get a feature update because Microsoft sees the machine as 'not ready', this means, according to Berg, translated: Microsoft does not know whether the new build is running on the hardware from the telemetry data.
Ex-developer demands a test department
The quintessence of Berg is: Many of the problems found today by telemetry at some point would have been found earlier by the test team. That's why the quality of earlier products was – at least I felt it – better. Today you can't find the bugs at Microsoft or if you found something, you can't be sure if you found a reported bug or just a new one. That's bad, of course, but explains to me why, despite bug fix updates, people keep finding that a problem isn't really fixed.
This would also explain why Microsoft is extremely failing to deliver tested updates for its own hardware such as the Surface models and why serious bugs are not detected. Berg suggests that Microsoft should again set up a test department to test Windows 10 on different hardware.
Windows as a Service as another problem
Much of what Berg addresses in his video I can agree. However, Berg ignores one relevant aspect: With Windows as a Service, Microsoft has lost sight of the needs of its users. Management and developers are rushing along timelines that have nothing to do with reality. There are features presented that no one needs – and suddenly the features are removed after a while.
We have has a working start menu in Windows 7. Now we have a start menu in Windows 10, that is causing steady trouble. We have had a working update control in Windows up to Windows 8.1. Now Microsoft is experimenting with feature updates, that can be deferred by a user, optional updates and driver updates, that the user may control. But we are still not back at the comfort and transparency we have had in Windows 7/8.1. It's interesting to see, how Jerry Berg outlines what is causing the current Windows 10 update mess from his experience.
Cookies helps to fund this blog: Cookie settings
Pingback: Microsoft Eases Windows 10 Previews for Windows Server Update Services Users — Redmondmag.com
Wow this all makes so much sense. As an IT and networking professional I have noticed an increasing number of problems with Windows 10, home and professional as well as Windows Server 2016. The 1903 update caused a lot of people a lot of problems.
That's why I change from Windows, it's became a huge uncontrolled mess.