You are here

The software bug disease

By Jean-Claude Elias - May 23,2019 - Last updated at May 23,2019

The recent admission by aircraft industry giant Boeing that there was indeed a bug in the software systems of its 737 Max airplane is sad, regrettable but — quite unfortunately — not a new phenomenon in the software industry.

There have been bugs, or programming errors, ever since there was software. Examples abound in the 60 years or so history of modern computing, in all fields of application. The severity and the eventual consequences go from very mild, even funny in some cases, to extremely dramatic, like in an air crash, in others.

The public awareness about IT and high-tech today is such that people know very well what a software bug is, and how it enters a programme because of a human error. They also know that such errors are impossible to avoid completely, that the human error factor will always be there.

However, the obvious question that most of us would ask is the following: How come that the process of debugging a programme, of making it 100 per cent bug-free by having it extensively verified and thoroughly tested, by several different teams, time and again, how come that this process is not applied and followed systematically as it should be, scrupulously, before critical and life-threatening applications are actually put to work?

The question is simple but the answer is not. There is in fact a set of them.

The first is that debugging a large software system is a daunting task. Think of the most common of these systems, the one that most people in the world use: MS-Windows. It is a gigantic piece of code, developed over the years, by different teams. The size of the code sometimes consists of millions of line — yes, millions. The change of the successive teams who do the development adds another dimension to the complexity and therefore the fragility of the project. We all know that MS-Windows is not 100 per cent bug-free, that each version may iron out previous mistakes but introduce new ones and we accept to live that way.

Moreover, software developers and coders sometime argue that if they have to do complete and perfect testing before they release their product, it would take years and years, and therefore is not a commercially doable approach, not a viable solution. Such argument should not be acceptable in what we call critical applications. It is one thing to experience a bug in your bank account, while chatting with your friend over Skype, or while watching a Netflix video stream, and it is totally another when it hits aboard a flying aircraft, or in a busy hospital operating room.

The current trend towards driverless vehicles — among other heavily IT-dependent high-tech trends — implies more and more software, constantly more complex programming, and larger coding size. Will the industry rush to release software for unmanned cars for commercial reasons or will it spend the time it takes to produce clean, thoroughly tested applications?

A few weeks ago I downloaded and installed the very latest version of my software music player, J. River Media Centre. The manufacturer was very professional and warned us users that this new release may not yet be perfectly debugged, and that it was up to each and every one to go for it immediately or to wait for its complete debugging, knowing and keeping mind that the very fact that some users are willing to try it out as it is does help debugging it, thanks to their testing and feedback.

I bravely went for it from the start, experienced a couple of minor bugs (sudden shutdowns), and six weeks later I was happy to receive the fully-debugged copy from the maker. This is all but fair and ethical, but again in no way acceptable in otherwise critical applications. My music player certainly does not qualify as anything “critical”.

The above is nothing new. The industry knows very well what is critical and what is not, and what methodology is to apply in the first case. It is therefore not about knowing what is right, but about doing it by the book. Maybe it is a management issue after all! At a very big scale, certainly.

199 users have voted.

Add new comment

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
2 + 18 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.


Get top stories and blog posts emailed to you each day.