Financial Times FT.com

Testers aim to kill off dreaded blue screens

By Mary Branscombe

Published: November 22 2006 02:00 | Last updated: November 22 2006 02:00

Sometimes the "blue screen of death" closes down Windows and all applications without giving the user a chance to save their data. Sometimes the hourglass on the PC - or the spinning "beachball of doom" on a Mac - goes round for so long that that the only option is to reboot.

Either way, frustration, lost productivity and helpdesk support costs add up.

Hardware failure is inevitable: everything wears out in the end. But a team at Microsoft's research labs in Cambridge headed by Byron Cook is finding that software can be made much more reliable by changing the way code is tested - and by looking for approaches that might not solve the whole problem but solve enough of it to be useful.

Researchers have spent decades using techniques, known as formal methods, to prove whether software will do what it is supposed to. But they are complicated and cumbersome.

However, automation is making them easier to use: creating a test for the 30,000 lines of code in a device driver takes 10 minutes with an automated tool - a lifetime if done manually.

There are already tools to check program code to make sure it will not crash, including a software verifier developed by Dr Cook's team that Microsoft uses to check device drivers.

Intel has invested heavily in tools to verify circuit designs and Airbus uses them to check the software that controls its aeroplanes, for example.

Now Dr Cook is concentrating on proving whether a program will finish what it sets out to do or get caught in a loop.

Although a program that hangs could cause the whole system to crash if it is using up more and more resources as it tries to complete its task, that will not always happen. The program will still be working in the background and there is the possibility it could eventually finish the task. On the other hand, it might be caught in an infinite loop caused by a flaw in the code.

Dr Cook's team is developing a tool called Terminator that can tell the difference by looking for "liveness properties" that guarantee that "good things eventually happen".

Even though many of the problems developers want to check for are to do with "liveness", it has not been seriously tackled before because computing pioneer Alan Turing showed that you cannot always tell the difference.

"Imagine all of the world's programs and imagine trying to sort them into the ones that will finish and ones that won't. What Turing proved is that you couldn't come up with a 100 per cent accurate procedure for separating those programs - and it scared people off," explains Dr Cook. While he has not solved what is known as "the halting problem" he believes there are still useful results: "From a practical point of view, it's a solved problem."

Initially, Terminator is checking device drivers - the programs that handle what happens when users move a mouse, type on the keyboard, scan in an image or use any other peripheral device. They are a good place to start because the code that governs them is short enough to deal with but also because they are a potential problem: "You want a driver to stop eventually; you don't want it to be stuck responding to a mouse movement forever," says Dr Cook.

"Drivers are critical to a computer. When a device driver hangs the whole machine is useless - and they are frequently buggy." Next, he plans to move on to the functions Windows provides to other applications.

Dr Cook's techniques have already made Windows more reliable. Within a few years of beginning his research on crashes, the resulting tools were finding bugs in device drivers written by Microsoft and by hardware manufacturers.

And Terminator has found problems in code going into Windows Vista. Teams within Microsoft are now building more tools based on his research.

"In the next three to four years we will build tools for every program a programmer writes and we'll be able to handle 95 per cent of those cases. In 10 years, all the software used will have been in some way affected by the general research area I work in."

Just as today's programs can be digitally signed to show who wrote them, in future they could include signatures promising they will not hang.

"You can imagine in the future a possible operating system that runs device drivers that have been proved correct in a fast mode, while drivers that have not might even run in a sandbox [a safe area for untrusted programs], for security."

More from this sector

US customs approves HTC shipments

RIM warns of loss as banks assist review

Research In Motion: accidents happen

Facebook shares close below $29

Cyberwar fears after bug targets Tehran

Vector makes move for France’s Technicolor

Spinout companies face funding challenge

Olympus settles with ex-chief Woodford

Oil and gas prices boost Aveva

Renesas Electronics: no renaissance

Renesas unveils restructuring plan

Jobs and classifieds

Jobs

Search
Type your search criteria below:
Recruiters

FT.com can deliver talented individuals across all industries around the world

Post a job now