Ask programmers what they fear most, and one answer is the hidden fatal bug—a serious programming error so deeply embedded in their code that it could go undiscovered for years. Then, suddenly, a program thought to be rock-solid just crashes and burns.
Consider what happened early this year when the U.S. Air Force tried to fly the first brand-new F-22 fighter planes to Japan. Halfway across the Pacific, the high-tech jets crossed the international date line. Suddenly, their navigation and communications systems crashed and refused to reboot. The most advanced (and expensive) warplanes in the world were helpless. The crippled fighters had to follow their air tankers back to Hawaii. Later, programmers found a software bug that choked the onboard computers when the compass heading instantly switched from 180 degrees west to 180 degrees east. Do similar bugs lurk in our PCs? Duh! And new bugs are coming from an unexpected source.
Mr. Edward A. Lee, an engineering professor at the University of California, Berkeley, has written a well-researched but troubling technical paper, “The Problem with Threads.” Lee found that multithreaded programs running perfectly on single-processor systems could behave differently or even freeze on multiprocessor systems. In fact, Lee’s group carefully wrote and tested an experimental program that ran without problems on a single processor for four years. Then, one day, it simply froze. The problem: a thread deadlock, or “threadlock,” that appeared when the program ran on dual processors. Threadlocks happen when two threads of execution within a program contend for the same code. They block each other, sometimes halting the program. Threadlocks are well known and programmers commonly test for them. What’s scary is that the university’s experts believed they had eradicated all possible threadlocks from their program—a belief reinforced by four years of trouble-free operation. But dual processors spelled doom.
From the software’s point of view, a dual-core or dual-threaded processor looks like a dual-processor system. PCs with dual-core processors are now commonplace, and quad-core processors are becoming popular. As programmers write increasingly complex multithreaded code to take advantage of these multicore systems, threadlocks that were hidden on smaller systems will begin causing more strange behavior and mysterious crashes.