After decades of fitful progress, parallel processing is suddenly hot and will soon be commonplace on ordinary PCs. For applications rich in data-level parallelism, performance is soaring by leaps and bounds.
Multicore CPUs from Intel and AMD are all good, but the game-changers are the next-gen GPUs from Nvidia and AMD/ATI. These chips are evolving from highly specialized 3D-graphics processors for games into broader computing engines for nongame software. Nvidia is leading the charge with a new GPU architecture that, for the first time, supports general-purpose computing as strongly as it supports graphics.
Nvidia’s new Fermi GPUs will support error-correction codes (ECC), one terabyte of memory, concurrent kernels, and faster double-precision floating-point math. These features are largely unnecessary for 3D graphics but vital for high-performance general-purpose computing. (In fact, ECC slows down graphics processing, which is why it can be disabled in Fermi chips sold for the consumer market.)
With Nvidia’s CUDA development tools, programmers are accelerating some tedious media-processing tasks, such as video transcoding. CUDA uses the GPU’s programmable 3D-graphics shaders as massively parallel processor cores, delivering performance that today’s PC processors can’t match. In addition, GPUs are finding new applications in scientific computing, financial analysis, medical imaging, energy exploration, and engineering.
Other developments are equally exciting. Microsoft’s DirectCompute brings a parallel-processing API to millions of mainstream PCs running Vista and Windows 7. The new OpenCL standard makes parallel programming easier and less proprietary. Apple’s Snow Leopard (Mac OS X 10.6) supports OpenCL and Apple’s Grand Central Dispatch technology (now open source), allowing programmers to distribute workloads across multicore CPUs and GPUs.
Intel is busy, too. With its own new GPU (Larrabee) on the way, Intel has acquired two small companies specializing in software tools for parallel programming—RapidMind and Cilk Arts. RapidMind is especially cool, because its software bridges GPUs, multicore x86 processors, and even IBM’s Cell Broadband Engine.
Parallel processing is spreading to the masses, and parallel-programming tools are catching up with parallel-processing hardware. When these trend lines finally converge, we’ll wonder why it took so long.
Tom Halfhill was formerly a senior editor for
magazine and is now an analyst for