2009 Technology Watch List
Posted 01/05/09 at 03:00:00 PM | by The Maximum PC Staff
We know, you just got your rig right where you want it, complete with a primo CPU, a kick-ass videocard config, and seemingly limitless storage. So forgive us if we dangle the temptation of better, faster hardware in front of your face. We’re just doing our job. Over the last few weeks, we’ve been grilling our industry contacts for news of what computing delights await power users in the months and years to come. And delightful the future is: CPUs with eight cores, GPUs that run games as a pastime, mobos with both SLI and CrossFire support, and hard drives so large your data will feel puny and inadequate. And that’s just part of it.
Look at it this way: Our 2009 technology preview gives you advance warning about the hardware that will soon occupy your dreams, so you can start saving your pennies and plotting your next upgrade path today.
[ Editor's Note: This feature originally ran in our December 2008 issue. The Intel Core i7 section has been expanded and incorporated into our full review, which you can find here. ]
CPUs
Intel takes a bold approach to processor architecture, multi-core computing

As a buttoned-down company, Intel rarely likes to make sweeping changes, but its upcoming Core i7 CPU is a major break from the past. Gone is the ancient front-side bus that connects all of the current-gen CPU cores. Instead, cores will communicate via a high-speed crossbar switch, and different CPUs will communicate via a high-speed interconnect.
Also on the outs is the need for an external memory controller. Intel, which has relied on gluing two dual-core chips together under the heat spreader to make its quad-core CPUs, is now placing all four cores on a single die.
Even overclocking, which was once verboten to even talk about within 10 miles of Intel’s HQ, is now automatically supported. Intrigued? You should be. Intel’s Core i7 is the most radical new design the company has taken in decades.
An Inside Job
One of Core i7’s most significant changes is the inclusion of an integrated memory controller. Instead of memory accesses going from the CPU across a relatively slow front-side bus to the motherboard chipset and finally to the RAM, an IMC will eliminate the need for a front-side bus and external memory controller. The result is dramatically lower latency than was found in the Core 2 and Pentium 4 CPUs.
Why can’t the memory controller on the motherboard simply be pushed to higher speeds to match an IMC? Remember, when you’re talking about a memory controller residing directly in the core, the signals have to travel mere millimeters across silicon that’s running at several gigahertz. With an external design, the signals have to travel out of the CPU to a memory controller in the chipset an inch or so away. It’s not just distance, either—the data is traveling across a PCB at far, far slower speeds than it would if it were within the CPU. In essence, it’s like having to go from an interstate to an unpaved, bumpy road.
Of course, if you’re an AMD loyalist, you’re probably bristling at the thought of Intel calling an IMC an innovation. After all, AMD did it first. So doesn’t that make AMD the pioneer? We asked Intel the same question. The company’s response: One: An IMC isn’t an AMD invention and, in fact, Intel had both an IMC and graphics core planned for its never-released Timna CPU years before the Athlon 64. Two: If AMD’s IMC design is so great, why does the Core 2 so thoroughly trash it with an external controller design? In short, Intel’s message to the AMD fanboys is nyah, nyah!
Naturally, you’re probably wondering why Intel thinks it needs an IMC now. Intel says the more efficient, faster execution engine of the Core i7 chip benefits from the internal controller more than previous designs. The new design demands boatloads of bandwidth and low latency to keep it from starving as it waits for data.
Memory A Trois
The Core i7 CPU is designed to be a very wide chip capable of executing instructions with far more parallelism than previous designs. But keeping the chip fed requires tons of bandwidth. To achieve that goal, the top-end Core i7 CPUs will feature an integrated tri-channel DDR3 controller. Just as you had to populate both independent channels in a dual-channel motherboard, you’ll have to run three sticks of memory to give the chip the most bandwidth possible. This does present some problems for board vendors though, as standard consumer mobos have limited real estate. Most performance boards will feature six memory slots jammed onto the PCB, but some will feature only four. On these four-slot boards, you’ll plug in three sticks of RAM and use the fourth only if you absolutely have to, as populating the last slot will actually reduce the bandwidth of the system. Intel, in fact, recommends the fourth slot only for people who need more RAM than bandwidth. With three 2GB DIMMs, though, most enthusiast systems will feature 6GB of RAM as standard.
Although it may change, Core i7 will support DDR3/1066, with higher unofficial speeds supported through overclocking. Folks hoping to reuse DDR2 RAM with Intel’s budget chips next year can forget about it. Intel has no plans to support DDR2 with a Core i7 chip at this point, and with DDR3 prices getting far friendlier to the wallet, we don’t expect the company to change its mind.
Hyper-Threading Revisited
A CPU core can execute only one instruction thread at a time. Since that thread will touch on only some portions of the CPU, resources that are not used sit idle. To address that, Intel introduced consumers to Hyper-Threading with its 3.06GHz Pentium 4 chip. Hyper-Threading, more commonly called simultaneous multi-threading, partitioned the CPU’s resources so that multiple threads could be executed simultaneously. In essence, a single-core Pentium 4 appeared as two CPUs to the OS. Because it was actually just one core dividing its resources, you didn’t get the same performance boost you would receive from adding a second core, but Hyper-Threading did generally smooth out multitasking, and in applications that were optimized for multi-threading, you would see a modest performance advantage. The problem was that very few applications were coded for Hyper-Threading when it was released and performance could actually be hindered. Hyper-Threading went away with the Core 2 series of CPUs, but Intel has dusted off the concept for the new Core i7 series because the transistor cost is minimal and the performance benefits stand to be far better than what the Pentium 4 could ever achieve.
Intel toyed with the idea of redubbing the feature Hyper-Threading 2 but decided against it, as the essential technology is unchanged. So why should we expect Hyper-Threading to be more successful this go around? Intel says it’s due to Core i7’s huge advantage over the Pentium 4 in bandwidth, parallelism, cache sizes, and performance. Depending on the application, the company says you can expect from 10 to 30 percent more performance with Hyper-Threading enabled. Still, Intel doesn’t force it down your throat because it knows many people still have mixed feelings about the feature. The company recommends that you give it a spin with your apps. If you don’t like it, you can just switch it off in the BIOS. Intel’s pretty confident, however, that you’ll leave it on.
Tomorrow's Performance Today

You can’t recompile the world. That’s the lesson Intel learned with the Pentium 4, which kicked ass with optimized code but ran like a Yugo with legacy apps. And even with Intel’s nearly limitless resources, it couldn’t get every developer to update software for the P4.
Intel took those lessons to heart with the stellar Core 2 and continues in that vein with Core i7, which is designed to run even existing code faster. That’s largely due to the Hyper-Threading, massive bandwidth, and low latency in the new chip, but other touches also help.
Loop conditions are common programming techniques that repeat the same task in a CPU. With Core i7, an improved loop detector routine will save power and boost performance by detecting larger loops and caching what the program asks for. Intel also polished its branch prediction algorithms. Branch predictions are those yes/no questions a CPU faces. If the CPU guesses wrong on what the program wants, the assembly-line-like pipeline inside the CPU must be cleared and the process started anew. New SSE4.2 instructions also make their way into Core i7, but they will be of little benefit to desktop users. Since Intel is designing the chip for server use as well, the new instructions are mainly to help speed up supercomputing and server-oriented workloads.

The main takeaway is that while some of the changes are radical, Intel is being pragmatic with its chip design—you won’t have to go out and buy new software to experience the CPU’s performance potential.
Making Better Connections
With a Hyper-Threaded quad core, even enthusiasts are unlikely to see the need for a multi-processor machine; nevertheless, one of the new features in Core i7 directly addresses a weakness in Intel’s current lineup when it comes to multi-CPU machines. As you know, Intel currently uses a front-side-bus technology to tie its multiprocessor machines together. As you might imagine, problems arise when a single front-side bus is sharing two quad-core CPUs. With so many cores churning so much data, the front-side bus can become gridlocked. Intel “fixed” this issue by building chipsets with two front-side buses. But what happens when you have a machine with four or eight CPUs? Since Intel couldn’t keep adding front-side buses, it took another page from AMD’s playbook by building in direct point-to-point connections over what it calls a Quick Path Interconnect. Server versions of Core i7 feature two QPI connections (desktop versions get just one), which can each talk at about 25GB/s, or double what a 1,600MHz front-side bus can achieve. AMD fans, of course, will point out that the fastest iteration of AMD’s chip-to-chip conduit, dubbed HyperTransport 3.1, is twice as fast as the current QPI.
QPI combined with the on-die memory controller will also make an Intel server or workstation a NUMA, or non-uniform memory access, design. Since each CPU has a direct link to its own individual memory DIMM, what happens if CPU 1 needs to access something that’s stored in the RAM being controlled by CPU 2? In this case, it must use the QPI link to access the second CPU’s memory controller to the RAM to get the data. This will slow things down a bit, but Intel says its tests indicate that even given this scenario, the memory access is still faster than what is possible with the current front-side-bus multiprocessor design.
The Power Within
It’s a known fact that overclocking can decrease the life of your CPU; thus, Intel has always discouraged end-users from overclocking its CPUs. With Core i7, Intel reverses its stance and actually overclocks the CPU for you! Of course, Intel would not describe its Turbo mode as overclocking, and, technically, it isn’t. While pushing your 2.66GHz Core 2 Quad to 3.2GHz would likely strain its thermal and voltage specs, the new Core i7 CPUs feature an internal power control unit that closely monitors the power and thermals of the individual cores.
This wouldn’t help by itself, though. Intel designed the Core i7 to be very aggressive in power management. With the previous Core 2, power to the CPU could be lowered only so far before the chip would crash. That’s because while you can cut power to large sections of the execution core, the cache can tolerate only so much decrease in power before blowing up. With Core i7, Intel separates the power circuit, so the cache can be run independently. This lets Intel cut power consumption and thermal output even further than before. Furthermore, while the Core 2 CPUs required that all the cores were idle to reduce voltage, with Core i7, individual cores can be turned off if they’re not in use.
Turbo mode exploits the power savings by letting an individual core run at increased frequencies if needed. This again follows Intel’s mantra of improving performance on today’s applications. Since a majority of today’s applications are not threaded to take full advantage of a quad core with Hyper-Threading, Turbo mode’s “overclocking” will make these applications run faster. For more information on how you’ll set up Turbo mode, read our sidebar below.










