Core i7 Dissected and Benchmarked! Does Intel’s Next-Generation Chip Live Up to the Hype? Hell Yeah!

Maximum PC Staff

Core i7 Up Close

Tick tock? More like ding-dong, mutha—shut your mouth. What baby? We’re talkin’ about Core i7.

Our apologies to Isaac Hayes, but if he were alive, we’re almost certain he would have been tapped to hammer out a theme song for Intel’s most significant CPU launch in, well, ever.

Why is this CPU more significant than the 8088, Pentium, or Pentium M? As the second new chip produced after a series of embarrassing losses to archrival AMD, the Core i7 will answer for the world whether Intel is prepared to ride the momentum of its Core 2 launch with another winning chip or if it’s content to rest on its laurels, as it did with the Pentium 4.

Core i7 also represents a major new direction for Intel, which has stubbornly clung to the ancient front-side-bus architecture and discrete memory controller for years. Indeed, with its triple-channel integrated DDR3 memory controller and chip-to-chip interconnect, the block map of a Core i7 looks more like an Athlon 64 than a Core 2 chip.

Intel actually has three quad-core Core i7 CPUs ready: the top-end 3.2GHz Core i7-965 Extreme Edition, the performance-oriented 2.93GHz Core i7-940, and the midrange 2.66GHz Core i7-920. For the most part, all three are exactly the same except for clock speeds, multiplier locking (only the Extreme is unlocked), and QuickPath Interconnect speed. See the chart on page 42 for details.

The bigger issue is how Core i7 performs. To find out, we ran the Extreme 965 against AMD’s fastest proc as well as Intel’s previous top gun in a gauntlet of benchmarks. Read on for the results.

Intel takes a bold approach to processor architecture, multi-core computing

As a buttoned-down company, Intel rarely likes to make sweeping changes, but its upcoming Core i7 CPU is a major break from the past. Gone is the ancient front-side bus that connects all of the current-gen CPU cores. Instead, cores will communicate via a high-speed crossbar switch, and different CPUs will communicate via a high-speed interconnect.

Also on the outs is the need for an external memory controller. Intel, which has relied on gluing two dual-core chips together under the heat spreader to make its quad-core CPUs, is now placing all four cores on a single die.

Even overclocking, which was once verboten to even talk about within 10 miles of Intel’s HQ, is now automatically supported. Intrigued? You should be. Intel’s Core i7 is the most radical new design the company has taken in decades.

An Inside Job

One of Core i7’s most significant changes is the inclusion of an integrated memory controller. Instead of memory accesses going from the CPU across a relatively slow front-side bus to the motherboard chipset and finally to the RAM, an IMC will eliminate the need for a front-side bus and external memory controller. The result is dramatically lower latency than was found in the Core 2 and Pentium 4 CPUs.

Why can’t the memory controller on the motherboard simply be pushed to higher speeds to match an IMC? Remember, when you’re talking about a memory controller residing directly in the core, the signals have to travel mere millimeters across silicon that’s running at several gigahertz. With an external design, the signals have to travel out of the CPU to a memory controller in the chipset an inch or so away. It’s not just distance, either—the data is traveling across a PCB at far, far slower speeds than it would if it were within the CPU. In essence, it’s like having to go from an interstate to an unpaved, bumpy road.

Of course, if you’re an AMD loyalist, you’re probably bristling at the thought of Intel calling an IMC an innovation. After all, AMD did it first. So doesn’t that make AMD the pioneer? We asked Intel the same question. The company’s response: One: An IMC isn’t an AMD invention and, in fact, Intel had both an IMC and graphics core planned for its never-released Timna CPU years before the Athlon 64. Two: If AMD’s IMC design is so great, why does the Core 2 so thoroughly trash it with an external controller design? In short, Intel’s message to the AMD fanboys is nyah, nyah!

Naturally, you’re probably wondering why Intel thinks it needs an IMC now. Intel says the more efficient, faster execution engine of the Core i7 chip benefits from the internal controller more than previous designs. The new design demands boatloads of bandwidth and low latency to keep it from starving as it waits for data.

Memory a Trois

The Core i7 CPU is designed to be a very wide chip capable of executing instructions with far more parallelism than previous designs. But keeping the chip fed requires tons of bandwidth. To achieve that goal, the top-end Core i7 CPUs will feature an integrated tri-channel DDR3 controller. Just as you had to populate both independent channels in a dual-channel motherboard, you’ll have to run three sticks of memory to give the chip the most bandwidth possible. This does present some problems for board vendors though, as standard consumer mobos have limited real estate.

Most performance boards will feature six memory slots jammed onto the PCB, but some will feature only four. On these four-slot boards, you’ll plug in three sticks of RAM and use the fourth only if you absolutely have to, as populating the last slot will actually reduce the bandwidth of the system. Intel, in fact, recommends the fourth slot only for people who need more RAM than bandwidth. With three 2GB DIMMs, though, most enthusiast systems will feature 6GB of RAM as standard.

Although it may change, Core i7 will support DDR3/1066, with higher unofficial speeds supported through overclocking. Folks hoping to reuse DDR2 RAM with Intel’s budget chips next year can forget about it. Intel has no plans to support DDR2 with a Core i7 chip at this point, and with DDR3 prices getting far friendlier to the wallet, we don’t expect the company to change its mind.

Hyper-Threading Revisited

A CPU core can execute only one instruction thread at a time. Since that thread will touch on only some portions of the CPU, resources that are not used sit idle. To address that, Intel introduced consumers to Hyper-Threading with its 3.06GHz Pentium 4 chip. Hyper-Threading, more commonly called simultaneous multi-threading, partitioned the CPU’s resources so that multiple threads could be executed simultaneously. In essence, a single-core Pentium 4 appeared as two CPUs to the OS. Because it was actually just one core dividing its resources, you didn’t get the same performance boost you would receive from adding a second core, but Hyper-Threading did generally smooth out multitasking, and in applications that were optimized for multi-threading, you would see a modest performance advantage.

The 45nm-based Core i7 will pack all four cores on a single die. The cores will communicate via a high-speed crossbar switch. An integrated memory controller and Quick Path Interconnect links to other CPUs also make the Core i7 very AMD-like.

The problem was that very few applications were coded for Hyper-Threading when it was released and performance could actually be hindered. Hyper-Threading went away with the Core 2 series of CPUs, but Intel has dusted off the concept for the new Core i7 series because the transistor cost is minimal and the performance benefits stand to be far better than what the Pentium 4 could ever achieve.

Intel toyed with the idea of redubbing the feature Hyper-Threading 2 but decided against it, as the essential technology is unchanged. So why should we expect Hyper-Threading to be more successful this go around? Intel says it’s due to Core i7’s huge advantage over the Pentium 4 in bandwidth, parallelism, cache sizes, and performance. Depending on the application, the company says you can expect from 10 to 30 percent more performance with Hyper-Threading enabled. Still, Intel doesn’t force it down your throat because it knows many people still have mixed feelings about the feature. The company recommends that you give it a spin with your apps. If you don’t like it, you can just switch it off in the BIOS. Intel’s pretty confident, however, that you’ll leave it on.

Tomorrow’s Performance Today

You can’t recompile the world. That’s the lesson Intel learned with the Pentium 4, which kicked ass with optimized code but ran like a Yugo with legacy apps. And even with Intel’s nearly limitless resources, it couldn’t get every developer to update software for the P4.

Intel took those lessons to heart with the stellar Core 2 and continues in that vein with Core i7, which is designed to run even existing code faster. That’s largely due to the Hyper-Threading, massive bandwidth, and low latency in the new chip, but other touches also help.

Loop conditions are common programming techniques that repeat the same task in a CPU. With Core i7, an improved loop detector routine will save power and boost performance by detecting larger loops and caching what the program asks for. Intel also polished its branch prediction algorithms. Branch predictions are those yes/no questions a CPU faces. If the CPU guesses wrong on what the program wants, the assembly-line-like pipeline inside the CPU must be cleared and the process started anew. New SSE4.2 instructions also make their way into Core i7, but they will be of little benefit to desktop users. Since Intel is designing the chip for server use as well, the new instructions are mainly to help speed up supercomputing and server-oriented workloads.

The main takeaway is that while some of the changes are radical, Intel is being pragmatic with its chip design—you won’t have to go out and buy new software to experience the CPU’s performance potential.

Making Better Connections

With a Hyper-Threaded quad core, even enthusiasts are unlikely to see the need for a multi-processor machine; nevertheless, one of the new features in Core i7 directly addresses a weakness in Intel’s current lineup when it comes to multi-CPU machines. As you know, Intel currently uses a front-side-bus technology to tie its multiprocessor machines together. As you might imagine, problems arise when a single front-side bus is sharing two quad-core CPUs.

With so many cores churning so much data, the front-side bus can become gridlocked. Intel “fixed” this issue by building chipsets with two front-side buses. But what happens when you have a machine with four or eight CPUs? Since Intel couldn’t keep adding front-side buses, it took another page from AMD’s playbook by building in direct point-to-point connections over what it calls a Quick Path Interconnect. Server versions of Core i7 feature two QPI connections (desktop versions get just one), which can each talk at about 25GB/s, or double what a 1,600MHz front-side bus can achieve. AMD fans, of course, will point out that the fastest iteration of AMD’s chip-to-chip conduit, dubbed HyperTransport 3.1, is twice as fast as the current QPI.

QPI combined with the on-die memory controller will also make an Intel server or workstation a NUMA, or non-uniform memory access, design. Since each CPU has a direct link to its own individual memory DIMM, what happens if CPU 1 needs to access something that’s stored in the RAM being controlled by CPU 2? In this case, it must use the QPI link to access the second CPU’s memory controller to the RAM to get the data. This will slow things down a bit, but Intel says its tests indicate that even given this scenario, the memory access is still faster than what is possible with the current front-side-bus multiprocessor design.

The Power Within

It’s a known fact that overclocking can decrease the life of your CPU; thus, Intel has always discouraged end-users from overclocking its CPUs. With Core i7, Intel reverses its stance and actually overclocks the CPU for you! Of course, Intel would not describe its Turbo mode as overclocking, and, technically, it isn’t. While pushing your 2.66GHz Core 2 Quad to 3.2GHz would likely strain its thermal and voltage specs, the new Core i7 CPUs feature an internal power control unit that closely monitors the power and thermals of the individual cores.

This wouldn’t help by itself, though. Intel designed the Core i7 to be very aggressive in power management. With the previous Core 2, power to the CPU could be lowered only so far before the chip would crash. That’s because while you can cut power to large sections of the execution core, the cache can tolerate only so much decrease in power before blowing up. With Core i7, Intel separates the power circuit, so the cache can be run independently. This lets Intel cut power consumption and thermal output even further than before. Furthermore, while the Core 2 CPUs required that all the cores were idle to reduce voltage, with Core i7, individual cores can be turned off if they’re not in use.

Turbo mode exploits the power savings by letting an individual core run at increased frequencies if needed. This again follows Intel’s mantra of improving performance on today’s applications. Since a majority of today’s applications are not threaded to take full advantage of a quad core with Hyper-Threading, Turbo mode’s “overclocking” will make these applications run faster. For more information on how you’ll set up Turbo mode, read our sidebar below.

Intel’s Turbo Mode Technology

Turbo mode might sound like a feature left over from the TV series Knight Rider, but it’s more neat than cheesy. You already know that Core i7 CPUs closely monitor the power and thermals of the chip and use any leftover headroom to overclock the individual cores as needed. But just how does it work?

From what we’ve surmised by examining an early BIOS, you will be able to set each type of core scenario based on how far you want to overclock, given the load. For example, with applications that push one thread, you could set the BIOS to overclock, or rather, turbo that single core by perhaps three multipliers over stock. You would do the same for two-, three-, and four-core scenarios.

The good news is that you’ll get fine-grain control over the Turbo mode in the upcoming Core i7 CPUs.

The BIOS will also take into account the thermal rating, or TDP, of the cooling system you’re using. If you’re using, say, a heatsink rated for 150 TDP, the BIOS will overclock to higher levels than it would with a 130 TDP unit. You would manually set the heatsink’s rating in the BIOS, as there’s no way for the heatsink to communicate with the motherboard directly.

New Socket on the Block

So all this CPU goodness and performance will drop right into that $450 LGA775 board you just bought, right? Of course not. Ung’s Law dictates that the minute you buy expensive hardware, something better will arrive that makes what you just bought obsolete.

Intel isn’t doing this just to piss people off (although a history of such behavior has had that result). Since Core i7 moves the memory controller directly into the CPU, Intel added a load of pins that go directly to the memory modules. The new standard bearer for performance boxes is the LGA1366 socket. It looks functionally similar to the LGA775, with the obvious addition of more pins. More pins also means a bigger socket, which means your fancy heatsink is also likely headed to the recycle bin. LGA1366 boards space the heatsink mounts just a tad bit wider, just enough to make your current heatsink incompatible. There’s a chance that some third-party heatsink makers will offer updated mounts to make your current heatsink work, but that’s not known yet.

What will be interesting to heatsink aficionados is Intel’s encouragement that vendors rate the heatsinks using a unified thermal rating that will be tied to the Turbo mode settings. For more information, see the Turbo mode sidebar below.

The Second Coming

Intel is adopting more than just AMD’s integrated memory controller with its new Core i7 chips; it’s also adopting AMD’s abandoned Socket 940/754 two-socket philosophy. For the high end, the LGA1366 socket will offer tri-channel RAM and a high-performance QPI interface. For mainstream users, Intel will offer a dual-channel DDR3 design built around a new LGA1066 socket late next year. LGA1066 isn’t just about shedding one channel of DDR3 though; LGA1066-based CPUs will also bring direct-attach PCI Express to the table.

Instead of PCI Express running through the chipset, as it does with existing Core 2 and the new performance Core i7, PCI-E will reside on the die of LGA1066 CPUs. With the PCI-E in the CPU itself, Intel will reuse its fairly slow DMI interface to connect the CPU to a single-chip south bridge. The two chips Intel will introduce are the quad-core Lynnfield and the dual-core Havendale. Havendale CPUs will actually feature a newly designed graphics core inside the heat spreader that will talk to the CPU core via a high-speed QPI interface. Both chips will feature Hyper-Threading on all cores.

Many AMD users got a royal screwing when the company abandoned both Socket 940 and Socket 754 for a unified Socket 939; could Intel do something similar? We asked Intel point blank whether LGA1366 would eventually be abandoned for LGA1066; the company told us it fully intends to support both platforms.

The Core i7 Family

Core i7-965 Extreme Edition Core i7-940
Core i7-920
Clock Speed
3.2GHz
2.93GHz
2.66GHz
L2 Cache
1MB 1MB 1MB
L3 Cache
8MB 8MB 8MB
Process
45nm 45nm 45nm
Transistors
731 million
731 million
731 million
QPI Speed 6.4GT/s 4.8GT/s 4.8GT/s
Multiplier Lock No
Yes
Yes
Default Multiplier 24
22
20
Volume Pricing
$999
$562
$284

Core i7 Versus the World

To test the Core i7’s mettle, we threw it in the ring with the two quad-core class leaders available today: AMD’s 2.6GHz Phenom X4 9950 Black Edition and Intel’s 3.2GHz Core 2 Extreme QX9770. We paired each with its respective top-end chipset: a 790FX board for the Phenom X4 and an X48 for the Core 2, while the Core i7 partnered with an Intel DX58SO board using the new X58 chipset. All three systems were outfitted with an Nvidia GeForce 8800 GTX card, the same graphics driver, a Western Digital 150GB Raptor 10K hard drive, and the 64-bit edition of Windows Vista Home Premium.

For RAM, we couldn’t use the same components in all three systems; the Phenom uses DDR2 while both Intel CPUs use DDR3; the Core i7’s triple-channel DDR3 requires three DIMMs for maximum bandwidth while the Core 2 needs just two. Our solution favored the Phenom and Core 2: We populated the Phenom X4 with 4GB of Patriot DDR2/800 and the Core 2 with 4GB of Corsair DDR3/1333, each receiving a pair of 2GB modules. The Core i7 made do with three 1GB DDR3/1066 DIMMs from Qimonda. The Core i7 officially supports DDR3 at 1066 at this point, so we stuck with stock speeds, although motherboard vendors tell us they’re able to hit far higher DDR3 speeds.

We selected a combination of tests that stress memory performance, computational abilities, and real-world performance. The vast majority of the application tests are multithreaded. The gaming tests, beyond 3DMark Vantage, reflect performance optimized for dual-core CPUs, at best. For our real-world gaming tests, we turned down graphics and resolutions to the minimum to remove the GPU as a bottleneck.

The Upshot

If we had to describe the Core i7 in one word, it would be monster. The CPU is to benchmarks as Godzilla is to downtown Tokyo.

Take, for example, the Core i7 Extreme 965 versus the Phenom X4 9950 Black Edition. It’s no surprise that the Core i7 throws the Phenom X4 through a couple of concrete walls and right into a telephone pole. We witnessed performance differences of 87 percent, 95 percent, and even 133 percent over the fastest Phenom X4 part. AMD’s best and brightest part was utterly crushed by Intel’s new baby. Naturally, some folks will argue that it’s unfair to put a $1,000 chip against one that sells for $174, but we don’t feel that way. The Phenom X4 9950BE is AMD’s fastest CPU. If AMD doesn’t feel comfortable selling it at higher clocks, that’s AMD’s problem. Sure, we could overclock the Phenom part to 3GHz, but we could also overclock the Core i7. In the interest of a more competitive landscape, let’s just hope AMD’s 45nm CPU—due out soon—puts some pep back in the company’s step because the situation is getting beyond ugly.

A more closely matched fight was expected between the Core i7-965 Extreme Edition and Intel’s own Core 2 Extreme QX9770, both of which churn along at 3.2GHz. Nevertheless, the Core i7 managed to maul its sibling in several benchmarks. In our MainConcept H.264 encoding test, the Core i7 was 55 percent faster. In ProShow Producer, the Core i7 completed its runs about 25 percent faster. Using WinRAR to compress a folder of digital RAW files, the Core i7 was 43 percent faster. In other tests, especially gaming, the QX9770 closed the spread down to single digits, but for the most part, the Core i7 was from 14 to 20 percent faster than its Penryn counterpart.

Not everything came up roses for the Core i7, however. We saw the Core i7 cough up a hair ball in FEAR with an odd 51fps compared with the QX9770’s 122fps and a shocking 239fps from the Phenom. Intel says this is the result of a USB bug, as a duplicate system in its lab performed as expected. A more believable result was in World in Conflict: The Core i7 reached 250fps versus the QX9770’s 220 and the Phenom’s 136.

Even an Arthur Andersen accountant would have to declare the Core i7 the new champion after peeping our benchmark table. From encoding performance to 3D rendering to gaming, the Core i7’s more efficient core, boatloads of memory bandwidth, and low RAM latency make it a shockingly fast CPU.

Benchmarks

Core i7-965 Extreme
Phenom X4 9950 BE
Core 2 Extreme QX9770
MainConcept (min:sec) 3
15:58 31.37
24.49
MainConcept Pro (min:sec) 10:08 18:44
14:49
ProShow Producer 3.1 (min:sec)
10:19
20:10
12:52
Premiere Pro CS3 (min:sec)
10:17
16:27
11:26
Photoshop CS3 (min:sec)
1:50
2:48
1:55
Cinebench 10 32-bit
15,398
8,179
12,175
Cinebench 10 64-bit
18,963 10,431
13,849
Valve Map Compilation (min:sec)
2:05 2:47
1:56
ScienceMark Overall
2,091.22 1,608.74
1,920.2
ScienceMark Membench (MB/s)
13,312 7,279
8,559.5
PCMark Vantage x64 Overall
7,510 5,724
6,423
PCMark Vantage Overall
6,705 5,299
5,961
Sisoft Sandra RAM Bandwidth (GB/s)
18.15GB/s 9.73GB/s
7.4GB/s
Sisoft Sandra RAM Latency (ns)
77 95
79
Everest Ultimate MEM Read (MB/s)
15,167 6,701
8,252
Everest Ultimate MEM Write (MB/s)
12,041 4,856
8,490
Everest Ultimate MEM Copy (MB/s)
15,583 7,760
8,426
Everest Ultimate MEM Latency (ns)
39.2 64.7
66.7
WinRAR 3.80 (min:sec)
9:44 18:11
13:57
POV-Ray 3.7 (min:sec)
6:48 11:52
8:08
3DMark06 overall
12,859 11,639
12,906
3DMark06 CPU
5,638 3,532
4,717
3DMark Vantage
7,516 7,301
7,588
3DMark Vantage CPU
39,725 26,709
32,446
3DMark Vantage GPU
5,917 5877
6,044
FEAR (FPS)
51 239
122
Quake 4 (FPS)
228.0 152.3
206.6
Valve Particle Test (FPS)
161 69
111
Crysis 1.2 10x7 very low CPU1(FPS)
164 112
153
World In Conflict (FPS)
250 136
220
NOTES: How we tested. We used matched GeForce 8800GTX cards for all three platforms, matched Western Digital 150GB Raptors, Windows Vista Home Premium 64-bit and the same graphics drivers. The Core 2 Quad had 4GB of DDR3/1333, the Phenom X4 BE 9950 had 4GB of DDR2/800 and the Core i7 had 3GB of DDR3/1066.

Top End Showdown

As you know, all Core i7 are pretty much the same chip. There’s no cache size difference, no disabled Hyper-Threading. The only differences between the chips are clock speeds and the QPI interface speed and the “Overspeed protection.” Overspeed protection is simply the multiplier lock. Non extreme Core i7’s will not let you change your multiplier ratio wily nily that the Extreme will. The second is the QPI speed. The 965 Extreme runs at a 6.4GT/s while the 920 and 940 communicate with the chipset at 4.8GT/s.

The performance upshot is that the Extreme is the fastest. No surprise there Sherlock. What does surprise us though is the difference in speed in some benchmarks. In our Main Concept test, for example, we saw the 965 encode our high def video about 24 percent faster than the 940. What’s odd is that the 965 offers just 9 percent more clocks than the 940. We saw a similar results in the Cinebench 10 test where the 965 was about 14 percent faster than the 940.

In other tests we saw standard clock speed splits. In PC Mark Vantage, for example, the 965’s 9 percent clock spread gave it about an 11 performance spread. POV-Ray saw the 965 with its 20 percent clock advantage over the 920 turn a score about 22 percent faster. Other tests saw fairly minimal advances for the 965. For example, our ProShow Producer was virtually a tie between the 2.93GHz part and the 3.2GHz chip which leads us to believe we have a bottleneck in our configuration or coding issues going on.

Benchmarks

2.66GHz Core i7-920
$284
2.93GHz Core i7-940
$562
3.2GHz Core i7-965 Extreme
$999
MainConcept (min:sec) 3
21.40 19:50
15:58
MainConcept Pro (min:sec) 12:21 11:19
10:08
ProShow Producer 3.1 (min:sec)
11:10
10:16
10:19
Premiere Pro CS3 (min:sec)
12:39
11:41
10:17
Photoshop CS3 (min:sec)
2:05
2:03
1:50
Cinebench 10 32-bit
12,632
13,793
15,398
Cinebench 10 64-bit
15,217 16,651
18,963
Valve Map Compilation (min:sec)
2:32 2:21
1:50
ScienceMark Overall
1,710.1 1884.69
2,091.22
ScienceMark Membench (MB/s)
12,737 13,028
13,312
PCMark Vantage x64 Overall
6,616 6,767
7,510
PCMark Vantage Overall
5,347 6,043
6705
Sisoft Sandra RAM Bandwidth (GB/s)
18.07GB/s 18.09GB/s
18.15GB/s
Sisoft Sandra RAM Latency (ns)
79ns 78ns
77ns
Everest Ultimate MEM Read (MB/s)
14,449 14,841
15,167
Everest Ultimate MEM Write (MB/s)
11,627 14,788
12,041
Everest Ultimate MEM Copy (MB/s)
15,039 15,011
15,583
Everest Ultimate MEM Latency (ns)
38.7 37.0
39.2
WinRAR 3.80 (min:sec)
10:52 10:45
9:44
POV-Ray 3.7 (min:sec)
8:18 7:42
6:48
3DMark06 overall
12,407 12,559
12,859
3DMark06 CPU
4,620 5,035
5,638
3DMark Vantage
7,450 7,453
7,516
3DMark Vantage CPU
34,909 35,548
39,725
3DMark Vantage GPU
5,902 5868
5,917
FEAR (FPS)
132 235
51*
Quake 4 (FPS)
144.6 156.2
228.0
Valve Particle Test (FPS)
131 143
161
World In Conflict (FPS)
223 232
250
QPI
4.8 GT/s
4.8GT/s 6.4GT/s
NOTES: How we tested. We used a single GeForce 8800GTX, a 150GB Western Digital Raptor, Windows Vista Home Premium 64-bit edition and 3GB of DDR3/1066 for all of our tests. *We had issues running FEAR on the Core i7 Extreme part.

Budget Processor Showdown: Core 2 Quad vs. Core i7

It is no longer politically correct to call your thrift-minded friends by any of the many offensive low-cost names people have used over the years, but you can forward your cheap-skate geek friends this link and tell them that even they can participate in the latest technology trends without feeling like they’re getting a great deal. That’s because Intel’s new 2.66GHz Core i7-920 is a great deal.

To find out how well the 920 would do, we put the $284 chip against the $316 2.83GHz Core 2 Quad Q9550. The upshot is in almost every benchmark, the 920 was faster. In some tests, the slightly clock speed advantage of the Q9550 put it ahead, but not by much.  Oddly, we did see the 920 lose in Quake 4. Quake 4 is only optimized for quad core but we didn’t expect to win here. Clearly there’s something going on the gaming side that we’ll have to continue to investigate. We must also point out that our decision to limit the Core i7 to its stock DDR3/1066 speeds may also be hobbling the chip.

Our recommendation is that you go with the path of the Core i7 chip if you’re concerned about future upgrades. With Core i7 here, Intel is likely to rapidly push the Core 2 platform aside so you’ll never see a CPU faster than the 3.2GHz Core 2 Extreme QX9770. The Core i7,l however, will continue to climb in clock speeds for the next few years. Where the Core 2 platform plays better is in the ultra budget shoppers. With Core 2 boards priced from $50 on up and CPUs in the sub $100 arena, you can actually start at far lower prices than Core i7. But if you are concerned about upgrades, the Core i7 is the way to go.

Benchmarks

2.8GHz Core 2 Quad Q9550 ($316) 2.66GHz Core i7-920 ($284)
MainConcept (min:sec) 3
27:40 21:40
MainConcept Pro (min:sec) 16.28 12.21
ProShow Producer 3.1 (min:sec)
15:18
11:10
Premiere Pro CS3 (min:sec)
12:51
12:39
Photoshop CS3 (min:sec)
2:04
2:05
Cinebench 10 32-bit
10,837
12,632
Cinebench 10 64-bit
12,288 15,217
Valve Map Compilation (min:sec)
2:10 2:32
ScienceMark Overall
1,715.67 1,710.1
ScienceMark Membench (MB/s)
7,105 12,737
PCMark Vantage x64 Overall
5:945 6,616
PCMark Vantage Overall
5,460 5,347
Sisoft Sandra RAM Bandwidth (GB/s)
6.9GB/s 18.07GB/s
Sisoft Sandra RAM Latency (ns)
81ns 79ns
Everest Ultimate MEM Read (MB/s)
8,006 14,449
Everest Ultimate MEM Write (MB/s)
7,075 11,627
Everest Ultimate MEM Copy (MB/s)
7,334 15,039
Everest Ultimate MEM Latency (ns)
66.4 38.7
WinRAR 3.80 (min:sec)
14:48 10:52
POV-Ray 3.7 (min:sec)
9:08 8:18
3DMark06 overall
12,583 12,407
3DMark06 CPU
4,276 4,620
3DMark Vantage
7,459 7,450
3DMark Vantage CPU
30,615 34,909
3DMark Vantage GPU
6,034 5,902
FEAR (FPS)
114 132
Quake 4 (FPS)
180.3 144.6
Valve Particle Test (FPS)
100
131
World In Conflict (FPS)
188 151
NOTES: How we tested. We used matched GeForce 8800GTX cards  for both platforms, matched Western Digital 150GB Raptors, Windows Vista Home Premium 64-bit and the same graphics drivers. The Core 2 Quad had 4GB of DDR3/1333 and the Core i7 had 3GB of DDR3/1066.

Core i7 Features Dissected

The Core i7 CPU sports some unique features—we test their merits

Hyperthreading: The Next Generation

Hyper-Threading got a bad rap under Pentium 4 for being more a hindrance than a help to performance. Our tests then showed that HT generally helped, but the lack of threaded applications made the feature pretty near worthless. Intel has reintroduced Hyper-Threading with the Core i7 and says it’s worth another look. We ran a handful of our multithreaded applications with HT both on and off and determined that this time around, it’s good stuff. We generally saw a healthy double-digit boost in performance with HT enabled. Using the latest version of ProShow Producer, we actually took a 26 percent hit by turning off Hyper-Threading. MainConcept’s encoder experienced a drop of 17 percent without Hyper-Threading. So, if you ask us, you oughta leave it on.

Benchmarks
HyperThreading
HT On
HT Off
Main Concept
15:58 19:13
ProShow Producer 3.5 10:42 14:28
Cinebench 10 32-bit
15398
13451
Cinebench 64-bit
18963
16613
PO V-Ray
6:48
6:56
3DMark Vantage CPU 39725
35623
NOTES: Best Scores in Bold

Tinkering with Turbo Mode

Intel’s Turbo Mode gives the user fine-grain control over individual cores. By shutting down individual cores that aren’t used during, say, a single-threaded game, you can pick up what is essentially free performance by overclocking, or rather, Turboing, from 3.2GHz to 3.8GHz. We dialed up the allowable, um, Turbos from the stock 24 to 27 to see if the feature works. Indeed it does. In our mostly single-threaded Photoshop CS3 test and World in Conflict, we saw the scaling you’d expect from a 10 percent overclock. Since we didn’t choose to overclock for two threads, we didn’t see much of a change in Quake 4. Our verdict is that it’s a worthwhile proposition, the caveat being that you will need liquid cooling or a big, fat heatsink to truly exploit its potential.

Benchmarks
Turbo Mode
Off
On
Photoshop CS3
1:50 1:42
Quake 4
228 FPS
228 FPS
World in Conflict
250 FPS
272 FPS
NOTES: Best Scores in Bold

Tri-Channel Memory Tested

Core i7’s tri-channel DDR3 memory controller presents a radical alternative to the standard dual-channel configurations. Since the controller lets you run single, dual, or tri mode, we decided to take a look at the actual bandwidth offered by each scenario and the resulting real-world impact. Using three Qimonda 1GB DDR3/1066 DIMMs and a single Corsair 2GB DDR3/1600 DIMM (set at DDR3/1066), we ran two RAM benchmarks and Quake 4. The upshot is that for the best performance, you should populate three channels.

Benchmarks
Tri-Channel 3 DIMM
3GB DDR3
2 DIMM
2GB DDR3
1 DIMM
2GB DDR3
1 DIMM
1GB DDR3
SiSoft Sandra RAM Bandwidth 18.15
12.7
7.1
6.75
Everest Ultimate MEM Read 15167
14388
8317
8236
Everest Ultimate MEM Write
12041
13590
8285
8187
Everest Ultimate MEM Copy
15583
14848
9062
7798
Quake 4
228.0
172.4
213
167

Around the web

by CPMStar (Sponsored) Free to play

Comments