True performance enthusiasts have had a very difficult choice this past year. Go for maximum core and thread count using an older core microarchitecture, or cheap out and get almost the same (or better) performance in most apps and games using the mainstream Sandy Bridge chip.
That, in a nutshell, has been the enthusiasts’ dilemma ever since Intel introduced the Sandy Bridge chip in January 2011. Well those days are behind us now that Intel has finally, finally released its Sandy Bridge-E (for Enthusiast) chip. With one simple chip—the new 3.3GHz Core i7-3960X—Intel has neatly folded up all those worries and put them into a nice little blue box stamped with the Intel logo.
A TRUE ENTHUSIASTS' CPU
Boiled down to the simplest of terms, if the quad-core 3.4GHz Core i7-2600K (or its new sibling the 3.5GHz Core i7-2700K) was the best chip out there, the Core i7-3960X is now the bestest. That’s because the Core i7-3960X is simply a Core i7-2600K with two additional cores.
Actually, that’s not really accurate. As an enthusiast chip, there are no graphics cores in the Core i7-3960X. And while the Core i7-2600K is limited to just 16 PCIe 2.0 lanes, the Core i7-3960X sports 40. Even better, those 40 lanes of PCIe support are PCIe 3.0 compliant. Out the gate, however, Intel (or its lawyers, anyway) is reluctant to label them as PCIe 3.0 until it actually has enough PCIe 3.0 cards to test.
As to the cores, you already know about them. They’re Sandy Bridge cores and include AVX and AES-NI instruction-set goodness. Turbo Boost 2.0 on these models will take the top-end 3.3GHz Core i7-3960X to 3.9GHz. The cores are built using Intel’s 32nm process and, well, there are two more of them turned on.
Besides the added cores, enthusiasts will also be thrilled by the memory support: To keep those cores fed, Intel is using a new quad-channel memory controller. The memory controller seems significantly faster than previous iterations, too. While the tri-channel memory controller in the original LGA1366 didn’t blow our socks off (over a dual-channel configuration), the quad-channel controller in the Core i7-3960X has us stunned. In our tests, we found that it offered nearly 100 percent more memory bandwidth than the Core i7-990X’s triple-channel configuration.
PSSST, IT'S REALLY EIGHT CORES INSIDE
Intel isn’t making the Core i7-3960X just to satiate the appetites of speed freaks. The chip is mostly intended to be sold as a Xeon workstation CPU. So it shouldn’t surprise you that the Core i7-3960X is actually an eight-core chip. Yup, that’s right; looking at the block map of the chip, you can see that the new CPU has two sections blocked out where cores seven and eight go. Why leave them off? Intel officially says the decision was based on its desire to balance clock speeds, thermals, and power needs. We suspect that it’s really because Intel doesn’t need those two extra cores at this point. Not to telegraph too much, but AMD hasn’t posed much performance competition yet. By leaving cores off now, Intel can always introduce octo-core chips later if it needs to be more competitive. There could also truly be a thermal concern, as unsubstantiated rumors (are there any other kind?) initially told of Intel’s new chip pushing an unheard‑of 180‑watt thermal rating.
Yeah, we know what you’re thinking already, because we asked the same thing ourselves: Can you unlock those two other cores? Negative, Ghostrider. Intel has laser-cut those cores off in the die, so unless someone has the smallest‑possible soldering gun, we’d bet a box of adamantium claws that it’s impossible.
MEET THE NEW PLATFORM
As is Intel’s modus operandi, the company has a new socket. While the switch from LGA1156 to LGA1155 certainly pissed off customers, the LGA1366 crowd can hardly complain. LGA1366 launched with the original Core i7-965 Extreme Edition way back in 2008. For Intel to even support a socket that long is almost unheard of. So, with Core i7-3960X, Intel is introducing its new LGA2011.
Why the extra pins? The additional pins in the socket are to support the quad-channel memory and the relocation of the PCIe lanes from the core-logic chipset to the CPU core (à la Sandy Bridge and Lynnfield). For the most part, enthusiasts will be tickled pink with the beastly new socket, the quad-channel memory, and PCIe 3.0. What they won’t be happy with is the SATA 6Gb/s situation. The new X79 chipset features a Serial Attached SCSI controller that can support up to 10 drives in SATA 6Gb/s, but at the 11th hour, the feature was switched off due to compatibility concerns. Instead, we’re left with an X79 peripheral controller hub that’s pretty much a weak-sauce retread of the P67 and Z68’s PCH: two SATA 6Gb/s and four SATA 3Gb/s ports. You can certainly argue that you don’t need more than two SATA 6Gb/s ports since they’re only useful for SSD drives, but we think it stinks, especially as we had been teased by thoughts of motherboards bursting with SATA 6Gb/s. We expect initial boards to be limited in SATA 6Gb/s ports due to the last-minute switch, but in a few months, board vendors will tack on additional ports using third-party controllers. If anything, the SATA 6Gb/s features on boards and how they’re implemented will separate the men from the boys in mobo land.
MEET THE SANDY BRIDGE-E FAMILY
For the LGA2011 platform, Intel is introducing three new chips: The top-end Core 7-3960X at $990—yup, that’s $9 cheaper than the existing Core i7-990X chip (gee thanks, Intel!) that this Extreme chip is meant to replace. Intel is also introducing two other chips. The mid-tier 3.2GHz Core i7-3930K will sell for $555. Besides the lower stock clock, the chip will shed some of the L3 cache, for a total of 12MB. For the budget enthusiast, Intel has plans to release a quad-core, Hyper-Threaded Sandy Bridge-E with 10MB of L3 cache early next year. Prices of the Core i7-3820 haven’t been released, but we’re pretty sure it’ll slot in at about $300. The part is “partially unlocked,” meaning it will have limited overclocking features, and is likely intended as a way to get entry-level enthusiasts in the X79 game.
The good news for enthusiasts is that Intel has no plans to step away from offering blistering‑fast chips with cutting-edge technology, despite all the focus on tablets and smartphones these days. Hallelujah.
|Intel Core i7-2600K||Intel Core i7-990X||Intel Core i7-3960X||AMD Phenom II X6 1100T||AMD FX-8150|
|Turbo Clock (Max)||3.8GHz||3.7GHz||3.9GHz||3.7GHz||3.9GHz (4.2GHz)|
|TDP||95 watts||130 watts||130 watts||125 watts||125 watts|
|Cores / Threads||4 /8||6 / 12||6 / 12||6||8|
|Total L2 Cache||1MB||1.5MB||1.5MB||3MB||8MB|
|Total L3 Cache||8MB||12MB||15MB||6MB||8MB|
|Transistor Count||995 million||1.17 billion||2.27 billion||904 million||2 billion|
|Socket||LGA1155||LGA1366||LGA2011||Socket AM3||Socket AM3+|
|Memory Controller||Dual Channel DDR3/1333||Tri-Channel DDR3/1066||Quad-Channel DDR3/1600||Dual Channel DDR3/1333||Dual Channel DDR3/1866|
3DMark11 is considered a GPU test but its overall score is actually created from "the graphics score, Physics score, and the Combined score using a weighted harmonic mean." That basically means it's still a test that is weighted heavily towards graphics performance. For our test, remember, that we used identical GeForce GTX 580 cards all running the same graphics drivers. In the end, the new Sandy Bridge E part was the fastest, but really, we're not talking by a huge margin because 3DMark11's overall score is so heavily weighted toward the GPU in the slot—not the GPU in the socket. For this run, we run the default standard test which is Performance. Higher is better here.
3DMark11's physics score for the new Sandy Bridge E shows just how heavily weighted it is toward the GPU. The 3DMark11's Physics test "focuses on CPU performance by simulating rigid body physics with a large number of objects. This test runs at a fixed screen resolution for all presets. There is no post processing, volumetric lighting, or tessellation."
Here, Sandy Bridge E simply crushes the competition to dust even smashing its cousin the Core i7-990X to tiny bits which surprised us as both chips have the same core and thread counts. It's quite possible the insane amounts of bandwidth available to the Sandy Bridge E chip accounts for some of this, or its crazy ass big cache. When we get a chance, we'll switch the X79 to dual-channel mode and rerun the test. For now though, Sandy Bridge E stands tall. Higher is better here.
The popular 7-Zip utility features a built-in benchmark utility that measures how fast a processor can compress and decompress a file. The performance is presented as MIPS and you can vary the size of the load and how many threads you want it to use. The number here uses the maximum number of threads available on a processor so for the eight-core FX-8150 ran with eight threads and the six-core, Hyper-Threaded parts ran with 12-threads. We saw both the older Core i7-990X and its replacement run dead even. On the good news for AMD front though, the FX-8150 gets a little payback on the Core i7-2600 by acing it despite the FX-8150 having a lower price. For those hoping to replicate this at home, just fire up 7-Zip and run the benchmark with a 64MB workload and the maximum number of threads your CPU supports. Higher is better here.
For our Bibble 5 test, we take a folder of 210 or so RAW/CR2 files shot with a Canon 5D Mark II and output them to JPEG. Bibble has always been a wondrously multi-threaded application and the Sandy Bridge E again is the top performer by a respectable margin. While people have widely dinged the FX-8150 for middling single-threaded chops, the eight-core processor actually pulls dead even with the four-core, Hyper-Threaded Core i7-2600K CPU. That's not a bad showing for the new AMD part at all. We actually tested a number of our benchmarks to make sure disk I/O wasn't hampering them. This unfortunately wasn't one of them (at least this time, we have tested it on an SSD in the past). We'll revisit this with a couple of the chips with the latest generation SSD to make sure I/O isn't hurting the performance of the new Sandy Bridge E part. Lower is better here.
On the Intel side, there's really only two cores in action here: The older Westmere core that's in the Core i7-990X "Gulftown" and the newer Sandy Bridge core that powers both the Core i7-2600K and the Core i7-3960X. Both parts are 32nm but there is a difference under the heat spreader. To try and get an idea of how each core stands on its own, we used Cinebench10 to run a single-threaded render of its benchmark test. Both the Sandy Bridge and Sandy Bridge E parts run away from the older Westmere core chip in the Core i7-990X and are actually pretty close. Why did the Sandy Bridge E part best its cheaper sibling? This can be attributed to the larger cache (15MB in the Core i7-3960X vs. 8MB in the Core i7-2600K) and also the memory bandwidth differences between the quad-channel and dual-channel machines. There is also a slight clock speed difference. The Core i7-3960X Sandy Bridge E part will turbo clock to 3.9GHz while the Core i7-2600K part will turbo up to 3.4GHz. The upshot here is that given the similar clocks under Turbo (assuming Intel hasn't tweaked its Turbo Boost 2.0 even more) the Sandy Bridge E core perform similarly to Sandy Bridge cores with an edge going to the newer chip and its beefed up cache and memory bandwidth. On the AMD front, you can also clearly see what people are concerned about with FX parts. Despite its higher base clock, Turbo Clock, and Maximum Turbo Clock, and larger cache, the older Phenom II X6 breaks even with the newer FX-8150. Higher is better here.
We've said that the older Core i7-990X part is a monster when it comes to 3D rendering so what does that make the newer Core i7-3960X part? Godzilla? Stomping and romping through your 3D renders? Perhaps. Basically, if your time is money and you render for a living, the new Sandy Bridge E is the chip to have. Higher is better here.
The newer Cinebench 11.5 shows a similar advantage for the six-core chips. The Sandy Bridge E chip can even outpace dual-quad core Xeon W5590 chips (that's 16-threads). It easily outpaces its cousin six-core chip. On the AMD front, the FX-8150 doesn't represent well as its score is just barely faster than the older Phenom II X6 chip despite it having eight cores. We suspect that the Cinebench render is heavy on floating point resources which may be a disadvantage to the FX-8150 which features shared floating point units among its eight cores. Higher is better here.
For our CyberLink Espresso 6.5 test, we take the same 1920x1080 MPEG2 file that we use in our Main Concept Reference test and convert it to an MPEG4 file suitable for viewing on an iPhone 4. CyberLink Espresso has the capability to be run on a GPU, the CPU or Intel's QuickSync technology that's only in the Sandy Bridge processors. The score is how long it takes to run in seconds. The Core i7-3960X is again the winner (tired of hearing that yet) but not by an overwhelming amount when compared to the Core i7-990X.Oddly, the FX-8150 is the slowest chip of the pack as it just barely trails the Phenom II X6.
As part of our test, we also shifted the workload from running on the CPU to the GeForce GTX 580 that we used in all of our test rigs. We figured since the work load was shifted to the GPU, the scores should all be the same but surprise surprise, the Core i7-3960X comes in significantly faster than the other chips here. We're really not sure why, but when talking to CyberLink about why we're seeing the performance spreads, we are. For kicks, we also ran it on the Core i7-2600K's integrated graphics chip using the QuickSync technology. These are specific hardware instructions that Intel has set aside for video transcoding and encoding. The result using QuickSync was interestingly faster than all of the CPUs save the Core i7-3960X and the GeForce GTX 580. So maybe integrated graphics doesn't suck anymore, does it?
For our gaming testing, we like to run our games at low resolutions and with features turned down so as not to bottleneck the CPU with the GPU. This would simulate what kind of performance you would get if you had a fantasy GPU that you somehow snatched out of the year 2015 to run in your machine today. With Far Cry 2 the Sandy Bridge E again runs away with it. We suspect the large cache and memory bandwidth help the Core i7-3960X zoom past all others here.
For our Handbrake test, we used a video file shot with a Canon EOS 5D Mk II at 30fps (the 24fps firmware wasn't available at the time) and tasked Handbrake with encoding it to H.264 using the high profile setting. Handbrake is a widely used free encoder that favors using the CPU for encoding due to the consistent quality over GPU encodes. The Core i7-3960X again leads the pack by an impressive healthy margin over the six-core Core i7-990X chip. The eight-core FX-8150 also holds up its head when compared to the Core i7-2600K chip. We say this despite the Core i7-2600K being a four-core chip with Hyper-Threading because of the pricing of the lower pricing of the FX-8150.
In looking for a benchmark that uses the new AVX instruction set in the Sandy Bridge and Sandy Bridge E cores, we came across the Intel Burn Test. Created by, ummm, AgentGOD at XtremeSystems.org, it's intended as a stress test but it uses AVX to perform linear algebra calculations. It's apparently based on Intel's own Linpack math library thus it's called the Intel Burn Test. As such, we ran it mostly to compare the Intel procs.
Looking at the chart, the Core i7-3960X demolishes the competition but the Core i7-2600K also does pretty well as it also has the AVX instruction set aboard. Despite it having six cores, the older Core i7-990X lacks AVX and is easily eclipsed by the Core i7-2600K part. We did say that we ran this to gauge the Intel parts, but the FX-8150 also happens to feature AVX instruction set support as well while the Phenom II X6 doesn't. Obviously, this is an Intel test but the overall performance of the FX-8150 does pretty well compared to the Core i7-990X and especially against its older sibling. AVX support in applications is in its infancy today, but as with all instruction sets, expect that to expand.
PCMark7 is the latest in FutureMark's popular all-around performance tests. It's not designed to test just the CPU as it factors in storage and graphics into its score, too. The overall score is based on storage performance tests in Windows Defender, importing pictures, gaming as well as video playback and transcoding performance, image manipulation, web browsing and decrypting. Since we use the same hard drives and GPUs in our tests, the graphics performance and storage components should be mostly even. The differences you see here should mostly be the result of image manipulation, web browsing, and decrypting and video transcoding here. The Core i7-3960X just edges the Core i7-2600K and yields an advantage over the Core i7-990X. We've read reports that the test is multi-threaded but we suspect it tops out with but four cores. Again, the score is heavily weighted by storage subsystems. Since we used the same Raptors across the board, our scores are significantly lower than those using an SSD.
The Persistence of Vision Raytracer or POV-RAY is a freeware raytracing app that reaches all the way back to the 1980s and drew inspiration from the Amiga. The two six-cores again take the lead and surprise, the FX-8150 just pushes the Core i7-2600K aside. The six-core Phenom II X6 also has a good showing.
We've long used ProShow Producer 3.5 to benchmark machines. The application is a popular slideshow creator used by professional photographers. It's multi-threaded but we've found for some time that it doesn't seem to scale beyond four cores. For our test, we take a folder of 200 or so JPEG files shot with a 21MP Canon EOS 5D Mk II, add random transitions, a sound track and output it to a 1080p video file.
Again, the Core i7-3960X offers seriously fast performance coming in about 20 percent faster than the Core i7-990X part and the Core i7-2600K which are about tied. There's also good news for AMD here. While its top end CPU can't compete head on with the Core i7-2600K part, the chip is showing fairly good improvements in encoding tasks. For example,look at the score from the six-core Phenom II X6—it's simply dragging butt while the FX-8150 closes the gap with Intel.
We've been using Sony Vegas Pro for the last couple of years to measure how fast a machine is at rendering a video from this popular non-linear editor. We normally run Vegas Pro 9, but decided to update to Vegas Pro 10 for this set of processors. (We're also looking at Premiere Pro CS5 but didn't have a benchmark assembled in time.) For the workload, we use a video shot with a Canon EOS 5D Mark II, apply numerous filters to it, and spit it out to Window Media Video at a high-bit rate setting.
The Core i7-3960X, sigh, again kicks ass with a render time almost 30 percent faster than the Core i7-990X processor. Against the Core i7-2600K, you're looking at almost a 45 percent performance difference. On the AMD side, we're again seeing the FX-8150 offer significant performance increases over the Phenom II X6 in video-related duties. It's not enough to beat the Core i7-2600K, but it ain't bad in a category that Intel has ruled for a long time.
For memory bandwidth we tapped the good old SiSoft Sandra. We haven't been blown away by the triple channel performance of the Core i7-990X so we didn't know what to expect here. The answer is whoa mama! A huge increase in memory bandwidth from the quad-channel Core i7-3960X. It's roughly twice the bandwidth of the Core i7-990X with its triple-channel setup. For the record, we tested the quad-channel CPU with four 4GB DIMMs of DDR3/1600, the tri-channel with three 2GB DIMMs of DDR3/1600 and all three dual-channel rigs with two 4GB DIMMs of DDR3/1600. And no, we didn't test the FX-8150 at DDR3/1866 speeds. Believe us when we say that that little bit of memory bandwidth won't suddenly propel it past the Core i7 parts.
Can a faster CPU make a different on-GPU test that's nearly completely reliant on the GPU? Yes! No, not at all, actually. So why does this chart from the DX11 Unigine 2.5 bench say otherwise? Look closer at it and you'll see the oldest trick in the book: a chart that makes seemingly minor differences seem huge. We left this in because our Excel defaulted to this setting when we created the test and to show you how it's easy to warp things.
This (below) is actually what you should expect from benchmarks (and games) that are bottlenecked solely by the GPU. There is virtually no difference between the Core i7-3960X and the lowly Phenom II X6 1100T chip. That's despite us using a $500 GPU—a GeForce GTX 580—for our tests. Would it make a difference if we ran three GTX 580 cards in tri-SLI? Unlikely. Today's games are still highly reliant on the graphics card rather than the CPU. That's not true in all cases though. Many games are finally starting to use enough threads to require a quad-core chip and a small number of games will actually use six cores or more to enhance your game play experience. So why this chart? We just want to emphasize to those who are the 100 percent gamers that your best investment is the fastest GPU you can get. Frankly, we don't think there's such a thing as a 100 percent gamer because video encoding, transcoding, image editing, and other chores that rely heavily on a fast CPU are performed by all enthusiasts.
Valve's particle benchmark dates back to the first quad-core chips. We still use it as a gauge of CPU performance on particle effects in gaming though. The benchmark doesn't really scale beyond four cores so we've been surprised by the performance of the six-core Core i7-990X. We've always attributed that to either the large L3 cache or the additional memory bandwidth the chip has over dual-channel configurations. The performance of the Core i7-3960X may back that up as it has even more memory bandwidth and a huge amount of L3 cache. But then we get to the FX-8150 which has more cache than the Phenom II X6. Maybe it's time to call Scooby, Shaggy and have them truck over here in the Mystery Mobile so they can find Old Man Withers hiding down in the silicon.