Study: More Cores Not Always Better
For a long time, both Intel and AMD relied on ever increasing clockspeeds for each new processor release. That still remains the case today, but to a much lesser degree. Case in point - Intel's long retired Northwood line topped out at 3.4GHz, or 200MHz faster than the zippiest Core i7 processor currently on the market.
The future of chip design has shifted to where multiple cores is now main factor, supported by larger cache, on die memory controllers, expanded instruction sets, and other secondary concerns. That's all well and good that AMD and Intel are on the same page, which puts the onus on software developers to catch up, but at least one group of researchers believes we're headed for an unpleasant surprise.
According to Sandia National Laboratories, performance is going to start tapering off significantly as chip makers keep piling on more cores. Sandia came to the conclusion by running a simulation consisting of "key algorithms for deriving knowledge from large data sets." During the simulation, few performance gains were had from moving from four to eight cores. But the real kicker is going beyond eight cores resulted in a performance drop. When that number gets as high as 16 cores, Sandia warns "a steep decline is registered as more cores are added."
So what the heck is going on that would cause multiple cores to stumble so unexpectedly? It comes down to a bottleneck in memory bandwidth. And what's scary is that the bottleneck isn't an unknown problem, but "it isn't an issue to which the industry has a known solution, and the problem is often ignored."
We're still quite a ways off from 16-core processors making it into the mainstream, and developers have yet to fully tap into even dual-core processors on a consistent basis. So while there's plenty of time to come up with a solution, chip makers haven't yet started doing so, according to Sandia.

Image Credit: TechReport.com
![]()
djdandan
January 19, 2009 at 5:07am
By the time 16 cores are about to be standard we will be using DDR4 with quad channel or better so who cares about your pessimistic predictions?? Sheesh, some people just like to stir. Please at least get somewhat realistic with your predictions!
![]()
I Jedi
January 16, 2009 at 10:32am
There was a similar article about a month or two ago on MaximumPC in which they described this very exact problem, too. They'll come up with something eventually. Although, I can't imagine why someone would want or need 16 core in one processor altogether... besides the ability to run huge math calculations or server based stuff.
![]()
StahnAileron
January 17, 2009 at 9:10pm
Just give it time. As things go, we'll eventually see comsumer-level software that NEEDS the processing power. Just wait until the day when everyone is running an OS that today would be considered Server grade, but would be just a run of the mill OS at that point. (Though in theory you can say that about Linux now to some extent...But it's market penetration is still a bit low, no? At least at the Consumer level...Not sure about the enterprise level.)
And there's always PC Gaming to help drive the tech advancements for hardware anyway.
It was only a couple decades ago that, "640k should be enough for anyone." Look where we are now...^_~
![]()
StahnAileron
January 15, 2009 at 11:25pm
Most supercomputers have hundreds, if not thousands of cores, yes...But They're also composed of dozens or hundreds of NODES, each which can contain several individual computers which have their own dedicated systems. A HPC maybe have the aggregate equivalent of several terabytes/sec of Memory bandwidth, but it's NOT a collective bandwidth ALL cores can access. Each individual Core/CPU has only a small fraction of the aggregate bandwdith available to it.
Saying a HPC has a lot of memory bandwidth is like saying it has a very high speed reading becuase you added all the individual core speeds together. (2 x 4GHz cores does NOT equal the equivalent of an 8GHz single core.)
Cache is based on SRAM (far as I recall. Correct me if I'm wrong anywhere. Always up for leanring more), and that's both expensive as hell and ineffifcient in terms of silicon use. (DRAM is preferred for high density RAM use because of how much denser you can pack it together compared to SRAM...Assuming similar fab size, IE 65nm.)
I don't see cache-type RAM being used as main system RAM any time soon...Not unless cost and density have been addressed. (Current mainstream SRAM is 6T, no? I don't recall if lower T-count SRAM designs have been implemented in practice yet...)
And with SRAM, there's the problem of physical distance, especially at high frequencies. Signals can only travel so far in the billionths of a second a modern CPU's clock rate can hit. SRAM can run pretty quick, but once you move it a good way from the CPU, you negate it's speed advantage because the signal starts to lag. Then your Price vs Performance ratio would suffer, I'd think.
One of the problems I see ii the overhead needed to ochestrate the distro of data on massively multicore systems. I mean, if the OS can't take advantage of all the cores available to it and/or efficiently dole out work to the cores, what's the point?
Of course there's the software application side as well...(It'll be the day when something like Notepad can make use of multiple cores, LoL)
Maybe they can start designing consumer systems with dedicated Memory controllers for EACH core or something...Maybe like how the CPU cache designs are used, but at the system RAM level using DRAM. (L1 RAM with dedicated connection to a core, Shared L2 RAM for full system use.)
It'll be interesting to see how they solve the memory bandwidth problem though. They got past the perfomance versus Core speed/thermal issue by going to Multi-core...Let's see where they go with RAM.
![]()
mikemckay
January 15, 2009 at 10:05pm
i just have to throw this out there.....havent we seen this all before?
i meen now you can get single video cards with many hundreds of cores (or stream processors...whatever) and more than 100GB/s bandwidth to the memory and THEY still seem to scale well......are we supposed to assume that system memory will never advance past triple channel ddr3?
those tests just seem like they might be a little unrealistic
![]()
devin3627
January 15, 2009 at 3:06pm
Is that talking about motherboard/ddr3 fsb? What if you had 3000mhz ram.....? There's gotta be a instruction code around this that uses L3 cache, externally?
![]()
AndyYankee17
January 15, 2009 at 12:46pm
depends heavily on what you're trying to do, realtime things like games are hitting a block with multicores but nonrealtime things like encoding/decoding, rendering, compiling really fly with multiple cores
![]()
decapitor
January 15, 2009 at 12:25pm
That's a pretty vague argument, especially since clusters and supercomputers have been rocking thousands of cores for quite some time now and helping the scientific community crunch data at incredible speeds. I just xgridded 5 8 core mac pros and my mpi code is significantly faster than on one machine. I mean of course no matter how well optimized your code you are going to have diminishing returns as you add cores, but the performance always increases. Just put me in the "doubters" bin for now.
![]()
QUINTIX256
January 15, 2009 at 12:21pm
'report: roadblock ahead for multicore processors'
Also, "key algorithms for deriving knowledge from large data sets" seems to me like a lot of cache hit misses. If the cpu is doing a whole lot more time waiting for the cache to be swaped out than actual data crunching, than yes, it is much more of a measure of memory bandwidth than a measure of processor performance.
You can have your recession. I'm not participating. (vs The economy is suffering, let's starve it!)
![]()
patrickmaher
January 15, 2009 at 12:30pm
Was thinking the same thing.
http://www.maximumpc.com/article/news/report_roadblock_ahead_multicore_processors
















