AMD’s Radeon HD 7970: 4.3 Billion Transistors of Pure Performance
AMD moves its high-end GPU family to 28nm, delivering stunning performance and impressive efficiency
We knew this was coming. We saw all the signs: The rumors. The price drops on existing videocards. The tweaked versions of old standbys masquerading as “new” GPUs. But more than anything, it’s been too long since we’ve had something fresh to sink our teeth into. And as has been the case in each of the last several big product launches, AMD is serving the first course.
Eric Demers, CTO of AMD’s Graphics Division, began talking about the company’s Graphics Core Next (GCN) earlier this summer. He described a new GPU architecture that would take graphics to the next level. He promised a GPU-compute monster that would remain highly scalable, so versions could be built into future generations of AMD APUs. The first iteration of Graphics Core Next comes in the form of the Radeon HD 7970, and it marks a substantial architectural shift for Radeon graphics. We’ll examine the overall architecture first, and then we’ll dive into the hardware specifics of the Radeon HD 7970.

Goodbye VLIW
Previous AMD GPU generations used very long instruction words (VLIW), a way of tightly packing multiple GPU instructions in order to move them around the GPU and memory efficiently. VLIW went through a couple of tweaks, including a change to a four-word VLIW scheme from a four-word scheme . VLIW was well tuned for the modern generation of programmable graphics, but it wasn’t so hot for GPU compute.
With AMD betting the farm on Fusion, which inherently takes advantage of a GPU’s parallel-compute capability, the company needed a more flexible architecture. So AMD discarded VLIW in favor of something the company calls GCN Quad SIMD (single instruction, multiple data). Instead of a single VLIW instruction plus four math operations for the ALU (arithmetic logic unit), the GPU uses four SIMDs and a single ALU operation. The four SIMDs can do the same work as a single VLIW, but they can also act independently when needed.
GCN marks a major shift in how AMD GPUs operate, behaving more like a general-purpose vector processor than a pure graphics engine. What’s more, each basic building block, called a GCN Compute Unit, includes a scalar coprocessor that can behave like a traditional—but non-pipelined—CPU. AMD has beefed up the caches that are distributed throughout the GPU. Each GCN core (yes, AMD is calling them cores) has its own dedicated L1 read/write cache. Each group of four cores shares a 16KB instruction cache and a 32KB scalar data cache. All the cores communicate over a shared bus to a partitioned L2 cache that can be sized differently depending on the graphics card and particular GPU die.
AMD intends for GCN to serve as the basis for several product families. The first product, code-named Tahiti, is aimed at gaming enthusiasts who want maximum frame rates while enabling maximum eye candy. The next product, code-named Pitcairn, will supersede the Radeon HD 6800 series. Pitcairn will be followed by a series code-named Cape Verde, which AMD believes will redefine the segment now held by products such as the Radeon HD 6700 series.
Code-name Tahiti
AMD took advantage of TSMC’s new 28nm manufacturing process to build its new high-end GPU. The Radeon HD 7970 sports 4.3 billion transistors in a surprisingly small 365mm2 die. AMD product marketing manager Devon Nekechuk tells us AMD’s 28nm yields have been both “good” and “predictable.”
Tahiti is assembled from 32 GCN compute units, which translates to 2,048 stream processors, each of which is based on AMD’s new SIMD-plus-scalar architecture. The existing Radeon HD 6970, by contrast, is equipped with just 1,536 stream processors and doesn’t benefit from the new architecture. The 7970 includes 768KB of L2 cache and eight render back-ends capable of pushing 32 color ROPs per clock and 128 Z/stencil ROPs per clock cycle. The existing 6970 provides the same quantity of render back-ends, but the newer card boasts higher throughput and much-improved efficiency; plus, the 7970 features a 384-bit interface to 3GB GDDR5 memory and a PCIe 3.0 interface. The GPU is capable of peak throughput of 264GB/s.
Tahiti also implements a feature known as partially resident textures. Local graphics memory is used as a kind of big cache for texture data, and very large textures can be streamed in on demand. This improves performance in game engines that use features such as virtual texturing or mega-textures: Texture sizes can be as large as 32TB (yes, terabytes).
The Radeon HD 6970 is oft criticized for its weak tessellation performance, especially when compared to Nvidia’s GeForce GTX 580 series. AMD has beefed up the GCN’s tessellator by improving the reuse of vertices, improving its off-chip buffering performance, and providing larger parameter caches. AMD predicts overall tessellation performance will be as much as 4x better than the 6970, depending on the application.
On the compute side, Tahiti uses dual asynchronous compute engines, which can independently schedule and dispatch work to improve multitasking. The compute engines can work in parallel with the graphics command processor, and AMD reports that context switching is “fast.” The GPU also features dual built-in DMA engines, and AMD suggests the chip can saturate a PCIe 3.0 x16 bus when running compute chores.
Floating-point performance is fully IEEE compliant, and the 7970 is capable of pumping out up to 947 double-precision gigaflops per second. It is the first GPU to support OpenCL 1.2, DirectCompute 1.1 and C++ AMP in hardware.
Video processing has also been improved. Given the right application, Tahiti can evaluate 7.6 terapixels per second (peak), and it has the ability to transcode 1080p video in faster than real time.
![]()
blackdog
December 30, 2011 at 11:28am
Check this website by AMD and see what the estimated value of the card it says there, look for Jan. 3rd prize: https://docs.google.com/document/pub?id=1khomiLMw0uaTt1Hv7fJFAl0Wf96XgqJNvJKF5Xmktsc
Like this:
AMD Radeon 7970 GPU
Estimated Value: $329
![]()
blackdog
December 30, 2011 at 11:27am
Check this website by AMD and see what the estimated value of the card it says there, look for Jan. 3rd prize: https://docs.google.com/document/pub?id=1khomiLMw0uaTt1Hv7fJFAl0Wf96XgqJNvJKF5Xmktsc
![]()
EthicSlave
December 30, 2011 at 9:47am
I think a PCI-e spec upgrade will become a large factor for those dealing with multi monitor setups.
To date this has been the only way to moderately saturate the current pci-e spec.
having trouble with super high resolutions/fps ... upgrade to pci-e 3.0 with enough lane bandwidth to support your current multi monitor setup, make sure each lane has enough bandwidth also
![]()
BLACKCELL
December 28, 2011 at 8:13am
none of these companies are building their hardware based on what we need, it's always a guess, I'm just saying, wtf am I going to actually do with 3GB of DDR5 memory? lol I don't need to be all maxed out to the point that I bow a fuse in the house, it's all pointless, you know there is such a thing as cloud gaming systems processing all this data and speed that we will just be able to streamline to a monitor, mouse, and keyboard, soon it won't matter who the top dog is AMD, INTEL, or what video card will dish out spinning stars that pop off a supernova's ass cheeks!, pretty soon, it will all just come down to internet speed and how big of a screen that you want to play it on, THINK MMPC USERS, fan boying of anyone company is just stupid as well, hell I like AMD and Intel, I use them both, and so will my kids, the point is how can you call yourself a true IT guy/ girl If you just stick to one type of IT tree? that's like saying I only eat apples, but knowing I secretly also love grapes, but I won't eat them because their not the apples I grew up on, grow up, love it all because soon it will just be another things that we used to remind our kids about, " kids back in the days we used to enjoy records, and tapes" "dad that's old now we have mp3's and streaming movies and HD2000dpi and 4D screens"
but in the end I still love it all, so stop harping on each other, talk about hacking it or making that thing run naked on a bucket of ice water!!!!
![]()
ashinms
December 31, 2011 at 2:53pm
Brand affinity is like religion. logically, it makes no sense and causes people to make counter intuitive and downright wrong decisions, but there is a natural drive in all of us to "find our camp" so to say; to have an enemy and an ally. Fanboisim also promotes competition. Anyone from the outside will almost always choose sandy bridge over bulldozer. why? Because it's what the sales guy told them to get, plain and simple. it makes more sense, so its what he'd suggest, but i'm building my next computer around bulldozer. why? Because I like amd. plain and simple. Would sandy bridge make more sense? Probably- but i just dont think the twently extra dollars id save would be worth jumping ship for.
![]()
ashinms
December 24, 2011 at 4:34am
This thing is gonna be a compute BEAST when put in with trinity.
![]()
JohnP
December 23, 2011 at 4:43pm
I donno. I broke out the calulator and I found that if I bumped the speed of the NVidia 580 up by the difference in clock speeds, the NVidia 580 matches the AMD 7970 benchmarks almost precisely. That means the new architecture from AMD MAKES NO DIFFERENCE to the card at all except for being a bit less power hungry. Now the power has everything to do with the shink in die size so that too is a wash once NVidia moves to a smaller die size.
Bottom Line: AMD's newest family of videocards is only (once again) matching what NVidia has had out in production for 2 years now. Man, first Bulldozer, now Radeon 7970. Both Intel and NVidia have no real reason to bump up thier production schedule of Ivy Bridge and Kepler if AMD continues on this way. I despair for AMD. They just HAVE TO DO BETTER and by a lot or they will continue to lose market share.
![]()
MastaGuy
December 27, 2011 at 9:22am
But what happens if you overclock the 7970?
Than it will be more powerful than a GTX 580
![]()
loyd
December 25, 2011 at 11:17am
Unfortunately, Nvidia can't match the clock speeds with the current Fermi. And we haven't heard a whisper about Kepler, though it's always possible that Nvidia is sandbagging until they saw what AMD is doing. So AMD has the lead for the time being. And that 3W deep idle is something to think about, too.
Remember also that these are beta drivers on a brand new architecure, so we'll see some growth in future driver releases. Finally, AMD suggests that the reference 7970s have enough headroom to hit 1GHz, depending on chip yields and custom coolers.
![]()
JohnP
December 25, 2011 at 1:50pm
Lloyd, the power consumption decrease with the Radeon 7970 is mainly due to the die shrink to 28nm. NVidia is planning on a die shrink of their existing Fermi architecture before Kepler is released:
http://news.softpedia.com/news/Nvidia-Kepler-Is-On-Track-Samples-Arrived-In-House-240284.shtml
Another effect of the die shrink is that clock speed usually increases as there is less heat created at the lower voltage needed with a smaller transistor.
The third change that is not revolutionary is the bump of AMD's 7970's memory bus from 384 bits (matching the 580) from the 6970's 256 bits along with 3GB DDR5 memory vs the GTX580's 1.5GB and the 6970's 2GB.
The final non revolutionary change is bumping the number of stream processors by 33% from 1,536 to 2,048.
Again, breaking out my calculator, the 35% bump in the number of stream processors ALONE causes the increase in the change in the benchmark differences between the 7970 and the 6970.
The higher benchmark, however, does not show ANY OTHER large speed bumps that SHOULD HAVE OCCURED due to the increase in the memory bus size, the higher amount of memory, compute performance, texture fill rate, or finally the NEW ARCHITECTURE.
If I add up all the increases in the technology, I would have expected benchmarks in excess of 50-60% over the previous generation. Perhaps I am naive in how much to expect but, hell, a doubling of transistor count should have produced a lot more than a 35% increase. Add the new architecture, smaller die size, and more memory and I am underwhelmed.
Yeah, I have seen some remarkable changes in certain benchmarks with driver updates, but expecting more that 10% is pretty unrealistic.
I would love to see AMD sucessful but there is not the gee whiz that I was expecting. I may be disappointed with NVidia's Kepler also, but having AMD being 1st out the door should be a dazzler, not an evolution. AMD needs a clear winner here and I just don't see it.
![]()
ashinms
December 24, 2011 at 4:41am
How many times does it need to be said that these are beta drivers and thus not performance optimized? That, and the goal wasn't exactly massive FPS increase in the first place. It says in the article that they were concentrating on improving tesselation and GPGPU performance, which is what this GPU delivers.
![]()
lordfirefox
December 23, 2011 at 11:01pm
I'd like to know how you were able to benchmark an AMD card that doesn't exist yet in the wild. And synthetic benchmarks are a joke when compared to real-world usage data. So wherever you're getting your numbers baffles me.
Is an end user going to dispair missing .001 or .002 FPS just because he uses an AMD GPU over an nVidia one? Probably not. Because as long as the games run smoothly and play well they're not going to care.
I've been using ATI/AMD for years and I've never had any problems with their hardware. I can say the same for nVidia too but I like saving money and if that means I can get by with AMD with great performance then I'm going to continue to buy AMD.
Oh and I love how people, particularly young gamers, have this fantasy of nVidia and Intel teaming up especially when the two companies oppose each other on just about every front. In other words it's not going to happen.
![]()
JohnP
December 24, 2011 at 12:51am
Uhh, I used the data IN THIS ARTICLE called BENCHMARKS. I would assume that you missed this...
I have nothing against AMD and ATI. I have 4 computers in my family and two of them has an ATI Radeon graphics cards in them. None have AMD uP chips any longer as they are not performance chips no matter how low they price them.
For my HTPC, I do not care about raw speed in graphics as much as the capabilities of the card and have an ATI card in it. For my high end gaming machines, I surely do care about performance, esp if I move up to that 30" monitor that I am looking at from the two machines with 27" monitors. The capabilities of running multiple monitors for high end games is also in my future.
Looking at the beta BENCHMARKS of the 7970 (even though they are rough and ready ones), I just don't see the WOW factor in AMDs latest card. The only things it seems to have going for it is the high transistor count and faster clock speeds. Not only should it have trounced the 5870 in ALL of the benchmarks, it should have done so by a hell of a lot more than just the bump in clock speed (which it DOES NOT). There is no compelling reason to upgrade to a AMD Radeon 7970 from an NVidia 5870 and that is a shame.
This is ATI's LATEST GRAPHICS CARD using their NEXT GEN ARCHITECTURE. As such, the card should have at least a 20-30% edge on the current gen cards if it is to succeed for the next few years in stomping the competition (hell even keeping up). If it is a meh card priced about the same as the competition or only slightly faster as the opposition's current gen card (as this is), ATI is going to be very vunerable to ANY speed bump of NVidia next gen cards coming out in the next couple of months.
AMD/ATI has to be at the top of their game NOW in order to give credible evidence to their stockholders and buyers that they have a viable and strong future. They are losing market share, they have consistently not made their forecast numbers, and they have seemed to have boosted their profits only by decreasing their head count and selling off assets. This is a dangerous slope for them that may not turn out well.
Dang, IMHO this is a well reasoned and well written critique, Heh.
![]()
warptek2010
January 08, 2012 at 10:22pm
Sorry, John. A very unscientific way of going about your numbers, I have to say. For example, just using the math alone you left scaling issues out of the equation entirely. We're dealing with a completely new architecture that does away with something that would otherwise might be a bottleneck... namely VLIW, along with other enhancements. To test for geometric scaling increases you MUST have optimized drivers and optimized benchmarks that take advantage of the new architecture. Afterall, we're talking here about video cards, right? At it's most basic and fundamental function, i.e. displaying a 2d image, ANY modern video card can do that job from the lowest $35.00 bargain basement card to the currently most expensive.
What sets these higher end cards apart is their ability to render 3d images, as well as other higher functions like multiple monitors etc....
Do the REAL words test first, not on paper.
![]()
lordfirefox
December 23, 2011 at 3:13pm
Gonna wait on AMD still... Their cards have been too amazing for me to want to switch over now.
In otherwords it's all subjective opinion.
![]()
ashinms
December 23, 2011 at 6:33am
I was gonna get a third 5779.... but I may just wait and try out one of these bad boys. Still a delima, though... My entire planned upgrade- FX8120, AM3+ motherboard, a whole new RAM setup, a third 5770, plus a new hard drive would still cost less than one of these cards...
![]()
chipwatcher
December 22, 2011 at 7:11pm
Now that a consumer product (potentially a high volume production item) is being introduced that saturates the PCIe 3.0 x 16 bus, it might make sense to introduce a working PCIe x 32 bus on high end mother boards. The specifications for this have been in the PCIe specs since PCIe 1.0 yet to my knowledge no one has built such an interface. With SSD companies like FIO and OCZ producing products that are approaching the PCIe 3.0 x 16 bus limits as well as the need for higher bus through put for InfiniBand, 40Gb and 100Gb Ethernet, and PCIe switches, a quick (6 to 12 months) means to produce working hardware including cabling to double the through put of a PCIe 3.0 x 16 bus sounds intriguing.
Considering that working PCIe 4.0 buses are at least 2 and probable 3 years away from seeing limited production a short term solution to the need for additional through put should be seriously looked at. Of course if a x 32 bus is introduced using the PCIe 3.0 specifications there should be no over riding reason that the upcoming 4.0 specification shouldn’t increase the x 32 bus speed.
![]()
blkpanthr
December 28, 2011 at 7:04am
there arent enough PCI lanes on mainstream CPUs to bother...
I think a better solution would be dynamic lane allocation so it uses as many as needed up to the max available...
![]()
Keith E. Whisman
December 22, 2011 at 4:46pm
I wonder when will video cards require so much bandwidth that they take up two x16 slots using a daughter card and ribbon cable plugged into an adjacent x16 slot? That day will bring sli and crossfire to an end but I bet it will come eventually unless they devise another slot that mimics vesa local bus in its sheer size as VLB was huge.
![]()
Marthian
December 23, 2011 at 8:52am
considering this card is PCI Express 3.0 compatible and PCIe 4.0 already in the works, I don't really think so
![]()
Dexter243
December 22, 2011 at 4:16pm
i was looking at new 6790 i think ill wate a tad longer and see what thay have for $150 in a 7xxx card
can all ready play bf3 on high with my 5670 on a 22" with athlon 2 x4 3.3ghz
![]()
ashinms
December 24, 2011 at 7:07am
That's what I love about BF3: Scalability. 5770, athlon II X4 @ 2.8, same settings except eyefinity, still playable framerates.
![]()
LatiosXT
December 22, 2011 at 9:23am
The real interesting battle will be for the midrange $200-$300 section. But let's just hope that Kepler doesn't become another Fermi.
![]()
The Corrupted One
December 22, 2011 at 10:11am
Yeah, but I hope the 6850's successor lives up to the mantle of having an insane cost to performance ratio
![]()
The Corrupted One
December 22, 2011 at 8:49am
I wonder how the architecture change will affect bitcoin mining?
The question is, can it run Cr..(gets shot)
![]()
ABouman
December 22, 2011 at 10:30am
*gets shot with an arrow to the knee (sorry, couldn't help myself)
![]()
chart2006
December 22, 2011 at 7:48am
I'm excited to see the final product and tweeked drivers. Makes me wonder what to expect with Nvidia as well. Most likely I'll eventually get the 7970 later next year over Nvidia just because of price/power/performance ratio. I like how the long idle drops to 3 watts which is very impressive.
![]()
FrancesTheMute
December 22, 2011 at 8:32am
They were using early beta drivers, probably doesn't support Eyefinity very well yet.
![]()
std error
December 22, 2011 at 11:24pm
I browsed around... tom's, hardocp and hardware heaven have eyefinity testing.
Pretty good performance for the card actually: 35.5 fps on ultra at 5760x1200 for BF3, with SSAO only though.
![]()
win7fanboi
December 22, 2011 at 6:59am
Nice perf/power consumption ratio. Can't believe Shogun 2 is this brutal on the GPUs. I have a 6850 and had no idea it has such low frame rate.
![]()
praetor_alpha
December 22, 2011 at 5:14am
How long is the card? Looking at trends over the past few generations, I'd say it has to be at least 15 inches.
![]()
h e x e n
December 22, 2011 at 4:54am
Pretty impressive. It will be interesting to see how these cards scale after release.
The increse in performance is no more astounding than the last series change, but what's exciting to me is pushing the core clock over 1ghz... on air. It will be cool to see what kind of overhead the cards have out of the gate.
![]()
Gumby
December 21, 2011 at 11:37pm
Seriously impressive performance per watt. And this is on early drivers. As the manufacturing proccess improves for these chips, drivers improve for this GPU, and the card builders improve the cooler and overclocking, this card has great potential to improve even over what we see right now. It could be a real game-changer, and I am not just talking about your frames-per-second. I would love to see a 6GB version down the road. Also, I would like to see some Crossfire benchmarks when more cards and more mature drivers are available. Even better, I would love to see Nvidia's response. Competition is great. Too bad AMD does not compete with Intel as well as they compete with Nvidia.
![]()
austin43
December 21, 2011 at 9:31pm
Gonna wait on Nvidia still...Their cards have been too amazing for me to want to switch over now.



















