At the very end of May we saw NVIDIA’s first effort to expand Fermi beyond the $300 space with the GeForce GTX 465, a further cut-down GF100 core priced at launch at $279. Unfortunately for NVIDIA, it wasn’t even a lackluster launch – while GF100 performs quite well with most of its functional units enabled (i.e. GTX 480), disabling additional units isn’t doing the GPU any favors. Furthermore disabling those units does little to temper the chip’s high power draw – something that’s only reasonable on the higher-end cards – resulting in a card that ate a lot of power while losing to AMD’s Radeon HD 5850.

In short, the GTX 465 is a lesson of how you can only cut down GPU so far. NVIDIA went too far, and ended up with a part that had GTX 285 performance and GTX 470 power consumption.

Today NVIDIA is back in the saddle with something entirely new: GF104 and the GTX 460. The second member of the Fermi family is ready for its day in the sun, and in many ways it’s nothing like we expected. Designed from the start as a smaller chip than GF100, GF104 is the basis of the GTX 460 line of products which fix the GTX 465’s ills while delivering the GTX 465’s performance. It’s what the GTX 465 should have been, and it’s priced as low as $199. And as we’ll see, it’s the first NVIDIA card in a long time that we can give a glowing review for.

  GTX 480 GTX 465 GTX 460 1GB GTX 460 768MB GTX 285
Stream Processors 480 352 336 336 240
Texture Address / Filtering 60/60 44/44 56/56 56/56 80 / 80
ROPs 48 32 32 24 32
Core Clock 700MHz 607MHz 675MHz 675MHz 648MHz
Shader Clock 1401MHz 1215MHz 1350MHz 1350MHz 1476MHz
Memory Clock 924MHz (3696MHz data rate) GDDR5 802MHz (3208MHz data rate) GDDR5 900MHz (3.6GHz data rate) GDDR5 900MHz (3.6GHz data rate) GDDR5 1242MHz (2484MHz data rate) GDDR3
Memory Bus Width 384-bit 256-bit 256-bit 192-bit 512-bit
Frame Buffer 1.5GB 1GB 1GB 768MB 1GB
FP64 1/8 FP32 1/8 FP32 1/12 FP32 1/12 FP32 1/12 FP32
Transistor Count 3B 3B 1.95B 1.95B 1.4B
Manufacturing Process TSMC 40nm TSMC 40nm TSMC 40nm TSMC 40nm TSMC 55nm
Price Point $499 $249 $229 $199 N/A

GF104, the heart of the GTX 460 series being launched today, is the first waterfall part of the Fermi family. As we saw with AMD’s Radeon HD 5000 series last year and NVIDIA’s GeForce 9000 series before that, NVIDIA is in the process of taking the base GF100 design and reducing it for the construction of smaller, lower performing GPUs suitable for use in video cards at lower prices for the larger markets.

The final tally for GF104 is 1.95 billion transistors, which occupies a die space slightly more than that of AMD’s Cypress in the 5800 series. To put this in comparison, this is about 200 million fewer transistors than AMD’s Cypress, or 550 million more than NVIDIA’s older GT200 GPU that powered the GeForce GTX 200 series. This makes the GF104 the biggest GPU we’ve seen for the prices NVIDIA is targeting, a sign of the increasing pricing pressure between NVIDIA and AMD.

GF104 like GF100 before it is not initially being shipped in a “full” configuration. The chip has 2 Graphics Processing Clusters (GPCs) containing 4 SMs each, for a total of 8 SMs adding up to 384 CUDA cores. The GeForce GTX 460 will be shipping with 1 of the 8 SMs disabled, leaving it with 336 enabled CUDA cores. NVIDIA tells us that the reason they’re shipping the first GF104 parts with a disabled SM is due to yields – they wouldn’t be able to meet the demand for cards if they only shipped cards with 8 functional SMs. Unlike GF100, outirght poor yields don’t appear to be a huge factor here. Our impression from discussing the issue with NVIDIA is that GF104 is yielding around where it should be for a chip of its size, with NVIDIA choosing to take a hit on selling “full” chips for a higher price in order to sell more chips overall. In any case it gives them some room for expansion in the future should they decide to release a “full” GF104 based product.

Perhaps the most surprising thing about GF104 is that it’s not a simple reduced version of GF100 like what AMD did with the Evergreen series. Instead NVIDIA made some very significant changes to the design of their SMs for GF104, resulting in a waterfall product that’s undoubtedly Fermi but also notably different from GF100. There’s a lot to discuss here, so we’ll get more in to this in a bit.

Moving on to the cards, NVIDIA is launching 2 cards today. At $229 there is the GeForce GTX 460 1GB, the closest thing we’ll see to a “full” GF104 part for the time being. The GTX 460 1GB has 7 of 8 SMs enabled along with all 32 ROPs, with a 256bit memory bus connecting the GPU to 1GB of GDDR5. The core is clocked at 675MHz core, 1350MHz shader, and 900MHz (3.6GHz effective) memory. The TDP for this part is 160W, with an unofficial idle power draw in the 20W-30W range.

The other GeForce GTX 460 being launched today is the GeForce GTX 460 768MB at $199, a slightly further cut-down card. As NVIDIA’s ROPs are closely tied to their memory controllers, the only way to reduce the amount of memory on a card is to disable memory controllers along with the ROPs. As a result the GTX 460 768MB has less memory than the GTX 460 1GB, but also only 24 ROPs connected to a 192bit memory bus. The shaders remain unchanged, giving the GTX 460 768MB the same compute/shading abilities as the GTX 460 1GB, but only 75% of the ROP capability and memory bandwidth. The clocks are unchanged from the GTX 460 1GB: 675MHz core, 1350MHz shader, and 900MHz (3.6GHz effective) memory.

Given these differences, we’re a bit dumbfounded by the naming. With the differences in memory and the differences in the ROP count, the two GTX 460 cards are distinctly different. If NVIDIA changed the clockspeeds in the slightest, we’d have the reincarnation of the GTX 275 and GTX 260. NVIDIA’s position is that the cards are close enough that they should have the same name, but this isn’t something we agree with. One of these cards should have had a different model number – probably the 768MB card with something like the GTX 455. The 1GB card does not eclipse the 768MB card, but this is going to lead to a lot of buyer confusion. The best GTX 460 is not the $199 one.

Today’s launch will be a mixed bag in terms of availability. $199 has long been known to be a critical price point with buyers, which is what makes this card so important for NVIDIA as it allows them to finally tap that market once more. However to get there they’re using their entire initial run of GF104 to build the 768MB versions of the GTX 460. There should be plenty of 768MB cards available for today’s launch, but the bulk of 1GB cards are roughly 2 weeks late (1 or 2 may show up early if the vendor does rush shipping). So what we have is a hard launch for the GTX 460 768MB, but a soft launch for the GTX 460 1GB. We’re not entirely thrilled with this – particularly as we believe the 1GB cards to be the better buy – but if nothing else it’s better than the GTX 480 launch.

Today’s launch will also be resulting in an interesting mix of price points. NVIDIA has lowered the MSRPs on the GTX 470 and GTX 465, while AMD’s prices have been slowly drifting down over the last month too. As a result we end up with roughly the following:

July 2010 Video Card MSRPs
NVIDIA Price AMD
  $700 Radeon HD 5970
$500  
 
$400 Radeon HD 5870
$330  
 
$300 Radeon HD 5850
$250  
$230  
$200 Radeon HD 5830

With these prices AMD and NVIDIA both have themselves comfortably stratified until you drop below $250. AMD doesn’t have anything between the 5850 and 5830, while they have a price gap of $80-$100. Meanwhile the 5830 is priced directly against the GTX 460 768MB. NVIDIA’s pricing will be taking advantage of this gap, while giving the 5830 a run for its money at $200.

GF104: NVIDIA Goes Superscalar
Comments Locked

93 Comments

View All Comments

  • Howard - Monday, July 12, 2010 - link

    What?
  • Zok - Monday, July 12, 2010 - link

    Excellent writeup! I really enjoyed you going into depth on the architectural changes. I couldn't agree more that it's superb to see NVIDIA get back into the efficiency game - whether it be performance/price or performance/watt (and, by extension, temperature). Here's to hoping that AMD was sitting on something to combat this!

    P.S. Small typo: For everything but the high-end, this year is a feature yet and not a performance year.
  • thekimbobjones - Monday, July 12, 2010 - link

    Let the price war begin.
  • homerdog - Monday, July 12, 2010 - link

    "Here we use the DX11 renderer and turn on self shadowing ambient occlusion (SSAO) to its highest setting, which uses a DX11 ComputeShader."

    I don't think that's what SSAO stands for. Sorry for the nitpick.
  • chizow - Monday, July 12, 2010 - link

    Yeah I believe the proper term is Screen Space Ambient Occlusion but self shadowing is how its often explained to give an idea of what it is.
  • gentlearc - Monday, July 12, 2010 - link

    The graphs shown are leaving out too many new derivatives of cards, making is good for contrasting results, but poor for consistent data comparison. Conveniently left out are many cards in one graph that are in another. I'm disappointed in your presentation and find you've concentrated too much on the presentation of your article.
  • Ryan Smith - Monday, July 12, 2010 - link

    Out of curiosity, what's not in our graphs that you'd like to see? At 2560 we run a limited number of cards because most cards are too slow to post a passable framerate, otherwise at 1920 and 1680 we have the complete 5700/5800/5900 series, GTX 400 series, GTX 200 series, and Radeon 4800 series, along with a 3870 and 8800GT. Is there something else you would like?
  • SpaceRanger - Monday, July 12, 2010 - link

    What I'd like to see is the ATI card that is in direct competition with this highlighted as well. Having to search for the 5830 or 5850 out of all those bars turned me off.
  • estaffer - Monday, July 12, 2010 - link

    need some cheese with your whine?
  • SpaceRanger - Monday, July 12, 2010 - link

    Sure.. a good Gruyère please...

Log in

Don't have an account? Sign up now