Last year around this same time the air was full of rumors about nVidia GPUs. Did they have anything at all to challenge the Evergreen lineup that had been thrown down by AMD? We heard that TSMC [
Taiwan Semiconductor Manufacturing Company] was having issues trying to make the giant GF100 die on their troublesome 40nm process. We saw a rush of articles about the mock-up card used by NV at their GTC 2009 keynote and so much more. Finally in the opening weeks of 2010 we saw the GF100. It was fast, it was big and it was also a power hungry GPU and puts out a ton of heat.
To put it simply, the GTX480 with its GF100 was not what nVidia wanted and not what we all expected. However, not too long after we were slightly disappointed [but in many ways still impressed] with the performance of the GF100 nVidia dropped a revamped GPU on us, this was the GF104 and showed up in the fast and nimble GTX460. This inexpensive GPU also featured a streamlined design that helped to reduce heat and power consumption while maintaining performance.
What’s New? The new GPU under the hood of the GTX580 is something of a new design. While the Shader Model configuration of each shader processor found on the GTX580 is the same as we saw on the GTX480 - there are some improvements to the way it works. One of the first is the inclusion of support for 16-bit Floating Point texture filtering. nVidia has also dropped in support for a new tile formats that show an improvement in
Z-cull efficiency. As a final 'improvement' the GeForce GTX580 simply has more shader, texture and CUDA cores and a much faster clock speed to boot.

nVidia GF110 die is hidden below this heatsink
For those of you planning on dropping in a window AC unit into your PC case I would hold off for now. nVidia did some tweaking to the overall design of the GPU which has allowed them to drop in 512 CUDA cores [whole 16SM clusters] while pulling less power and generating less heat that its warm blooded older brother the GTX480. They did this [according to nVidia] by reengineering the Fermi Die at the transistor level.
Again according to nVidia [you would have to tear down a GPU to really know for certain] they have adjusted the type of transistor and are using low leakage [pronounce somewhat slower] transistors where the instructions do not need full speed and, of course, fasters ones where the data throughput is more critical. This type of reconfiguration makes the GPU much more power efficient in that you are reducing the amount of power leakage across the board. This is combined with an improved cooling system [although it looks the same on the surface] means that you are also going to be generating less heat [we will talk more about this later].

But even these items are not all to the new GTX580. As you have heard us say before; AMD is great for bringing things to the market, but they often do not have the resources to pursue them. One of these items is DirectX 11. We have to give full credit to AMD for having the first full family of DX11 GPUs; they beat nVidia to the punch by a few months. However, AMD did not have the same type of funds that nVidia has and have not been able to developers offset the cost of game production in the same way that nVidia does and has. As such we are seeing nVidia pull ahead in DX11 implementation. This is appearing in the GTX580 by a new distributed tessellation system. nVidia uses a 16 PolyMorph Raster Engines in the GTX580. It pushes the workload out across all of these to not only improve performance but also to improve efficiency. As one AAA lead engine designer told us:
"With DirectX 11 hardware, Tessellation is finally usable. When AMD, nVidia and Microsoft asked us what do we need to use Tessellation, we listed five mandatory features. It took AMD the same amount of time to make Tessellation unit usable as nVidia, which did not had Tessellation at all... and don't get me started on N-Patches."
Hawx2 Tessellation Wire frame image
The final specs of the GTX580 come out something like this:
GPU Clock Speed 772MHz
512 Cores at 1,544MHz
6 x 64-bit memory controllers [nVidia lists this as 384-bit]
1.5GB GDDR5 memory at 1002MHz QDR [4.008 billion transfers per second]
192.4GB/s memory bandwidth
The interesting bit about this design is that we received press releases from board vendors and they cited clock speeds as high as 850MHz for the GPU and 1.7GHz for those 512 Cores, i.e. the same level of overclockability as experienced on the smaller GF104 GPU i.e. GeForce GTX 460.
© 2009 - 2011 Bright Side Of News*, All rights reserved.