The Architecture - Abridged
We are going to right off and say it quickly, this point of this review is not to banter on about the fine points of the architecture. If you're interested in that, we recommend heading over to our article solely designated for a deep dive into the architecture and future architectures
So, what exactly makes the Kepler Architecture different from, say Fermi? First, it has to do with the SMX unit which is NVIDIA's update to their previous generation SM unit. The Kepler SMX features significantly more texture, geometry and shader processing capability than Fermi as each SMX has 192 shader cores per SMX unit compared to Fermi's 32 shaders per SM unit. This results in 6 times as many shaders per SMX unit over Fermi with Kepler, and provides 2 times the performance per watt. Because of this, there are only 8 SMX units in the Kepler GK104's architecture compared to 16 SM units in Fermi. So, what NVIDIA has done is increase the amount of shaders per unit by a factor of 6 but reduce the amount of shader units by half, netting the three times increase in shader cores from Fermi to Kepler.
In addition to the entire overhaul of the shader units, NVIDIA has also changed up their memory controller from a 384-bit memory bus in Fermi to a 256-bit memory bus in Kepler, which may be one of the reasons why NVIDIA was able to pack so many shader cores into such a small chip. They have also increased their memory speeds up to 6Gbps, which is one of the highest speeds in the industry for a graphics chip. Fermi operated on 4Gbps, which represents a 50% increase in speed, but due to the narrower memory bus, the memory bandwidth is actually slightly lower than Fermi (192.26GB/s vs 192.4 GB/s).
The Kepler architecture also features twice as many texture units, resulting in 128 texture units compared to Fermi's 64 as well as more than double the texel fill-rate increasing from 49.4 gigatexels/s to 128.8 gigatexels/s. This increase is where NVIDIA's Kepler architecture stands to significantly improve its graphical performance. It also supports Direct X 11.1 which NVIDIA doesn't consider that big of a deal, but does put them on par with AMD.
The one thing we did notice was that NVIDIA had left out almost anything about Kepler's HPC performance, which we found quite interesting considering the fact that Fermi was a huge improvement in terms of HPC performance in both single and double precision. NVIDIA only mentioned that Kepler's single precision performance had doubled from 1581 GFLOPS to 3090 GFLOPS, but no mention of double precision performance... and as we will explain in our in-depth architecture analysis
, there's a reason why.
© 2009 - 2013 Bright Side Of News*, All rights reserved.