AMD’s Bulldozer processor architecture has been in the works for many years and has been delayed many times, causing all sorts of speculations. The "Zambezi" FX processor is one part of the Scorpius platform which combines the "world’s first consumer 8-Core desktop processor" with already available 9-Series motherboard chipsets accompanied by Radeon HD 6000 graphics processors.
AMD heavily stresses pairing the right graphics card with the right CPU to obtain the proper balance of computing strength depending on the task you are trying to accomplish. In many cases, having a powerful GPU will improve application performance in ways that a CPU simply cannot do alone. Having a powerful CPU though is no doubt necessary in order to support the GPU in graphically intensive applications. Bear in mind that we’re not talking just about games now, since even majority of contemporary Internet browsers are now GPU accelerated, thanks to HTML5, WebGL and Adobe Flash 11.0. AMD is claiming that this new CPU will offer a new level of performance at a price point that is attractive.
Today we will be reviewing whether or not the FX Series Processors and Platform deliver on that value proposition and how they hold up to the rest of the processors out there.
FX Series Processor Details
AMD has redesigned they way that they build their processors, which meant that the Bulldozer modules of each processor were redesigned in order to make them friendlier towards future use. These improvements include the decision to share the share the fetch, decode and floating point pipelines and as mentioned before L2 Cache.
AMD’s Zambezi/Orochi die consists out of four Bulldozer modules, 16MB cache and four HyperTransport Links. Dual-Channel Memory controller is on the right
AMD redesigned the floating point units inside of the chip so that it can support new instruction sets which allows for more sharing between the two cores. There are two 128-bit FMAC (Fused Multiply Accumulate) units shared per module which amounts to two 128-bit instructions per core (in theory) or one 256-bit instruction per dual core module. The primary instruction sets that AMD has enabled over the Phenom II series of processors are the addition of 128-bit and 256-bit AVX executions as well as SSSE3, SSE4.1 and SSE4.2. All of these instruction sets have already been available on Intel’s latest Sandy Bridge architecture, so in that sense AMD is finally catching up to Intel.
AMD originally had planned to include the full SSE5 instruction set, but the company cut down the proposed 170-instruction set to increase the compatibility with Intel’s AVX. On top of those instruction sets AMD has added FMA4 and XOP instruction sets to improve performance in High Performance Computing (for the Opteron core) and multimedia encoding/decoding. These are forward thinking instruction sets considering the fact that Intel likely won’t include FMA3 until they release their Haswell processors in 2013. The XOP instruction set is unique to AMD and is a revision of the SSE5 instruction set and is actually complemented by FMA4.
All of this means only one thing – Zambezi’s die is quite big. By using 32nm SOI process by GlobalFoundries, Zambezi is a 1.2 billion transistor chip which occupies 315mm2 of silicon real estate. By comparison, Intel’s Sandy Bridge is a 995 million transistor chip measuring 216mm2. However, Zambezi comes with 16MB of cache, while Sandy Bridge only packs 9MB (1MB L2 + 8MB L3).
The FX line of processors will initially launch with four models, followed by two more. These models will consist of two 8-core variants and one 6-core variant and one 4-core variant. The clock speeds will be 3.6GHz for the FX-8150 and 3.1GHz for the FX-8120 with the 6-core FX-6100 model coming in at 3.3GHz. Regardless of the actual processor, each core of the FX series processors will be accompanied by 1MB of L2 cache as well as 1MB of L3 cache. See the graph below to see the breakdown of the different models excluding the 4-core variant.
The Asterisk mentions Water cooling option. It is currently being trialed in Japan, and North America will also come soon
On the next page, we’ll dig into the characteristics of our Scorpius testing platform.
The AMD FX-8150 processor is the fastest and most powerful processor of the FX line and is the focal point of this product launch. We have paired this CPU with an ASUS 990X motherboard that goes by the name Crosshair V Formula. In addition to that, we are running an XFX Radeon HD 6950 flashed with a HD 6970 BIOS, thus unlocking all 1536 cores. This will mean that it will perform somewhere between a 6950 and a 6970 in terms of performance since it isn’t quite either.
We also opted for 4GB of Patriot CAS8 2GHz Low Latency RAM which we then downclocked to 1866MHz to bring it within AMD’s official specifications for our benchmarking. This was all cooled by a Corsair H100 self contained water cooling unit which we chose in order to see the full overclocking potential of the FX-8150 processor.
AMD also provided us with an Asetek cooling solution for the FX processor which will likely be priced around $100 as an added option. By the looks of it, though, it appears to be very similar if not identical to Antec-branded Asetek solution called the Kuhler 920. We will probably slap it on this system once we finish up our review of the FX-8150. Unfortunately we received it too late to be able to actually test it, but we look forward to letting you know in the future.
- AMD FX-8150, 3.6GHz
- ASUS Crosshair Formula V 990FX Motherboard (running latest BIOS)
- 2x2GB Patriot Sector 5 CAS8 DDR3 2GHz running at 1866MHz
- Corsair H100 CPU cooler
- XFX Radeon HD 6950 flashed to 6970 2GB GDDR5
- 120GB OCZ Vertex 3 SSD (latest firmware)
- Dimastech Bench
- CoolerMaster 1100W UCP PSU
Testing system was running our standard software setup: 64-bit version of Windows 7 Ultimate SP1 with all the latest updates and latest version of system drivers (incl. Catalyst 11.9).
For benchmarks we will be taking a look at a broad array of benchmarks running on this system at stock clocks as well as various overclocked settings to show you guys what kind of value the Bulldozer may or may not have.
SiSoft Sandra 2011 SP5
Sandra’s Arithemetic bench is consisted out of two tests: Dhrystone ALU and Whetstone FPU test. From looking at our results we can see that the Bulldozer processor actually falls behind both the i5 2500K and the i7 2600 but then surpasses the 2500K in the FPU test, quite possibly a testament to the new design.
The SiSandra Cryptography bench is once again a combination of two benchmarks. The first being a sheer cryptographic bandwidth test while the other is a hashing bandwidth test. Taking a look at this we can see that the 2500K and 2600K are relatively the same and the Bulldozer isn’t far behind while the Phenom II X6 is miles behind. This scenario becomes flipped for the AMD processors, though, when the hashing bandwidth test is run because it appears that the Intel processors and AMD Phenom II X6 processor appear to be at least twice as fast. This seems a bit odd to us, so we will consider it an anomaly for now until we run another hashing test in one of our other benchmarks.
Moving onto Multi-Core Efficiency benchmark we see that the Bulldozer processor yields us some mixed results showing that it can indeed surpass the 2500K in one test, while in another appearing to be worse than the AMD Phenom II X6 processor. For Bulldozer to score worse than a Phenom II X6 seems ridiculous. Then again, Bulldozer design is optimized for high clocks, and AMD cut down L1 cache from 128KB in Phenom to 80KB. Sometimes, that bid pays off. In the Multi-core efficiency test, it did not.
In the Multimedia test we see the FX processor take charge and lead ahead of both the i7-2600 and i5-2500K by quite a margin. This large margin is unfortunately only visible in the Multi-Media Integer test where the i7-2600 scores a good 25% lower than the Bulldozer chip. This is likely due to the new processor architecture from AMD in addition to having 8 cores – as we called it five months ago, ALU performance is top, while the FPU performance is not. This margin is erased quite a bit in the next test in the Multi-Media Float test where the i7 2600 wins by 70% and the 2500K by 4% but the Bulldozer still shows an improvement of 19% over the Phenom II X6.
For Sandra’s Power Efficiency test we see AMD taking the crown for power efficiency and ALU power performance. The most important test, admittedly, is the ALU power performance test in which the Bulldozer beats the i7 2600 by 12% and the Phenom II X6 1090T by 15%. In the power efficiency test, the Bulldozer does not beat some of the other AMD processors, but it still does beat the i7-2600.
AIDA64 v1.81 Extreme Edition (Bulldozer Optimizations)
For testing the AIDA64 suite, Tamas Miklos provided us with the latest built of AIDA64 system benchmark & utility suite, which contains optimizations for Bulldozer architecture. The benchmark already contains all the optimizations for other tested microarchitectures such as Intel Sandy Bridge or AMD’s STARS (K10.5). In AIDA64, we ran 13 different tests in order to as fully as possible measure the performance of Bulldozer against the competition. If you look at some of the scores, you will notice that most of the Intel processors were running at lower memory clock speeds than what we were running. The reason for this is because those tests were run at Intel’s officially supported memory clock speeds. This is a point that should be addressed by Intel. They should, like AMD, support higher clock speeds than just 1066 and 1333MHz.
In our first test, we ran AIDA64′s Memory Read test. In this test we found that the FX-8150 actually enabled the Turbo core option which had already been enabled in the bios bumping one of the cores to 4.2GHz. That, combined with the memory clock of 1866MHz resulted in a speed of 14,721MB/s just a little short of the i7-2600K processor. This is nearly double that of other AMD processors like the A8-3850 and the like. However, bear in mind that FX-8150 has maximum theoretical bandwidth of 29,856MB/s, while i7-2600 peaks at 21,328MB/s. As you can see, Intel created a much more efficient controller in Sandy Bridge.
In our AIDA64 Memory Write test we essentially got the same placement that we had in our Memory Read test, except for the fact that the i7′s had actually beaten out the FX-8150 even though they were running at 1333MHz.
In the Memory Copy test, the FX-8150 processor shone brightly beating out all other processors with a score of 18,601 MB/s beating out all previous generations of i7 processors (even the triple-channel one) as well as AMD processors. This is most likely attributable to the fact that the AMD FX Platform supports higher clock speeds without requiring the user to ‘overclock’.
For the Memory Latency test we expected the previous results to reproduce themselves and they did indeed. In terms of memory latency, we were running the memory in the system at CAS8 latency so there was a good chance we would beat out other platforms, but considering that most other platforms were also running lower clock speed RAM there was a good chance their latency would be lower. As a result it is great to see that AMD still has great memory latency and is bringing back the days of the Athlon 64 X2 latency performance.
Following the memory tests, we decided to run the CPU tests to see where the FX-8150 stacked up against all the other processors, what we found was pretty interesting. In the CPU Queen test the FX-8150 came in below all of the leading Intel i7 processors, but still above all but one AMD processor.
In the AIDA64 PhotoWorxx test we found some extremely interesting results considering the prevalence of photo editing and use in computing today. In this test, the FX-8150 Zambezi processor was bested only by the Xeon X5550 processor. In every other case the FX-8150 beat all of the i7 processors by a fairly decent margin.
For the Zlib benchmark that we ran in AIDA64, the Bulldozer processor performed pretty well. It was fair to say that it performed within expectations but it couldn’t quite catch up to the Intel i7 2600 processor. This benchmark appeared to heavily favor processors with more cores, but yet the 2600 processor somehow managed to edge its way into the group.
In the AES benchmark we expected to see a huge improvement for AMD’s performance due to the new instruction sets and added optimizations built into the Bulldozer cores and the Zambezi processor as a whole. We were not in the least bit disappointed as the FX-8150 simply dominated the test in nearly every single way and beat out every single processor that had been tested on the AES test beating the i7 990X by a healthy 10%.
In the hashing benchmark we were interested to see whether or not the FX-8150 would repeat its poor showing that we saw in Sandra and the good thing was, it didn’t. The FX-8150 actually ended up placing second overall in all of our tests only being beaten by the Magny-Cours Opteron 2431 12-core processor. This test shows us that perhaps Sandra wasn’t quite optimized for AMD’s Bulldozer as the version of AIDA64 we were running came as part of a benchmarking package that we downloaded specifically for Bulldozer benching and monitoring.
We then ran some FPU tests in order to see how well the new Bulldozer FPU stacked up against other processors. What we found was that the FX-8150 didn’t quite reach the performance of the Core i7-2600, but it did beat the Phenom II X6 by a decent margin as well as the A8-3850.
In the FPU Julia test the FX-8150 dropped a bit in terms of its placement and comparison, but it still performed approximately where it was expected. We would’ve liked to see it performing closer to the Core i7 990X, but it doesn’t look like the Julia test really likes the Bulldozer architecture that much.
The FPU Mandel test was effectively more of the same. The FX-8150 performed about 10% faster than the Phenom II X6 1055T processor, but as still quite a ways away from the Core i7 2600 processor in terms of FPU performance.
The FPU SinJulia test was actually the most disappointing as we saw that the SinJulia test used the x87 instructions and the Bulldozer cores appear not to do that as well as its predecessor the Phenom II X6 processor. We’re not quite sure whether or not it is an architectural flaw or if there needs to be an optimization made to fix this potential problem.
In Cinebench we really had no idea what to expect, so when we got our results back we were a little disappointed. In Cinebench R11.5 we scored 5.86 points with our setup. This could possibly be attributable to the fact that we did have issues with the ASUS motherboard initially which caused us days worth of delays. We were told that we should expect a score of around 6 on average. As such, this would put the FX-8150 on par with our Core i7-975 processor, not necessarily a comparison we’d like to be drawing considering how much older the Core i7-975 is. Hopefully the FX-8150 will redeem itself in other places.
We expectedly ran 3DMark 11 in all 3 settings and here are our results. There isn’t really anything spectacular. By comparison with our 2600K in the Performance test we ran, the 2600K got a physics score of 8919 and our i5-2500K got a score of 6900. This is in contrast to our score of 6481 in the physics subscore with the FX-8150 processor. Admittedly, this result is a little disappointing, but there’s a possibility that some performance optimizations could be made to make it run better.
We decided to run PCMark 7 mostly because it is a whole system benchmark that effectively evaluates the system in a way that most users would likely experience the system. As such, it tests everything in the system ranging from the GPU, CPU, Hard Disks, and RAM in real-world-like scenarios. Admittedly, this is still a very controlled benchmark but its as close as some of these benchmarks will come to being real world.
In PCMark 7 we got a score of 4206 which wasn’t quite amazing nor too bad as we’ve seen scores ranging from 300 to 1800 all the way up to 6000. As we benchmark more and more systems we’ll likely have a better fix on how effective these scores gauge performance. At this time we’re mostly including it for the sake of record and potential comparison as PCMark 7 is still relatively new and unused.
In LinX, graphical interface for the Linpack, we were able to evaluate the linpack benchmark which is comparable to the one used in high-performance computing arena. Also, we used LinX in a nice 4-hour long stress test to see how hot the chip would get under our Corsair H100 cooler that we ran. With Linx we obtained approximately 30.7894 GFLOPS peak performance after running 464 loops of the test.
With PassMark we effectively used it as a quick and easy benchmark for multiple CPU tests and were able to compare against the Core i7 2600K and the Core i5 2500K processor. By giving it a look we can see that the FX-8150 for the most part stays behind the 2600K in terms of performance with the exception of the Floating Point Math CPU test. Also, we must say that this version of the application was specifically stated to already be working properly with Bulldozer unlike many other tests out there. Getting back to the test, we can see the FX-8150 mostly staying ahead of the i5 2500K but behind the i7 2600K, which in reality is really where we see AMD trying to position this processor.
Video Encoding Testing
We have used two popular multimedia benchmarks for encoding the video. Handbrake and X264 HD Benchmark are most commonly used, and in our conversations with Intel engineers, they were especially coy about media skipping testing Bulldozer with Handbrake. We tested Bulldozer with Handbrake and you can see the results for themselves.
In handbrake, we took a 1080P source file and converted it to qHD resolution in order to illustrate the transcoding capabilities of the FX-8150 processor. As you can see from the screenshot, we were getting an average FPS of 95.5 and it took about 21 minutes to complete.
X264 HD Benchmark 4.0
Taking a look at this benchmark we can see a direct comparison between all of the latest processors from both Intel and AMD. Taking a quick glance we can see that the Intel i7 2600K definitely dominates this test without a doubt with the i5 2500K beating the FX-8150 in the first pass but losing to the FX-8150 in the second pass. We’ll consider that a push.
In DiRT 3 we saw some really great performance from the FX-8150 processor, but admittedly the modified HD 6950 we were running took the brunt of the load. With those two combined we were able to attain some solid frame rate results with but only a single card. This was mostly helped along by the fact that DiRT 3 was actually using all eight cores of the FX-8150 in a way that improved overall performance to a level where frame rates simply refused to drop low enough. We ran DiRT 3 at 1920×1080 at Ultra settings with 4x MSAA and had an average of 66FPS with a minimum FPS of 51 and a maximum of 75.
As you can see from the screen shots, we got around 50% overall CPU utilization and at times it got all the way up to 60%. Needless to say, we’re really excited to see CPUs getting more loving especially considering how many cores tend to sit idle in a lot of applications where they could be used to improve performance and framerates.
We also ran DiRT 3 at ultra low settings at 640×480 to see if the CPU would be up to the task of running the game as the lower the resolution the more the load goes to the CPU rather than the GPU. In this scenario we saw DiRT 3 running at an average of 116 FPS with a minimum of 73 FPS and a maximum of 134 FPS.
In Metro 2033, we decided to crank up the eye candy level even more and have some fun with the game. Unfortunately, since we were only running one HD 6950 we only ran the game at High preset levels instead of Ultra. Nevertheless, we still ran them at 1920×1080 and managed to attain an average frame rate of 50 FPS with a minimum of 23 FPS and a Maximum of 92.
We did this because if we had bumped the settings to Ultra, the game would admittedly be unplayable as it requires two graphics cards or a dual chip graphics card in order to run at Ultra settings. Running Metro 2033 at lowest settings 800×600 the maximum and average frame rates increased to an average of 139 average and 172 maximum frame rate, but the minimum still sat at a low 24 FPS.
For F1 2011 we decidedly ran the same tests as we did in Metro 2033 and DiRT 3, but we decided to test out the canned benchmark as well just to give you guys, our readers, at least one canned benchmark. We usually just play through a full level or race in a game and capture our frame rates through FRAPS. For the game’s own benchmark the game got a minimum FPS of 44 and an average of 54 FPS. Not bad considering that we were running the game at absolutely maximum settings at 1920×1080.
This is the precalculated benchmark that comes with the game…
Since F1 is also a Codemasters and utilizes the variant of the EGO engine used in DiRT 3, it appears that they worked closely with AMD again to make sure that their games made the best of the Zambezi FX-8150 processor because we saw similar CPU utilization in F1 2011 that we did in DiRT 3. The likely answer is that they probably have very similar if not the same engines that make good use of all of these cores that are made available by the FX-8150 processor.
In terms of overclocking, we really didn’t have that much time to spend overclocking the FX-8150 but in the short time that we did, we managed to get some pretty nice overclocks as well as overclocking performance out of the chip. The primary reason why we believe that overclocking is so important is because of the value proposition that AMD has placed on their brand and their products as a whole. Considering the fact that the FX-8150 is priced at $245 there is no doubt that AMD is continuing towards a value oriented segment but with more performance behind it.
The important part, of course, for any value based product is extra mileage after purchase. In this case, it comes in terms of overclockability. In our experience, the FX-8150 was not that hard to overclock up to 4.6GHz. It was actually extremely simple and required little to no voltage increases. We re-ran many of our benchmarks at 4.6GHz to show the benefits of a quick and easy overclock as well as the performance increases one can reap from having an unlocked processor that overclocks relatively easily.
AMD FX-8150 processor overclocked to 4.8GHz pushed the score from 5.86 from 7.76 points
In Cinebench R11.5 we went from a score of 5.86 at stock to 7.47 at 4.6GHz. This showed nearly a 1 to 1 ratio of clock performance to Cinebench score improvement as our overclock was 27% and our Cinebench performance increase was also 27%.
AMD FX-8150 processor wasn’t stable at 4.8GHz, but on 4.6GHz scored quite nicely – 41.94 GFLOPS
In LinX we saw a linear increase in floating-point performance – clock the CPU at 4.6GHz is a 27% increase in clock, resulting with exactly the same 27% increase in performance.
Performance did not increase as much while overclocking the processor by 1.2GHz, up to 4.8GHz
In 3DMark11 we saw a solid increase in performance in the Entry level test going from a composite score of E7896 to E8445 with only an increase in CPU clockspeed. Performance scores rose from 5544 to 5678, while the Extreme scores rose by mere 10 points from X1864 to X1874, clearly showing a GPU bottleneck.
AMD FX-8150 at 4.8GHz significantly increases performance in an almost linear fashion
We also ran the x264 HD benchmark to witness the performance increases in HD video encoding and we saw the FX-8150 best the 2500K in both pass 1 and pass 2 and beat the 2600K in Pass 2. This illustrates that the FX-8150 does clock pretty well, but it does require quite a bit of clocking in order to be up to or beyond the level of the 2600K in a good amount of tests.
From our experience overclocking the FX-8150, it appears as though 4.6GHz was the overclocking sweet spot as it required very little effort and voltage bumping to attain. Furthermore, once we started moving up towards 4.7GHz and 4.8GHz we found ourselves pushing the voltages in larger and larger increments and the amount of performance gained began to diminish as we consumed more and more power per clock. In our short time of overclocking the FX-8150 we managed to attain a stable clockspeed of 4.8GHz (or 30% OC) as illustrated by our CPU-Z screenshot as well as our Cinebench score of 7.76 which had beat out our previous 7.47 at 4.6GHz. All of these overclocks were attained on all eight cores with turbo disabled.
Power and Heat
Since we were using an H100, heat was not much of an issue. As we stated before, the FX-8150 was able to be put under full load for 4 hours without breaking 44C and none of the cores got any hotter than 35C. This is most likely a testament to the H100′s cooling capability, but as we can see in the hardware monitor screenshot the FX-8150 never went over that 44C barrier. Furthermore, in terms of power consumption the FX-8150 is great in terms of idle power only consuming on average around 85W.
Under load, though, the CPU consumed as much as 209w which is more than what we’ve seen from AMD processors in the past. This is a little disappointing in terms of being energy efficient, but we really are optimistic that there will be further revisions to this design perhaps with a die shrink that will improve the power consumption greatly. Then again, it does still overclock quite well and I’d be afraid to go back and check the power consumption of the chip at 4.8GHz.
The AMD FX-8150 processor is without a doubt the latest and greatest product coming out of AMD’s processor division. At the price of $245 it is clear that AMD recognizes that their performance is not on par with a $300+ chip and to be frank, at $245 it does deliver quite a bit of value especially in terms of overclocking.
The FX-8150 will likely be a very good companion for anyone considering building a powerful gaming machine that is capable of running all the latest games without a hitch. There is the concern of using too much power, but it isn’t off the scales to the point where one must worry about the PSU choice. Also, another thing to consider is that not only do you save money on the price of the processor by $50 to $80 but you also save about the same if not more on the motherboard as most AMD 990FX motherboards are much less expensive than their P67 and Z68 counterparts.
For example, building a system based on AMD FX-8150 result in $245.99 charge for the processor and $224.99 for the Crosshair V Formula motherboard. The Core i7 2600K goes for $314.99, but the biggest difference is the motherboard. The comparable Maximus IV Extreme motherboard goes for $349.99. The price difference is $190, which is the difference between buying a high-end Radeon HD 6970 2GB or mainstream HD 6870.
AMD’s new FX-8150 processor is not entirely the game changer that many people were expecting, but it does put AMD back into the game in a way that they really weren’t in it before. With their heavy focus on gaming performance and added value there is no doubt that AMD will be working harder and harder with every single revision to improve performance in ways that consumers ultimately benefit from. With the planned release of Piledriver-core based A-Series and FX-Series processors in 2012 consumers can expect to see a minimum 10% to 15% improvement over Bulldozer’s performance today.
AMD’s new Bulldozer architecture is a sea change in thinking for the company and a drastic change in the way the company thinks about processors. We hope that this brings more innovation and better products and we’re glad to see that they’re back to providing a quality product for a reasonable price.