BRIGHT SIDE OF NEWS About | Advertise | Contact BSN USER Login
| Register
SUBSCRIBE Newsletter | RSS Feeds
Sunday, May 19, 2013
Email this to a friend.
Your friend's e-mail:
Your Name:
Your e-mail:
Message subject:

OPINION: Are Benchmarks Worthless?




Real World Dilemma
Whenever some company is in trouble in a particular set of case scenarios, the marketing department will mask the shortcomings by creating scenarios in which your computing experience should be peachy, provided that "you're not a 3D designer", "you're not a hard-core gamer", "you're not a professional photographer". Sadly for those terms, and we do hear the gaming one quite a bit, the reality is something else. For starters, we'll give you a quick example.

Why are Consumers Buying Tablets?
Back in 2009, we launched an experiment, which was simple comparing experience between Apple iPod and Microsoft Zune HD devices. We masked both devices not to see the manufacturer and gave users to select the one that gave them the best experience. The result was pretty devastating when you know the market shares of both; out of 10 users, vast majority would select Zune HD product over an iDevice, but the glory of marketing and promise of a perfect experience caused Apple to win over. In the meantime, Apple launched iPod Touch powered by the same iOS that powers iPhone and iPad and the user experience changed for the better.

When the Apple iPad came along, it was not a new concept. In fact, Microsoft has been selling Windows XP Tablet OS for almost a decade yet the part was kept in vertical markets and never managed to spread on the mainstream market mainly due to its lack of touch friendliness.

Apple rode on the popularity of the iPhone, but also on something else, their experience with web pages, surfing etc. was much better than on netbooks (which all were almost exclusively powered by Intel Atom processors), and even some low-end notebooks, running on weak CPUs from Intel and AMD. Today, reading the financial results from companies explains the convergence and the bottom compute experience price threshold has been reached. Consumers want a smooth user experience starting at $499. If you don't offer a smooth experience for that amount of money or more, don't show up to the party.

Delusion #1: You don't need Powerful Graphics for work in Windows or Mac OS
This is the single most common explanation given by the marketing staff from companies when it comes to parts which rely on whole system experience, rather than just the number crunching aspect. When it comes to experience, having a smooth user interface and workflow is paramount in today's user community. With all kudos to Android tablets, they still haven't reached the smoothness of Apple products, even though we saw Apple iPad 3, i.e. "the new iPad" starting to use SSD as a cache and in some cases not being as smooth as it should be.

This is an amazing setup... it is also very expensive. Can it actually perform what the makers are promising?
This is an amazing setup... it is also very expensive. Can it actually deliver user experience promised by Apple and Intel?

Thus, let's say you are a user that just shelled out $3000 for a top-of-the-line MacBook Air, 27" Thunderbolt Display and have connected both via $49 Thunderbolt cable. You know you won't play games, because you have an integrated graphics part. However, upon connecting your laptop with the display, and working in let's say two iterations of Safari with about 10 tabs each, you'll start noticing YouTube videos not working as smoothly as they should, and scrolling won't be as smooth as you would have expected. The natural explanation is "oh, applications are not accelerated" but in reality, should you care if you just spent $3000 on your computing experience? Looks are one thing, and the ability of integrated graphics to simply refresh 4.9 million pixels 60 times a second is another.

A good example of smoothness is a notebook powered by AMD‘s Bobcat APU. We have checked a ton of netbooks, and "lightbooks" and saw that a cheap computer can offer smooth Windows 7 UI experience. Pushing the term a bit further, both AMD Llano and Trinity APUs offer a great experience, and you can even play computer games in native resolution with full API compliance.

For example, at the recently held Trinity Reviewers Day, AMD did a series of surveys of the press and analysts. They were showing two systems side-by-side and asked the attendees to vote which one had smoother experience. In terms of productivity, the company ran Microsoft Excel, Internet Explorer and Word. Out of 30 people, 24 said that AMD Fusion A10 performed better than Intel Core i5 processor, while five members were undecided. The video shake removal test relied on MotionDSP technology and here Trinity A10 won by 25 to 3 to Core i5, with two undecided members. The final test was file compression, for which the latest version of WinZip was used (OpenCL support included). 26 members of the press and analysts said Fusion A10 system won, one said Intel did better and three were undecided.

Delusion #2: You don't need CPU power at all
Second delusion, which we hear from companies which have a strong GPU, but lack in sheer CPU compute performance is that sheer CPU performance does not matter anymore. For example, Intel will demolish its competition in sheer performance. If we take a look at the table below, you'll see that Intel Core i5-2500K runs around in circles even around 8-core FX-8150, yet alone current and upcoming quad-core processors such as A8-3870 (Llano) and A10-5800K (Trinity).

Leaked Performance table shows AMD Fusion A10-5800K Series CPU fighting against Llano based APUs and Intel Sandy Bridge CPUs.

Leaked Performance table shows AMD Fusion A10-5800K Series CPU fighting against Llano based APUs and Intel Sandy Bridge CPUs.


To say CPU performance is not important is a severe understatement. Today, more than ever, you need compute power to use secure banking, work on complex spreadsheets, or simply have a smooth experience in editing photos (provided that you don't have a high-power GPU). Furthermore, if you are a prosumer, you will need to get as much CPU power as possible, because you cannot lose business while waiting for the magically accelerated software to appear. In our own experiences of doing high-end graphics benchmarking, we have always found that the system with the best performing processor will also deliver the highest graphical performance even in tests where the CPU is not as critical. This is because at a certain point, the processor becomes a bottleneck to all of that graphical computation and in some cases even the fastest processors need to be overclocked.

Let Me Entertain You
After you are done with work, what you want to do is to be entertained (yes, more often people want to be entertained during work as well). The most popular entertainment options are social networks, social games, videos and games. The defense that today people don't game is borderline ridiculous, since more people play computer games than see movies in cinema. The PC gaming industry is the largest branch of entertainment, with over $30 billion in annual revenue from software and hardware. Add in all the consoles, gaming oriented platforms such as tablets and smartphones and you'll get the industry almost larger than Hollywood and the music industry…. combined (note - not the commercial aspect, marketing etc - we're looking at product sales only, game boxes and movie ticket / blu-ray sales).

How to benchmark that video experience? How to benchmark games? It is easy to say "you can't", like some analysts like to write. But that is simply not true. The question is though - can your laptop play back a 1080p resolution video you purchased on your desktop, even though you only have a 1366x768 resolution? Here at BSN*, we've had our fair share of notebooks and even desktops that could not pass the mustard in dynamic picture resizing, an option we consider to be an absolute must. How many images can be opened in Photoshop until the system slows down to a crawl?

And what settings in computer games make for a smooth gaming experience exactly…? All of these values are what testers on sites you read and appreciate or criticize. There are no silver bullets that can drive down the real world usage model from each and every user, but there are exact things you can measure and where the performance, and more importantly, your experience will be decided by something you did not expect - a hard drive that goes to sleep, a loud optical drive with a slow spin-up time, not enough USB 3.0 ports for fast transfer… and all of these can be measured.

Solving the Benchmark Dilemma, Car Style
When it comes to benchmarking computing devices, we actually don't need to reinvent the wheel. The dilemma we as an IT industry face is very similar to car testers and journalists. The car industry has the same issues - for years, standards bodies had their own set of measurements which the car makers tweaked and optimized their vehicles, only to have those same bodies changing the measurements every time an erroneous calculation occurred. Be that MPG ratings, which recently caused a lot of shuffle behind the scenes, or EuroNCAP crash test ratings, which were changed in a way that most five-star rated cars lost their rank, and some even were deemed more dangerous than before.

We live in an evolving industry and every new generation will push the boundaries much more aggressive than the car industry. However, just like our computers connect to the displays, car tires connect to the road.

Computing reviewers are adopting the same metric, and if they don't - they need to adopt the same approach: verify if the performance is in accordance to official released figures (practically every computer component or a system now comes with a review guide in which performance is disclosed, theoretical and measured one), what are the customized workloads (no two magazines test the cars in absolutely the same way… in that way, no two car testers will drive the car in the same way) and how much value per dollar are you getting.

Car industry loves to test on this near perfect circle, recently acquired by VW. Should we disregard cars not tested here?
Car industry loves to test on this near perfect circle in Italy, recently acquired by VW. Should we disregard cars not tested here?

Miles to the Gallon or Liters per 100 Kilometers turn into Performance Per Watt, i.e. how much juice you have to pay in order to utilize the computer to your workload. 0-60 mph i.e. 0-100 kph is akin to Boot time or time to wake for a system. Driving the car on a track is equal to running Unigine's Heaven benchmark, and 3DMark score for performance, which takes both CPU and the GPU into account. Taking the car on an open road and measuring the highway or city performance is what the applications such as Adobe Creative Suite, Blackmagic DaVinci Resolve, CyberLink MediaEspresso, PCMark and nearly any well coded video game (taking certain developer relations into account).

Benchmarks versus Real World: I am sorry, but there are no victors here. You have to measure the performance taking both sides into account. But you have to select the right tools for the job. There are benefits to measuring both, as in many cases real world tests become much more subjective especially as settings and applications between testers vary.

Conclusion: Vote with Your Wallet
At the end of the day, what decides whether or not a product is good enough for the market is the effort which reviewers have put into writing a review of a product. A good review needs to be based on the similar principles as automotive reviews and to cover what is both above and below the hood/bonnet, and test the vehicle in both controlled and uncontrolled environments.

Should you stop trusting the benchmarks today? No, you should not. But you also need to take things at face value, because the subjective experience is the most important one. And seeing your $3000 MacBook Air plus 27" Thunderbolt display run like a turtle for simplistic browsing experience is enough to go back to the store.

Real benchmarks are the ones to be trusted, while it is always good to rely on selected synthetic benchmarks for comparable numbers, such as ones from ElcomSoft, FinalWire, Futuremark, SiSoft or Unigine.

When you purchase your next computer, or a car, or just about anything measurable, think of just one thing… is this product really worth my hard earned ______________ (insert your local currency)?



© 2009 - 2013 Bright Side Of News*, All rights reserved.


Related articles:

Tags:


Comments:



© 2009 - 2013 Bright Side Of News*, All rights reserved.