nVidia just made an announcement that is significant on many levels. From one side, it is a commitment to the growth segment of Tesla business unit, from another – it is a confirmation of something both AMD and nVidia have been preaching for years – the key to datacenter is efficiency, and in increasingly visual world – using CPUs for visual computing has the same efficiency as driving a 1MPG car [Mile per Gallon].

First of all, to clear things up; nVidia and nVidia-owned mental images [written without caps, but we'll use caps for clarity, sub. Ed.] launched a GPU-based RealityServer, not the first RealityServer. RealityServer is a quite successful product line from Mental Images, but the problem was that nVidia-owned subsidiary was earning money on selling software requiring clusters with thousands of CPUs – not quite what nVidia had in mind. In its third generation, RealityServer is going the GPGPU route [GPU Computing] and for the first time deploying as the mixed hardware/software platform. This is also the first time nVidia is putting its stamp on RealityServers, rather than seeing Mental Images just selling server-side software.

3D goes into Cloud: meet 3D Cloud Computing
During the conference, Dan Vivoli went on to discuss the problems faced by current CPU-based visual cloud users – for instance, Lucas Film is creating a movie that takes 23 hours to render a single frame using the world’s fastest CPU. Even with all the code optimizations, Lucas Film is getting a single frame in 23 hours, resulting in painful and long term calculations, prolonging the time-to-market for the project. It takes 575 hours [24 days] to create a single second of that movie, 34,500 hours [4 years] to create one minute and massive three million minutes of CPU time to generate a 90-minute animated movie [354 years]. Naturally, 354 years is divided and brought down to a few weeks by using tens of thousands of processing cores and this is where nVidia wants to step in.

Current RealityServer customers requested more computational power and nVidia is finally stepping up to the plate.
Current RealityServer customers requested more computational power and nVidia is stepping up to the plate

This is only one small part of the whole equation. Augmented Reality is pushing into the mainstream, but don’t think that advanced calculations can be done on-the-fly. Given that data needs to come through the cellular network it is no wonder that several Augmented Reality applications are already rendered on movable supercomputing sites such as NASCAR’s display of fluid dynamics during every race.

nVidia also discussed business scenarios such as MyDeco www.mydeco.com, world’s largest home improvement website featuring over 100,000 3D rooms, meaning users create around 200 rooms a day. MyDeco features world’s largest collection of majority of different home furnishing products from top manufacturers. This is one of those billion-dollar websites you might have never heard off, but the business is booming.

Both nVidia and companies in its eco system expect a breakthrough with 4G LTE technology, featuring bandwidth speeds of 100Mbps if you are stationary, or up to 10Mbps when you are moving [switching between stations]. With these speeds coming as early as 2010, mobile developers such as Ubermind http://www.ubermind.com/ are expecting a large jump in need for computational power. The company presented a cloud-based car configurator app for the iPhone. The principle of this app is quite simple; app streams data from the server and you can customize the photo-realistic 3D car with various equipment, ranging from the external color and alloys to interior – quite similar experience to that Lamborghini Reventon demo from nVision 2008 – but this time around everything is handled on the cellphone. Company representatives did not disclose who might be the launch customer, but given the design of the demonstrated car, we would point the finger in direction of Germany.

Markets where nVidia sees its advantage
Markets where nVidia sees its advantage: notice that high ASP segments were basically unprerpresented

3D Cloud computing doesn’t stop at emerging products and markets such as iPhone apps. For instance, BAA Heathrow was drawn as an example where 2nd generation, CPU-based RealityServer was used during the development of Terminal 5. Terminal 5 was consisted out of 10,000 AutoCAD files and Terminal 5 was consisted out of 10,000 AutoCAD files and around 5,000 SolidWorks files. By using previous generation of RealityServer, BAA Heathrow saw how many issues their original design had, and went on a wild ride fixing the elevators that didn’t fit and numerous other architectural and engineering errors. We won’t go into the clusterf**k that happened on the T5 opening day with the conveyor belts, but somehow Dan missed the punch-line "If they only used the GPU?" Ah well.

The software – iray
Demonstration of Global Illumination - watch at the lights
Demonstration of iray’s Global Illumination – watch at the lights

The same image again, this time with no blinds - finally photo realistic and mathematically correct lighting
The same image again, this time with no blinds – finally photo realistic and mathematically correct lighting

The person behind Mental Images is no other than Rolph Herken, founder of the company [serves as CEO and CTO], a visionary that wasn’t satisfied with the realisms of the GPU industry and created a software framework where reality comes true, and that’s no pun. It squeezed all the available performance from CPU side and now is naturally migrating to GPU. According to Rolph, Iray is world’s first physically correct renderer. This means no tricks are used in the production mode, you get the same results if running on a CPU or on GPU. Iray is also one of the first fully accurate Global Illumination software applications and the company doesn’t hide the fact that their Mental Ray software powers 70% of world’s movie production [Iray is a part of Mental Ray], with Autodesk, Dassault Systems or Parametric Technology being open to full integration with their software suites.

In order to demonstrate capabilities of Iray, Rolf ran a demo of a photo-realistic office space, and started to change parameters. There was a noticeable delay between commands and the actual change ? but nothing lasted more than 10-15 seconds. The demo ran on a 16 GPU-based node, featuring an computing equivalent of 640 CPU cluster.

In order to generate this on a typical workstation, Rolph claimed that even with the fastest CPU available to date, traditional CPU-side rendering would take 200 minutes. With RealityServer 3.0, that time is cut in mere seconds, and productivity level visibly increases from one generation of software to another.

The big news about iray is the announcement that the free version, dubbed Iray Developer Edition will become available at the same time as the platform, on November 30th, 2009. If you use Iray for your own, non-commercial purposes, you won’t be paying a dime. If you decide to leave adacemia and advance to commercial applications, nVidia offers several licensing models.

RealityServer 3.0
As the name says, RealityServer 3.0 platform is a combination of two things: Tesla RS servers and mental images’ iray software. Iray ray tracer is now GPU-accelerated, but for those that don’t require photo-realistic precision, it is possible to combine ray tracing with OpenGL API.

nVidia RealityServer overview
nVidia RealityServer overview

For those that want to use streaming capabilities, RealityServer comes with again, GPU-accelerated video compression which warrants a sustainable bit-rate. Still, seeing is believing.

Mental Images believes that most of their current RealityServer customers will throw themselves away at GPU-based 3.0, given the power savings and performance benefits provided by utilization of more efficient hardware. RealityServer is compatible with standard CAD/DCC formats all major commercial applications such as 3Ds MAX, Maya, CATIA, SolidWorks – as well as operating inside a Multi OS cluster.

Meet the hardware – Tesla RS
The hardware platform is naturally based on Tesla S1070 1U server, consisting out of four Tesla cards [4.14 TFLOPS Single-Precision, 345 GLOPS Dual-Precision IEEE 754-2005]. Tesla RS is divided into three products: Tesla RS M, L and XL. During the press conference, Dan Vivoli [SVP Marketing, in charge of all professional business following the recent reorganization] joked that the inspiration for the model naming was clothing. Given that one of targeted markets for RealityServer is fashion industry [Apparel Styling], one might say – no wonder.

nVidia Tesla RS Lineup
nVidia Tesla RS Lineup

Tesla RS M is based around eight GPUs supporting tens of concurrent users. Primary market for Tesla RS M is a development team working on a 3D cloud app or similar visual application, and their selected beta testers.
Tesla RS L is targeting to replace the existing RealityServers inside companies such as BAA Heathrow, V.A.G. [Volkswagen Audi Group], Daimler AG, Airbus SAS, Boeing, LucasFilm and many more. The RS L is consisted out of 32 Tesla GPUs, able to support hundreds of concurrent users. The main idea behind the Tesla RS L is to get large development teams working together on a project.

The top-end Tesla RS XL is targeting cloud computing providers and as such, does not come in a specific configuration. Tesla RS XL starts with a 128 GPU cluster [four Tesla RS L nodes], and moves onwards. According  to nVidia, this product targets thousands of concurrent users, serving as a consumer service platform.

Wrap up
nVidia showed that the company wants to grow Tesla business and that’s a given. Assumptions that nVidia will abandon GeForce market will probably meet the end of its existence after nVidia launches yet another, dual-GPU based graphics card. It is good to see that the company finally started to utilize the companies that it acquired a while ago. In our off-the-record conversations over the course of years, we learned that two most common negatives about nVidia in this, commercial aspect were lack of support for dual precision format and ECC. This was fixed firstly with Dual precision [which was shown in the demo, btw], and now the GT300 Fermi Is coming to market with the support for ECC.