BRIGHT SIDE OF NEWS
About
|
Advertise
|
Contact
BSN USER
Login
Users profile
Login
Username:
Password:
Log in
Lost password
Please enter your email address
Send
New user
Proceed to
registration page
.
|
Register
SUBSCRIBE
Newsletter
|
RSS Feeds
HOME
APPLE
GRAPHICS
HARDWARE
CLOUD COMPUTING
ENTERPRISE
SOFTWARE
BUSINESS
ENTERTAINMENT
SECURITY
News
Analysis
Interviews
Reviews
Rumors
Wednesday, May 22, 2013
Email this to a friend.
Your friend's e-mail:
Your Name:
Your e-mail:
Message subject:
Comments on article
Intel Larrabee finally hits 1TFLOPS - 2.7x faster than nVidia GT200!
Comments
Riiight...
by:
Anonymous
on
12/7/2009
"1 for Intel and a kick in the short and curlies to nVidia and AMD!
Microsoft should consider Larrabee for XBOX 720 since SONY has gone with Imagination Technologie Power VR Series 6 for PS4"
You really have no idea what you're on about...
ah well...
http://www.reuters.com/article/idUSTRE5B51QR20091206?type=technologyNews
by:
rvalencia
on
12/7/2009
GT5 Prologue (demo) = 1080p mode is 1280x1080 (2xAA) in-game while the garage/pit/showrooms are 1920x1080 with no AA. 720p mode is 1280x720 (4xAA)
Refer to http://forum.beyond3d.com/showthread.php?t=46241
GT5 Prologue's in-game 1080p is fake i.e. not real 1920x1080.
by:
rvalencia
on
12/7/2009
From Beyond3D's forum, a developer was able to achieve ~1 TFlops (1000 GFlops) using Radeon HD 4870 for the SGEMM benchmark.
http://forum.beyond3d.com/showthread.php?t=54842
Next few days its cancelled!!!
by:
Anonymous
on
12/4/2009
LOL
dude
by:
Anonymous
on
12/3/2009
sgemm has nothing to do with qcd dude - it was 'the other' number that was for qcd - the one that was, what.. 8 GFlops?
Anyways sgemm is full square matrix multiply - qcd, or Dirac solver, is all about sparse matrices and until lately has been variants of conjugate gradient algorithm (sparse matrix - full vector multiply), with an almost diagonal sparse matrix => general purpose multiply is either crazy.
Re. Theo
by:
Anonymous
on
12/2/2009
Theo,
I'm a CUDA developer. If you've meet CUDA devs who think this business is about tricks and magic they just didn't know what they were talking about.
I would have to say that 95% of all the techniques can be found in the programming guide and best practices guide.
There are a few performance enhancers that can be used on both Nvidia and AMD/ATI cards that isn't really documented. According to my reading analogous optimizations can be found both on the Cell BE and on regular intel processors.
There is nothing in the closet about GPGPU programming these days. Everything is well documented with more and more development tools on the way because thats the way Nvidia and others want it.
There is nothing mystical about GPGPU programming other than that it's new and it's scarring the sh*t out of intel. Like someone mentioned, they are about 2 years behind.
Larrabee is a graphics card?
by:
Greg442
on
12/2/2009
Wait I'm confused, Larrabee is a graphics card? not a cpu with graphics integrated? Why in the hell would Intel think it was a good idea to compete with ATI/NVidia building graphic cards? Only a fool would buy a first generation, completely “experimental" graphics card from Intel. This is an "EPIC FAILURE" considering the fact the R&D on Larrabee is rumored to exceed the cost of both ATI 5000 Series and NVidia Fermi... ridiculous
half the story
by:
Anonymous
on
12/2/2009
Ok, I understand that gpu computing isnt about video display, but about number crunching, but the rest of the actual video cards that perform this feature ALSO render video. CVome back Intel, when your card actually does both. Otherwisw why slap any name to it such as Larrabee....
Larrabee
by:
Anonymous
on
12/2/2009
Is a Graphics Card - it is not a CPU/ GPU - it requires a CPU. The 1 TFlop is a huge milestone, now we need to see Crysis 2 running on with all the shaders, anti-aliasing, physics, etc
1 for Intel and a kick in the short and curlies to nVidia and AMD!
Microsoft should consider Larrabee for XBOX 720 since SONY has gone with Imagination Technologie Power VR Series 6 for PS4
Theo, GT5 @ 1080p *IS* a trick
by:
Anonymous
on
12/2/2009
... and a dirty one at that. In reality the RSX renders it at 1280x1080, but anamorphically (rectangular pixels instead of square ones). The picture then gets stretched sideways by PS3s internal HW scaler to what would *appear* to be 1920x1080, while the rectangular pixels become square ones so the picture does not look funny.
Wipeout HD uses the same HW scaler trick to scale on-the-fly and make it look like 1920x1080 @ solid 60 fps, when most of the time it's somwhere between 1920/1280 x 1080. Apparently the dev got a lot of praise for this stunt. I have to admit, it is sorta impressive as a technical solution, but as a gamer I'd loathe to see this happen often. And they had the nerve to present it as fully-fledged Full HD experience.
RE: Standards...
by:
Theo Valich
on
12/2/2009
HPC standards are the ones GPU needs adjusting to, not the other way around.
Bear in mind that everybody is using their own
4K by 4K matrix is an industry standard and if different CPU vendors go by it, there is no pardon given if a GPU wants to capture not a small slice, but a lionshare of the multi-billion dollar business.
Graphics was a lot of "tricks'n'hoes", as one developer colorfully put - whenever I talk with gamedevs about their title, the amount of tricks that has to be put inside "because the HW can't support it" or "because the routine would be too slow on slower HW" etc etc etc. Only clearing those tricks out can lead to a better performance.
When you see a PlayStation 3 game rendered in 1080p, especially Gran Turismo 5 - bear in mind that is cut-down GeForce 7900 GTX with 128-bit interface.
The development of PS3 games is a result when you have access to low-level hardware and there are no secrets go to around. The time has come for PC hardware to be open and clean, and let's see what kind of applications can we have without ghosts in closets.
Ed.
Larrabee scaling problem?
by:
Anonymous
on
12/2/2009
The demo utilized only half of the available cores. Why? Is the Larrabee architecture incapable of scaling any further? Is it due to thermal/power limitation? Or both?
by:
Anonymous
on
12/2/2009
300Gflops for Rv770?? thats TOTOALLY UN-OPTMIZED!! some guy @ Beyond3D squeezed 2.7TFLOPS in FP32 Matrix Mul!!!
by:
Anonymous
on
12/2/2009
If nvidia is scalar and ati is vector based, then comparing them directly would be pointless. Can someone confirm this?
re:
by:
Anonymous
on
12/2/2009
Yes, CUDA is easier. Nvidia is constantly offering more and more support to developers, they are really taking HPC seriously.
I believe this is also a huge benefit for gaming since it should also mean that game developers can squeeze more juice out of the hardware.
I saw people mentioning that some of the older amd/ati cards had serious trouble with coalescing memory ( using the on-chip to off-chip) bandwidth efficiently which really drives down efficiency.
CUDA is scalar, AMD is vector based...
by:
Anonymous
on
12/2/2009
So it means that gt200 is slower than rv770 theoretically, but easier to write code for. In other words rv770 is fast, but writing a real world application that would take 100% of its computing potential would be a very difficult task. Am I right? Is this why cuda is so popular?
easier
by:
Anonymous
on
12/2/2009
x86 my a*s, with a 100 vector ops required
to reach even a small percent of theoretical peak. I bet writing fast code will be as complicated as writing good shaders right now, that is keeping mem/alu ratio, hot texture caches, slot occupancy etc
by:
Anonymous
on
12/2/2009
A more efficient implementation of SGEMM on ATI cards has achieved 92% theoretical peak.
http://cerberus.fileburst.net/showthread.php?s=fbfd66aadcfb503bc6e82afcf1f4fcc4&t=54842
In the right hands, the 5970 should be able to get 3+ TFLOPs. And you can buy the card NOW...sorta.
by:
Anonymous
on
12/2/2009
it basically tries to do all the hardware stuff modern gpu's do in software instead. Also it's based off the x86 instruction set, so they believe it'll be easier for programmers to work with.
what..
by:
Anonymous
on
12/2/2009
What is this Larabee again ??
Is it a GPU ?
Is it a CPU ?
Is it supposed be a CPU/GPU combo ?
1.8 > 0.9
by:
Anonymous
on
12/2/2009
Yes, there is an implementation running at 2/3 on an AMD card.
Thus it is doing around 1.8 TFLOPs. 1.8 TFLOPs is more than 0.9xx TFLOPs ( pun intended).
by:
Anonymous
on
12/2/2009
My mistake, but the fact remains.
down
by:
Anonymous
on
12/2/2009
'do you?' silly me
heh
by:
Anonymous
on
12/2/2009
You don't get it, don't you? A good dense matrix multi implementation on HD 4800, 5800 reaches 2/3 peak performance.
by:
Anonymous
on
12/2/2009
Oh goodness, wasn't everyone talking about how this was gonna totally flop (no pun intended)
by:
Anonymous
on
12/2/2009
You don't get it, do you? 2.72 TeraFLOPS for HD 5870 is just theoretical. The real number is lower in real world cases.
by:
Anonymous
on
12/2/2009
AMD has 4.64 TeraFLOPS card out NOW named Radeon HD 5970
btw Radeon HD 5870 is a 2.72 TeraFLOPS card :p
Also...
by:
Anonymous
on
12/2/2009
Also, the C1060 shown in the slides has 240 cores at 1.3 Ghz each, all with Fused Multiply add. This gives a total theoretical peak GFLOPS of:
240*
1.3*
2=
624 GFLOPS. So reaching close to 400 is all to bad. That probably at 200W. Also, if I recall correctly, hpcWire.com stated that 1TFLOPS was the overclocked number, 800 if not overclocked. In other words, the watts for the overclocked 32 core 1 TFLOPS LRB might well be pretty high.
4kx4k dense matrix product
by:
Anonymous
on
12/2/2009
4870 reaches 880 GFlops, been there done that
http://forum.beyond3d.com/showthread.php?t=54842
How do they get 1 TFLOPS?
by:
Anonymous
on
12/2/2009
16* cores
2.0* Ghz
8* Vector width, right?
2= Fused Multiply Add
512 GFLOPS.
That sounds like the "417 GFLOPS using half the cores on the prototype".
So the 1 TFLOPS card must have 32 cores, or am I missing something? The real question is what the GFLOPS/W and GFLOPS/$ is. If the 1 TFLOPS is simply a case of sticking 2 16-core-LRBs together, then wouldn't that compare in some manner to simply using 2 GPUs instead of one? Hence, TFLOPS/W and TFLOPS/$ are probably what we all are interested in, not simply the max TFLOPS on SGEMM.
Or?
@ first two posters
by:
Anonymous
on
12/2/2009
You forgot to ask!
Can it run Crysis?
5870 at 1.8 TFLOPs
by:
Anonymous
on
12/2/2009
I posted this before but it seems my comment didnt show up??
Anyways I've read that SGEMM on AMD/ATI 5870 has reached 67 % of theoretical performance.
This would mean that the 5870 is doing 0.67*2.72 TFLOPs ~= 1.8 TFLOPs on SGEMM. Almost twice the peformance of LRB.
The most interesting question is how general purpose LRB will be. But it seems they will be about 2 years behing the competition when their product finally arrives.
by:
Kyocera
on
12/2/2009
Now, the price is missing.
And the release date of this Larabee; and how good it will play games.
© 2009 - 2011 Bright Side Of News*, All rights reserved.
Top Stories
The Smartphone Duopoly Continues – Who is No.3?
Galaxy S4: Breaking Records Amidst Component Shortages
3D Printing at Maker Faire
Review: iBuypower Revolt - Lots of Power in a Small Package
Sean Pelletier Leaves NVIDIA for AMD, Another Quality Acquisition
Intel ISEF Winner Uses Artificial Intelligence to Beat Google
Recent news
Upcoming Variety of Galaxy S4 Branded Devices
Can Stephen King Impact eBooks’ Future?
Futuremark Announces PCMark 8, A Real Whole System Benchmark
Intel ISEF Winner Uses Artificial Intelligence to Beat Google
Jolla Unveils Their Sailfish Platform Smartphone
Yahoo Buys Tumblr, Gives Flickr Facelift and 1TB Free to Users
Tips and tricks
Thanks for reading BSN*
APPLE
Mrs. Jobs Comes Out From Behind Steve’s Shadow
Who's Got Your Back on Privacy? AT&T, Verizon, Apple and Yahoo Don't
Google Now, Now on the iOS
GRAPHICS
Sean Pelletier Leaves NVIDIA for AMD, Another Quality Acquisition
3D Printing at Maker Faire
Nvidia Announces SHIELD Pre-Orders, Availability and Pricing
HARDWARE
Upcoming Variety of Galaxy S4 Branded Devices
Futuremark Announces PCMark 8, A Real Whole System Benchmark
Intel ISEF Winner Uses Artificial Intelligence to Beat Google
CLOUD COMPUTING
Yahoo Buys Tumblr, Gives Flickr Facelift and 1TB Free to Users
How Saudi Arabia (And Most Governments) Want to Monitor You
Bitcoin Exchange Mt.Gox Has Funds Seized by DHS
DISTRIBUTED COMPUTING
NAB 2013: NVIDIA Brings GRID to Hollywood
Qualcomm Shows Halo, Their Wireless Vehicle Charging Technology
10 Billion Year Old Galaxies Come to Life
ENTERPRISE
Abenomics: Japan to Build World's First ExaFLOP Computer for USD 1 Billion
Take a Tour of San Diego's Supercomputer Center
Thinklogical Delivers Mil-Spec Networking to Entertainment Biz
ENTERTAINMENT
Can Stephen King Impact eBooks’ Future?
Futuremark Announces PCMark 8, A Real Whole System Benchmark
Yahoo Buys Tumblr, Gives Flickr Facelift and 1TB Free to Users
BUSINESS
Jolla Unveils Their Sailfish Platform Smartphone
Smartphones Make Olympus Cry Uncle, Drops Low End Cameras
Sean Pelletier Leaves NVIDIA for AMD, Another Quality Acquisition
BRIGHT SIDE OF NEWS
About
|
Advertise
|
Contact
|
Terms & Conditions
SUBSCRIBE
Newsletter
|
RSS Feeds
© 2009 - 2013 Bright Side Of News*, All rights reserved.