Original Link: https://www.anandtech.com/show/4012/nvidias-geforce-gtx-580-the-sli-update
NVIDIA's GeForce GTX 580: The SLI Update
by Ryan Smith on November 10, 2010 10:00 AM ESTPicking up immediately from where we left off yesterday with our review of NVIDIA’s new GeForce GTX 580, we have a second GTX 580 in house courtesy of Asus, who sent over their ENGTX580. With our second GTX 580 in hand we’re taking a look at GTX 580 SLI performance and more; we’ll also be taking a look at voltage/power consumption relationship on the GTX 580, and clock-normalized benchmarking to see just how much of GTX 580’s improved performance is due to architecture and additional SMs, and how much is due to the clockspeed advantage.
Asus ENGTX580 | GTX 580 | GTX 480 | GTX 460 1GB | |
Stream Processors | 512 | 512 | 480 | 336 |
Texture Address / Filtering | 64/64 | 64/64 | 60/60 | 56/56 |
ROPs | 48 | 48 | 48 | 32 |
Core Clock | 782MHz | 772MHz | 700MHz | 675MHz |
Shader Clock | 1544MHz | 1544MHz | 1401MHz | 1350MHz |
Memory Clock | 1002MHz (4008MHz data rate) GDDR5 | 1002MHz (4008MHz data rate) GDDR5 | 924MHz (3696MHz data rate) GDDR5 | 900Mhz (3.6GHz data rate) GDDR5 |
Memory Bus Width | 384-bit | 384-bit | 384-bit | 256-bit |
Frame Buffer | 1.5GB | 1.5GB | 1.5GB | 1GB |
FP64 | 1/8 FP32 | 1/8 FP32 | 1/8 FP32 | 1/12 FP32 |
Transistor Count | 3B | 3B | 3B | 1.95B |
Manufacturing Process | TSMC 40nm | TSMC 40nm | TSMC 40nm | TSMC 40nm |
Price Point | ~$510 | $499 | ~$420 | ~$190 |
As you may recall from our launch article yesterday, NVIDIA would only make a second GTX 580 available to us for SLI testing if we also accepted and reviewed a high-end gaming system, an offer which we declined. As a result we were unable to look at GTX 580 SLI performance right away. However Asus quickly came to our aid and sent us one of their first GTX 580s, giving us a second card to work with both for SLI testing and as a second data point. Since yesterday afternoon we’ve been busy at work seeing what a pair of NVIDIA’s latest and greatest are capable of doing.
It shouldn’t come as any surprise that as a launch-day card, the ENGTX580 is an almost identical carbon-copy of the GTX 580 reference design. Asus is using the reference PCB and cooler, and are differentiating the card through a very token 10MHz factory overclock and the possibility of a much greater overclock through voltage adjustment using their SmartDoctor utility (which we do not have in hand at this time). At this point the factory overclock has us scratching our heads however, as this is the second Asus card we’ve received with such an overclock. We’re not ones to look a gift horse in the mouth when it comes to a free performance boost, but 10MHz (1.2%) core overclock? It’s the very definition of a token overclock – it’s not enough of an overclock to actually make a difference in performance. We’re still trying to get to the bottom of this one…
About Last Night
Prior to the actual launch of the GTX 580, we were concerned about what the availability would be like. With NVIDIA engaging in such a quick development cycle for GF110 and being unwilling to discuss launch quantities, we didn’t think they could do it. We’re glad to report that we were wrong, and the GTX 580 has been in steady supply since the launch yesterday morning. Kudos to NVIDIA for proving us wrong here and hitting a hard launch – it’s the kind of action that helps to make up for the drawn out launch of the GTX 480 and GTX 470.
Actually getting a GTX 580 is turning out to be a curious affair however. When we first saw Newegg post their GTX 580s for sale our jaw dropped as they were all $50-$80 over NVIDIA’s MSRP; the GTX 580 is already an expensive card and selling it over MSRP isn’t doing NVIDIA any favors. However after checking out MWave, Tiger Direct, the EVGA Store, and others, we saw at least 1 card at MSRP at each store. Were NVIDIA and their partners price gouging, or was it something else? The truth is often in the middle.
At this point Newegg is the 800lb gorilla of computer parts; they have the largest volume and as far as we can figure they get the bulk of the launch cards allocated to the United States. So what they’re doing is usually a good barometer of what pricing and availability is going to be like – except for this week. As it turns out Newegg is running a 10% sale on all video cards via a well-known promo code; and as best as we can tell rather than not including the GTX 580 in their sale, they simply hiked up the price on all of their GTX 580 cards so that prices were at or around MSRP after the promo code was applied. The end result being that the cards look like they’re going well over MSRP when they’re not. Judging from pricing at Newegg and elsewhere it looks like there is some slight gouging going on (we can only turn up a couple of cards that are actually at $499 instead of $509/$519), but ultimately GTX 580 prices aren’t astronomical like they appeared at first glance. After this stunt, this will probably go on the record as being one of the weirder launches.
Asus’s ENGTX580: A Second Data Point
With a second GTX 580 in hand we have a second data point to look at with respect to the GTX 580’s physical attributes. As we’ve noted time and time again, with the GeForce 400 (and now, 500) series, NVIDIA has moved to having a range of VIDs for each product instead of only a single VID for every card. The result is that much like CPUs the power consumption and resulting cooling/noise properties of a product can vary from card to card.
Our reference GTX 580 shipped with a load voltage of 1.037v, notably higher than the sub-1v load voltages of the GTX 480 and a solid example of how NVIDIA has been able to reduce leakage on their GPUs. By luck our Asus GTX 580 comes with a different voltage, 1.000v, giving us some idea of what the VID range is going to be for the GTX 580 and what a card with a “good” GPU might be like.
GeForce GTX 480/580 Voltages | ||||
Ref 480 Load | Ref 580 Load | Asus 580 Load | ||
0.959v | 1.037v | 1.000v |
Not surprisingly, with a lower load voltage our Asus card consumes less power in all of our tests. We’ll just jump right in to the charts here and dissect things.
Under Cyrsis system power consumption is 20W lower, putting this GTX 580 under the Radeon HD 5970 instead of over, but also within 10W of the 6850CF, the GTX 470 (with its fused off SMs), and even the GTX 285. Going by power consumption this card is only slightly worse than the GTX 285, a far cry from the GTX 480 and the 421W system power consumption we see.
The situation is much the same with Program X, where power consumption has dropped 26W to 426W. Here it’s a not quite as close to the GTX 470, but it’s still only a dozen watts or less off of the GTX 285 and 6850CF.
However it turns out the effect on temperature & noise isn’t as great as we’d assume. These two aspects are of course dependent on each other, as temperatures drive fan speeds and vice versa. For whatever reason our Asus GTX 580 gets slightly warmer and slightly louder than our reference GTX 580, even though we’ve already determined that power consumption – and hence heat dissipation – are lower. Some of this may come down to BIOS programming by Asus, but at the moment we don’t really have a great explanation for why power consumption can drop but heat/temperatures can slightly rise. At the moment we’re entertaining the idea that the difference may be in assembly, and that the reference GTX 580 has a different thermal paste application than the Asus card.
In any case from these two data points we can clearly determine that power consumption can differ from our reference card, however whether temperatures and noise can differ are still in question. Ultimately we’d like to find out the full VID range of the GTX 580, if only to get an idea of how our cards compare to the complete spectrum of possibilities.
GTX 580 SLI: Setting New Dual-GPU Records
Today’s main event of course is the performance of the GTX 580 in SLI mode. We hope that it doesn’t spoil things for anyone when we say that the GTX 580 in SLI is setting new records for dual-GPU performance in our charts, a natural consequence of pairing up what was already the fastest single GPU card on the market. Since the results are going to be rather self-explanatory, we’ll skip the running commentary here and stick to the charts.
There are two situations where the GTX 580 SLI doesn’t handily beat everything else: Metro 2033, and Civilization V. The latter appears to be yet another incident where NVIDIA’s apparently faulty Civ5 SLI profile is robbing an SLI setup of performance, while Metro 2033 is a more interesting case. At 1920 the 580 SLI is well in the lead, but at 2560 SLI scaling is breaking down, letting the 5870CF take a slight lead.
Meanwhile in other cases we’re clearly running in to CPU limits even at 2560, as both Wolfenstein and HAWX are definitely hitting the wall; though these are already two of our fastest games before including SLI. The good news is that this leaves plenty of performance for eye candy options, as NVIDIA’s fantastic but expensive Transparancy AA and Supersample AA options for DX10 and DX11 are still available. For the IQ nuts out there that won’t settle for anything less than the best, we managed to get the 580 SLI running Crysis with all Enthusiast settings and 4x SSAA at a playable framerate of 42.8fps – albeit at 1680x1050. Perhaps next year’s 28nm die shrink will unlock enough performance that we can seriously start considering SSAA at the very high end?
As for power, temperature, and noise, the results are in-line with where we’d expect them to be considering we’re pairing up high-end cards. Compared to the GTX 480 everything is peachy; idle power is down 55W(!), load power is down 40-80W, gaming temperatures are down 10C, and even load noise is way down. Here we see the same 7dB drop as a single GTX 580, bringing the GTX 580 SLI in below the 5970, a single GTX 480, and only slightly above a single GTX 285. Bear in mind that we’re running our cards directly next to each other here to look at the worst case scenario, so given some spacing everything here would be even quieter. Truth be told, we did not really have high hopes here, as we expected the lack of a PCB ventilation hole to take its toll; we’re pleasantly surprised as a result.
On the flipside, we’re still looking at a lot of power consumption – GTX 580 doesn’t change the fact that GF100/110 cards are in their own little universe in SLI compared to the next most power hungry setup, a 5870CF. Meanwhile noise isn’t bad, but if you’re used to a single card then this will probably catch you off guard. So the usual concerns stand with the GTX 580 SLI: make sure you have a solid high wattage power supply, an airy case, and ideally a motherboard with an x16 PCIe slot located farther away from the first one.
Normalized Clocks: Separating Architecture & SMs from Clockspeed Increases
While we were doing our SLI benchmarking we got several requests for GTX 580 results with normalized clockspeeds in order to better separate what performance improvements were due to NVIDIA’s architectural changes and enabling the 16th SM, and what changes are due to the 10% higher clocks. So we’ve quickly run a GTX 580 at 2560 with GTX 480 clockspeeds (700Mhz core, 924Mhz memory) in order to capture this data. Games that benefit most from the clockspeed bump are going to be memory bandwidth or ROP limited, while games showing the biggest improvements in spite of the normalized clockspeeds are games that are shader/texture limited or benefit from the texture and/or Z-cull improvements.
We’ll put 2 charts here, one with the actual framerates and a second with all performance numbers normalized to the GTX 480’s performance.
Games showing the lowest improvement in performance with normalized clockspeeds are BattleForge, STALKER, and Civilization V (which is CPU limited anyhow). At the other end are HAWX, DIRT 2, and Metro 2033.
STALKER and BattleForge hold consistent with our theory that games that benefit the least when normalized are ROP or memory bandwidth limited, as both games only see a pickup in performance once we ramp up the clocks. And on the other end HAWX, DIRT 2, and Metro 2033 still benefit from the clockspeed boost on top of their already hefty boost thanks to architectural improvements and the extra SMs. Interestingly Crysis looks to be the paragon game for the average situation, as it benefits some from the arch/SM improvements, but not a ton.
A subset of our compute benchmarks is much more straightforward here; Folding@Home and SmallLuxGPU improve 6% and 7% respectively from the increase in SMs (theoretical improvement, 6.6%), and then after the clockspeed boost reach 15% faster. From this it’s a safe bet that when GF110 reaches Tesla cards that the performance improvement for Telsa won’t be as great as it was for GeForce since the architectural improvements were purely for gaming purposes. On the flip side with so many SMs currently disabled, if NVIDIA can get a 16 SM Tesla out, the performance increase should be massive.