Original Link: https://www.anandtech.com/show/10067/ashes-of-the-singularity-revisited-beta
Ashes of the Singularity Revisited: A Beta Look at DirectX 12 & Asynchronous Shading
by Daniel Williams & Ryan Smith on February 24, 2016 1:00 PM ESTWe’ve been following DirectX 12 for about 2 years now, watching Microsoft’s next-generation low-level graphics API go from an internal development project to a public release. Though harder to use than earlier high-level APIs like DirectX 11, DirectX 12 gives developers more control than ever before, and for those who can tame it, they can unlock performance and develop rendering techniques simply not possible with earlier APIs. Coupled with the CPU bottlenecks of DirectX 11 coming into full view as single-threaded performance increases have slowed and CPUs have increased their core counts instead, and DirectX 12 could not have come at a better time.
Although DirectX 12 was finalized and launched alongside Windows 10 last summer, we’ve continued to keep an eye on the API as the first games are developed against it. As developers need the tools before they can release games, there’s an expected lag period between the launch of Windows 10 and when games using the API are ready for release, and we are finally nearing the end of that lag period. Consequently we’re now getting a better and clearer picture of what to expect with games utilizing DirectX 12 as those games approach their launch.
There are a few games vying for the title of the first major DirectX 12 game, but at this point I think it’s safe to say that the first high profile game to be released will be Ashes of the Singularity. This is due to the fact that the developer, Oxide, has specifically crafted an engine and a game meant to exploit the abilities of the API – large numbers of draw calls, asynchronous compute/shading, and explicit multi-GPU – putting it a step beyond adding DX12 rendering paths to games that were originally designed for DX11. As a result, both the GPU vendors and Microsoft itself have used Ashes and earlier builds of its Nitrous engine to demonstrate the capabilities of the API, and this is something we’ve looked at with both Ashes and the Star Swarm technical demo.
Much like a number of other games these days, Ashes of the Singularity for its part has been in a public beta via Steam early access, while its full, golden release on March 22nd is fast approaching. To that end Oxide and publisher Stardock are gearing up to release the second major beta of the game, and the last beta before the game goes gold. At the same time they’ve invited the press to take a look at the beta and its updated benchmark ahead of tomorrow’s early access release, so today we’ll be taking a second and more comprehensive look at the game.
The first time we poked at Ashes was to investigate an early alpha of the game’s explicit multi-GPU functionality. Though only in a limited form at the time, Oxide demonstrated that they had a basic implementation of DX12 multi-GPU up and running, allowing us to not only pair up similar video cards, but dissimilar cards from opposing vendors, making a combined GeForce + Radeon setup a reality. This early version of Ashes showed a lot of promise for DX12 multi-GPU, and after some additional development it is now finally being released to the public as part of this week’s beta.
Since that release Oxide has also been at work both cleaning up the code to prepare it for release, and implementing even more DX12 functionality. The latest beta adds greatly improved support another one of DX12’s powerhouse features: asynchronous shading/computing. By taking advantage of DX12’s lower-level access, games and applications can directly interface with the various execution queues on a GPU, scheduling work on each queue and having it executed independently. Async shading is another one of DX12’s optimization features, allowing for certain tasks to be completed in less time (lower throughput latency) and/or to better utilize all of a GPU’s massive arrays of shader ALUs.
Between its new functionality, updated graphical effects, and a significant amount of optimization work since the last beta, the latest beta for Ashes gives us quite a bit to take a look at today, so let’s get started.
More on Async Shading, the New Benchmark, & the Test
As we’ve previously covered the principles of asynchronous shading in depth, we’re not going to completely reiterate what it does and what it’s for in this preview. However for our non-regular readers, here is a quick high-level overview of async shading.
GPUs are, at their most fundamental levels, a large collection of arithmetic logic units (i.e. CUDA Cores/Stream Processors) combined with various other scheduling and fixed function graphics hardware. Because graphics rendering is an embarrassingly parallel problem, GPUs are able to easily subdivide the work in processing a scene into multiple parts, meaning it is relatively easy to scale up the performance of a GPU by adding more ALUs. At the same time because any given graphics operation is likely being applied to a large number of pixels at once, ALUs are grouped together to execute a single instruction over multiple pieces of data (SIMD), which greatly limits the independence of the ALUs, but in turn also allows them to be packed far more densely.
In a traditional (DX11 and earlier) graphics rendering scenario, a GPU will be occupied with one job/task at any given time, time sharing the GPU if necessary in order to let multiple applications use it. By and large this is fine, especially as games are run in a near-exclusive manner and full-screened. However within even a single application the same rules apply: with certain exceptions, the GPU can only handle one task at a time. So if a game wishes to execute multiple tasks, it must execute them in serial, one after another.
Again in a traditional environment all of this is fine, however as GPUs have advanced they have begun to test the limits of a single execution queue. As GPUs add ever more ALUs, even embarrassingly parallel begins to break down, and it is harder to keep a GPU filled the more ALUs there are to fill. Meanwhile new paradigms such as virtual reality have come along, where certain operations such as time warping require executing them with far less latency than the traditional high throughput/high latency execution model of a GPU allows. Thus GPU developers and software developers alike have needed the means to concurrently execute multiple jobs on a GPU’s ALUs, and this is where asynchronous shading comes in.
Whereas the traditional model is serial execution, asynchronous shading is executing multiple jobs over the ALUs at the same time. By implementing multiple queues within a GPU’s thread scheduler, a GPU executing jobs in an asynchronous manner can potentially run upwards of several jobs at once; more queues presents more options for work. Doing so can allow a GPU to be better utilized – by filling the underutilized ALUs with additional, related work – and at the same time work queues can be prioritized so that more important queues get finished sooner, if not as soon as outright possible.
Meanwhile on the API side of matters, while this functionality has been implemented into GPUs for a few years now, DirectX 11 and earlier APIs aren’t built for this paradigm and are unable to submit work to multiple queues. As a result this functionality has been going largely unused. But along with modernizing multi-core rendering, DirectX 12 also modernizes work queuing, and for the first time for DirectX gives developers the ability to issue work to multiple queues. There are a bunch of limitations here – in particular, only one queue can access non-ALU graphics hardware – but overall it gives both GPU developers and game developers tools to further improve performance and better implement certain rendering algorithms and technologies.
Like DirectX 12’s other headlining features, async shading is a powerful tool, but it’s one whose potency will depend on the hardware it’s being executed on and what a game is attempting. Async shading itself is a bit of a catch-all term – not unlike calling a CPU/SoC a multi-core CPU – and can mean any number of things depending on the context. Hardware can have a different number of queues, different rules on how resources are shared, different rules on how queues are scheduled, etc. So not all async shading capable hardware is the same, and there will be varying levels of how much work can actually be done concurrently. At the same time from a throughput perspective async shading can only fill ALUs that aren’t already being fully utilized, so the upper limit to its benefits is whatever resources aren't already being used.
All of this is in turn closely tied to the actual application being run, and how much of its shading/compute tasks can actually be executed concurrently. An application that can issue work to multiple queues but doesn’t actually have much work to issue to multiple queues will not benefit as much, whereas an application that can fill up multiple queues may benefit more. Ultimately it’s a technology whose benefit will vary on a case-by-case basis, with Ashes of the Singularity being one possible way to use the technology.
FAQs & More
Jumping back into the real world and the business that surrounds it, even though Ashes is still in beta, as one of the first games to use DX12 async shading, it’s going to be a big deal.
AMD's Radeon Technologies Group for their part has been heavily promoting Ashes for some time now, and for them the release of this latest beta is definitely a major event. From a marketing standpoint RTG has been touting the benefits of low-level APIs and async shading for some time now, and this latest beta of Ashes brings the first potential killer app one step closer. Meanwhile from a technical perspective it’s fair to say that Ashes via DX12 addresses many of RTG’s perceived weakspots over the past few years: driver CPU utilization, multi-GPU performance, and GPU shader/ALU utilization. A successful DX12 game with both DX12 and DX11 rendering paths gives RTG a prime opportunity to show that these problems are resolved under DX12, so RTG is keen to show that off. To that end, we do want to quickly note that while this beta is being handled through Oxide/Stardock, RTG has also sent the press their own thoughts in a new Ashes benchmark guide.
Meanwhile Oxide is distributing their own benchmark guide to the press for this latest beta. At the very end of the guide is a FAQ, which gives a good overview of what new functionality has been implemented in the latest beta, and what the developer’s polices are on IHV relations. Also included is a brief summary of Oxide’s plans to support Vulkan in the future.
I’ve heard you allow source access to vendors? Is this true?
Yes. Oxide and Stardock want our game to run as fast as possible and with as few issues as possible on everyone’s hardware. Thus, we have an open door policy. For security reasons, we can’t dive into details, but we should be clear that this level of source access is almost unprecedented in the game industry. It is not common industry practice to share source code with IHVs.
The basic way it works is that we have branches in our code tree. Unfortunately, we can’t give complete unrestricted access to our entire source tree to everyone for legal reasons (our lawyers would rather us not share source at all, but we overrode them ;)), but we have created a special branch where not only can vendors see our source code, but they can even submit proposed changes. That is, if they want to suggest a change our branch gives them permission to do so. Naturally, any changes will be carefully reviewed by us and we don’t think it’s ever been made more simple. However, we stress that such changes are relatively rare and typically consist of bug fixes.
This branch is synchronized directly from our main branch so it’s usually less than a week from our very latest internal main software development branch. IHVs are free to make their own builds, or test the intermediate drops that we give our QA. Typically, IHVs receive builds about the same time as our own QA department. However, because they can make their own builds, IHVs can end up with builds that are more current then our own QA department.
Obviously, Oxide and Stardock are taking a huge risk in giving such level source access to everyone. We have significant IP in our code base which must actively protect. However, we’re strong believers in being transparent about our development process. Our hope is that sharing this information will make everyone’s products better.
Does Oxide optimize specifically for any hardware?
Oxide primarily optimizes at an algorithmic level, not for any specific hardware. We also take care to avoid the proverbial known “glass jaws” which every hardware has. However, we do not write our code or tune for any specific GPU in mind. We find this is simply too time consuming, and we must run on a wide variety of GPUs. We believe our code is very typical of a reasonably optimized PC game.
How much performance should I gains from a second graphics card in my computer?
This depends on your video cards. We expect around 70% scaling if you use two of the same card. However, mixing cards can vary the results. For example, you will never get more than twice the speed of the slowest video card. You would be better off just using the new card alone. If you are mixing and matching cards, we recommend running the benchmark in single GPU mode first, then matching cards which have similar single GPU scores.
Why do multiple GPUs matter?
Multiple GPU configurations are increasingly common amongst gamers. Moreover, it allows users with a reasonably new video card to greatly improve their performance by buying a second card, even if it is a different brand or model and gain performance. This will begin to matter more as gamers begin to migrate to 4K and higher resolution displays.
Where can I get detailed benchmark results?
In documents\my games\ashes of the singularity you will find a Benchmarks directory. Within it, the detailed log is kept. This log will store not only the aggregate timings, but additional information regarding the timings of every frame of the log.
Where can I change settings?
In documents\my games\ashes of the singularity you will see a settings.ini file. Within that, you can see many different settings to try. This should only be done by very technical, advanced users. Should you place the game in some sort of settings that prevent loading, you may always delete the settings.ini file and the game will regenerate it, setting it to default values.
Is Oxide still supporting Mantle?
Oxide is migrating the effort spent on Mantle to support on the upcoming Vulkan API. We have no solid time-table for Vulkan support at this time, however.
How close to final is the code?
In the era of digital updates, nothing is every really final. However, the code is nearing release form for our release on March 22nd. We expect few changes related to graphics rendering to occur before release.
Does Oxide/Stardock have some sort of business deal with any IHV with regards to Async Compute? Is Oxide promoting this feature because of some kind of marketing deal?
No. We have no marketing or business agreement to pursue or implement this feature. We pursued the multiple command queues also known as async compute because it is a new capability in D3D12 and Windows 10. That is, we implemented it entirely on our own accord and curiosity. Oxide is committed to exploiting as many capabilities of DX12 as possible.
In the previous benchmark, were you using async compute?
We had very basic support of this feature. During the process of development for Multi-GPU, we realized that some of the lessons learned and code written could be applied to async compute. Thus, this benchmark 2 has a much more advanced implementation of this feature.
Do you have any recommended settings?
All of the presets are appropriate for certain class of hardware. Internally, it is our expectation that a user with a high end video card would run at Extreme at 1600p. Though our game will attempt to auto detect settings appropriate for the video card, we tend to be a bit conservative with this detection and let users turn settings up.
Ashes of the Singularity Benchmark 2.0
Along with the new functionality introduced in this week’s beta, this release also contains a modified version of the benchmark distributed with the previous version of Ashes. The new benchmark is still 3 minutes long and many of the camera tracks/unit placements are identical, but this latest version utilizes the models for another of the game’s factions, implements newer graphics effects, and is overall intended to be a more strenuous benchmark (performance optimizations not withstanding).
The Test
And with that out of the way, let’s dive into benchmarking. As we don’t typically benchmark beta games, we want to reiterate that this is a true beta. We’ve already seen the performance of Ashes shift significantly since our last look at the game, and while the game is much closer to competition now, it is not yet final. Further optimizations or driver releases likely will further alter the performance of the game, so nothing here should be considered definitive about how the final game will perform.
CPU: | Intel Core i7-4960X @ 4.2GHz |
Motherboard: | ASRock Fatal1ty X79 Professional |
Power Supply: | Corsair AX1200i |
Hard Disk: | Samsung SSD 840 EVO (750GB) |
Memory: | G.Skill RipjawZ DDR3-1866 4 x 8GB (9-10-9-26) |
Case: | NZXT Phantom 630 Windowed Edition |
Monitor: | Asus PQ321 |
Video Cards: | AMD Radeon R9 Fury X ASUS STRIX R9 Fury AMD Radeon R9 285 AMD Radeon HD 7970 NVIDIA GeForce GTX Titan X NVIDIA GeForce GTX 980 Ti EVGA GeForce GTX 960 NVIDIA GeForce GTX 780 Ti NVIDIA GeForce GTX 680 |
Video Drivers: | NVIDIA Release 361.91 AMD Radeon Software 16.1.1 Hotfix |
OS: | Windows 10 Pro |
DirectX 12 Single-GPU Performance
We’ll start things off with a look at single-GPU performance. For this, we’ve grabbed a collection of RTG and NVIDIA GPUs covering the entire DX12 generation, from GCN 1.0 and Kepler to GCN 1.2 and Maxwell. This will give us a good idea of how the game performs both across a wide span of GPU performance levels, and how (if at all) the various GPU generational changes play a role.
Meanwhile unless otherwise noted, we’re using Ashes’ High quality setting, which turns up a number of graphical features and also utilizes 2x MSAA. It’s also worth mentioning that while Ashes does allow async shading to be turned off and on, this option is on by default unless turned off in the game’s INI file.
Starting at 4K, we have the GeForce GTX 980 Ti and Radeon R9 Fury X. On the latest beta the Fury X has a strong lead over the normally faster GTX 980 Ti, beating it by 20% and coming close to hitting 60fps.
When we drop down to 1440p and introduce last-generation’s flagship video cards, the GeForce GTX 780 Ti and Radeon R9 290X, the story is much the same. The Fury X continues to hold a 10fps lead over the GTX 980 Ti, giving it an 18% lead. Similarly, the R9 290X has an 8fps lead over the 780 Ti, translating into a 19% performance lead. This is a significant turnabout from where we normally see these cards, as 780 Ti traditionally holds a lead over the 290X.
Meanwhile looking at the average framerates with different batch count intensities, there admittedly isn’t much remarkable here. All cards take roughly the same performance hit with increasingly larger batch counts.
Finally at 1080p, with our full lineup of cards we can see that RTG’s lead in this latest beta is nearly absolute. The 2012 flagship battle between the 7970 and the GTX 680 puts the 7970 in the lead by 12%, or just shy of 4fps. Elsewhere the GTX 980 Ti does close on the Fury X, but RTG’s current-gen flagship remains in the lead.
The one outlier here is the Radeon R9 285, which is the only 2GB RTG card in our collection. At this point we suspect it’s VRAM limited, but it would require further investigation.
DirectX 12 Multi-GPU Performance
Shifting gears, let’s take a look at multi-GPU performance on the latest Ashes beta. The focus of our previous article, Ashes’ support for DX12 explicit multi-GPU makes it the first game to support the ability to pair up RTG and NVIDIA GPUs in an AFR setup. Like traditional same-vendor AFR configurations, Ashes’ AFR setup works best when both GPUs are similar in performance, so although this technology does allow for some unusual cross-vendor comparisons, it does not (yet) benefit from pairing up GPUs that widely differ in performance, such as a last-generation video card with a current-generation video card. None the less, running a Radeon and a GeForce card together is an interesting sight, if only for the sheer audacity of it.
Meanwhile as a result of the significant performance optimizations between the last beta build and this latest build, this has also had an equally significant knock-on effect on mutli-GPU performance as compared to the last time we looked at the game.
Even at 4K a pair of GPUs ends up being almost too much at Ashes’ High quality setting. All four multi-GPU configurations are over 60fps, with the fastest Fury X + 980 Ti configuration nudging past 70fps. Meanwhile the lead over our two fastest single-GPU configurations is not especially great, particularly compared to the Fury X, with the Fury X + 980 Ti configuration only coming in 15fps (27%) faster than a single GPU. The all-NVIDIA comparison does fare better in this regard, but only because of GTX 980 Ti’s lower initial performance.
Digging deeper, what we find is that even at 4K we’re actually CPU limited according to the benchmark data. Across all four multi-GPU configurations, our hex-core overclocked Core i7-4960X can only setup frames at roughly 70fps, versus 100fps+ for a single-GPU configuration.
Top: Fury X. Bottom: Fury X + 980 Ti
The increased CPU load from utilizing multi-GPU is to be expected, as the CPU now needs to spend time synchronizing the GPUs and waiting on them to transfer data between each other. However dropping to 70fps means that Ashes has become a surprisingly heavy CPU test as well, and that 4K at high quality alone isn’t enough to max out our dual GPU configurations.
Cranking up the quality setting to Extreme finally gives our dual-GPU configurations enough of a workload to back off from the CPU performance cap. Once again the fastest configuration is the Fury X + 980 Ti, which lands just short of 60fps, followed by the Fury X + Fury configuration at 55.1fps. In our first look at Ashes multi-GPU scaling we found that having a Fury X card as the lead card resulted in better performance, and this has not changed for the newest beta. The Fury continues to be faster at reading data off of other cards. Still, the gap between the Fury X + 980 Ti configuration and the 980 Ti + Fury X configuration has closed some as compared to last time, and now stands at 11%.
Backing off from the CPU limit has also put the multi-GPU configurations well ahead of the single-GPU configurations. We’re now looking at upwards of a 65% performance boost versus a single GTX 980, and a smaller 31% performance boost versus a single Fury X. These are smaller gains for multi-GPU configurations than we first saw last year, but it’s also very much a consequence of Ashes’ improved performance across the board. Though we didn’t have time to test it, Ashes does have one higher quality setting – Crazy – which may drive a bit of a larger wedge between the multi-GPU configurations and the Fury X, though the overhead of synchronization will always present a roadblock.
DirectX 12 vs. DirectX 11
Now that we’ve had the chance to look at DirecX 12 performance, let’s take a look at things with DirectX 11 thrown into the mix. As a reminder, while the two rendering paths are graphically identical, the DirectX 12 path introduces the latter’s multi-core scalability along with asynchronous shading functionality. The game and the underlying Nitrous engine is designed to take advantage of both, but particularly the multi-core functionality as the game pushes some very high batch counts.
Given that we had never benchmarked Ashes under DirectX 11 before, what we had been expecting was a significant performance regression when switching to it. Instead what we found was far more surprising.
On the RTG side of matters, there is a large performance gap between DX11 and DX12 at all resolutions, increasing with the overall performance of the video card being tested. Even on the R9 290X and the 7970, using DX12 is a no brainer, as it improves performance by 20% or more.
The big surprise however is with the NVIDIA cards. For the more powerful GTX 980 Ti and GTX 780 Ti, NVIDIA doesn’t gain anything from the DX12 rendering path; in fact they lose a percent or two in performance. This means that they have very good performance under DX11 (particular the GTX 980 Ti), but it’s not doing them any favors under DX12, where as we’ve seen RTG has a rather consistent performance lead. In the past NVIDIA has gone through some pretty extreme lengths to optimize the CPU usage of their DX11 driver, so this may be the payoff from general optimizations, or even a round of Ashes-specific optimizations.
Breaking down the gains on a percentage basis at 1080p, the most CPU-demanding resolution, we find that the Fury X picks up a full 50% from DX12, followed by 29% and 23% for the R9 290X and 7970 respectively. Meanwhile at the opposite end of the spectrum are the GTX 980 Ti and GTX 780 Ti, who lose 1% and 3% respectively.
Finally, right in the middle of all of this is the GTX 680. Given what happens to the architecturally similar GTX 780 Ti, this may be a case of GPU memory limitations (this is the only 2GB NVIDIA card in this set), as there’s otherwise no reason to expect the weakest NVIDIA GPU to benefit the most from DX12.
Overall then this neatly illustrates why RTG in particular has been so gung-ho about DX12, as Ashes’ DX12 path has netted them a very significant increase in performance. To some degree however what this means is a glass half full/half empty full situation; RTG gains so much from DX12 in large part because of their poorer DX11 performance (especially on the faster cards), but on the other hand a “simple” API change has unlocked a great deal of GPU power that wasn’t otherwise being used and vaulted them well into the lead. As for NVIDIA, is it that their cards don’t benefit from DX12, or is it that their DX11 driver stack is that good to begin with? At the end of the day Ashes is just a single game – and a beta game at that – but it will be interesting to see if this is a one-off situation or if it becomes recurring.
The Performance Impact of Asynchronous Shading
Finally, let’s take a look at Ashes’ latest addition to its stable of DX12 headlining features; asynchronous shading/compute. While earlier betas of the game implemented a very limited form of async shading, this latest beta contains a newer, more complex implementation of the technology, inspired in part by Oxide’s experiences with multi-GPU. As a result, async shading will potentially have a greater impact on performance than in earlier betas.
Update 02/24: NVIDIA sent a note over this afternoon letting us know that asynchornous shading is not enabled in their current drivers, hence the performance we are seeing here. Unfortunately they are not providing an ETA for when this feature will be enabled.
Since async shading is turned on by default in Ashes, what we’re essentially doing here is measuring the penalty for turning it off. Not unlike the DirectX 12 vs. DirectX 11 situation – and possibly even contributing to it – what we find depends heavily on the GPU vendor.
All NVIDIA cards suffer a minor regression in performance with async shading turned on. At a maximum of -4% it’s really not enough to justify disabling async shading, but at the same time it means that async shading is not providing NVIDIA with any benefit. With RTG cards on the other hand it’s almost always beneficial, with the benefit increasing with the overall performance of the card. In the case of the Fury X this means a 10% gain at 1440p, and though not plotted here, a similar gain at 4K.
These findings do go hand-in-hand with some of the basic performance goals of async shading, primarily that async shading can improve GPU utilization. At 4096 stream processors the Fury X has the most ALUs out of any card on these charts, and given its performance in other games, the numbers we see here lend credit to the theory that RTG isn’t always able to reach full utilization of those ALUs, particularly on Ashes. In which case async shading could be a big benefit going forward.
As for the NVIDIA cards, that’s a harder read. Is it that NVIDIA already has good ALU utilization? Or is it that their architectures can’t do enough with asynchronous execution to offset the scheduling penalty for using it? Either way, when it comes to Ashes NVIDIA isn’t gaining anything from async shading at this time.
Meanwhile pushing our fastest GPUs to their limit at Extreme quality only widens the gap. At 4K the Fury X picks up nearly 20% from async shading – though a much smaller 6% at 1440p – while the GTX 980 Ti continues to lose a couple of percent from enabling it. This outcome is somewhat surprising since at 4K we’d already expect the Fury X to be rather taxed, but clearly there’s quite a bit of shader headroom left unused.
Closing Thoughts
Wrapping up our second look at Ashes of the Singularity and third overall look at Oxide’s Nitrous engines, it’s interesting to see where things have changed and where they have stayed the same.
Thanks to the general performance optimizations made since our initial look at Ashes, the situation for multi-GPU via DirectX 12 explicit multi-adapter is both very different and very similar. On an absolute basis it’s now a lot harder to max out a multi-GPU configuration; with reasonable quality settings we’re CPU limited even up to 4K, requiring we further increase the rendering quality. This more than anything else handily illustrates just how much performance has improved since the last beta. On the other hand it’s still the most unusual pairing – a Radeon R9 Fury X with a GeForce GTX 980 Ti – that delivers the best multi-GPU performance, which just goes to show what RTG and NVIDIA can accomplish working together.
As for the single GPU configurations, I’m not sure things as they currently stand could be any more different. NVIDIA cards have very good baseline DX11 performance in Ashes of the Singularity, but they mostly gain nothing from Ashes’ DX12 rendering path. RTG cards on the other hand have poorer DX11 performance, but they gain a significant amount of performance from the DX12 rendering path. In fact they gain so much performance that against traditional competitive lineups (e.g. Fury X vs. 980 Ti), the RTG cards are well in the lead, which isn’t usually the case elsewhere.
Going hand-in-hand with DX12, RTG’s cards are the only products to consistently benefit from Ashes’ improved asynchronous shading implementation. Whereas our NVIDIA cards see a very slight regression (with NVIDIA telling us that async shading is not currently enabled in their drivers), the Radeons improve in performance, especially the top-tier Fury X. This by itself isn’t wholly surprising given some of our theories about Fury X’s strengths and weaknesses, but for Ashes of the Singularity performance it further compounds on the other DX12 performance gains for RTG.
Ultimately Ashes gives us a very interesting look at the state of DirectX 12 performance for both RTG and NVIDIA cards, though no more and no less. As we stated at the start of this article this is beta software and performance is subject to change – not to mention the overall sample size of one game – but it is a start. For RTG this certainly lends support to their promotion of and expectations for DirectX 12, and it should be interesting to see how things shape up in March and beyond once the gold version of Ashes is released, and past that even more DirectX 12 games.