Unless GTA V has been made to run on the Phi architecture or Core architecture without a GPU, no.
IF it can run it at 60Hz depends on what screen(s) the super computer is plugged into. If that screen is 60Hz, and GTA 5 can run, it will be running at 60Hz.
You make it sound like they're not developing something like that. PCIe4 will have just under 16GBps transfer per lane. NVLink is about 20GBps per lane.
Now, NVLink shipping products should launch at least 3-4 months before PCIe4. However, at worst, Intel should end up with 75-80% of the performance. I would guess Intel probably has something at least on par with NvLink in production, however.
Knight's landing can be socketed and used as the main system CPU, btw.
Intel seems to be content with using QPI for internal chip communication and Omnipath for external node-to-node communication. Considering that these super computers will be using silicon photonics at some point, it wouldn't surprise me long term if Intel merges QPI and Omnipath in terms of a physical optical link with support for higher layers of both.
At the moment, IBM seems content with electrical for internal usage but they have already used massive optical switches in POWER7 to provide a single system image for the cluster (though the entire system was not coherent, it did have a flat memory address space throughout). If IBM wants to go optical for inter-chip communication, nVidia will also have to step up.
One article says they decided not to make a dual socket version, which means only single socket is available.
That means either a single socket version or a card version. Neither needs QPI. Of course there's a third one which adds external OmniPath Interconnect.
NVlink is also memory-coherent with the host - no translation required, unlike PCIe. This results in *very* large latency reductions when talking to the attached accelerators.
You seem to be confusing GB (Gigabyte) and Gigabit. PCIe Gen 4 is 1969MB/s (1.9GB/s) per lane, with max of 16 lanes per interface... giving you almost 32GB/s. NVLINK is also 16 lanes, but gives a full interface speed of 80GB/s, or 5GB/s per lane.
Intel's "competitor" is their socket-socket Quick Path Interconnect interface, which is a parallel bus (unlike PCIe and NVLINK, which are serial) and has speed ranging from 32GB/s to ~80 GB/s in chips today. Unlike Kevin G saying in the comments section here, Intel QPI is *not* used for internal chip communication, it is used for the uncore, motherboard, and socket-socket communication.
This is *NOT* about bandwidth, it's all about latency, nVidia plan to launch an incredible offer with a really low latency to enable efficient communication between GPU on a server, stay tuned ;)
Wow, are you that much of a fanboy? I gave credit to both. Power9 should be pretty awesome, despite being on a larger node than intel. A single chip supports almost 100 threads, and these threads are a good deal heavier than those than make up a wave (also much finer grained control). Lastly, capi and nvlink will allow for an HSA-type workload. For embarrasingly parallel tasks pascal will be awesome, and for more dynamic workloads it'll be far better than maxwell.
Where are you reading about thread counts on POWER9 ? -- POWER8 supports 96 threads in a single socket (for the big version) -- so POWER9 may end up being even more.
It's difficult to compare those big nasty power chips to xeons though as they typically run in much higher TDP ranges.
There'll be many in the government wanting to use it for decryption of the 14 years of data they've gathered on everyone, foreign and domestic. Once the communications are intercepted and stored, the data-mining begins. Everything a person does becomes charted on a graph, financial transactions or travel or anything. Thus, as data like bookstore receipts, bank statements, and commuter toll records flow in, the NSA is able to paint a more and more detailed picture of someone’s life. Have fun with convincing people like Hatch, McCain, and Graham that what Nixon did, using such information for political purposes, shouldn't be accomodated.
There will be no one reasonable who would possibly use Aurora or Summit for decryption tasks. For one thing these machines haven't been co-designed with that in mind. These machines aren't on classified networks and it is pretty likely that the NSA has something better suited than either of these giant machines. Plus anyone logged into either machine can see what else is in the queue. This is just stupid paranoia and this kind of crap undermines very real issues with the surveillance state.
What redviper is saying is not that NSA & government won't do what you sais - in fact, since Snowden we know full well that they will do it. But the point is that this won't happen on a machine as public as Aurora. Soemthing secret and architectured for the task is what they'll use.
Obviously Intel's payoff for the US government restricting them from selling supercomputer chips to China - and Nvidia's loss. Also still a loss to the US government and Intel in the end, if China starts building its own competing chips and actually bans Intel and AMD from the market, just like Russia did.
My concern over this timeframe is what the implications are for the consumer space. Will Intel essentially stop supporting discreet graphics 3 years from now? PCIe Gen. 3 may no longer have enough bandwidth for high end cards by then, so will Intel support an interface with enough bandwidth that third parties can use effectively, or are they banking on their own integrated graphics being enough for everyone? If so, will gamers have to choose between something like ARM or POWER9 (which would be super expensive) if they want discreet graphics?
given that PCIe gen 1.1 x16 can handle a 980 with only an average of 4-5% (900p to 2160p) and that PCIe 2/3 x16/x8 have over 99% average scaling... And PCIe 4 x16 gives ~4 times the bandwidth of those. It shouldn't be an issue for quite a while.
And, if Intel does that, everyone is going to recommend going AMD (given they're still around) or perhaps even ARM chips depending on how that develops.
Part of my concern is that AMD may no longer be viable for the gaming market in that timeframe, and Intel may decide to reduce access to PCIe lanes, at least in their consumer grade CPUs, as they do on some lower-performance CPUs. If this happens, will ARM be good enough or viable for gaming? Not every company is serious about gaming. Will Intel be the next company to take Apple's lead?
ARM will be good enough for gaming. 3 of Apple's Cyclone+ cores manage to be about as fast as 2 of Intel's Core M cores. Not sure of what clocks they each were exactly hitting... But, A57 approx 35-45% behind Cyclone+ clock per clock (based on single thread benchmarks, I don't know if either chip throttled (sorry!))
A72 should bring that to around the IPC of Cyclone. If it hits under 2Ghz clockspeed I would be shocked. In current games that would be a problem, but, with APIs getting more and more multithreaded, you could see the performance loss not being very high.
Nevermind the A72 is for late 2015/2016. So, by 2018 I imagine ARM likely can get a design in the performance of consumer i5-i7 SKU. Hopefully.
No, PCIe is not going anywhere except to generation 4 with again double the bandwidth. Sure, at some point really massive integrated GPUs may be enough for most. But there's always going to be a market for "accelerators" in some way in the near to mid-term future. This includes GPUs and other stuff.
His question was more about Intel supporting it. How fast Gen4 is doesn't matter if Intel doesn't use it at all. Broadwell offers a total of 12 PCI 2.0 lanes, 2x4, 4x1. That drops the performance of a 980 9% at 1440p. Now, those are all laptop parts so far, so, the PCI implementation there is likely for maximum power saving. Hopefully desktop chips bring more.
That's Broadwell-U, the low power SOC version with a maximum TDP of 28 W. They are not comparable with the larger Haswell or upcoming Skylake chips. Skylake will have even more total bandwidth than Haswell because they will increase the bandwidth the the PCH when they update to DMI 3.0. So that's PCIe 3.0 x16 from the CPU and double the bandwidth to PCH to 40 Gb/s. PCIe 4.0 is expected to show up in Skylake-E. Bandwidth to GPUs isn't that much of a bottleneck currently. PCIe 3.0 x16 is enough for most dual GPU systems today and for the larger builds, there's always the s2011 platforms with more lanes. A doubling of bandwidth with PCIe 4.0 will make sure bandwidth is not a problem. They are moving faster than the GPU manufacturers can.
Why would Intel invest in a product designed by Nvidia for HPC use? Nvlink is not made to replace PCIe in gaming computers. It's designed for HPC and even there it's a niche product.
Sorry Testbug, it was a joke. I always get a kick out of people who think stuff like this would apply to their regular PC. Guess it was my inside joke then.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
35 Comments
Back to Article
JoJoman88 - Thursday, April 9, 2015 - link
But can it play GTA V at 4k 60Hz?testbug00 - Friday, April 10, 2015 - link
Unless GTA V has been made to run on the Phi architecture or Core architecture without a GPU, no.IF it can run it at 60Hz depends on what screen(s) the super computer is plugged into. If that screen is 60Hz, and GTA 5 can run, it will be running at 60Hz.
tuxRoller - Thursday, April 9, 2015 - link
Well, I guess this means we won't be seeing 1 exaflop by 2020.Still, pretty impressive efficiency from ibm/nvidia.
Morawka - Thursday, April 9, 2015 - link
mostly nvidia there but somethings got to push those cores i supposeMorawka - Thursday, April 9, 2015 - link
maybe intel will take the hint and support NVLINK or come up with something comparable with a lot of bandwidth for CPU to GPU interconnect.testbug00 - Friday, April 10, 2015 - link
You make it sound like they're not developing something like that. PCIe4 will have just under 16GBps transfer per lane. NVLink is about 20GBps per lane.Now, NVLink shipping products should launch at least 3-4 months before PCIe4. However, at worst, Intel should end up with 75-80% of the performance. I would guess Intel probably has something at least on par with NvLink in production, however.
Knight's landing can be socketed and used as the main system CPU, btw.
Kevin G - Friday, April 10, 2015 - link
Intel seems to be content with using QPI for internal chip communication and Omnipath for external node-to-node communication. Considering that these super computers will be using silicon photonics at some point, it wouldn't surprise me long term if Intel merges QPI and Omnipath in terms of a physical optical link with support for higher layers of both.At the moment, IBM seems content with electrical for internal usage but they have already used massive optical switches in POWER7 to provide a single system image for the cluster (though the entire system was not coherent, it did have a flat memory address space throughout). If IBM wants to go optical for inter-chip communication, nVidia will also have to step up.
testbug00 - Friday, April 10, 2015 - link
dunno what Knights Landing uses, but, i'm pretty sure they don't use QPI. Happy to be wrong.IntelUser2000 - Saturday, April 11, 2015 - link
One article says they decided not to make a dual socket version, which means only single socket is available.That means either a single socket version or a card version. Neither needs QPI. Of course there's a third one which adds external OmniPath Interconnect.
extide - Monday, April 13, 2015 - link
Knights Landing can use either PCIe or QPI, depending on the form factor. (Card vs Chip only)SarahKerrigan - Friday, April 10, 2015 - link
NVlink is also memory-coherent with the host - no translation required, unlike PCIe. This results in *very* large latency reductions when talking to the attached accelerators.trsohmers - Friday, April 10, 2015 - link
You seem to be confusing GB (Gigabyte) and Gigabit. PCIe Gen 4 is 1969MB/s (1.9GB/s) per lane, with max of 16 lanes per interface... giving you almost 32GB/s. NVLINK is also 16 lanes, but gives a full interface speed of 80GB/s, or 5GB/s per lane.Intel's "competitor" is their socket-socket Quick Path Interconnect interface, which is a parallel bus (unlike PCIe and NVLINK, which are serial) and has speed ranging from 32GB/s to ~80 GB/s in chips today. Unlike Kevin G saying in the comments section here, Intel QPI is *not* used for internal chip communication, it is used for the uncore, motherboard, and socket-socket communication.
iAPX - Thursday, April 16, 2015 - link
This is *NOT* about bandwidth, it's all about latency, nVidia plan to launch an incredible offer with a really low latency to enable efficient communication between GPU on a server, stay tuned ;)tuxRoller - Friday, April 10, 2015 - link
Wow, are you that much of a fanboy? I gave credit to both.Power9 should be pretty awesome, despite being on a larger node than intel.
A single chip supports almost 100 threads, and these threads are a good deal heavier than those than make up a wave (also much finer grained control).
Lastly, capi and nvlink will allow for an HSA-type workload.
For embarrasingly parallel tasks pascal will be awesome, and for more dynamic workloads it'll be far better than maxwell.
extide - Monday, April 13, 2015 - link
Where are you reading about thread counts on POWER9 ? -- POWER8 supports 96 threads in a single socket (for the big version) -- so POWER9 may end up being even more.It's difficult to compare those big nasty power chips to xeons though as they typically run in much higher TDP ranges.
Jalek99 - Thursday, April 9, 2015 - link
There'll be many in the government wanting to use it for decryption of the 14 years of data they've gathered on everyone, foreign and domestic. Once the communications are intercepted and stored, the data-mining begins. Everything a person does becomes charted on a graph, financial transactions or travel or anything. Thus, as data like bookstore receipts, bank statements, and commuter toll records flow in, the NSA is able to paint a more and more detailed picture of someone’s life.Have fun with convincing people like Hatch, McCain, and Graham that what Nixon did, using such information for political purposes, shouldn't be accomodated.
redviper - Thursday, April 9, 2015 - link
There will be no one reasonable who would possibly use Aurora or Summit for decryption tasks. For one thing these machines haven't been co-designed with that in mind. These machines aren't on classified networks and it is pretty likely that the NSA has something better suited than either of these giant machines. Plus anyone logged into either machine can see what else is in the queue. This is just stupid paranoia and this kind of crap undermines very real issues with the surveillance state.Jalek99 - Thursday, April 9, 2015 - link
It's largely quoted text from the Wired article that came a year before Snowden, that many similarly dismissed as baseless paranoia.redviper - Friday, April 10, 2015 - link
You sound like an anti-vaxxer, unable to digest information that you don't agree with.tuxRoller - Friday, April 10, 2015 - link
Logic fail.Just because one very specific conspiracy theory is true doesn't make the rest any more or less likely.
MrSpadge - Friday, April 10, 2015 - link
What redviper is saying is not that NSA & government won't do what you sais - in fact, since Snowden we know full well that they will do it. But the point is that this won't happen on a machine as public as Aurora. Soemthing secret and architectured for the task is what they'll use.mdcsd - Friday, April 10, 2015 - link
If you're talking about brute force decryption, there is no way these machines would even scratch the surface.Tewt - Friday, April 10, 2015 - link
Cray Land, never heard of them before. What do they make?xthetenth - Friday, April 10, 2015 - link
Intel and Cray land contract for 2 Dept. of Energy Supercomputers.Never heard of landing a contract?
Krysto - Friday, April 10, 2015 - link
Obviously Intel's payoff for the US government restricting them from selling supercomputer chips to China - and Nvidia's loss. Also still a loss to the US government and Intel in the end, if China starts building its own competing chips and actually bans Intel and AMD from the market, just like Russia did.Ktracho - Friday, April 10, 2015 - link
My concern over this timeframe is what the implications are for the consumer space. Will Intel essentially stop supporting discreet graphics 3 years from now? PCIe Gen. 3 may no longer have enough bandwidth for high end cards by then, so will Intel support an interface with enough bandwidth that third parties can use effectively, or are they banking on their own integrated graphics being enough for everyone? If so, will gamers have to choose between something like ARM or POWER9 (which would be super expensive) if they want discreet graphics?testbug00 - Friday, April 10, 2015 - link
given that PCIe gen 1.1 x16 can handle a 980 with only an average of 4-5% (900p to 2160p) and that PCIe 2/3 x16/x8 have over 99% average scaling... And PCIe 4 x16 gives ~4 times the bandwidth of those. It shouldn't be an issue for quite a while.And, if Intel does that, everyone is going to recommend going AMD (given they're still around) or perhaps even ARM chips depending on how that develops.
Ktracho - Friday, April 10, 2015 - link
Part of my concern is that AMD may no longer be viable for the gaming market in that timeframe, and Intel may decide to reduce access to PCIe lanes, at least in their consumer grade CPUs, as they do on some lower-performance CPUs. If this happens, will ARM be good enough or viable for gaming? Not every company is serious about gaming. Will Intel be the next company to take Apple's lead?testbug00 - Friday, April 10, 2015 - link
ARM will be good enough for gaming. 3 of Apple's Cyclone+ cores manage to be about as fast as 2 of Intel's Core M cores. Not sure of what clocks they each were exactly hitting... But, A57 approx 35-45% behind Cyclone+ clock per clock (based on single thread benchmarks, I don't know if either chip throttled (sorry!))A72 should bring that to around the IPC of Cyclone. If it hits under 2Ghz clockspeed I would be shocked. In current games that would be a problem, but, with APIs getting more and more multithreaded, you could see the performance loss not being very high.
Nevermind the A72 is for late 2015/2016. So, by 2018 I imagine ARM likely can get a design in the performance of consumer i5-i7 SKU. Hopefully.
MrSpadge - Friday, April 10, 2015 - link
No, PCIe is not going anywhere except to generation 4 with again double the bandwidth. Sure, at some point really massive integrated GPUs may be enough for most. But there's always going to be a market for "accelerators" in some way in the near to mid-term future. This includes GPUs and other stuff.testbug00 - Friday, April 10, 2015 - link
His question was more about Intel supporting it. How fast Gen4 is doesn't matter if Intel doesn't use it at all. Broadwell offers a total of 12 PCI 2.0 lanes, 2x4, 4x1. That drops the performance of a 980 9% at 1440p. Now, those are all laptop parts so far, so, the PCI implementation there is likely for maximum power saving. Hopefully desktop chips bring more.http://www.anandtech.com/show/8814/intel-releases-...
Zotamedu - Thursday, April 16, 2015 - link
That's Broadwell-U, the low power SOC version with a maximum TDP of 28 W. They are not comparable with the larger Haswell or upcoming Skylake chips. Skylake will have even more total bandwidth than Haswell because they will increase the bandwidth the the PCH when they update to DMI 3.0. So that's PCIe 3.0 x16 from the CPU and double the bandwidth to PCH to 40 Gb/s. PCIe 4.0 is expected to show up in Skylake-E. Bandwidth to GPUs isn't that much of a bottleneck currently. PCIe 3.0 x16 is enough for most dual GPU systems today and for the larger builds, there's always the s2011 platforms with more lanes. A doubling of bandwidth with PCIe 4.0 will make sure bandwidth is not a problem. They are moving faster than the GPU manufacturers can.Why would Intel invest in a product designed by Nvidia for HPC use? Nvlink is not made to replace PCIe in gaming computers. It's designed for HPC and even there it's a niche product.
mehrotrasc - Friday, April 10, 2015 - link
Congratulations to all members of the team for excellent product which will lead to many breakthrough real time applications...dahippo - Saturday, April 11, 2015 - link
Can I dropp a a-bomb faster or is it wrong department :)JoJoman88 - Saturday, April 11, 2015 - link
Sorry Testbug, it was a joke. I always get a kick out of people who think stuff like this would apply to their regular PC. Guess it was my inside joke then.