For SC20: Intel affords Aurora replace as a result of Argonne builders use Intel Xe-HP-GPUs as a substitute of ‘Ponte Vecchio’

Editor’s Note: The following story is an update of a previous version that was released yesterday.

Intel and Argonne National Laboratory have announced that they will use GPUs based on the Intel Xe-HP microarchitecture and the Intel oneAPI toolkits for the development of scientific applications for the Aurora-Exascale system – in anticipation of the later shipment of Intel 7-nm -GPUs “Ponte Vecchio”. This will power Aurora when the delayed system is deployed in 2022. Objective: To use Intel’s heterogeneous computer programming environments so that scientific applications are ready when deployed for the size and architecture of the Aurora supercomputer.

Jeff McVeigh, Intel VP / GM for XPU products and solutions for data centers, told us in an interview today about developments at Intel that have a direct impact on Aurora, including the planned delivery of Ponte Vecchio (unchanged). on the use of Aurora (forecast earlier than yesterday by industry analyst firm Hyperion Research); on Intel’s “XPU” cross-architecture strategy and its impact on Argonne’s ongoing Aurora application development work; and with the upcoming release of the first production version of oneAPI (next month), Intel’s cross-architecture programming model for CPUs, GPUs, FPGAs, and other accelerators.

The Aurora system (A21), for which Intel is the prime contractor, was due to be deployed in 2021 and would be the first exascale-class system in the United States. However, in July, Intel announced that Ponte Vecchio would be delaying at least six months, which means that HPE-Crays Frontier, powered by AMD CPUs and GPUs and destined for the Oak Ridge National Laboratory, will be the first exascale system of the country will be.

Yesterday, at its HPC Market Update, industry analyst firm Hyperion Research announced that Aurora is shipping about 12 months behind schedule. But McVeigh told us today that those expectations are pessimistic.

“We didn’t agree with what they said. I think they said delivery in 2022 and then it will go online in 2023, ”McVeigh said. “I think that was a bit aggressive, aggressive, which means we assume it will be sooner.”

Regarding Ponte Vecchio, McVeigh said Intel’s shipping forecasts have remained unchanged since its statement in late July.

“We haven’t changed from what was sort of a window from late 2021 to the first half of the 2022 period,” McVeigh said. “I think that is still the window we describe for this work. We still have a lot to do. Of course, that’s more than a year from now, but we’re pretty happy with some of the changes we’ve made based on the July changes. So I’m not going to provide any further public information on this at this point, but we’re glad where that is going. “

He also didn’t add to Intel’s earlier statements about the possibility of outsourcing the manufacturing of Ponte Vecchio to a third party that is widely believed to be either TSMC or Samsung.

“We’re not describing any specific outside partnerships,” said McVeigh, “but we’ve been public about using both in-house and outside foundries for the product.” So that’s still the case, (but) we still don’t give … details of who it is. “

In its announcement yesterday, Intel said that researchers at the Argonne Leadership Computing Facility (ALCF) are using software development platforms based on Intel Xe-HP GPUs as part of the Aurora Early Science program and the Exascale Computing project that enable applications to create libraries and infrastructure should be prepared for Aurora. Intel and Argonne teams are working to jointly design, test, and validate multiple exascale applications.

“With access to Intel Xe HP GPUs and development tools, Argonne developers can perform software optimizations for Intel CPUs and GPUs and examine scenarios that would be difficult to replicate in software-only environments,” Intel said in its announcement. “The Xe HP GPUs provide Intel Xe HPC GPUs (i.e. Ponte Vecchio) used in the Aurora system, a development vehicle.”

With oneAPI playing a central role, this is all part of Intel’s heterogeneous “XPU” strategy to support multiple architectures, including GPUs from AMD and Nvidia, said McVeigh.

“Intel is changing,” he said, “we are moving from what is called a CPU-centric world to an XPU-centric work. We really realize that there are just different workloads and that one architecture is not suitable.” for all. Often times, they have workloads with a high proportion of scalar code, matrix or vector code, or even spatial code, and that really drove a lot of our acquisition activities forward. And we’ve done a lot of what I would call organic evolution in the world of GPU storage space to really bring a full portfolio of features to our customers in the ecosystem so that they cater to their needs and those that are most suitable for them Can combine requirements. “

However, McVeigh also said that using the oneAPI programming model to develop Aurora applications for Intel Xe-HP GPUs instead of Ponte Vecchio Xe-HPC GPUs has yet to do the programming after the Ponte Vecchio and Aurora ship out.

“A rule of thumb that we usually say is that you have a starting point of about 80 percent of the code that is optimized for architecture based on a prior,” said McVeigh. “The fact that XE HP is based on the same architecture gives us confidence that this level of support will be there. Now there are many elements of Ponte Vecchio – it has a very large cache, it has different connection technologies – these things require a lot of work. But in terms of base code, it should work completely. “

He said Argonne application developers will have “quarter” (ie three months) of optimization to maximize performance with Ponte Vecchio. “But they’ll be able to get things working right away based on the fact that there is the same type of software support (Xe-HP and Xe-HPC), but they really will have items around the Want to optimize memory access and intercommunication and all of those things. “

Examples of development work are:

  • The EXAALT project, which enables molecular dynamics in the exascale for fusion and fission energy problems.
  • The QMCPACK project developing Exascale Quantum Monte Carlo algorithms to improve predictions about complex materials.
  • The GAMESS project, which develops ab initio fragmentation methods to address challenges in computational chemistry such as heterogeneous catalysis problems more efficiently.
  • The ExaSMR project, which develops high-fidelity modeling functions at exascale for complex physical phenomena that occur in the operation of nuclear reactors in order to ultimately improve their design.
  • The HACC project, which develops on exascal extreme-scale cosmological simulations that allow scientists to simultaneously analyze observational data from state-of-the-art telescopes to test various theories.

Comments are closed.