Intel’s 7 nm slip raises questions about Ponte Vecchio GPU, Aurora supercomputers

During the earnings call for the second quarter, Intel announced a one year delay in 7nm process technology, which is expected to shift CPU product timing by approximately six months from previous expectations. The main problem is a defect mode in the 7 nm process, which led to a deterioration in yield, said Intel CEO Bob Swan during the earnings statement on July 23, 2020.

“We created the problem and believe that there are no fundamental obstacles,” said Swan. “But we have also invested in contingency plans to protect ourselves against further uncertainties in the schedule. We have mitigated the impact of process lag on our product plan by taking advantage of improvements in design methodology such as disaggregation and advanced packaging. “

The 7nm delay may affect Intel’s plans to have the Aurora supercomputer at Argonne National Laboratory on schedule by the end of 2021. Intel, the prime contractor, is building the machine with HPE / Cray using Intel’s 10nm Sapphire Rapids Xeon CPU and their 7nm GPU data center GPU, code-named Ponte Vecchio.

As the centerpiece of Aurora, the front runner in the US exascale program, Ponte Vecchio would herald Intel’s entry into the data center GPU market. The GPU should also be Intel’s first 7 nm product.

The Ponte Vecchio release is now slated for “late 2021 or early 2022,” and Intel is working with external fabs on at least some elements of the GPU.

Bob Swan on the results call for the second quarter:

“We will continue to invest in our future process technology roadmap, but we will pragmatically and objectively use the process technology that will provide our customers with the best possible predictability and performance, be it in our process, in an external foundry process, or a combination of both. Our advanced packaging technologies combined with our disaggregated architecture give us tremendous flexibility in using the process technology that best serves our customers. For example, our data center GPU design, Ponte Vecchio, will be released now in late 2021 or early 2022, using external and internal process technologies in combination with our world-leading packaging technologies. “

Swan stated that the “first Intel-based 7nm product” would be a client CPU in late 2022 or early 2023.

I suppose “first Intel-based 7nm product” can be creatively analyzed, if not naturally (with “Intel-based” modifying the process, not the product), but it looks like Ponte Vecchio, at least the GPU chip part, would not do this at startup and be built on 7nm, least of all on Intel’s node, as it will not be ready until a year later. There is speculation that Intel could move production to TSMC (or Samsung) as part of its “contingency plans”.

Aurora knot design as featured by Intel’s Raja Koduri at SC19

Given the Department of Energy’s set performance and performance goals for Aurora and the timeframe of late 2021 or early 2022, TSMC’s 5nm node is a likely candidate.

In the above quote, Swan refers to Ponte Vecchio, “who uses external and internal process technologies”. The I / O chip and the GPU chip (and the memory stack) can be implemented on different nodes. AMD does this with its Epyc CPUs. For example, Rom uses 7nm CPU cores and a 14nm I / O chip.

Again from Swan:

“Originally, Ponte Vecchio’s architecture included an I / O-based chip, connectivity, a GPU, and some memory tiles, all packaged together. This is the design of Ponte Vecchio. Right from the start, we’ve done some of these tiles on the inside and some of these tiles on the outside, and again used packaging technology as evidence of how we combine different designs into one package. So that was the design from the start … this design disaggregation gives us a lot of flexibility.

“If we go on now, we can think about introducing Ponte Vecchio with…. I think I said some of these tiles are inside and out from the start. Now we can judge whether we are replacing one of our tiles with a third-party foundry or not. Again, this is the beauty and value of this modification and design method which gives us much more optionality and flexibility. In the event that there is a process receipt, we can buy something instead of doing everything ourselves. “

Swan lays a positive impact on the disaggregated design of the GPU, but swapping out the GPU computational cube, as Intel apparently needs to do, isn’t a minor change. The process node correlates directly with the performance and energy goals of the Ponte Vecchio GPU and, in a broader sense, the Aurora system, with the GPU providing most of the power and controlling a good portion of the electricity demand.

Intel reported that the 10nm Sapphire Rapids will be on track in the second half of 2021, and its “Intel-based 7nm data center CPU” is planned for the first half of 2023.

Aurora plays a central role in the United States’ exascale plans. The current implementation, known as Aurora21, has been positioned as the first U.S. exascale machine, despite Intel not publicly committing to any Linpack target with an exaflop.

Aurora isn’t the only exascale machine being developed in the US Oak Ridge National Lab (with HPE and AMD). The aim is to get the Frontier system with 1.5 exaflops (minimum peak) in a timely timeframe (late 2021) and with uncertainties about Intel running the GPU in the data center increased the likelihood of Oak Ridge at the exascale -Rollout in the US took the lead. The Lawrence Livermore National Lab (also with HPE and AMD) would like to use El Capitan a year later in late 2022 to achieve two exaflops. All three systems have HPE’s Cray Shasta architecture.

Aurora was originally conceived as a pre-exascale supercomputer in 2015. The DOE CORAL contract stipulated that a 180 petaflops machine (Peak) made up of Intel Xeon Phi Knights Hill processors and second-generation OmniPath fabric technology would be used in Argonne in 2018. The plans were thrown out when Intel withdrew and eventually abandoned Phi and OmniPath development. The contract was redefined and expanded.

The 7nm delay announcement contrasted with Intel’s strong financial data for the second quarter. The company’s revenue of $ 19.7 billion for the second quarter was up 20 percent year over year. Data-driven sales rose 34 percent and accounted for 52 percent of total sales. Earnings rose 22 percent to $ 5.11 billion, as reported in the Wall Street Journal, but stocks fell 18 percent on news of the delay and lowered third-quarter guidance as of the time of this writing not recovered.

Comments are closed.