Hot Chips Over the past two years, AMD has steadily expanded its computing portfolio. This included the addition of FPGAs and smartNICs through the acquisition of Xilinx in 2020 and Pensando earlier this year.
At the Hot Chips conference this week, AMD offered a glimpse of how it plans to combine these technologies to build a better infrastructure processor for the cloud, enterprise datacenter, and telco network.
The sheer number of VMs and container workloads combined with faster networking demands and heightened security requirements have placed a tremendous burden on CPUs that, at their heart, are designed to run tenant workloads, AMD Senior Fellow Jaideep Dastidar explained in his keynote address.
On top of the VMs or containers themselves, the CPUs also have to contend with VM-to-VM networking and switching, storage virtualization, and encryption – to name just a few non-core workloads.
“To the rescue came smartNICs and DPUs, because they helped offload those workloads off the host CPU,” he said.
However, most smartNICs are based on either high-speed, fixed-function ASICs, programmable FPGAs, general compute cores, or a combination of at most two of these technologies. Usually this means compromising on performance or flexibility somewhere, Dastidar argued.
ASIC-like, fixed-function logic is an attractive option because it offers the best performance-per-watt of any smartNIC platform, he said, “but that is attractive only if you use it for fixed functions.”
Embedded processors, he added, offer a more flexible alternative, but suffer from many of the same limitations as the CPU.
Because of this, many DPUs and smartNICs on the market today utilize a combination of fixed-function logic and highly programmable Arm-CPU cores to accelerate workloads. Examples include Marvell’s Octeon 10 and Nvidia’s BlueField DPUs.
And then there are FPGAs, which fall somewhere between ASICs and general-purpose processors in terms of sheer compute and programmability. Intel has found success with its FPGA-based IPUs here.
The best of three worlds
With Xilinx now in-house, AMD could have pursued any of the three approaches. Instead, Dastidar said he and his team plan to combine FPGAs, ASICs, and general compute cores to achieve the greatest flexibility and performance possible while minimizing power consumption.
The idea here is pretty straightforward: interconnect each of the accelerators and abstract the complexity associated with programming for a heterogenous compute platform in software.
According to Dastidar, this allows them to run workloads on the ASIC, FGPA, general compute cores, or break the workload up and distribute it across all three.
“It doesn’t matter if one particular heterogeneous component is solving the function or a combination of heterogeneous functions. From a software perspective, it should just look like one SoC,” he said.
With that said, AMD expects certain workloads to favor certain components. “We apply ASIC logic where it does best: crypto offload, DMA offload, even full network data-plane offload,” Dastidar added.
For customers who want to add functions that won’t change frequently, “you can also completely hot add or remove new accelerator functions in the programmable logic.”
For general compute, AMD curiously won’t be relying on the Zen core architecture that underpins its Ryzen and Epyc portfolio and will instead employ Arm processor cores.
Specifically, the smartNIC features 16 A78-AE cores for high-performance workloads and four R52 cores for low-power and lights-out operations.
Memory is supplied via up to four 32GB LPDDR5 or DDR5 memory DIMMs to the card itself.
Finally, networking and host communication will be provided by a pair of 200Gbps interfaces and 16 lanes of PCIe 5.0 or CXL 2.0 connectivity.
Offload all the things
AMD anticipates numerous use cases for its smartNICs, including offloading common virtualization tasks like Open vSwitch and Virtio.
“Open vSwitch as the traditional SDN application. It’s very useful for VM-to-VM communication and VM migration. However, OVS does tend to tax the whole CPU,” Dastidar said, explaining that one implementation might be to have the OVS control plane running on the host CPU and offload the actual switching functionality to the smartNIC.
“Virtio Net is very popular because it allows you to paravirtualize… the network adapter for VMs. We support two models for Virtio Net. One is bare metal Virtio, where the entire control plane and data plane are consumed by the smartNIC SoC… we also support the BBVA model, where the data path is in the adaptive smartNIC SoC, but the control plane can remain on the host.”
According to Dastidar, the smartNIC can also work in reverse. This enables external hosts to access resources, like NVMe storage over the network, effectively enabling resources in one server to be accessed by another.
“What we have within the SoC is a virtual switch with different endpoints. These PCIe endpoints are actually presenting themselves for the different smartNIC services,” he explained.
AMD says the smartNIC will be based on a TSMC 7nm process, but it remains unclear when we can expect to see the first of these three accelerators from the vendor in the market. ®
Discussion about this post