The Inherent Freedom of Heterogeneous Systems
by Zvonimir Bandic, Senior Director, Next Gen Platform Technologies, Western Digital Research, Office of CTO, Western Digital Corporation

Few can deny that the choke points for many of today’s increasingly sophisticated systems continue to be hardware cost and power consumption. It’s no surprise that system performance has tapered off for a lack of “architectural vision.” One way out of this dilemma has been the introduction of heterogeneous computing. This concept is fundamentally based on the assumption that systems can use more than one kind of processor or core. Performance and energy efficiency rise by adding dissimilar co-processors that typically use specialized processing capabilities to handle specific tasks.

The Rise of Open, Memory Centric Architecture

Taking heterogeneity to the next level calls for the creation of an architectural platform that uses a memory-centric architecture. This means that heterogeneous systems can now reside in the same memory address space, and share memory coherently. It also means that multiple processors can write to memory without first copying it to a separate address space while simultaneously other processors can read from the same location. Access is coordinated using atomic operations and everything remains consistent. A key advantage is that this enables re-use of a considerable amount of legacy software.

New Freedom to Connect RISC-V Compute Nodes: OmniXtend™ protocol

The open, memory-centric architecture provides vast, new avenues of flexibility for today’s designers. Unlike earlier architectures that left designers hamstrung in what they could and could not connect to, the new architectural paradigm finally means that a diverse array of RISC-V compute nodes can now be connected to universally shared memory (NUMA)—standardized and open coherence protocols, such as Western Digital’s OmniXtend™ (see Figure 1), a new open approach to providing cache coherent memory over an Ethernet fabric. This leverages the full power and promise of heterogeneous computing, including capabilities based on an open-source architecture supported by a full spectrum of hardware, including ubiquitous Ethernet physical layer and programmable Ethernet switches, such as Barefoot Toffino P4 programmable switch. One can imagine the flexibility this gives designers, a free hand that allows one to attach CPUs, AI devices, memories, and other devices/systems. It is worth mentioning that this new breed of heterogeneous systems can easily be compatible with other ISA platforms and belong to the same memory domain. This means that companies can now take advantage of low-cost, high-capacity memory solutions to systems using the new memory-centric architecture. And it allows the entire ecosystem to be built around innovative new peripherals—beyond memories—to include accelerator systems for AI workloads.

Figure 1 Memory centric architecture with OmniXtend: utilizes P4 programmable Ethernet switch and ubiquitous Ethernet Phy. It allows large numbers of RISC-V compute nodes to connect to universally shared memory (NUMA) – utilizing standardized and open coherence protocols. This concept also enables aggregation and disaggregation of memory through memory appliance – memory heavy node.

Open Sourcing the Core

To ensure maximum efficiencies, today’s vast silos of data must be closer to their compute “geographies.” Proprietary CPUs can no longer handle the demands of vastly sophisticated systems. RISC-V-based designs allow open standard interfaces to be used, which, in turn, enable specialty processing, memory-centric designs, unique storage, and flexible interconnect applications. One can foresee the emergence of new data-centric applications such as the Internet of Things (IoT), secure processing, industrial controls and more. This innovative, new open approach will finally deliver cache-coherent memory over an Ethernet “fabric.”

Open Sourced RISC-V Cores and Ecosystem Enablement

Building of open networked cache coherency schemes such as Western Digital’s OmniXtend requires a whole ecosystem of building blocks – processors, accelerators and compute cores. RISC-V instruction set architecture is open, and allows for building such ecosystem. As a part of this process, Western Digital recently released the SweRV Core™ (see and an associated instruction set simulator (ISS). The SweRV Core is a 32-bit, mostly in order 2-way superscalar, 9 stage pipeline with a superior performance compared to current open sourced RISC-V cores.

One is left with the inescapable conclusion that any truly viable system must be based on platforms that include a memory-centric architecture. Heterogeneous systems should all reside in the same memory domain and share memory in a coherent way. In addition, legacy software should be able to piggy-back on this new architecture to take advantage of the many benefits it provides today’s systems designers.

Western Digital, SweRV Core, and OmniXtend are registered trademarks or trademarks of Western Digital Corporation or its affiliates in the US and/or other countries. All other marks are the property of their respective owners.