ICSA Colloquium Talk - 26/11/2020
Helium Vector Architecture for Cortex-M: The challenges (and opportunities) of being small
The performance demands of traditional DSP workloads like audio processing and sensor fusion have been increasing. Together with the emergence of new workloads like ML at the edge, this has resulted in the need for a significant increase in DSP capabilities, even in the smallest embedded devices. To address this, we have introduced Arm® Helium™ technology, the new vector extension for Arm Cortex® M processors. This clean sheet architecture eliminates the need for separate DSP processors and the complex heterogeneous development environments they introduce. Last year alone, the Arm ecosystem shipped over 16 billion Cortex M based devices. As such, Helium represents a significant enhancement to a pervasive architecture.
This talk will describe how Helium provides a scalable solution that is applicable to everything from high performance embedded applications right down to the most constrained devices. The most constrained Cortex M processors often have incredibly short pipelines and no caches or branch predictors. To illustrate the problems this can cause and our solutions, we compare some common features of the Scalable Vector Extension (SVE) and Helium; for example, support for complex math operations and predication. We describe why Helium takes a different approach to implementing these features and the benefits it brings to smaller-scale processors.
To increase performance, while still getting the most out of every gate, and every byte/s of available bandwidth, VLIW based DSPs expose the microarchitecture to software. This, however, comes at the cost of sacrificing software portability. We will present how Helium achieves the same goals by taking a balanced approach that exposes some microarchitecture details, but in a way that still preserves software portability. We explain the knock-on effects this had on the architecture and in particular, the exception model, where a key goal was maintaining the deterministic and low-latency interrupts that Cortex M is known for.
Tom is a Fellow within Arm Research where he leads the embedded architecture group. He was the architect behind both Arm’s Helium vector extension and the TrustZone security architecture for Cortex-M. For his PhD at Durham University, Tom used just-in-time compilation techniques to provide abstraction in reconfigurable computing systems. When not in the office, Tom can be found jumping out of perfectly good planes for some wingsuiting or formation skydiving.