30.6.5. ARM SVE

Scalable Vector Extension.

Examples:

To understand it, the first thing you have to look at is the execution example at Fig 1 of: https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf

aarch64 only, newer than ARM NEON.

It is called Scalable because it does not specify the vector width! Therefore we don’t have to worry about new vector width instructions every few years! Hurray!

The instructions then allow:

  • incrementing loop index by the vector length without explicitly hardcoding it

  • when the last loop is reached, extra bytes that are not multiples of the vector length get automatically masked out by the predicate register, and have no effect

Added to QEMU in 3.0.0 and gem5 in 2019 Q3.

The Linux kernel shows /proc/cpuinfo compatibility as sve.

SVE support is indicated by ID_AA64PFR0_EL1.SVE which is dumped from baremetal/arch/aarch64/dump_regs.c.

Using SVE normally requires setting the CPACR_EL1.FPEN and ZEN bits, which as as of lkmc 29fd625f3fda79f5e0ee6cac43517ba74340d513 + 1 we also enable in our Baremetal bootloaders, see also: aarch64 baremetal NEON setup.