33.11.2. aarch64 baremetal NEON setup
Inside baremetal/lib/aarch64.S there is a chunk of code that enables floating point operations:
mov x1, 0x3 << 20 msr cpacr_el1, x1 isb
CPACR_EL1 is documented at ARMv8 architecture reference manual D10.2.29 "CPACR_EL1, Architectural Feature Access Control Register".
Here we touch the CPACR_EL1.FPEN bits to 3, which enable floating point operations:
11 This control does not cause any instructions to be trapped.
We later also added an enable for the CPACR_EL1.ZEN bits, which are needed for ARM SVE.
Without CPACR_EL1.FPEN, the printf
:
printf("got: %c\n", c);
compiled to a:
str q0, [sp, #80]
which uses NEON registers, and goes into an exception loop.
It was a bit confusing because there was a previous printf
:
printf("enter a character\n");
which did not blow up because GCC compiles it into puts
directly since it has no arguments, and that does not generate NEON instructions.
The last instructions ran was found with:
while(1) stepi end
or by hacking the QEMU CLI to contain:
-D log.log -d in_asm
I could not find any previous NEON instruction executed so this led me to suspect that some NEON initialization was required:
-
http://infocenter.arm.com/help/topic/com.arm.doc.dai0527a/DAI0527A_baremetal_boot_code_for_ARMv8_A_processors.pdf "Bare-metal Boot Code for ARMv8-A Processors"
-
https://community.arm.com/processors/f/discussions/5409/how-to-enable-neon-in-cortex-a8
-
https://stackoverflow.com/questions/19231197/enable-neon-on-arm-cortex-a-series
We then tried to copy the code from the "Bare-metal Boot Code for ARMv8-A Processors" document:
// Disable trapping of accessing in EL3 and EL2. MSR CPTR_EL3, XZR MSR CPTR_EL3, XZR // Disable access trapping in EL1 and EL0. MOV X1, #(0x3 << 20) // FPEN disables trapping to EL1. MSR CPACR_EL1, X1 ISB
but it entered an exception loop at MSR CPTR_EL3, XZR
.
We then found out that QEMU starts in EL1, and so we kept just the EL1 part, and it worked. Related: