33.10.3.1. ARM WFE and SEV instructions

The WFE and SEV instructions are just hints: a compliant implementation can treat them as NOPs.

Concrete examples of the instruction can be seen at:

However, likely no implementation likely does (TODO confirm), since:

  • WFE is intended to put the core in a low power mode

  • SEV wakes up cores from a low power mode

and power consumption is key in ARM applications.

Quotes for the above ARMv8 architecture reference manual db G1.18.1 "Wait For Event and Send Event":

The following events are WFE wake-up events:

\[…​]

  • An event caused by the clearing of the global monitor associated with the PE

and ARMv8 architecture reference manual db E2.9.6 "Use of WFE and SEV instructions by spin-locks":

ARMv8 provides Wait For Event, Send Event, and Send Event Local instructions, WFE, SEV, SEVL, that can assist with reducing power consumption and bus contention caused by PEs repeatedly attempting to obtain a spin-lock. These instructions can be used at the application level, but a complete understanding of what they do depends on a system level understanding of exceptions. They are described in Wait For Event and Send Event on page G1-5308. However, in ARMv8, when the global monitor for a PE changes from Exclusive Access state to Open Access state, an event is generated.

Note This is equivalent to issuing an SEVL instruction on the PE for which the monitor state has changed. It removes the need for spinlock code to include an SEV instruction after clearing a spinlock.

The recommended ARMv8 spinlock implementation is shown at http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0008a/ch01s03s02.html where WAIT_FOR_UPDATE is as explained in that section a macro that expands to WFE. TODO SEV is used explicitly in those examples via SIGNAL_UPDATE, where is the example that shows how SEV can be eliminated due to implicit monitor signals?

In QEMU 3.0.0, SEV is a NOPs, and WFE might be, but I’m not sure, see: https://github.com/qemu/qemu/blob/v3.0.0/target/arm/translate-a64.c#L1423

    case 2: /* WFE */
        if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
            s->base.is_jmp = DISAS_WFE;
        }
        return;
    case 4: /* SEV */
    case 5: /* SEVL */
        /* we treat all as NOP at least for now */
        return;

TODO: what does the WFE code do? How can it not be a NOP if SEV is a NOP? https://github.com/qemu/qemu/blob/v3.0.0/target/arm/translate.c#L4609 might explain why, but it is Chinese to me (I only understand 30% ;-)):

 * For WFI we will halt the vCPU until an IRQ. For WFE and YIELD we
 * only call the helper when running single threaded TCG code to ensure
 * the next round-robin scheduled vCPU gets a crack. In MTTCG mode we
 * just skip this instruction. Currently the SEV/SEVL instructions
 * which are *one* of many ways to wake the CPU from WFE are not
 * implemented so we can't sleep like WFI does.
 */

For gem5 however, if we comment out the SVE instruction, then it actually exits with simulate() limit reached, so the CPU truly never wakes up, which is a more realistic behaviour, since gem5 is more focused on simulating a realistic microarchitecture and power consumption.

The following Raspberry Pi bibliography helped us get this sample up and running:

For how userland spinlocks and mutexes are implemented see Userland mutex implementation.