24.6.4. gem5 restore checkpoint with a different CPU
gem5 can switch to a different CPU model when restoring a checkpoint.
A common combo is to boot Linux with a fast CPU, make a checkpoint and then replay the benchmark of interest with a slower CPU.
This can be observed interactively in full system with:
./run --arch aarch64 --emulator gem5
Then in the guest terminal after boot ends:
sh -c 'm5 checkpoint;sh' m5 exit
And then restore the checkpoint with a different slower CPU:
./run --arch arm --emulator gem5 --gem5-restore 1 -- --caches --cpu-type=DerivO3CPU
And now you will notice that everything happens much slower in the guest terminal!
One even more direct and minimal way to observe this is with userland/freestanding/gem5_checkpoint.S which was mentioned at gem5 checkpoint userland minimal example plus some logging:
./run \ --arch aarch64 \ --emulator gem5 \ --static \ --trace ExecAll,FmtFlag,O3CPU,SimpleCPU \ --userland userland/freestanding/gem5_checkpoint.S \ ; cat "$(./getvar --arch aarch64 --emulator gem5 trace_txt_file)" ./run \ --arch aarch64 \ --emulator gem5 \ --gem5-restore 1 \ --static \ --trace ExecAll,FmtFlag,O3CPU,SimpleCPU \ --userland userland/freestanding/gem5_checkpoint.S \ -- \ --caches \ --cpu-type DerivO3CPU \ --restore-with-cpu DerivO3CPU \ ; cat "$(./getvar --arch aarch64 --emulator gem5 trace_txt_file)"
At gem5 2235168b72537535d74c645a70a85479801e0651, the first run does everything in AtomicSimpleCPU:
... 0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1f92 WriteReq 0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e40 WriteReq 0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e30 WriteReq 0: SimpleCPU: system.cpu: Tick 0: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger) 500: SimpleCPU: system.cpu: Tick 500: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+4 : movz x1, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger) 1000: SimpleCPU: system.cpu: Tick 1000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+8 : m5checkpoint : IntAlu : flags=(IsInteger|IsNonSpeculative|IsUnverifiable) 1000: SimpleCPU: system.cpu: Resume 1500: SimpleCPU: system.cpu: Tick 1500: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+12 : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger) 2000: SimpleCPU: system.cpu: Tick 2000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+16 : m5exit : No_OpClass : flags=(IsInteger|IsNonSpeculative)
and after restore we see as expected a single ExecEnable
instruction executed amidst O3CPU
noise:
FullO3CPU: Ticking main, FullO3CPU. 79000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+12 : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 FetchSeq=1 CPSeq=1 flags=(IsInteger) 82500: O3CPU: system.cpu: Removing committed instruction [tid:0] PC (0x400084=>0x400088).(0=>1) [sn:1] 82500: O3CPU: system.cpu: Removing instruction, [tid:0] [sn:1] PC (0x400084=>0x400088).(0=>1) 82500: O3CPU: system.cpu: Scheduling next tick! 83000: O3CPU: system.cpu:
which is the movz
after the checkpoint. The final m5exit
does not appear due to DerivO3CPU logging insanity.
Bibliography: