24.6.4. gem5 restore checkpoint with a different CPU
gem5 can switch to a different CPU model when restoring a checkpoint.
A common combo is to boot Linux with a fast CPU, make a checkpoint and then replay the benchmark of interest with a slower CPU.
This can be observed interactively in full system with:
./run --arch aarch64 --emulator gem5
Then in the guest terminal after boot ends:
sh -c 'm5 checkpoint;sh' m5 exit
And then restore the checkpoint with a different slower CPU:
./run --arch arm --emulator gem5 --gem5-restore 1 -- --caches --cpu-type=DerivO3CPU
And now you will notice that everything happens much slower in the guest terminal!
One even more direct and minimal way to observe this is with userland/freestanding/gem5_checkpoint.S which was mentioned at gem5 checkpoint userland minimal example plus some logging:
./run \ --arch aarch64 \ --emulator gem5 \ --static \ --trace ExecAll,FmtFlag,O3CPU,SimpleCPU \ --userland userland/freestanding/gem5_checkpoint.S \ ; cat "$(./getvar --arch aarch64 --emulator gem5 trace_txt_file)" ./run \ --arch aarch64 \ --emulator gem5 \ --gem5-restore 1 \ --static \ --trace ExecAll,FmtFlag,O3CPU,SimpleCPU \ --userland userland/freestanding/gem5_checkpoint.S \ -- \ --caches \ --cpu-type DerivO3CPU \ --restore-with-cpu DerivO3CPU \ ; cat "$(./getvar --arch aarch64 --emulator gem5 trace_txt_file)"
At gem5 2235168b72537535d74c645a70a85479801e0651, the first run does everything in AtomicSimpleCPU:
...
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1f92 WriteReq
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e40 WriteReq
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e30 WriteReq
0: SimpleCPU: system.cpu: Tick
0: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger)
500: SimpleCPU: system.cpu: Tick
500: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+4 : movz x1, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger)
1000: SimpleCPU: system.cpu: Tick
1000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+8 : m5checkpoint : IntAlu : flags=(IsInteger|IsNonSpeculative|IsUnverifiable)
1000: SimpleCPU: system.cpu: Resume
1500: SimpleCPU: system.cpu: Tick
1500: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+12 : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger)
2000: SimpleCPU: system.cpu: Tick
2000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+16 : m5exit : No_OpClass : flags=(IsInteger|IsNonSpeculative)
and after restore we see as expected a single ExecEnable instruction executed amidst O3CPU noise:
FullO3CPU: Ticking main, FullO3CPU. 79000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+12 : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 FetchSeq=1 CPSeq=1 flags=(IsInteger) 82500: O3CPU: system.cpu: Removing committed instruction [tid:0] PC (0x400084=>0x400088).(0=>1) [sn:1] 82500: O3CPU: system.cpu: Removing instruction, [tid:0] [sn:1] PC (0x400084=>0x400088).(0=>1) 82500: O3CPU: system.cpu: Scheduling next tick! 83000: O3CPU: system.cpu:
which is the movz after the checkpoint. The final m5exit does not appear due to DerivO3CPU logging insanity.
Bibliography: