24.6.4.1. gem5 fast forward
Besides switching CPUs after a checkpoint restore, fs.py also has the --fast-forward
option to automatically run the script from the start on a less detailed CPU, and switch to a more detailed CPU at a given tick.
This is generally useless compared to checkpoint restoring because:
-
checkpoint restore allows to run multiple contents after the restore, and restoring to multiple different system states, which you almost always want to do
-
we generally don’t know the exact tick at which the region of interest will start, especially as the binaries change. It is much easier to just instrument the content with a checkoint m5op
But let’s give it a try anyway with userland/freestanding/gem5_checkpoint.S which was mentioned at gem5 checkpoint userland minimal example
./run \ --arch aarch64 \ --emulator gem5 \ --static \ --trace ExecAll,FmtFlag,O3CPU,SimpleCPU \ --userland userland/freestanding/gem5_checkpoint.S \ -- \ --caches --cpu-type DerivO3CPU \ --fast-forward 1000 \ ; cat "$(./getvar --arch aarch64 --emulator gem5 trace_txt_file)"
At gem5 2235168b72537535d74c645a70a85479801e0651 we see something like:
0: O3CPU: system.switch_cpus: Creating O3CPU object. 0: O3CPU: system.switch_cpus: Workload[0] process is 0 0: SimpleCPU: system.cpu: ActivateContext 0 0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0 WriteReq 0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x40 WriteReq ... 0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1f92 WriteReq 0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e40 WriteReq 0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e30 WriteReq 0: SimpleCPU: system.cpu: Tick 0: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger) 500: SimpleCPU: system.cpu: Tick 500: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+4 : movz x1, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger) 1000: SimpleCPU: system.cpu: Tick 1000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+8 : m5checkpoint : IntAlu : flags=(IsInteger|IsNonSpeculative|IsUnverifiable) 1000: O3CPU: system.switch_cpus: [tid:0] Calling activate thread. 1000: O3CPU: system.switch_cpus: [tid:0] Adding to active threads list 1500: O3CPU: system.switch_cpus: FullO3CPU: Ticking main, FullO3CPU. 1500: O3CPU: system.switch_cpus: Scheduling next tick! 2000: O3CPU: system.switch_cpus: FullO3CPU: Ticking main, FullO3CPU. 2000: O3CPU: system.switch_cpus: Scheduling next tick! 2500: O3CPU: system.switch_cpus: ... FullO3CPU: Ticking main, FullO3CPU. 44500: ExecEnable: system.switch_cpus: A0 T0 : @asm_main_after_prologue+12 : movz x0, #0, #0 : IntAlu : D=0x00000000000 48000: O3CPU: system.switch_cpus: Removing committed instruction [tid:0] PC (0x400084=>0x400088).(0=>1) [sn:1] 48000: O3CPU: system.switch_cpus: Removing instruction, [tid:0] [sn:1] PC (0x400084=>0x400088).(0=>1) 48000: O3CPU: system.switch_cpus: Scheduling next tick! 48500: O3CPU: system.switch_cpus: ...
We can also compare that to the same log but without --fast-forward
and other CPU switch options:
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e40 WriteReq 0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e30 WriteReq 0: SimpleCPU: system.cpu: Tick 0: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger) 500: SimpleCPU: system.cpu: Tick 500: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+4 : movz x1, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger) 1000: SimpleCPU: system.cpu: Tick 1000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+8 : m5checkpoint : IntAlu : flags=(IsInteger|IsNonSpeculative|IsUnverifiable) 1000: SimpleCPU: system.cpu: Resume 1500: SimpleCPU: system.cpu: Tick 1500: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+12 : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger) 2000: SimpleCPU: system.cpu: Tick 2000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+16 : m5exit : No_OpClass : flags=(IsInteger|IsNonSpeculative)
Therefore, it is clear that what we wanted happen:
-
up until the tick 1000,
SimpleCPU
was ticking -
after tick 1000, cpu
O3CPU
started ticking
Bibliography: