24.6.4.1. gem5 fast forward
Besides switching CPUs after a checkpoint restore, fs.py also has the --fast-forward option to automatically run the script from the start on a less detailed CPU, and switch to a more detailed CPU at a given tick.
This is generally useless compared to checkpoint restoring because:
-
checkpoint restore allows to run multiple contents after the restore, and restoring to multiple different system states, which you almost always want to do
-
we generally don’t know the exact tick at which the region of interest will start, especially as the binaries change. It is much easier to just instrument the content with a checkoint m5op
But let’s give it a try anyway with userland/freestanding/gem5_checkpoint.S which was mentioned at gem5 checkpoint userland minimal example
./run \ --arch aarch64 \ --emulator gem5 \ --static \ --trace ExecAll,FmtFlag,O3CPU,SimpleCPU \ --userland userland/freestanding/gem5_checkpoint.S \ -- \ --caches --cpu-type DerivO3CPU \ --fast-forward 1000 \ ; cat "$(./getvar --arch aarch64 --emulator gem5 trace_txt_file)"
At gem5 2235168b72537535d74c645a70a85479801e0651 we see something like:
0: O3CPU: system.switch_cpus: Creating O3CPU object.
0: O3CPU: system.switch_cpus: Workload[0] process is 0 0: SimpleCPU: system.cpu: ActivateContext 0
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0 WriteReq
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x40 WriteReq
...
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1f92 WriteReq
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e40 WriteReq
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e30 WriteReq
0: SimpleCPU: system.cpu: Tick
0: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger)
500: SimpleCPU: system.cpu: Tick
500: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+4 : movz x1, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger)
1000: SimpleCPU: system.cpu: Tick
1000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+8 : m5checkpoint : IntAlu : flags=(IsInteger|IsNonSpeculative|IsUnverifiable)
1000: O3CPU: system.switch_cpus: [tid:0] Calling activate thread.
1000: O3CPU: system.switch_cpus: [tid:0] Adding to active threads list
1500: O3CPU: system.switch_cpus:
FullO3CPU: Ticking main, FullO3CPU.
1500: O3CPU: system.switch_cpus: Scheduling next tick!
2000: O3CPU: system.switch_cpus:
FullO3CPU: Ticking main, FullO3CPU.
2000: O3CPU: system.switch_cpus: Scheduling next tick!
2500: O3CPU: system.switch_cpus:
...
FullO3CPU: Ticking main, FullO3CPU.
44500: ExecEnable: system.switch_cpus: A0 T0 : @asm_main_after_prologue+12 : movz x0, #0, #0 : IntAlu : D=0x00000000000
48000: O3CPU: system.switch_cpus: Removing committed instruction [tid:0] PC (0x400084=>0x400088).(0=>1) [sn:1]
48000: O3CPU: system.switch_cpus: Removing instruction, [tid:0] [sn:1] PC (0x400084=>0x400088).(0=>1)
48000: O3CPU: system.switch_cpus: Scheduling next tick!
48500: O3CPU: system.switch_cpus:
...
We can also compare that to the same log but without --fast-forward and other CPU switch options:
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e40 WriteReq
0: SimpleCPU: system.cpu.dcache_port: received snoop pkt for addr:0x1e30 WriteReq
0: SimpleCPU: system.cpu: Tick
0: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger)
500: SimpleCPU: system.cpu: Tick
500: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+4 : movz x1, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger)
1000: SimpleCPU: system.cpu: Tick
1000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+8 : m5checkpoint : IntAlu : flags=(IsInteger|IsNonSpeculative|IsUnverifiable)
1000: SimpleCPU: system.cpu: Resume
1500: SimpleCPU: system.cpu: Tick
1500: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+12 : movz x0, #0, #0 : IntAlu : D=0x0000000000000000 flags=(IsInteger)
2000: SimpleCPU: system.cpu: Tick
2000: ExecEnable: system.cpu: A0 T0 : @asm_main_after_prologue+16 : m5exit : No_OpClass : flags=(IsInteger|IsNonSpeculative)
Therefore, it is clear that what we wanted happen:
-
up until the tick 1000,
SimpleCPUwas ticking -
after tick 1000, cpu
O3CPUstarted ticking
Bibliography: