38.12. Simultaneous runs

When doing long simulations sweeping across multiple system parameters, it becomes fundamental to do multiple simulations in parallel.

This is specially true for gem5, which runs much slower than QEMU, and cannot use multiple host cores to speed up the simulation: https://github.com/cirosantilli2/gem5-issues/issues/15, so the only way to parallelize is to run multiple instances in parallel.

This also has a good synergy with Build variants.

First shell:

./run

Another shell:

./run --run-id 1

and now you have two QEMU instances running in parallel.

The default run id is 0.

Our scripts solve two difficulties with simultaneous runs:

  • port conflicts, e.g. GDB and gem5-shell

  • output directory conflicts, e.g. traces and gem5 stats overwriting one another

Each run gets a separate output directory. For example:

./run --arch aarch64 --emulator gem5 --run-id 0 &>/dev/null &
./run --arch aarch64 --emulator gem5 --run-id 1 &>/dev/null &

produces two separate m5out directories:

echo "$(./getvar --arch aarch64 --emulator gem5 --run-id 0 m5out_dir)"
echo "$(./getvar --arch aarch64 --emulator gem5 --run-id 1 m5out_dir)"

and the gem5 host executable stdout and stderr can be found at:

less "$(./getvar --arch aarch64 --emulator gem5 --run-id 0 termout_file)"
less "$(./getvar --arch aarch64 --emulator gem5 --run-id 1 termout_file)"

Each line is prepended with the timestamp in seconds since the start of the program when it appeared.

To have more semantic output directories names for later inspection, you can use a non numeric string for the run ID, and indicate the port offset explicitly:

./run --arch aarch64 --emulator gem5 --run-id some-experiment --port-offset 1

--port-offset defaults to the run ID when that is a number.

Like CPU architecture, you will need to pass the -n option to anything that needs to know runtime information, e.g. GDB step debug:

./run --run-id 1
./run-gdb --run-id 1

To run multiple gem5 checkouts, see: Section 38.13.3.1, “gem5 worktree”.

Implementation note: we create multiple namespaces for two things:

  • run output directory

  • ports

    • QEMU allows setting all ports explicitly.

      If a port is not free, it just crashes.

      We assign a contiguous port range for each run ID.

    • gem5 automatically increments ports until it finds a free one.

      gem5 60600f09c25255b3c8f72da7fb49100e2682093a does not seem to expose a way to set the terminal and VNC ports from fs.py, so we just let gem5 assign the ports itself, and use -n only to match what it assigned. Those ports both appear on gem5 config.ini.

      The GDB port can be assigned on gem5.opt --remote-gdb-port, but it does not appear on config.ini.