35.2.3.3.2. Benchmark gem5 single file change rebuild time
This is the critical development parameter, and is dominated by the link time of huge binaries.
In order to benchmark it better, make a comment only change to:
vim submodules/gem5/src/sim/main.cc
then rebuild with:
./build-gem5 --arch aarch64 --verbose
and then copy the link command to a separate Bash file. Then you can time and modify it easily.
Some approximate reference values on 2017 Lenovo ThinkPad P51 LKMC d4b3e064adeeace3c3e7d106801f95c14637c12f + 1 (doing multiple runs to warm up disk caches):
-
opt
-
unmodified: 10 seconds
-
LDFLAGS_EXTRA=-fuse-ld=gold
: 6 seconds. Huge improvement! Note that in general you have to do a full rebuild or else link may fail: https://sourceware.org/bugzilla/show_bug.cgi?id=23869More info on gold:
-
-
debug
-
unmodified: 14 seconds. Why so much slower than unmodified?
-
-fuse-ld=gold
:internal error in read_cie, at ../../gold/ehframe.cc:919
on Ubuntu 18.04 all GCC. https://sourceware.org/bugzilla/show_bug.cgi?id=23869
-
-
fast
-
--force-lto
: 1 minute. Slower as expected, since more optimizations are done at link time.--force-lto
is only used forfast
, and it adds-flto
to the build.
-
-
opt LDFLAGS_EXTRA=-s
: stripping the executable greatly reduces link time, but you get no symbols
ramfs made no difference, the kernel must be caching files in memory very efficiently already.
In addition to the link time, scons startup time can also be considerable:
On LKMC 220c3a434499e4713664d4a47c246cb81ee0a06a gem5 63e96992568d8a8a0dccac477b8b7f1370ac7e98 (Sep 2020):
-
opt
-
default link:
18.32user 3.99system 0:22.33elapsed 99%CPU (0avgtext+0avgdata 4622908maxresident)k
-
LDFLAGS_EXTRA=-fuse-ld=lld
(after a build with default linker):6.74user 1.81system 0:03.85elapsed 222%CPU (0avgtext+0avgdata 7025292maxresident)k
-
LDFLAGS_EXTRA=-fuse-ld=gold
:7.70user 1.36system 0:09.44elapsed 95%CPU (0avgtext+0avgdata 5959152maxresident)k
-
LDFLAGS_EXTRA=-fuse-ld=gold -Wl,--threads -Wl,--thread-count=8
:9.66user 1.86system 0:04.62elapsed 249%CPU (0avgtext+0avgdata 5989916maxresident)k
Arghhh, it does not use multile threads by default… https://stackoverflow.com/questions/5142753/can-gcc-use-multiple-cores-when-linking/42302047#42302047
-
-