24.1. gem5 vs QEMU
-
advantages of gem5:
-
simulates a generic more realistic optionally pipelined and out-of-order CPU cycle by cycle, including a realistic DRAM memory access model with latencies, caches and page table manipulations. This allows us to:
-
do much more realistic performance benchmarking with it, which makes absolutely no sense in QEMU, which is purely functional
-
make certain functional observations that are not possible in QEMU, e.g.:
-
use Linux kernel APIs that flush cache memory like DMA, which are crucial for driver development. In QEMU, the driver would still work even if we forget to flush caches.
-
spectre / meltdown:
-
It is not of course truly cycle accurate, as that:
-
would require exposing proprietary information of the CPU designs: https://stackoverflow.com/questions/17454955/can-you-check-performance-of-a-program-running-with-qemu-simulator/33580850#33580850
-
would make the simulation even slower TODO confirm, by how much
but the approximation is reasonable.
It is used mostly for microarchitecture research purposes: when you are making a new chip technology, you don’t really need to specialize enormously to an existing microarchitecture, but rather develop something that will work with a wide range of future architectures.
-
-
runs are deterministic by default, unlike QEMU which has a special QEMU record and replay mode, that requires first playing the content once and then replaying
-
gem5 ARM at least appears to implement more low level CPU functionality than QEMU, e.g. QEMU only added EL2 in 2018: https://stackoverflow.com/questions/42824706/qemu-system-aarch64-entering-el1-when-emulating-a53-power-up See also: Section 33.10.1, “ARM exception levels”
-
gem5 offers more advanced logging, even for non micro architectural things which QEMU models in some way, e.g. QEMU trace memory accesses, because QEMU’s binary translation optimizations reduce visibility
-
-
disadvantages of gem5:
-
slower than QEMU, see: Section 35.2.1, “Benchmark Linux kernel boot”
This implies that the user base is much smaller, since no Android devs.
Instead, we have only chip makers, who keep everything that really works closed, and researchers, who can’t version track or document code properly >:-) And this implies that:
-
the documentation is more scarce
-
it takes longer to support new hardware features
Well, not that AOSP is that much better anyway.
-
-
not sure: gem5 has BSD license while QEMU has GPL
This suits chip makers that want to distribute forks with secret IP to their customers.
On the other hand, the chip makers tend to upstream less, and the project becomes more crappy in average :-)
-
gem5 is way more complex and harder to modify and maintain
The only hairy thing in QEMU is the binary code generation.
gem5 however has tended towards horrendous intensive code generation in order to support all its different hardware types
gem5 also has a complex Python interface which is also largely auto-generated, which greatly increases the maintenance complexity of the project: Embedding Python in another application.
This is done so that reconfiguring platforms can be done quickly without recompiling, and it is amazing when it works, but the maintenance costs are also very high. For example, pybind11 of several trivial
param_
files accounted for 50% of the build time at one point: pybind11 accounts for 50% of gem5 build time.All of this also makes it hard to setup an IDE for developing gem5: gem5 Eclipse configuration
The feelings of helplessness this brings are well summarized by the following CSDN article https://blog.csdn.net/maokelong95/article/details/85333905:
Found DPRINTF based debugging unable to meet your needs?
Found GDB based debugging unfriendly to human beings?
Want to debug gem5 source with the help of modern IDEs like Eclipse?
Failed in getting help from GEM5 community?
Come on, dude! Here is the up-to-date tutorial for you!
Just be ready for THE ENDLESS NIGHTMARE gem5 will bring!
-