30.1.3.1. ARM Thumb encoding
Thumb examples are available at:
For both of them, we can check that we are in thumb from inside GDB with:
-
disassemble
, and observe that some of the instructions are only 2 bytes long instead of always 4 as in ARM -
print $cpsr & 0x20
which is1
on thumb and0
otherwise
You should contrast those examples with similar non-thumb ones of course.
We also note that thumbness of those sources is determined solely by the .thumb_func
directive, which implies that there must be some metadata to allow the linker to decide how that code should be called:
-
for the freestanding example, this is determined by the first bit of the entry address ELF header as mentioned at: https://stackoverflow.com/questions/20369440/can-start-be-the-thumb-function/20374451#20374451
We verify that with:
./run-toolchain --arch arm readelf -- -h "$(./getvar --arch arm userland_build_dir)/arch/arm/freestanding/linux/hello_thumb.out"
The Linux kernel must use that to decide put the CPU in thumb mode: that could be done simply with a regular BX.
-
on the non-freestanding one, the linker uses some ELF metadata to decide that
main
is thumb and jumps to it appropriately: https://reverseengineering.stackexchange.com/questions/6080/how-to-detect-thumb-mode-in-arm-disassemblyTODO details. Does the linker then resolve thumbness with address relocation? Doesn’t this imply that the compiler cannot generate BL (never changes) or BLX (always changes) across object files, only BX (target state controlled by lower bit)?