25. Gensim
Source at: https://github.com/gensim-project/gensim previously at: https://bitbucket.org/gensim/gensim
MIT licensed Binary translation simulator, so a bit like an MIT QEMU.
Video showing it boot Linux fast: https://www.youtube.com/watch?v=aZXx17oYumc
Its name is unfortunately completely and totally overshadowed by an unrelated software with the sane name: https://radimrehurek.com/gensim/
TODO: advantages over QEMU. Like the name implies, they seem to have a nice ISA description language. From quick internals look, seems to generate LLVM intermediate language, which sound good.
Build on Ubuntu 20.04:
git submodule update --init submodules/gensim sudo apt install libantlr3c-dev cd submodule/gensim make
First fails with:
arm-none-eabi-gcc: error: unrecognized -march target: armv5
Let’s try just armv8, who cares about arvm5!!!
mkdir build cd build cmake -DTESTING_ENABLED=FALSE -DCMAKE_BUILD_TYPE=DEBUGOPT .. make -j`nproc` model-armv8
Now fails as mentioned at https://bitbucket.org/gensim/gensim/issues/34/build-fails-with-unrecognised-intrinsic:
terminate called after throwing an instance of 'std::logic_error' what(): Unrecognised intrinsic: __builtin_abs64 Aborted (core dumped)
Get the failing command with:
make VERBOSE=1 model-armv8
and we see some code generation step:
cd /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/models/armv8 && \ /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/build/dist/bin/gensim \ -a /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/models/armv8/aarch64.ac \ -s module,arch,decode,disasm,ee_interp,ee_blockjit,jumpinfo,function,makefile \ -o decode.GenerateDotGraph=1,makefile.libtrace_path=/home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/support/libtrace/inc,makefile.archsim_path=/home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/archsim/inc,makefile.llvm_path=,makefile.Optimise=2,makefile.Debug=1 \ -t /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/build/models/armv8/output-aarch64/
We can see an inclusion path:
gensim/models/armv8/aarch64.ac ac_isa("isa.ac"); gensim/models/armv8/isa.ac ac_execute("execute.simd");
and where gensim/models/armv8/isa.ac
contains __builtin_abs64
usages.
Rebuilding with -DCMAKE_BUILD_TYPE=DEBUG
+ GDB on gensim
shows that the error comes from a call to gci.GenerateExecuteBodyFor(body_str, *action);
, so it looks like there are some missing cases in gensim/src/generators/GenCInterpreter/InterpreterNodeWalker.cpp
function SSAIntrinsicStatementWalker::EmitFixedCode
, e.g. there should be one for __builtin_abs64
.
This is completely broken academic code! They must be using an off-tree of part of the tool and forgot to commit.