Holdec Version 2.0

After more than 8 years a new version of the holistic decompiler (holdec) is available. During this time a lot has changed: for PCs 64bit became the standard, AARCH64 (aka arm64) became widespread in mobile devices and is with the M1 chip from Apple on the jump to the desktop and last but not least the NSA open sources their decompiler Ghidra. The development of holdec also didn’t stop during the time. A more detailed (but still incomplete) list of changes is in the Changelog. I want to go over the biggest changes here in more detail.

Link to the download page.

Support for x64

A lot of work went into a another rewrite of the x86 disassembler to support x64. Hopefully this third version will last a bit longer. Of course the new registers and longer integer values had to be added.

Support for AArch64/A64

To extend the existing 2.5 architectures (M68k and x86/x64) with another one I had to decide which one to add. I choose AArch64/A64 since the majority of mobile devices used it at the time. In retrospective I think it is a nice ISA since it is easy to understand and decode and also has SIMD built in and not as an extension(s) like x86 or RISC-V.

Of course the support starts with a disassembler. Since I prefer a pure Java solution and there was not one available I “had” to write a new disassembler. This time I open sourced it (jarmo). I used objdump from GNU binutils as a reference implementation. And of course every complex software has bugs. In general it was a nice experience getting to know such a modern ISA.

Support for x87

While AArch64/A64 is pretty new stuff the old FPU instructions (stack based, all starting with an “f”) are quite old. Still they are used in older binaries. I executed my usual cycle of: looking at the problem, creating a test program, running other decompilers on the test program, filing bugs for the other decompilers, implement the feature in holdec.

One thing to struggle quite often was how to model each instruction. I wrote more about this issue in separate post.

The decompiler doesn’t perform the usual arithmetic transformations like “x+x => 2x”. There is however a new option “-f/–eval-float-ops” which performs floating point evaluation of concrete values. This includes basic arithmetic operators but also functions like sin() or sqrt().

Support for large binaries

Inspired by a blog post about obfuscation techniques used by snapchat I started looking into this to see how holdec performs. Well it turns out that the binary is 133mb and contains 89mb of code. Since it is AArch64/A64 code this means 22mio instructions. The disassembler used to perform a linear search through all the possible instructions to decode one. This didn’t scale and now the code is now using a binary tree based approach.

Loading all these 22mio instructions in the global phase required more heap memory than my computer has. Since the binaries will only grow larger in general with time I rewrote the global phase to use a streaming approach of only disassemble a few instructions and build supporting data structures in memory to write the method cache.

Second surprise was that holdec identified 1 mio functions in the binary. Previously the decompiler wrote one file for each each identified function containing all the lines of this function. And all the files are in one directory. Not surprising the filesystem doesn’t like a directory with 1mio files. Next step was to use the standard approach of nested directories. This was successful but took a long time and used 11gb on the hard disk. Quite a waste (of development effort, execution time and disk space). Another iteration replaced this with one simple text file which just contains text offsets for each function into the global text file of disassembled instructions. With these changed the global phase runs for 20 minutes for the snapchat binary I use.

So while you now can decompile any subset of the 1mio functions you have to choose which one to focus on. During the global phase certain measurements are stored for every function. These include number of blocks, number of instructions, number of jumps (conditional, unconditional, and indexed), number of call instructions, McCabe complexity, number of backward jumps (indicating loops), number of other functions called and number of other functions calling this one. Currently you can only see this information with the new CLI command “-c printTopMethodCache” which prints the Top-10 functions for the various metrics. More support is required here.

With this rather large internal change holdec now supports a very wide range of subjects: from 64byte long demos for MS-DOS to 100mb large mobile applications.

Correctness of x86/x64

As outline in another post one can test if the semantic model of the decompiler matches a real world silicon CPU. Other decompilers (reko, snowman, ghidra, hex-rays) have certain issues with the the 32bit and 64bit version of the test while holdec is the only one which passes all test methods.

One outcome for the x64 test is that holdec had multiple bugs related to the fact that assigning a 32bit register in 64bit mode clears the upper 32bits. This is not surprising for simple movs (which may even be nops in 32bit mode) but other cases include rotating with count=0 or CMOVcc when the condition doesn’t hold. On the other hand states the doc for CMPXCHG that an assignment happens in both cases and one can assume that in both cases the upper bits are cleared but this is not the case.

Conclusion

As always there is a lot of ideas and not enough time. It is difficult to decide what to work on:

  • increase width: support more CPU architectures / formats
  • go deeper: model existing CPUs in more detail
  • increase robustness: run the decompiler on a large set of binaries and look at errors
  • front-end (CPU architectures from above) vs middle part (e.g. track value sets across blocks) vs output (e.g. better detection of for loops)
  • more tests: more tests which show the features but also the limits of the current decompilers

Let us hope that the next version of holdec does not take so THIS much time.

Thank you for reading and please send questions or feedback via email to holdec@kronotai.com or contact me on Twitter.

This entry was posted in decompiler, holdec, Uncategorized and tagged , . Bookmark the permalink.