Decompiler Maturity Index (DMI). A decompiler with a higher index is not automatically “better” but has more features.
Criteria
DMIC stands for DMI Criterion.Basic
- DMIC-A1: I’m able to build and run the decompiler
- DMIC-A2:The decompiler is able to work on one subject (maybe provided by the project, otherwise chosen by me)
- DMIC-A3: The project is active: There is a commit in the last 3 months.
- DMIC-A4: The project has a history: There are 100 commits.
- DMIC-A5: The decompiler is able to detect and output a simple loop.
- DMIC-A6: The decompiler performs simple expression simplifications.
- DMIC-A7: The decompiler produces some basic output for the hexdump subject (or a similar complex subject).
- DMIC-A8: The decompiler support ia32 ELF. This is important since a lot of test subjects are in the ia32 ELF format.
- DMIC-A9: The decompiler models all flag changes but also removes unused flag assignments.
- DMIC-A10: The decompiler recognized the number of arguments for a stack based method call.
- DMIC-A11: The decompiler propagates register values to other statements. In the some block and into other blocks of the same functions.
- DMIC-A12: The decompiler understands indexed jumps using a jump table as generated by switch-case.
Intermediate
- DMIC-B1: The decompiler supports more than one CPU architecture. This ensures that the core is quite generic. i386 and AMD64 are counting here as one.
- DMIC-B2: The decompiler supports more than one executable format.
- DMIC-B3: The decompiler understands advanced i386 opcodes like string operations with rep prefix or the cpuid instruction.
- DMIC-B4: The decompiler has a GUI.
- DMIC-B5: The decompiler understands printf/scanf format strings and passes the correct number of arguments in the function call.
- DMIC-B6: The decompiler outputs a normal increasing for loop as a for loop.
- DMIC-B7: The decompiler detects and outputs short circuit boolean expressions (|| and &&).
- DMIC-B8: The decompiler outputs string/int literals from the data segment when possible (for example read-only).
- DMIC-B9: The decompiler supports FPU operations and types.
- DMIC-B10: The decompiler detects and removes jumps which can not be taken and dead blocks.
- DMIC-B11: The decompiler propagates memory values to other statements.
- DMIC-B12: The decompiler is able to output the fields of a struct when these fields are used.
- DMIC-B13: The decompiler is able to cope with code which casts/uses union to interpret values in a conflicting way.
- DMIC-B14: The decompiler detects local variables and replaces unstructured stack with local variables.
Advanced
- DMIC-C1: The decompiler understands at least one way where the subject calls the OS directly. Think Linux syscalls, MS-DOS interrupts or Amiga libraries.
- DMIC-C2: The decompiler detects common library methods in statically compiled subjects. Something like FLIRT.
- DMIC-C3: The decompiler knows the signatures of common libraries like libc and applies these signatures.
- DMIC-C4: The decompiler knows some advances expression transformations. For example division by multiplication.
- DMIC-C5: The decompiler supports SIMD. At least models these with internal functions which capture the input and output. Or decompose them into their “real” semantics.
- DMIC-C6: The decompiler detects the number of elements and the element sizes from a loop which goes over the array.