{"id":546,"date":"2020-11-25T13:16:15","date_gmt":"2020-11-25T12:16:15","guid":{"rendered":"http:\/\/kronotai.com\/wordpress\/?p=546"},"modified":"2020-11-25T13:16:15","modified_gmt":"2020-11-25T12:16:15","slug":"about-passing-floating-point-parameters","status":"publish","type":"post","link":"https:\/\/kronotai.com\/wordpress\/2020\/11\/25\/about-passing-floating-point-parameters\/","title":{"rendered":"About passing floating point parameters"},"content":{"rendered":"\n\t\t\t\t<![CDATA[\n<p class=\"bla\">On i386 linux software floating point numbers are passed on the stack and not registers. This creates two issues:<\/p>\n\n\n<ol class=\"wp-block-list\"><li>How does a decompiler recognize that a floating point number is passed?<\/li><li>How does a decompiler express an individual call?<\/li><\/ol>\n\n\n<p class=\"bla\">There are multiple sources for type recognition in general. For floating point numbers the relevant ones are:<\/p>\n\n\n<ul class=\"wp-block-list\"><li>if the value is in a floating point register (x87 ST*, XMM, SSE,&#8230;) it is with high probability a floating point value. This is not available with stack memory is used.<\/li><li><strong>caller<\/strong>: if the caller writes the memory from a known floating point value (like from one of the registers) one can assume it is a floating point value<\/li><li><strong>usage<\/strong>: if the called function performs floating point operations (at least loading) with the value it is a floating point value<\/li><li><strong>printf<\/strong>: if other information say that it is a floating point value. The most common are printf format strings.<\/li><\/ul>\n\n\n<p class=\"bla\">Let us take a look at these. We choose the C double type because it is 8 bytes long and therefore takes up two &#8220;stack slots&#8221;. This makes it easy to see the difference if the decompiler sees a method taking two ints or one which takes a double parameter.<\/p>\n\n\n<p class=\"bla\">If the caller passes in a float literal this float literal is not distinguishable from two 4 byte integers. If the called method just looks at the &#8220;bytes&#8221; we call this bytes in the table below.<\/p>\n\n\n<p class=\"bla\">The source code, executable and decompiler output are on <a href=\"https:\/\/github.com\/rfalke\/decompiler-subjects\/tree\/master\/from_holdec\/i386_float_recog\">github<\/a>.<\/p>\n\n\n<p class=\"bla\">The first case (unknown_to_unknown) is not distinguishable from passing around two 4 byte integer in the binary form of the program. Therefore the decompilers can&#8217;t recognize this. It is included as a baseline.<\/p>\n\n\n<table class=\"wp-block-table\"><tbody><tr><td><strong>Caller<\/strong><\/td><td><strong>Usage in function<\/strong><\/td><td><strong>Function<\/strong><\/td><\/tr><tr><td>literal<\/td><td>bytes<\/td><td>unknown_to_unknown<\/td><\/tr><tr><td>literal<\/td><td>double<\/td><td>unknown_to_double<\/td><\/tr><tr><td>double<\/td><td>bytes<\/td><td>double_to_unknown<\/td><\/tr><tr><td>double<\/td><td>double<\/td><td>double_to_double<\/td><\/tr><\/tbody><\/table>\n\n\n<p class=\"bla\">And the decompiler reactions:<\/p>\n\n\n<table class=\"wp-block-table\"><tbody><tr><td><strong>Function<\/strong><\/td><td><strong>reko<\/strong><br \/><\/td><td><strong>Ghidra<\/strong><\/td><td><strong>retdec<\/strong><\/td><\/tr><tr><td>unknown_to_unknown<\/td><td>two uint32<\/td><td>two uint<\/td><td>two uint32_t<\/td><\/tr><tr><td>unknown_to_double<\/td><td>real64<\/td><td>double<\/td><td>two uint32_t<\/td><\/tr><tr><td>double_to_unknown<\/td><td>two uint32<\/td><td>two uint<\/td><td>float80_t and uint32_t<\/td><\/tr><tr><td>double_to_double<\/td><td>real64<\/td><td>double<\/td><td>float80_t<\/td><\/tr><\/tbody><\/table>\n\n\n<p class=\"bla\">We see that reko and ghidra only use the usage information: they look at the function body and not the caller. retdec on the other side looks at the caller but the actual types are wrong.<\/p>\n\n\n<p class=\"bla\">When we look at the calls for functions where the signature is correctly recognized we see that reko passes one argument to unknown_to_double and double_to_double. Ghidra an on the other side passes two integer to unknown_to_double while double_to_double is ok. retdec passes one double to double_to_double but in the other cases it doesn&#8217;t produce correct output.<\/p>\n\n\n<hr class=\"wp-block-separator is-style-wide\"\/>\n\n\nWhen we look at the two printf calls the decompilers have to solve two issues:\n\n\n<ul class=\"wp-block-list\"><li>parse the format string to get the types of the parameters (and then output the call according to these types)<\/li><li>support long double (&#8220;%Lf&#8221; in the format string) which is 10 bytes and takes three &#8220;stack slots&#8221;<\/li><\/ul>\n\n\nNone of the three decompilers tested get this. See my inline comments.\n\n\n<pre class=\"wp-block-preformatted\">\/\/ ===== Original source<br \/>printf(\"unknown: int-a=%d double=%f int-b=%d long double=%Lf int-c=%d\\n\", <br \/>  100, 2.31, 101, (long double)2.32, 102);<br \/><br \/>printf(\"double: int-a=%d double=%f int-b=%d long double=%Lf int-c=%d\\n\", <br \/>  200, 2.41+argc, 201, (long double)(2.42+argc), 202);<br \/><br \/>\/\/ ===== reko<br \/>\/\/ - tLoc3C is not defined<br \/>\/\/ - doesn't realize that %Lf takes 3 stack slots<br \/>\/\/ - 2920577024 aka 0xae147800 is part of the long double<br \/>printf(\"unknown: int-a=%d double=%f int-b=%d long double=%Lf int-c=%d\\n\", <br \/>  100, 2.31, 101, tLoc3C, 2920577024);<br \/><br \/>printf(\"double: int-a=%d double=%f int-b=%d long double=%Lf int-c=%d\\n\", <br \/>  200, rLoc1_134 + g_r804A100, 0xC9, tLoc3C, (word32) (real80) (g_r804A0F8 + rLoc1_134));<br \/><br \/>\/\/ ===== ghidra<br \/>\/\/ - extra uVar3<br \/>\/\/ - int literals and not floating literals<br \/>dVar2 = (double)param_1 + 1.24;<br \/>uVar3 = (undefined4)((ulonglong)dVar2 &gt;&gt; 0x20);<br \/>printf(\"unknown: int-a=%d double=%f int-b=%d long double=%Lf int-c=%d\\n\",<br \/>  100, 0x47ae147b, 0x40027ae1, 0x65, <br \/>  0xae147800, 0x947ae147, 0, 0x66, uVar3);<br \/><br \/>\/\/ - doesn't known that the 3 stack slots belong together<br \/>fVar1 = (float10)2.42 + (float10)param_1;<br \/>printf(\"double: int-a=%d double=%f int-b=%d long double=%Lf int-c=%d\\n\",<br \/>  200, (double)((float10)param_1 +(float10)2.41), <br \/>  0xc9, <br \/>  SUB104(fVar1,0), <br \/>  (int)((unkuint10)fVar1 &gt;&gt; 0x20), <br \/>  (char)((unkuint10)fVar1 &gt;&gt; 0x40),<br \/>  0xca, uVar3);<br \/><br \/>\/\/ ===== retdec<br \/>\/\/ - How? Why? Only positive is that it the gets<br \/>\/\/   the number of extra arguments correct<br \/>printf(\"unknown: int-a=%d double=%f int-b=%d long double=%Lf int-c=%d\\n\", <br \/>  100, 5.9415882152956413e-315, 0x40027ae1, 3.68165152720129934855e-4949L, -1);<br \/><br \/>\/\/ - the double and long double part is good<br \/>\/\/ - the ints are wrong<br \/>printf(\"double: int-a=%d double=%f int-b=%d long double=%Lf int-c=%d\\n\", <br \/>  200, (float64_t)(v1 + 2.41L), 0, v1 + 2.42L);<br \/><\/pre>\n\n\nWhile floating point numbers s are with us for a long time (the 80387 is from 1987) and we are not doing anything fancy like SIMD the decompilers are lacking in this regard: retdec is just strange\/broken, ghidra doesn&#8217;t support floating point literals and reko has its own issues.\n\n<p><![CDATA[\n<\/p]]><\/p>\n\n\n<p class=\"bla\">Thank you for reading and please send questions or feedback via email to holdec@kronotai.com or contact me on <a href=\"https:\/\/twitter.com\/holdecd\">Twitter<\/a>.<\/p>\n]]>\t\t","protected":false},"excerpt":{"rendered":"<p>\t\t\t\t<![CDATA[]]>\t\t <a href=\"https:\/\/kronotai.com\/wordpress\/2020\/11\/25\/about-passing-floating-point-parameters\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,3],"tags":[16,20,21,34,38,43],"class_list":["post-546","post","type-post","status-publish","format-standard","hentry","category-decompiler","category-floating-point","tag-double","tag-format-string","tag-fpu","tag-long-double","tag-printf","tag-x87"],"_links":{"self":[{"href":"https:\/\/kronotai.com\/wordpress\/wp-json\/wp\/v2\/posts\/546","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kronotai.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kronotai.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kronotai.com\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/kronotai.com\/wordpress\/wp-json\/wp\/v2\/comments?post=546"}],"version-history":[{"count":0,"href":"https:\/\/kronotai.com\/wordpress\/wp-json\/wp\/v2\/posts\/546\/revisions"}],"wp:attachment":[{"href":"https:\/\/kronotai.com\/wordpress\/wp-json\/wp\/v2\/media?parent=546"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kronotai.com\/wordpress\/wp-json\/wp\/v2\/categories?post=546"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kronotai.com\/wordpress\/wp-json\/wp\/v2\/tags?post=546"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}