\myheading{Making smallest possible test suite using OpenWBO} \renewcommand{\CURPATH}{MaxSxT/min_test_OpenWBO} My previous example (\ref{set_cover}) was made up. And now I can do it better. % TODO put it to web and add URL here Once upon a time, I wrote my own x86/x64 disassembler. And a code like that is fragile and prone to bugs --- any unnoticed typo can ruin a hour or two. How to test it? Just to be sure, I put as much to tests as possible. I enumerated all possible 2 bytes opcodes, etc. At some point, there were 12531 tests, like: \begin{lstlisting}[style=customc] ... disas_test1(Fuzzy_True, (const byte*)"\xF3\x0F\x58\xF1", 3, "ADDSS XMM6, XMM1"); disas_test1(Fuzzy_True, (const byte*)"\xF3\x0F\x59\xC0", 3, "MULSS XMM0, XMM0"); disas_test1(Fuzzy_True, (const byte*)"\xF3\x0F\x5C\xD7", 3, "SUBSS XMM2, XMM7"); disas_test1(Fuzzy_True, (const byte*)"\xF3\x0F\x5E\xF8", 3, "DIVSS XMM7, XMM0"); disas_test1(Fuzzy_True, (const byte*)"\xF3\x42\x0F\x11\x74\x1D\x98", 3, "MOVSS [RBP+R11-68h], XMM6"); disas_test1(Fuzzy_True, (const byte*)"\xF3\x48\x0F\x2A\xC0", 3, "CVTSI2SS XMM0, RAX"); disas_test1(Fuzzy_True, (const byte*)"\xF3\x48\xAB", 3, "REP STOSQ"); disas_test1(Fuzzy_True, (const byte*)"\xF7\xED", 0x300, "IMUL EBP"); disas_test2_1op(Fuzzy_True, (const byte*)"\xFF\xD0", 0x300, "CALL RAX", 64); disas_test2_2op(Fuzzy_True, (const byte*)"\x00\x4C\x8B\x94", 0x300, "ADD [RBX+RCX*4-6ch], CL", 8, 8); disas_test2_2op(Fuzzy_True, (const byte*)"\x0D\x01\xFF\xFF\xFF", 0x48DB13, "OR EAX, 0FFFFFF01h", 32, 32); disas_test2_2op(Fuzzy_True, (const byte*)"\x0F\xB7\x0D\x4F\x02\x5B\x06", 0x436E96, "MOVZX ECX, [69e70ech]", 32, 16); disas_test2_2op(Fuzzy_True, (const byte*)"\x0F\xB7\x0D\xDE\x01\x5B\x06", 0x436F07, "MOVZX ECX, [69e70ech]", 32, 16); ... \end{lstlisting} % TODO URL % ( \href{https://raw.githubusercontent.com/DennisYurichev/x86_disasm/80f2c7b7137ddce8c9a1168051865ae595a09d56/tests.h}{The file} ) Basically, these are pairs: opcode + disassembled output. I modified my test code so that a test number can be set in arguments, and then prepended this to each test: \begin{lstlisting}[style=customc] ... if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\xF3\x0F\x58\xF1", 3, "ADDSS XMM6, XMM1"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\xF3\x0F\x59\xC0", 3, "MULSS XMM0, XMM0"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\xF3\x0F\x5C\xD7", 3, "SUBSS XMM2, XMM7"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\xF3\x0F\x5E\xF8", 3, "DIVSS XMM7, XMM0"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\xF3\x42\x0F\x11\x74\x1D\x98", 3, "MOVSS [RBP+R11-68h], XMM6"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\xF3\x48\x0F\x2A\xC0", 3, "CVTSI2SS XMM0, RAX"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\xF3\x48\xAB", 3, "REP STOSQ"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\xF7\xED", 0x300, "IMUL EBP"); if (test_all || line==__LINE__) disas_test2_1op(Fuzzy_True, (const byte*)"\xFF\xD0", 0x300, "CALL RAX", 64); if (test_all || line==__LINE__) disas_test2_2op(Fuzzy_True, (const byte*)"\x00\x4C\x8B\x94", 0x300, "ADD [RBX+RCX*4-6ch], CL", 8, 8); ... \end{lstlisting} Also any disassembler, including mine, has a \textit{mega-table} or \textit{mother-table} or \textit{master-table} that contains all opcodes, flags, instruction names, etc. % FIXME: % : \\ % \url{https://github.com/DennisYurichev/x86_disasm/blob/acb6751145dac38258c53412fb51fb709b3f3c6f/x86_tbl_entries.h}. And I added a debugging statement: if a table entry is loaded, its number is being printed. And you'll see why. I'm compiling the code so that it will generate GCOV-statistics: \begin{lstlisting} gcc -fprofile-arcs -ftest-coverage -g -DX86_DISASM_PRINT_INS_TBL_ENTRY x86_disasm_tests.c x86_disas.c x86_register.c -I../octothorpe ../octothorpe/octothorpe.a \end{lstlisting} And I run it 12531 times. A number in 1..12531 range is passed in arguments: \begin{lstlisting} #!/bin/bash for i in $(seq 1 12532); do echo $i rm *gcda* tbl_entry=$(./a.out $i | tail -1) echo $tbl_entry gcov x86_disas mv x86_disas.c.gcov gcovs/$i.$tbl_entry done \end{lstlisting} 12531 files have been created. Filename consists of number1.number2, where number1 is a test number (or line number in tests.h) and number2 is a table entry loaded during testing. The contents is a typical GCOV's output: \begin{lstlisting} ... 1: 651: if (IS_SET (p->new_flags, F_REG32_IS_LOWEST_PART_OF_1ST_BYTE)) #####: 652: mask=0xF8; 1: 653: if (IS_SET (p->new_flags, F_REG64_IS_LOWEST_PART_OF_1ST_BYTE)) #####: 654: mask=0xF8; -: 655: 1: 656: if ((opc & mask) != ins_tbl[p->tbl_p].opc) -: 657: { #####: 658: p->tbl_p++; #####: 659: continue; -: 660: }; -: 661: -: 662: // load second opcode if needed 1: 663: if (IS_SET (p->new_flags, F_OPC2)) -: 664: { -: 665: uint8_t opc2; #####: 666: if (Da_stage1_get_next_byte(p, &opc2)==false) -: 667: { #####: 668: if (dbg_print) #####: 669: printf ("%s() line %d\n", __func__, __LINE__); #####: 670: return false; -: 671: }; ... \end{lstlisting} This is how many times each line was executed during run. \verb|#####| means never. \verb|1| means one. Now the goal: to execute all lines at least once, with the help of as few tests as possible. This is how the problem can be stated in plain English language: \textit{for the line X, the test Y OR the test Z or the test M must be ran}. We generate such (hard) clauses for each lines. Also, we say to MaxSAT solver to find such a solution, where as few tests would be True, as possible. And this is my Python program to do so: \lstinputlisting[style=custompy]{\CURPATH/minimize_tests1.py} I run it: \begin{lstlisting} python minimize_tests1.py > 1.wcnf \end{lstlisting} The resulting \ac{WCNF} file is like: \begin{lstlisting} p wcnf 12532 12832 10000 c lines: c line= 768 10000 3432 6250 9716 9034 2020 4138 1308 5546 3433 4137 2728 7661 12449 9033 10400 4842 9717 4841 5547 8351 11083 2729 6955 11766 12450 2021 1309 6251 11767 6956 10399 11084 8350 7662 0 c line= 375 10000 1126 0 c line= 521 10000 12411 1796 8139 11130 2620 7728 11380 6631 1774 7685 5381 10996 626 6252 8189 4546 6191 11398 3865 ... ... 1 -1 0 1 -2 0 1 -3 0 1 -4 0 ... 1 -12528 0 1 -12529 0 1 -12530 0 1 -12531 0 1 -12532 0 \end{lstlisting} "c" is comment. I'm putting line (of code) number here. So for line 768, any tests among 3432 6250 9716 9034 2020 4138 1308 5546 3433 4137 2728 7661 must be present, at least one. For line 375, a (single) test 1126 must be present. (By the way, it's quite interesting to see what line of code is triggered by a \textit{single} test. Surely, they are must be present in minimized test suite.) All test numbers are then enumerated as soft constraints. Now I'm running the Open-WBO MaxSAT solver on the given \ac{WCNF} file: \begin{lstlisting} open-wbo 1.wcnf > result \end{lstlisting} \begin{lstlisting} c c Open-WBO: a Modular MaxSAT Solver -- based on Glucose4.1 (core version) ... s OPTIMUM FOUND v -1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 -16 -17 -18 -19 -20 -21 ... -1125 1126 -1127 ... ... \end{lstlisting} The output consists of all variables. But since a test number is mapped to a line number, this number also mapped to a variable's number. Efficiently, Open-WBO reports, which tests are to be picked (like 1126th). I wrote an utility for that: \lstinputlisting[style=custompy]{\CURPATH/pick_lines_from_test.py} And we found that to cover (almost) all lines of code in my disassembler, only 11 tests are enough! \begin{lstlisting}[style=customc] if (test_all || line==__LINE__) disas_test1(Fuzzy_False, (const byte*)"\x64\xA3\x12\x34\x56\x78", 0x100, "FS: MOV [78563412h], EAX"); if (test_all || line==__LINE__) disas_test1(Fuzzy_False, (const byte*)"\xD9\xEE", 0x123456, "FLDZ"); if (test_all || line==__LINE__) disas_test1(Fuzzy_False, (const byte*)"\xF0\x66\x0F\xB1\x0B", 0x1234, "LOCK CMPXCHG [EBX], CX"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\x44\xAB", 0x4f5b, "STOSD"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\x44\xBC\x90\x90\x90\x90", 0x507c, "MOV ESP, 90909090h"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\x45\xDB\x90\x90\x90\x90\x90", 0x638b, "FIST [R8-6f6f6f70h]"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\x4B\xC8\x90\x90\x90", 0xc848, "ENTER 0FFFFFFFFFFFF9090h, 90h"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\x4F\xBF\x90\x90\x90\x90\x90\x90\x90\x90", 0x10baf, "MOV R15, 9090909090909090h"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\x66\x0F\x38\x3D\xD0", 3, "PMAXSD XMM2, XMM0"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\xF2\x0F\x5A\xCE", 0, "CVTSD2SS XMM1, XMM6"); if (test_all || line==__LINE__) disas_test1(Fuzzy_True, (const byte*)"\xF3\x42\x0F\x11\x74\x1D\x98", 3, "MOVSS [RBP+R11-68h], XMM6"); \end{lstlisting} (It's important to know that my "bloated" tests was not perfect, some lines of code like error reporting are like "dead code" now, but it's OK.) At this point, my readers perhaps could stop reading and reuse my ideas for their own code and tests. But... As I mentioned, a disassembler has a \textit{mega-table}. And we want to touch each its entry during tests at least once, like each line of code. And this is a second version of my program: \lstinputlisting[style=custompy]{\CURPATH/minimize_tests2.py} Now only 301 tests are enough to cover (almost) all lines in my disassembler and to touch (almost) all entries in the \textit{mega-table}. Much better than 12531. % FIXME this: %: \\ %\url{https://github.com/DennisYurichev/x86_disasm/blob/acb6751145dac38258c53412fb51fb709b3f3c6f/tests.h}. Also, Open-WBO seems to be a better tool for the job, it works faster than Z3. Or maybe Z3 can be tuned?