We have finally figured out a way to break the gadgets in C programs. With the Help of LLVM compiler and pin tool from Intel we list those set of instructions which can potentially form a gadget. We assume all gadgets end with a “ret” instruction. We now break these gadgets using techniques like basic block splitting, algebraic re-association and instruction re-ordering.
Below is the flow of the complete algorithm.
Steps Involved
1) Compile C code to assembly and machine code via Clang. Use XED provided in the pin tool to map assembly and machine code.
Example of Assembly code generated: movl %ecx, 0x80436ec #foo :10:1
Example of Machine code generated: 890DEC360408
Example of Assembly and machine code mapping: 890DEC360408 movl %ecx, 0x80436ec
2) From the mapping, we list those instructions which can form a gadget. The “ret” instruction’s opcode is C3
890DEC360408 movl %ecx, 0x80436ec
3) From steps (1) and (2) we can get source line numbers which might map to a gadget.
4) We can now use techniques like Basic Block Splitting and Algebraic Re association to remove the gadget.
The V8 compiler does not provide any API to translate JS code at the IR level. For the time being we are working on C programs and assume that a compiler from JS to LLVM-IR is not far away from development. Given the widespread use of LLVM, we don’t think this is far fetched.