Rice University logo
 
Top blue bar image
 

compiler reassociation

December 2nd, 2012 by ksm2

Cocke et. al. proposed algebraic re-association in 1970s, and concept was extended by shantaram, and keith cooper and preston briggs in their Partial Redundancy Elimination paper (1995). In algebraic re-association, a subtree of the abstract syntax tree computing an arithmetic expression is re-arranged so that there is not change in output but instead exposes more opportunities for constant propagation, PRE, and other compiler optimizations.

For our project, we identify reassociation as one of the strategies to prevent the ‘C3’s from occurring at un-intended places (an actual function return is an intended ‘C3’ in the machine code). We employ the first few steps as done in ‘breaking basic blocks’:

* use clang to generate an executable of the application

* use gdb -x with disas commands to disassemble the code, and search for C3

* map the unintended C3  instructions back to source code.

—this phaser specific changes—-

*reassociate pass in llvm skips over instructions which do not match the special line numbers.

*selects instructions +- 5 from instructions important to us, applies algebraic re-association.

eg: 4+x-5 ===> x-5+4 (this step is algebraic reassociation)

* we were able to see around 7% reduction in RETs with this approach in some benchmarks

* this pass combines with basic block splitting led to around 33% unintended ret reductions for some benchmarks.

More results in final presentation.



Breaking the Gadgets

November 20th, 2012 by dm14

We have finally figured out a way to break the gadgets in C programs. With the Help of LLVM compiler and pin tool from Intel we list those set of instructions which can potentially form a gadget. We assume all gadgets end with a “ret” instruction. We now break these gadgets using techniques like basic block splitting, algebraic re-association and instruction re-ordering.

Below is the flow of the complete algorithm.
Flow for Gadget breaking algorithm

Steps Involved
1) Compile C code to assembly and machine code via Clang. Use XED provided in the pin tool to map assembly and machine code.
Example of Assembly code generated: movl %ecx, 0x80436ec #foo :10:1
Example of Machine code generated: 890DEC360408
Example of Assembly and machine code mapping: 890DEC360408 movl %ecx, 0x80436ec

2) From the mapping, we list those instructions which can form a gadget. The “ret” instruction’s opcode is C3
890DEC360408 movl %ecx, 0x80436ec

3) From steps (1) and (2) we can get source line numbers which might map to a gadget.

4) We can now use techniques like Basic Block Splitting and Algebraic Re association to remove the gadget.

The V8 compiler does not provide any API to translate JS code at the IR level. For the time being we are working on C programs and assume that a compiler from JS to LLVM-IR is not far away from development. Given the widespread use of LLVM, we don’t think this is far fetched.



Controlling order of optimizations

November 8th, 2012 by ksm2

To achieve the objective of controlling order of optimizations; we perform the following steps:

  • We gdb’ed through a simple hello world js application to observe the callpath that occurs when a js is jit’ed with the v8 chrome JIT engine
  • we now understand that the hydrogen.cc present in the v8 engine is the centralized place where optimizations are controlled/implemented.
  • all our experiments should probably take place through the sites in this file.


Deliverables and current progress

November 8th, 2012 by ksm2

After a meeting with Dr. Wallach, we revisited our deliverables to be the following set of items:

  • performance graphs to depict how the order of optimizations affect code size/quality/runtime
  • generate programs without rets; how does this affect the various code features
  • re-ordering of basic blocks
  • shadowing of different parts of address space.

As of today,

  • we have been able to reproduce the exploit the attach in the previous 527 project related to the same
  • understand the v8 code base
  • we have started manipulating the v8 engine to achieve objective for the first deliverable


Welcome! Overview of our 527 project

November 1st, 2012 by ksm2

This blog is related to our COMP 527 project.

In this project, we extend the work done by Chabbi et. al. in Fall 2011 titled Effective Browser
Exploit with Return-Oriented Programming (ROP) via JavaScript JIT. Return-Oriented
programming is based on the notion of using a series of gadgets to achieve a desired goal.
These gadgets are snippets of code extracted from pre-loaded library code such as glibc. ROP
utilizes an existing vulnerability such as buffer overflow to fill the stack with a chain of
procedure calls to required gadgets. The chained procedure calls when knitted
together constitute a substantial exploit which allows intruders to execute malicious code.

Java Script is a popular scripting language used to build the client-side face of web applications.
Hence, JS is the vehicle for our exploit. We plan to demonstrate the exploit performed by Chabbi
et. al. via Mozillas JIT engine, namely ionmonkey, instead of the Chrome V8 engine. We plan to
exploit the fact that ionmonkey loads libstdc++ library into the address space, and allows
client-side JS to access the library.