Abstract
In binary code analysis, current function identification approaches are challenged by functions without explicit call sites and handcrafted assembly without standard prologues/epilogues. We propose a new function representation called a reverse extended control flow graph (RECFG) and a RECFG-based method for identifying functions in stripped binary code. A function has at least one return instruction (an instruction that makes the control flow leave a function). Therefore, return instructions are more reliable than the function prologues and epilogues used by traditional methods. We first build RECFGs from any values that can be interpreted as return instructions in a code range. Then, for each independent RECFG, the multiple-decision method chooses a subgraph as the control flow graph of a function. A prototype tool is developed for evaluation on seven open source applications, 138 binaries in MASM32 code examples, and 292 binaries in Windows XP SP3. Experimental results show that the proposed method can identify functions that cannot be identified by current methods with high precision and stable recall.
| Original language | English |
|---|---|
| Pages (from-to) | 793-820 |
| Number of pages | 28 |
| Journal | Journal of Software Maintenance and Evolution |
| Volume | 27 |
| Issue number | 10 |
| DOIs | |
| State | Published - 1 Oct 2015 |
| Externally published | Yes |
Keywords
- TOPSIS
- function identification
- reverse engineering
- reverse extended control flow graph
- static analysis
Fingerprint
Dive into the research topics of 'Identifying functions in binary code with reverse extended control flow graphs'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver