Presentation Material
Abstract
Each Android app runs in its own VM, with every VM allocated a limited heap size for creating new objects. Neither the app nor the OS differentiates between regular objects and objects that contain security sensitive information like user authentication credentials, authorization tokens, en/decryption keys, PINs, etc. These critical objects like any other object are kept around in the heap until the OS hits a memory constraint and realizes that it needs more memory. The OS then chooses to invoke garbage collector in order to reclaim memory from the apps. Java does not provide explicit APIs to reclaim memory occupied by objects. This leaves a window of time where the security critical objects live in the memory and wait to be garbage collected. During this window a compromise of the app can allow an attacker to read the credentials. This is a needless risk every Android application lives with today. To exacerbate the situation, apps today heavily make use of Identity providers to implement Open ID/OAuth based authentication and authorization.
In this paper we propose a novel approach to determine at every program statement, which security critical objects will not be used by the app in the future. An Android application once compiled, has all the information needed to determine this. Using results from our data flow analysis [1] we can decide to flush out the security sensitive information from the objects immediately after their last use, thereby preventing an attacker who has compromised the app from reading security critical information. This way an app can truly provide defence in depth, protecting sensitive data even after a compromise.
We propose a new tool called Androsia, which uses static program analysis techniques to perform a summary based [2] interprocedural data flow analysis to determine the points in the program where security sensitive objects are last used (so that their content can be cleared). Androsia then performs bytecode transformation of the app to flush out the secrets resetting the objects to their default values. The data-flow analysis associates two elements with each statement in the unit control flow graph called flow sets: one in-set and one out-set. These sets are (1) initialized, then (2) propagated through the unit graph along statement nodes until (3) a fixed point is reached.
We leverage the power of Soot [3], a static Java-bytecode analysis framework, to identify the points in the program where an object is last used (LUP). The detection of Last Usage Point (LUP) of objects, requires analysis of methods in a reverse topological order of their actual execution order; which means that the callee method will be analyzed before the caller method. We construct flow functions for the analysis and use them to propagate the data flow sets [4]. The flow functions are as follows:
Out(i) = φ if S(i) is exit node in CFG
= ∪ {In(j)} | where S(j) is the set of all successor statements of S(i) | otherwise
In(i) = Out(i) ∪ Gen(i); where
Gen(i) = {var(y)} | if S(i) is of the form: x = y
= {var(y)} | if S(i) is of the form: x = if(y)
= {var(y)} | if S(i) is of the form: x = while(y)
= {p(i)} | if S(i) is of the form: x = f(p)
= {φ} | otherwise
AI Generated Summary
The talk presents a static code analysis tool, Andrew Jie, designed to secure in-memory application data in Android applications by automatically clearing sensitive object contents after their last use. The tool addresses the misconception that Android’s garbage collector reliably removes sensitive data, as unreachable objects containing secrets can persist in heap memory until system resource constraints trigger collection, without guaranteed overwriting.
The core technique is a summary-based interprocedural data flow analysis implemented using the Soot framework. It performs a live variable analysis across the whole program to identify the last statement where a target object (initially StringBuilder) is used. This involves computing DEF and USE sets for each statement, then deriving live variable entry and exit sets via backward data flow. Method summaries store the last use point for local variables and static field references. A whole-program phase propagates these summaries in reverse topological order to determine the final last-use point for static fields, accounting for all call paths. The tool handles complex scopes (local, static, instance fields) and program structures like loops, adjusting instrumentation points to avoid logical errors—for example, clearing a static field only after all loop iterations complete.
The tool operates in two modes: it can automatically instrument the bytecode to insert memory-clearing code (e.g., calling delete(0, length) on a StringBuilder) at the computed last-use point, or it can report the location for manual developer intervention. A demo using a sample app with static StringBuilder fields demonstrated correct instrumentation placement even when usage patterns changed due to loops or additional calls. The analysis pipeline converts Dalvik bytecode to Soot’s SIMPLE IR, uses FlowDroid to generate a dummy main method simulating Android lifecycle callbacks, performs the analysis, and repackages the instrumented APK.
Practical implications include automating a critical security practice often overlooked by developers, reducing the risk of sensitive data exposure via heap dumps. While currently focused on StringBuilder, the approach is extensible to other object types. Work is ongoing to handle instance fields by tracking encapsulating objects and adding reset methods. The tool is not yet publicly released due to corporate open-sourcing procedures.