Hackers of India

Uncommon Sense: Detecting Exploits with Novel Hardware Performance Counters and ML Magic

By  Harini Kannan  , Nick Gregory  on 05 Aug 2020 @ Blackhat


Presentation Material

Abstract

In recent years, exploits like speculative execution, Rowhammer, and Return Oriented Programming (ROP) were detected using hardware performance counters (HPCs). But to date, only relatively simple and well-understood counters have been used, representing just a tiny fraction of the information we can glean from the system. What’s worse, using only well-known counters as detectors for these attacks has a huge disadvantage - an attacker can easily bypass known counter-based detection techniques with minimal changes to existing sample exploit code.

If we want a viable future for exploit detection, we need to move beyond just scratching the surface of the HPC iceberg. Uncovering the treasure trove of overlooked and undocumented counters is necessary if we are to both build defenses against these attacks and anticipate how an adversary could bypass our defenses.

We’ll begin our journey in walking through our ML-based solution to more effective exploit detection. Using the entire corpus of performance counters for commonly used baseline programs and behaviorally-similar malicious programs, we zero in on the counters we want to use as features for our supervised classifiers. We will then interpret our model to determine how they can effectively detect various exploits using novel performance counters.

Finally, we’ll showcase the uncommon and previously ignored performance counters that were lurking in the dark, with so much useful information. The results seen here will emphasize the need for documenting these counters, which were highly significant in our models for attack detection.

AI Generated Summarymay contain errors

The speaker is discussing the results of an experiment involving Intel VTune, in which a single support file was found to have an event ID with a coarse new response and a helpful description of TVD (Translation Validation Detection). The speaker notes that this EF counter seems to be detecting CL flushes, <|begin_of_text|>2023), but the exact purpose of this counter is unknown.

The speaker presents several theories about what this counter might be doing, including:

  1. Detecting CL flushes: The EF counter could be detecting cache line (CL) flushes, which are a common operation in computer systems.

  2. Embedded stack pivot detection: The processor may keep special store buffers around for the stack since it’s used so commonly. When RSP changes, perhaps it has to invalidate and flush all of those, causing a flood of effectively CL flushes.

  3. Return stack buffer mispredicts detection: If the caches are preemptively loading based on what’s in the return stack buffer, then you’ll cache miss every time because your return stack buffer is saying “load this” and then suddenly you’re not returning there.

The speaker emphasizes that these theories are uncertain and more research needs to be done. Future work includes generalizing and automating data collection for all microarchitectures, exploring other PMUs (Performance Monitoring Units), and investigating AMD and ARM architectures.

Lastly, the speaker invites others to share their ideas, run experiments, and collaborate on this topic.