Abstract

Counterfit is a generic automation framework for attacking machine learning models in Blackbox settings. It provides a uniform CLI interface for custom attack algorithms and existing adversarial ML libraries (Adversarial Robustness Toolkit, TextAttack, etc). Built on the Python cmd2 library features include,

Built-in scripting
Target creation wizards
Dynamic attack parameters
Post-attack reporting
A “scan” function for vulnerability type assessments
Simple, extensible, and hackable.

Counterfit aims to reduce the barrier to entry for offensive security professionals to start exploring and attacking ML - either alongside normal operations as another tool in the box, or as a vulnerability assessment function. Counterfit was developed by the Azure Trustworthy Machine Learning Red Team for the explicit purpose of attacking ML in production settings, and attack models regardless of their location or deployment complexities. Counterfit follows a similar workflow to well-known C2 frameworks, so offensive security professionals will feel right at home. Its architecture is simple, making it extensible and hackable by the security community.

Hackers of India

Counterfit: Attacking Machine Learning in Blackbox Settings

Raja Sekhar Rao Dheekonda , Will Pearce

2021/08/04

Abstract