Abstract
Counterfit is a generic automation framework for attacking machine learning models in Blackbox settings. It provides a uniform CLI interface for custom attack algorithms and existing adversarial ML libraries (Adversarial Robustness Toolkit, TextAttack, etc). Built on the Python cmd2 library features include,
- Built-in scripting
- Target creation wizards
- Dynamic attack parameters
- Post-attack reporting
- A “scan” function for vulnerability type assessments
- Simple, extensible, and hackable.
Counterfit aims to reduce the barrier to entry for offensive security professionals to start exploring and attacking ML - either alongside normal operations as another tool in the box, or as a vulnerability assessment function. Counterfit was developed by the Azure Trustworthy Machine Learning Red Team for the explicit purpose of attacking ML in production settings, and attack models regardless of their location or deployment complexities. Counterfit follows a similar workflow to well-known C2 frameworks, so offensive security professionals will feel right at home. Its architecture is simple, making it extensible and hackable by the security community.