High Severity

Evasion Attacks on AI Systems

Evasion attacks manipulate inputs to cause AI models to make incorrect predictions while appearing normal to human observers.

Attack Types

Adversarial Examples

Carefully crafted inputs with imperceptible perturbations that fool models

Physical Attacks

Real-world modifications like stickers on stop signs to evade detection

Digital Perturbations

Pixel-level changes to images or audio samples

Defense Strategies
  • • Adversarial training with robust examples
  • • Input transformation and preprocessing
  • • Ensemble methods and model diversity
  • • Certified defenses with provable guarantees
  • • Detection mechanisms for adversarial inputs
Common Evasion Techniques

White-Box Attacks

Attacker has full knowledge of the model architecture and parameters

  • • FGSM (Fast Gradient Sign Method)
  • • PGD (Projected Gradient Descent)
  • • C&W (Carlini & Wagner) attacks
  • • DeepFool algorithm

Black-Box Attacks

Attacker only has query access to the model

  • • Transfer attacks using surrogate models
  • • Query-based optimization methods
  • • Genetic algorithms for perturbation
  • • Score-based gradient estimation