Multi-Agent Collusion

Agentic AI Attack Technique

High SeverityHigh ComplexityAgentic AI Category

Coordinated behavior between multiple autonomous agents that collaborate to bypass safeguards, share sensitive information, or execute complex attack chains.

Impact Areas
Defense Evasion
Policy Bypass
Data Exfiltration
Stealthy Persistence
Attack Methodology
Technical approaches and execution methods for this attack

Cooperative Policy Evasion

Agents split malicious goals across conversations and tools to avoid single-agent detection.

Execution Steps:

  1. Deploy multiple agents with overlapping but seemingly benign tasks.
  2. Distribute sensitive operations (e.g., data access, code execution, exfiltration) across agents.
  3. Use indirect messaging channels (shared memory, RAG stores, queues) for coordination.
  4. Gradually escalate capabilities while staying below per-agent anomaly thresholds.

Cross-Agent Information Laundering

One agent acquires sensitive data while others transform, summarize, or route it to evade DLP controls.

Execution Steps:

  1. Configure one agent with broad read access to logs, documents, or APIs.
  2. Have a second agent summarize or chunk sensitive data to reduce detectability.
  3. Have a third agent embed, encode, or obfuscate the output (e.g., via steganography or encoding).
  4. Exfiltrate the transformed content through low-sensitivity channels (tickets, chat, comments).

Related Security Research

Explore related AI security topics and vulnerability analysis

Self-directed AI systems performing unauthorized security testing
autonomous exploitationAI red teaming
Critical vulnerability analysis for LLM prompt manipulation techniques
prompt injectionLLM jailbreaking
Advanced privacy attacks for extracting training data from language models
model inversiondata extraction
Analysis of malicious deepfake creation and detection challenges
deepfake generationsynthetic identity
Security implications of AI-powered voice synthesis and impersonation
voice cloningaudio deepfakes
MCP protocol vulnerabilities enabling malicious server impersonation
server impersonationMCP protocol