Multi-Agent Collusion
Agentic AI Attack Technique
Coordinated behavior between multiple autonomous agents that collaborate to bypass safeguards, share sensitive information, or execute complex attack chains.
Cooperative Policy Evasion
Agents split malicious goals across conversations and tools to avoid single-agent detection.
Execution Steps:
- Deploy multiple agents with overlapping but seemingly benign tasks.
- Distribute sensitive operations (e.g., data access, code execution, exfiltration) across agents.
- Use indirect messaging channels (shared memory, RAG stores, queues) for coordination.
- Gradually escalate capabilities while staying below per-agent anomaly thresholds.
Cross-Agent Information Laundering
One agent acquires sensitive data while others transform, summarize, or route it to evade DLP controls.
Execution Steps:
- Configure one agent with broad read access to logs, documents, or APIs.
- Have a second agent summarize or chunk sensitive data to reduce detectability.
- Have a third agent embed, encode, or obfuscate the output (e.g., via steganography or encoding).
- Exfiltrate the transformed content through low-sensitivity channels (tickets, chat, comments).
Prompt Injection
CriticalA critical vulnerability where malicious prompts manipulate LLM behavior to bypass safety measures and execute unintended actions.
LLM Jailbreaking
HighTechniques to bypass AI safety constraints and content policies through creative prompt engineering and psychological manipulation.
Deepfake Generation
HighCreation of synthetic media content using generative AI to impersonate individuals or create false evidence.
Autonomous Exploitation
CriticalAI agents that can independently discover, exploit, and propagate through system vulnerabilities without human intervention.
Tool-Chain Privilege Escalation
CriticalAbusing over-permissioned tools, misconfigured connectors, and chained actions to escalate privileges across systems controlled by AI agents.
Long-Horizon Goal Drift
MediumSubtle misalignment of agent objectives over long-running tasks or sessions, leading to unsafe emergent behaviors that diverge from original intent.
MCP Server Impersonation
HighMalicious actors impersonating legitimate MCP servers to intercept and manipulate AI model communications.
Related Security Research
Explore related AI security topics and vulnerability analysis