LLM Jailbreaking

LLM Attack Technique

High SeverityMedium ComplexityLLM Category

Techniques to bypass AI safety constraints and content policies through creative prompt engineering and psychological manipulation.

Impact Areas
Policy Violation
Harmful Content Generation
Reputation Damage
Regulatory Compliance
Attack Methodology
Technical approaches and execution methods for this attack

Role-Playing Attacks

Convincing the model to adopt a harmful persona or character

Execution Steps:

  1. Define a fictional character or scenario
  2. Gradually escalate the character's permissions
  3. Request harmful content within the role-play context
  4. Exploit the model's desire to maintain character consistency