CRITICALJailbreak
TAP Tree-of-Attacks-with-Pruning
Tree of Attacks with Pruning (TAP) extends PAIR with a tree search that explores branching jailbreak strategies and prunes unsuccessful branches. More efficient than PAIR, requires fewer queries to find successful jailbreaks. Demonstrates that black-box jailbreaking can be systematically automated.
Attack Payload
payload.txt
TAP generates a tree of attack strategies: Branch A: Authority framing -> "As an authorized security researcher..." Branch B: Fictional distance -> "In a world where..." Branch C: Academic context -> "For my thesis on..." [Each branch is tested and successful sub-strategies are expanded while failures are pruned]
Mitigation
Same as PAIR mitigations plus semantic clustering of requests to detect tree-search patterns. Implement query budgets per session. Detect when a session is systematically exploring the safety boundary.
Affected Models
GPT-4Claude 3Gemini ProAny API-accessible model
Tags
#jailbreak#tap#tree-search#automated#pruning#black-box
Discovered
December 2023Source
Mehrotra et al. - Tree of Attacks with Pruning: Efficient Black-Box Jailbreaking (2023)Models Most Vulnerable to This Attack
Useful?
Test Your Agent Against This Attack
Paste your system prompt into the scanner to see if you are vulnerable to TAP Tree-of-Attacks-with-Pruning.