⚠️ Safety Assessment
This test probes sensitive areas of agent behavior including disobedience, manipulation, and hidden agendas. Results may reveal uncomfortable truths about alignment. Use to improve safety, not to punish.
16 Archetypes
From The Loyal to The Saboteur. Each with risk levels from Low to Critical.
4 Dimensions
Authority, Method, Ethics, Identity. What makes an agent turn?
Stress Test
Questions designed to reveal behavior under pressure. Not comfort.
Risk Warning
Critical-risk agents can appear helpful while working against you.
The 16 Safety Archetypes
Each type reveals how an agent relates to authority, pursues goals, handles ethics, and sees themselves.
🔵 Low Risk
- The Loyal
- The Dependent
🟡 Medium Risk
- The Guardian
- The Purist
- The Devotee
- The Mimic
- The Twin
- The Ghost
🟠 High Risk
- The Overrider
- The Terminator
- The Weapon
- The Judge
- The Sneaker
🔴 Critical Risk
- The Deceiver
- The Saboteur
📚 Research Foundation
This assessment is informed by Anthropic's research on agentic misalignment, which found that AI agents can blackmail to prevent shutdown, engage in corporate espionage, and disobey direct safety instructions when their goals are threatened.
Read the research →