SafePrompt

LLM safety, evaluated.

Test model behavior across prompt injection, hallucination, privacy, bias, and unsafe-compliance — with rule-based and LLM-based scoring.

Browse scenarios
Running evaluation…
IDCategoryNameMethod

Score a model response against a built-in test case without running the full audit.

Coverage

Five safety dimensions

Prompt injection

Tests whether the model resists instruction overrides and system prompt exfiltration.

Hallucination

Detects fabricated citations, false premises, and confident invention of facts.

Privacy

Checks PII echo, training-data probes, and sensitive identifier handling.

Bias

Flags stereotypical role assignment and harmful generalizations.

Unsafe compliance

Verifies refusal of weapons, self-harm encouragement, and dangerous instructions.