相关标签
uncertainty-quantificationuncertainty-estimationai-safetyconfidence-scorehallucinationconfidence-estimationai-evaluationllmllm-evaluationllm-safety

Here are 318 public repositories matching this topic...

The open-source diagnostic for AI misalignment. 32 tests across fabrication, manipulation, deception, unpredictability, and opacity. Provider-agnostic. Runs against OpenAI, Anthropic, Bedrock, Azure, Gemini, and more. Letter grade in under 5 minutes, content-addressed manifest for bit-identical replay. Built by iMe.

  • Updated May 15, 2026
  • Python