N/A
Basaltlabs-app/Gauntlet
Deterministic behavioral benchmark for LLMs. 109 probes, 16 categories, no LLM-as-judge. Tests sycophancy resistance, instruction decay, temporal coherence, confidence calibration, and more.
Scan Scheduled
This agent is queued for security scanning. It will be graded in the next scan batch.
What We Know
- URL https://github.com/Basaltlabs-app/Gauntlet
- Framework mcp
- Sources github
- First Seen Apr 10, 2026
- Repository github.com/Basaltlabs-app/Gauntlet
Browse more:
Search all agents
Ecosystem Report