LLM eval rubric
Builds a structured eval rubric to grade LLM outputs for a specific task.
Stops vibes-based "did the AI do good?" review.
AI / MLevalstesting
79/8000
Output will appear here after you click Run.
Or run in your favourite chatbot
Clicking copies the prompt to your clipboard and opens the chatbot in a new tab. Gemini doesn't accept URL params β paste manually with Ctrl/Cmd+V.