LLM eval rubric

Builds a structured eval rubric to grade LLM outputs for a specific task.

Stops vibes-based "did the AI do good?" review.

AI / MLevalstesting

Task description

79/8000

Output

Output will appear here after you click Run.

Or run in your favourite chatbot

Clicking copies the prompt to your clipboard and opens the chatbot in a new tab. Gemini doesn't accept URL params — paste manually with Ctrl/Cmd+V.