Skip to main content
7BBusyBoss

OpenAI Evals

by openai

OpenAI's framework for benchmarking LLMs and an open-source registry of evals. Industry-standard test harness.

16,000 stars🍴 0 forksPythonbenchmarkevaluationopenai