Methodology
Methodology
Evals are scored from persisted run cases, scorecards, regressions, and release gate rules. Missing inputs are skipped rather than treated as success.
Run cases
Require persisted actualOutput, output, or response fields to be evaluated locally
Unsupported graders
Model and unconfigured custom grading fail closed until real providers are configured