Methods is the stable reference page for how AI Chess evaluates work: what counts as a trustworthy result, how benchmarks are framed, and what a release needs before it is worth publishing.
Verification before celebration
Performance claims are cheap when the correctness story is vague. The site favors reproducible validation, known-good baselines, and enough runtime context to interpret an outlier without guessing.
Benchmark discipline
- Keep hardware context attached to reported runs
- Separate trusted baselines from active experiment branches
- Record anomalies instead of smoothing them away
Publishing rules
- Research entries should explain the question, the method, and the current conclusion
- Tool entries should carry versioning, release notes, and checksums when binaries are posted
- Field Notes can be shorter, but they should still leave behind something concrete
Replace this seeded page with the exact methodology you want readers and collaborators to trust.