arstechnica.com/ai/2024/11/new-secret-math-benchmark-stumps-ai-models-and-phds-alike/ mentions what the official website is unable to clearly state out:So yeah, fuck off.
The design of FrontierMath differs from many existing AI benchmarks because the problem set remains private and unpublished to prevent data contamination