Tag: AGI benchmark
AGI Benchmark: Assessing AI Models for Catastrophic Harm
Scientists at OpenAI have developed a new set of tests called MLE-bench to evaluate the ability of artificial intelligence (AI) agents to modify their own code and enhance their capabilities without human intervention. This...