Tag: Artificial intelligence benchmarking

news-14102024-210611

AGI Benchmark: Assessing AI Models for Catastrophic Harm

Scientists at OpenAI have developed a new set of tests called MLE-bench to evaluate the ability of artificial intelligence (AI) agents to modify their own code and enhance their capabilities without human intervention. This...