Measuring progress toward AGI: A cognitive framework

We’re introducing a framework to measure progress toward AGI, and launching a Kaggle hackathon to build the relevant evaluations.

Oran Kelly

Product Manager, Google DeepMind

General summary

Google DeepMind wants to help measure the progress of Artificial General Intelligence (AGI) using cognitive science. Their new paper, "Measuring Progress Toward AGI: A Cognitive Taxonomy," presents a framework for understanding AI systems' cognitive capabilities. You can participate by designing evaluations for key cognitive abilities in their Kaggle hackathon for a chance to win from a prize pool of $200,000.

Summaries were generated by Google AI. Generative AI is experimental.

Several rectangles in lines diagonally across the image. Each rectangle has swirls.

Your browser does not support the audio element.

Listen to article

This content is generated by Google AI. Generative AI is experimental

[[duration]] minutes

Artificial General Intelligence (AGI) has the potential to accelerate scientific discovery and help solve some of humanity’s most pressing problems. But it can be difficult to know how close we are to this key milestone, because there’s a lack of empirical tools for evaluating systems’ general intelligence. Tracking progress toward AGI will require a wide range of methods and approaches, and we believe cognitive science provides one important piece of the puzzle.

That’s why today, we’re releasing a new paper, “Measuring Progress Toward AGI: A Cognitive Taxonomy,” that presents a scientific foundation for understanding the cognitive capabilities of AI systems.

Alongside the paper, we are partnering with Kaggle to launch a hackathon, inviting the research community to help build the evaluations needed to put this framework into practice.

Deconstructing general intelligence

Our framework draws on decades of research from psychology, neuroscience and cognitive science to develop a cognitive taxonomy. It identifies 10 key cognitive abilities that we hypothesize will be important for general intelligence in AI systems:

Perception: extracting and processing sensory information from the environment
Generation: producing outputs such as text, speech and actions
Attention: focusing cognitive resources on what matters
Learning: acquiring new knowledge through experience and instruction
Memory: storing and retrieving information over time
Reasoning: drawing valid conclusions through logical inference
Metacognition: knowledge and monitoring of one's own cognitive processes
Executive functions: planning, inhibition and cognitive flexibility
Problem solving: finding effective solutions to domain-specific problems
Social cognition: processing and interpreting social information and responding appropriately in social situations

Bubbles all connecting to the central bubble "Cognitive faculties". Each bubble list a cognitive faculty.

To understand AI capabilities across these cognitive abilities, we propose a three-stage evaluation protocol that benchmarks system performance in relation to human capabilities:

Evaluate AI systems across a broad suite of cognitive tasks covering each ability, using held-out test sets to prevent data contamination
Collect human baselines for the same tasks from a demographically representative sample of adults
Map each AI system’s performance relative to the distribution of human performance in each ability

Going from theory to practice

Defining these cognitive abilities is a crucial first step, but we need more than a framework to measure progress. To put this theory into practice, we are launching a new Kaggle hackathon — “Measuring progress toward AGI: Cognitive abilities”. The hackathon encourages the community to design evaluations for five cognitive abilities where the evaluation gap is the largest: learning, metacognition, attention, executive functions and social cognition.

Participants can use Kaggle's newly launched Community Benchmarks platform to build and test their evaluations against a lineup of frontier models.

We are offering a total prize pool of $200,000: $10,000 awards for the top two submissions in each of the five tracks, and $25,000 grand prizes for the four absolute best overall submissions. Submissions are open March 17 through April 16, and we’ll announce the results June 1. Head over to the Kaggle website to start building.

Done. Just one step more.

Check your inbox to confirm your subscription.

You are already subscribed to our newsletter.

You can also subscribe with a

推荐订阅源

Google DeepMind News

General summary

Deconstructing general intelligence

Going from theory to practice

Related stories

推荐订阅源

Google DeepMind News

General summary

Deconstructing general intelligence

Going from theory to practice

Get more stories from Google in your inbox.

Related stories