Score contribution: google/gemma-4-31B-it

### Model
google/gemma-4-31B-it

### Public score sources
- GPQA Diamond: 84.3 — https://huggingface.co/google/gemma-4-31B-it
- HLE (Humanity's Last Exam): 19.5 — https://huggingface.co/google/gemma-4-31B-it
- AIME 2026: 89.2 — https://huggingface.co/google/gemma-4-31B-it
- MMLU-Pro: 85.2 — https://huggingface.co/google/gemma-4-31B-it
- MMMLU: 85 — https://huggingface.co/google/gemma-4-31B-it
- BigBench Hard (BBH): 91.5 — https://huggingface.co/google/gemma-4-31B-it
- MathVision: 85.6 — https://huggingface.co/google/gemma-4-31B-it
- OmniDocBench 1.5: 91 — https://huggingface.co/google/gemma-4-31B-it
- OpenAI MRCR v2 (8-needle): 66.4 — https://huggingface.co/google/gemma-4-31B-it
- τ³-Bench: 76.9 — https://huggingface.co/google/gemma-4-31B-it
- LiveCodeBench: 80 — https://huggingface.co/google/gemma-4-31B-it

### BenchPress output
- SWE-Lancer IC SWE Diamond Freelance ($): 64363.3
- Vending-Bench 2: 6236.9
- Codeforces Rating: 2315.1
- GDPval (Artificial Analysis ELO): 1776
- Chatbot Arena Elo: 1414.5
- MATH-500: 98.5
- GSM8K: 97.5
- COLLIE: 97.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Score contribution: google/gemma-4-31B-it #8

Model

Public score sources

BenchPress output

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Score contribution: google/gemma-4-31B-it #8

Description

Model

Public score sources

BenchPress output

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions