Platform Core Capabilities & Examples

Highlights across authoritative benchmarks and real-world examples.

ARC-AGI-2 Results

On the ultra-difficult ARC-AGI-2 general intelligence test, Gemini 3.0 with thinking mode reaches ~35% accuracy, while others stay below 20%.

On the notoriously hard “Human Last Exam (HLE)” benchmark, Gemini 3.0 scores 32.4% — outperforming GPT-5 (high) and Grok 4.

Gemini 3.0 handles images including SVGs with ease. The famous cycling pelican SVG test wowed the community with Gemini 3.0 Pro output.

Previously hard Gundam robot and Switch controller renders now look visibly upgraded — getting very close to real product photos.

Gemini 3.0 Pro leads the coding arena by a wide margin.

Gemini 3 has become truly super‑intelligent; the series makes a leap forward.

Gemini 3.0 Pro generated an image from the prompt “the Power Rangers standing in the scene with typical poses the power rangers do”.

Reference:“Gemini 3 internal tests praised as possibly the best frontend dev model”. Examples and numbers compiled from the article and community tests.