PPL Circuit Coverage Perplexity Gap

Completed Marin 32B vs Qwen3 32B gap report with span heatmaps generated from the local scored-document parquet files. Positive gap means Marin is worse; negative gap means Marin is better.
Source: gs://marin-us-central1/analysis/perplexity_gap/ppl_circuit_coverage/marin_32b-vs-qwen3_32b-d0e561

Marin 32B vs Qwen3 32B

ppl_circuit_coverage · 129,648 docs · 3,977,331 bytes

model_a: marin-community/marin-32b-base
model_b: Qwen/Qwen3-32B

Dataset Slices

NameGapBarMarinQwenDocsBytes

Span Types

Dataset Groups

NameGapBarMarinQwenDocsBytes