Performance

Model Quant Size RADV pp RADV tg AMDVLK pp AMDVLK tg
Gemma-4-E2B UD-Q4_K_XL 2.9 GiB 3382 109 954 99
Gemma-4-E4B UD-Q4_K_XL 4.7 GiB 1828 59 491 57
GPT-OSS-20B-Derestricted MXFP4 11 GiB 1405 77 1380 77
Qwen3.5-4B Q8_0 4 GiB 1375 37.9 510 40.0
Gemma-4-26B-A4B Q8_0 25 GiB 1303 47.7 790 47.1
Qwen3-30B-Instruct-2507 UD-Q4_K_XL 16.5 GiB 1143 92 936 93
Nemotron-3-Nano-30B-A3B New UD-Q4_K_XL 21.3 GiB 1106 68 776 65
Qwen3.5-35B-A3B Unsloth UD-Q4_K_XL 21 GiB 1017 59.8 686 60.4
GLM-4.7-Flash New UD-Q4_K_XL 16.3 GiB 990 73 529 70
Qwen3.5-9B UD-Q4_K_XL 5.6 GiB 972 35.7 289 35.2
Nemotron-Cascade-2-30B-A3B Q8_0 31 GiB 968 54
Kimi-Linear-48B-A3B New Q4_K_M 28 GiB 789 72 574 72
Ministral-3-14B New UD-Q4_K_XL 7.8 GiB 696 25 173 25
GPT-OSS-120B MXFP4 59 GiB 596 56 661 53
Qwen3-Coder-Next-80B MXFP4 41 GiB 586 40 462 43
Magistral-Small-2509 New UD-Q4_K_XL 13.5 GiB 389 15 94 15
Devstral-Small-2-24B New UD-Q4_K_XL 13.5 GiB 382 15 94 15
Qwen3.5-27B UD-Q4_K_XL 16 GiB 310 12.1 86 11.9
Qwen3.5-122B-A10B Unsloth UD-Q4_K_XL 72 GiB 287 22.4 197 21.9
MiniMax-M2.5 New Unsloth UD-Q3_K_XL 94 GiB 179 22 164 32
Gemma-4-31B Unsloth UD-Q4_K_XL (Apr 11) 17.5 GiB 261 11.1 70.8 11.1

Quality

Model Writing /30 LRU /10 FastAPI /8 LeetCode /59 Polyglot /65 Postgres /57 Cassandra /56 Combined /285
Gemma-4-31B 27 10 8 59 15 44 38 201
Gemma-4-26B-A4B 28 10 8 59 15 45 29 194
Qwen3.5-122B-A10B 29 10 8 59 13 36 34 192
MiniMax-M2.5 New 26 10 7 59 13 40 30 185
GPT-OSS-120B 20 10 8 59 14 40 31 182
Qwen3.5-35B-A3B 28 10 8 59 8 32 33 178
Kimi-Linear-48B-A3B New 30 10 8 57 22 26 24 177
Qwen3.5-27B 25 10 8 59 10 34 29 175
Qwen3-30B-Instruct-2507 30 10 2 59 13 27 31 172
Qwen3-Coder-Next-80B 26 10 2 59 9 33 32 171
Devstral-Small-2-24B New 27 10 2 59 11 29 31 169
GPT-OSS-20B-Derestricted 13 10 8 59 14 37 23 164
Gemma-4-E2B 18 7 8 59 7 27 27 153
Gemma-4-E4B 23 7 8 59 3 26 26 152
Ministral-3-14B New 26 2 2 59 16 23 18 146
Nemotron-Cascade-2-30B-A3B 18 10 8 59 1 22 21 139
Qwen3.5-9B 20 10 0 51 5 28 22 136
Qwen3.5-4B 16 9 8 54 3 17 16 123
Magistral-Small-2509 New 20 0 8 30 2 12 35 107
Nemotron-3-Nano-30B-A3B New 20 4 0 46 7 16 16 109
GLM-4.7-Flash New 14 0 0 16 0 23 27 80

Key Findings

Partial Results

Models evaluated before the full benchmark suite was established. These ran writing and LRU cache tests but not the complete battery. No longer on disk. Listed here for historical reference.

Performance

Model Quant Size RADV pp RADV tg AMDVLK pp AMDVLK tg
Step3.5-Flash IQ3_XS 76 GiB 237 32
Nemotron-3-Super-120B-A12B Unsloth UD-Q4_K_XL 78 GiB 196 10.2 139 9.86

Quality

Model Writing /30 LRU /10
Ling-Flash-2.0 26 2
Nemotron-3-Super-120B-A12B 25 10
Devstral-2-123B 25 2
Solar-Open-100B 21 0
Mistral-Large-2411 20 2