EdgeMatrix Benchmark Dashboard

Comprehensive AI model performance analysis across hardware platforms

Average Performance Gain

65.9%

Across all models

Average Cost Savings

34.9%

hardware costs

Average Power Efficiency

34.9%

power saving

Models Tested

62

configurations

Peak Throughput

21,015.55

tokens/sec

Showing 193 results
ModelModel TypeCategoryHardwareSizePrecisionWithout EdgeMatrixWith EdgeMatrixImprovementCost SavingsPower Efficiency
Shakti-1B
Shakti Family
Multimodal
GPU
A100 (80GB)
1.88GBFP1616,25021,015.5529.3%22.7%22.7%
Qwen2-VL-2B
Qwen Family
Multimodal
GPU
A100 (80GB)
4.419GBFP1612,690.319,963.457.3%36.4%36.4%
DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek Family
Dense
GPU
A100 (80GB)
3.55GBFP1612,931.4218,104.7840.0%28.6%28.6%
Qwen2.5-VL-3B
Qwen Family
Multimodal
GPU
A100 (80GB)
7.51GBFP168,234.1113,903.6568.9%40.8%40.8%
DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek Family
Dense
GPU
L40s (48GB)
3.55GBFP167,99813,040.7763.1%38.7%38.7%
Shakti-2.5B
Shakti Family
Dense
GPU
H100 (80GB)
6.43GBFP86,672.5712,403.7285.9%46.2%46.2%
Shakti-1B
Shakti Family
Multimodal
GPU
L40s (48GB)
1.88GBFP167,57012,064.459.4%37.3%37.3%
DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek Family
Dense
GPU
A100 (80GB)
1.12GBQ49,004.6711,412.4126.7%21.1%21.1%
Shakti-2.5B
Shakti Family
Dense
GPU
H100 (80GB)
6.43GBFP166,615.8811,110.4867.9%40.5%40.5%
Qwen2-VL-2B
Qwen Family
Multimodal
GPU
L40s (48GB)
4.419GBFP166,509.4410,659.1963.7%38.9%38.9%
Shakti-4B
Shakti Family
Multimodal
GPU
A100 (80GB)
7.42GBFP167,56410,099.333.5%25.1%25.1%
Gemma-2-2B-IT
Gemma Family
Dense
GPU
A100 (80GB)
5.14GBFP167,256.99,954.1337.2%27.1%27.1%
Gemma-2-2B-IT
Gemma Family
Dense
GPU
A100 (80GB)
1.71GBQ48,904.679,929.4111.5%10.3%10.3%
InternVL2.5-4B
InternVL Family
Multimodal
GPU
A100 (80GB)
7.42GBFP164,576.069,542.87108.5%52.0%52.0%
Llama-3-8B
Llama Family
Dense
GPU
H100 (80GB)
16GBFP86,669.559,490.7142.3%29.7%29.7%
Gemma-3-4B-IT
Gemma Family
Multimodal
GPU
A100 (80GB)
8.6GBFP165,798.239,462.9163.2%38.7%38.7%
Qwen2.5-VL-3B
Qwen Family
Multimodal
GPU
L40s (48GB)
7.51GBFP161,738.959,188.68428.4%81.1%81.1%
DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek Family
Dense
GPU
L40s (48GB)
1.12GBQ47,011.018,527.6121.6%17.8%17.8%
Llama-3.1-8B
Llama Family
Dense
GPU
H100 (80GB)
16GBFP166,473.077,956.7422.9%18.6%18.6%
Shakti-2.5B
Shakti Family
Dense
GPU
A100 (80GB)
6.43GBFP166,120.037,854.4228.3%22.1%22.1%
Phi-4-mini-Instruct
Phi Family
Dense
GPU
A100 (80GB)
7.67GBFP165,285.016,870.5230.0%23.1%23.1%
Llama-3.2-3B
Llama Family
Dense
GPU
A100 (80GB)
2.02GBQ44,014.236,852.2370.7%41.4%41.4%
Phi-4-mini-reasoning
Phi Family
Dense
GPU
A100 (80GB)
7.67GBFP163,105.496,607.25112.8%53.0%53.0%
Llama-3.2-3B
Llama Family
Dense
GPU
A100 (80GB)
6.6GBFP163,581.46,362.8977.7%43.7%43.7%
Gemma-3-4B-IT
Gemma Family
Multimodal
GPU
A100 (80GB)
8.6GBFP164,998.026,063.0921.3%17.6%17.6%
Shakti-4B
Shakti Family
Multimodal
GPU
L40s (48GB)
7.42GBFP162,3395,789147.5%59.6%59.6%
Gemma-2-2B-IT
Gemma Family
Dense
GPU
L40s (48GB)
1.71GBQ44,5015,699.226.6%21.0%21.0%
Shakti-2.5B
Shakti Family
Dense
GPU
L40s (48GB)
6.43GBFP163,122.925,612.2479.7%44.4%44.4%
InternVL2.5-4B
InternVL Family
Multimodal
GPU
L40s (48GB)
7.42GBFP162,951.755,569.3488.7%47.0%47.0%
Llama-3.2-3B
Llama Family
Dense
GPU
L40s (48GB)
2.02GBQ43,163.25,183.4563.9%39.0%39.0%
InternVL2-2B
InternVL Family
Multimodal
GPU
A100 (80GB)
4.41GBFP162,0744,970.63139.7%58.3%58.3%
Llama-3.1-8B
Llama Family
Dense
GPU
A100 (80GB)
16GBFP163,796.374,875.8628.4%22.1%22.1%
Gemma-2-2B-IT
Gemma Family
Dense
GPU
L40s (48GB)
5.14GBFP162,862.54,698.0164.1%39.1%39.1%
InternVL2-2B
InternVL Family
Multimodal
GPU
L40s (48GB)
4.41GBFP161,9594,684.38139.1%58.2%58.2%
Llama-Guard-3-8B
Llama Family
Dense
GPU
A100 (80GB)
16.07GBFP163,528.94,51728.0%21.9%21.9%
Llama-3.2-3B
Llama Family
Dense
GPU
L40s (48GB)
6.6GBFP161,676.34,467.32166.5%62.5%62.5%
DeepSeek-R1-Distill-Llama-8B
DeepSeek Family
Dense
GPU
A100 (80GB)
16.06GBFP163,424.034,417.0929.0%22.5%22.5%
Qwen1.5-MoE-A2.7B-Chat
Qwen Family
MoE
GPU
A100 (80GB)
28.63 GBFP164,283.784,378.792.2%2.2%2.2%
Janus-Pro-7B
DeepSeek Family
Multimodal
GPU
A100 (80GB)
14.84GBFP162,6544,23559.6%37.3%37.3%
Gemma-2-9B-IT
Gemma Family
Dense
GPU
A100 (80GB)
18.48GBFP163,269.694,184.5228.0%21.9%21.9%
Phi-4-mini-Instruct
Phi Family
Dense
GPU
A100 (80GB)
2.49GBQ43,121.234,117.1131.9%24.2%24.2%
Phi-4-mini-reasoning
Phi Family
Dense
GPU
A100 (80GB)
2.49GBQ43,281.024,109.3225.2%20.2%20.2%
LLaVA-OneVision-Qwen2-7B
LLaVA Family
Multimodal
GPU
A100 (80GB)
16.06GBFP161,8504,101.7121.7%54.9%54.9%
Gemma-3-4B-IT
Gemma Family
Multimodal
GPU
A100 (80GB)
2.49GBQ43,039.124,007.5531.9%24.2%24.2%
Qwen3-4B
Qwen Family
Dense
GPU
A100 (80GB)
2.2GBQ43,068.123,917.5527.7%21.7%21.7%
Gemma-3-4B-IT
Gemma Family
Multimodal
GPU
L40s (48GB)
8.6GBFP162,220.353,90375.8%43.1%43.1%
DeepSeek-R1-Distill-Llama-8B
DeepSeek Family
Dense
GPU
A100 (80GB)
4.92GBQ43,098.773,88425.3%20.2%20.2%
Qwen3-4B
Qwen Family
Dense
GPU
A100 (80GB)
8.1GBFP163,105.593,837.0523.6%19.1%19.1%
Llama-3.1-8B
Llama Family
Dense
GPU
A100 (80GB)
4.92GBQ43,081.023,822.9224.1%19.4%19.4%
Gemma-3-12B-IT
Gemma Family
Multimodal
GPU
A100 (80GB)
24.32GBFP163,061.53,801.0124.2%19.5%19.5%
Phi-4-multimodal
Phi Family
Multimodal
GPU
A100 (80GB)
11.12GBFP161,8163,769.45107.6%51.8%51.8%
Qwen3-8B
Qwen Family
Dense
GPU
A100 (80GB)
5.03GBQ43,208.873,709.0915.6%13.5%13.5%
Phi-mini-MoE-instruct
Phi Family
MoE
GPU
A100 (80GB)
15.3 GBFP163,555.563,704.664.2%4.0%4.0%
Phi-4-multimodal
Phi Family
Multimodal
GPU
L40s (48GB)
11.12GBFP161,564.633,641132.7%57.0%57.0%
Llama-Guard-3-8B
Llama Family
Dense
GPU
A100 (80GB)
4.92GBQ42,967.973,618.0621.9%18.0%18.0%
Qwen3-4B
Qwen Family
Dense
GPU
L40s (48GB)
2.2GBQ42,991.433,587.319.9%16.6%16.6%
deepseek-moe-16b-chat
deepseek Family
MoE
GPU
A100 (80GB)
32.77 GBFP163,250.713,556.089.4%8.6%8.6%
Qwen2.5-7B
Qwen Family
Dense
GPU
L40s (48GB)
15.2GBFP163,112.533,554.4114.2%12.4%12.4%
Phi-4-mini-Instruct
Phi Family
Dense
GPU
L40s (48GB)
7.67GBFP162,116.713,49365.0%39.4%39.4%
Phi-4-mini-reasoning
Phi Family
Dense
GPU
L40s (48GB)
7.67GBFP162,013.63,434.870.6%41.4%41.4%
Llama-3.2-1B
Llama Family
Dense
Device
Tesla T4 (16GB)
808MBFP162,793.423,256.6816.6%14.2%14.2%
Gemma-2-9B-IT
Gemma Family
Dense
GPU
A100 (80GB)
5.76GBQ42,360.333,20936.0%26.4%26.4%
Qwen3-8B
Qwen Family
Dense
GPU
A100 (80GB)
16.5GBFP162,845.23,159.0211.0%9.9%9.9%
Qwen3-8B
Qwen Family
Dense
GPU
L40s (48GB)
5.03GBQ42,829.543,118.2310.2%9.3%9.3%
Qwen3-4B
Qwen Family
Dense
GPU
L40s (48GB)
8.1GBFP162,696.343,110.3415.4%13.3%13.3%
Llama-3.1-8B
Llama Family
Dense
GPU
L40s (48GB)
4.92GBQ42,639.543,089.1317.0%14.6%14.6%
Tiiuae/falcon-7b
Falcon Family
Dense
GPU
L40s (48GB)
14.43GBFP162,758.432,987.558.3%7.7%7.7%
Mistral-7B-v0.1
Ministral Family
Dense
GPU
L40s (48GB)
14.48GBFP162,705.882,987.0410.4%9.4%9.4%
Gemma-3-4B-IT
Gemma Family
Multimodal
GPU
L40s (48GB)
8.6GBFP161,749.452,974.8570.0%41.2%41.2%
Openchat-3.6-8b-20240522
OpenChat Family
Dense
GPU
L40s (48GB)
16.1GBFP162,514.412,920.4416.1%13.9%13.9%
Gemma-3-12B-IT
Gemma Family
Multimodal
GPU
A100 (80GB)
7.3GBQ42,241.312,904.8629.6%22.8%22.8%
Qwen3-8B
Qwen Family
Dense
GPU
L40s (48GB)
16.5GBFP162,428.232,874.1218.4%15.5%15.5%
Ministral-3-8B-Base-2512
Ministral Family
Multimodal
GPU
L40s (48GB)
17.84GBFP162,554.042,845.6911.4%10.2%10.2%
Llama-3.1-8B
Llama Family
Dense
GPU
L40s (48GB)
16GBFP161,578.772,748.5174.1%42.6%42.6%
Meta-Llama-3-8B
Llama Family
Dense
GPU
L40s (48GB)
16.07GBFP162,548.072,747.817.8%7.3%7.3%
Janus-Pro-7B
DeepSeek Family
Multimodal
GPU
L40s (48GB)
14.84GBFP161,3802,746.7699.0%49.8%49.8%
LLaVA-OneVision-Qwen2-7B
LLaVA Family
Multimodal
GPU
L40s (48GB)
16.06GBFP161,154.62,704.22134.2%57.3%57.3%
Phi-3-mini-4k-instruct
Phi Family
Dense
GPU
L40s (48GB)
7.64GBFP162,551.352,678.525.0%4.7%4.7%
Command-r7b-12-2024
Command R Family
Dense
GPU
L40s (48GB)
16.06GBFP162,310.672,628.1313.7%12.1%12.1%
Yi-9B
Yi Family
Dense
GPU
L40s (48GB)
17.7GBFP162,228.232,604.2916.9%14.4%14.4%
Granite-3.0-8b-instruct
IBM Granite Family
Dense
GPU
L40s (48GB)
16.34GBFP162,367.432,596.869.7%8.8%8.8%
Llama-Guard-3-8B
Llama Family
Dense
GPU
L40s (48GB)
16.07GBFP161,528.92,57568.4%40.6%40.6%
InternVL2-8B
InternVL Family
Multimodal
GPU
A100 (80GB)
16.16GBFP161,553.842,423.8256.0%35.9%35.9%
Deepseek-llm-7b-base
DeepSeek Family
Dense
GPU
L40s (48GB)
13.8GBFP161,972.132,291.5216.2%13.9%13.9%
DeepSeek-R1-Distill-Llama-8B
DeepSeek Family
Dense
GPU
L40s (48GB)
16.06GBFP161,552.962,281.9146.9%31.9%31.9%
Phi-4-mini-Instruct
Phi Family
Dense
GPU
L40s (48GB)
2.49GBQ41,752.22,238.4127.7%21.7%21.7%
Phi-4-mini-reasoning
Phi Family
Dense
GPU
L40s (48GB)
2.49GBQ41,760.32,149.9822.1%18.1%18.1%
Gemma-7b
Gemma Family
Dense
GPU
L40s (48GB)
17.07GBFP161,734.532,061.9618.9%15.9%15.9%
Tiiuae/falcon-11b
Falcon Family
Dense
GPU
L40s (48GB)
22.2GBFP161,715.252,001.216.7%14.3%14.3%
Gemma-3-4B-IT
Gemma Family
Multimodal
GPU
L40s (48GB)
2.49GBQ41,6821,890.3612.4%11.0%11.0%
Qwen2.5-14B
Qwen Family
Dense
GPU
L40s (48GB)
21.47GBFP161,664.251,834.0410.2%9.3%9.3%
SOLAR-10.7B-v1.0
SOLAR Family
Dense
GPU
L40s (48GB)
21.47GBFP161,664.251,834.0410.2%9.3%9.3%
DeepSeek-R1-Distill-Qwen-14B
DeepSeek Family
Dense
GPU
L40s (48GB)
29.5GBFP161,422.781,796.3526.3%20.8%20.8%
Gemma-2-9B-IT
Gemma Family
Dense
GPU
L40s (48GB)
18.48GBFP161,138.091,723.4651.4%34.0%34.0%
Gemma-2 9B IT
Gemma Family
Dense
GPU
L40s (48GB)
18.48GBFP161,138.091,723.4651.4%34.0%34.0%
InternVL2-8B
InternVL Family
Multimodal
GPU
L40s (48GB)
16.16GBFP161,020.331,700.6566.7%40.0%40.0%
Phi-3-medium-4k-instruct
Phi Family
Dense
GPU
L40s (48GB)
27.92GBFP161,388.351,667.4420.1%16.7%16.7%
DeepSeek-R1-Distill-Llama-8B
DeepSeek Family
Dense
GPU
L40s (48GB)
4.92GBQ41,298.21,569.8820.9%17.3%17.3%
Llama-3.2-3B
Llama Family
Dense
GPU
T4 (16GB)
6.6GBFP161,132.431,518.1934.1%25.4%25.4%
Gemma-2-9B-IT
Gemma Family
Dense
GPU
L40s (48GB)
5.76GBQ41,156.51,428.0723.5%19.0%19.0%
Gemma-3-12B-IT
Gemma Family
Multimodal
GPU
L40s (48GB)
24.32GBFP16941.331,412.0950.0%33.3%33.3%
Gemma-3-12B-IT
Gemma Family
Multimodal
GPU
L40s (48GB)
7.3GBQ4831.91,049.8626.2%20.8%20.8%
Nous-Hermes-13b
Nous Hermes Family
Dense
GPU
L40s (48GB)
26GBFP16749.431,023.6136.6%26.8%26.8%
Baichuan-13B-Chat
Baichuan Family
Dense
GPU
L40s (48GB)
26.5GBFP16688.44993.1744.3%30.7%30.7%
Llama-Guard-3-8B
Llama Family
Dense
GPU
L40s (48GB)
4.92GBQ4769.15978.7427.2%21.4%21.4%
Llama-2-13b
Llama Family
Dense
GPU
L40s (48GB)
26GBFP16663.91935.7140.9%29.0%29.0%
Llama-3.1-8B
Llama Family
Dense
Device
Tesla T4 (16GB)
16GBINT4380.59502.4332.0%24.3%24.3%
Shakti-250M
Shakti Family
Dense
Device
MacBook Pro M3 (36GB)
148MBQ429538530.5%23.4%23.4%
Shakti-100M
Shakti Family
Dense
Device
MacBook Pro M3 (36GB)
126MBQ428036530.4%23.3%23.3%
Shakti-500M
Shakti Family
Dense
Device
MacBook Pro M3 (36GB)
303MBQ4215281.4330.9%23.6%23.6%
SmolLM2-135M
SmolLM Family
Dense
Device
MacBook Pro M3 (36GB)
105MBQ4175227.2129.8%23.0%23.0%
SmolLM2-360M
SmolLM Family
Dense
Device
MacBook Pro M3 (36GB)
271MBQ4140182.8130.6%23.4%23.4%
Qwen2.5-500M
Qwen Family
Dense
Device
MacBook Pro M3 (36GB)
398MBQ4135173.8228.8%22.3%22.3%
Shakti-100M
Shakti Family
Dense
Device
iPhone 14 (6GB)
126MBQ4120153.728.1%21.9%21.9%
Qwen3-0.6B
Qwen Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
456.11MBQ454.03143.27165.2%62.3%62.3%
Shakti-2.5B
Shakti Family
Dense
Device
MacBook Pro M3 (36GB)
1.5GBQ49512834.7%25.8%25.8%
Qwen3-0.6B
Qwen Family
Dense
CPU
AMD EPYC 9554 (32 cores, 117GB)
456.11MBQ445.11115.6156.3%61.0%61.0%
Qwen3-1.7B
Qwen Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
1.28GBQ449.2298.73100.6%50.1%50.1%
DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
1.12GBQ442.0197.01130.9%56.7%56.7%
Qwen3-1.7B
Qwen Family
Dense
CPU
AMD EPYC 9554 (32 cores, 117GB)
1.28GBQ436.4494.34158.9%61.4%61.4%
Shakti-250M
Shakti Family
Dense
Device
iPhone 14 (6GB)
148MBQ46588.1135.6%26.2%26.2%
Llama-3.3-70B
Llama Family
Dense
GPU
A100 (80GB)
42.5GBQ448.8784.2472.4%42.0%42.0%
Llama-3.2-3B
Llama Family
Dense
CPU
AMD EPYC 9554 (32 cores, 117GB)
2.02GBQ432.982.34150.3%60.0%60.0%
Qwen3-0.6B
Qwen Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
456.11MBQ437.982.2116.9%53.9%53.9%
Qwen3-1.7B
Qwen Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
1.28GBQ432.7774.23126.5%55.9%55.9%
Llama-3.2-3B
Llama Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
2.02GBQ446.469.8450.5%33.6%33.6%
Phi-4-mini-reasoning
Phi Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
2.49GBQ42565.37161.5%61.8%61.8%
Llama-3.2-3B
Llama Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
2.02GBQ428.3964.23126.2%55.8%55.8%
Phi-4-mini-Instruct
Phi Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
2.49GBQ426.663.28137.9%58.0%58.0%
Shakti-500M
Shakti Family
Dense
Device
iPhone 14 (6GB)
303MBQ44562.438.7%27.9%27.9%
Shakti-100M
Shakti Family
Dense
Device
Raspberry Pi 5 (8GB)
126MBQ44560.7435.0%25.9%25.9%
Gemma-2-2B-IT
Gemma Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
1.71GBQ441.3156.6237.1%27.0%27.0%
Qwen3-0.6B
Qwen Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
456.11MBQ429.3455.6389.6%47.3%47.3%
Qwen3-4B
Qwen Family
Dense
CPU
AMD EPYC 9554 (32 cores, 117GB)
2.2GBQ422.653.99138.9%58.1%58.1%
DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
1.12GBQ430.7853.7474.6%42.7%42.7%
Qwen3-4B
Qwen Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
2.2GBQ438.0453.1139.6%28.4%28.4%
DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
1.12GBQ429.4750.1870.3%41.3%41.3%
Shakti-250M
Shakti Family
Dense
Device
Raspberry Pi 5 (8GB)
148MBQ43548.91139.7%28.4%28.4%
Gemma-3-4B-IT
Gemma Family
Multimodal
CPU
AMD EPYC 9554 (60 cores, 201GB)
2.49GBQ421.9947.4115.6%53.6%53.6%
Qwen3-4B
Qwen Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
2.2GBQ419.744.01123.4%55.2%55.2%
Gemma-2-2B-IT
Gemma Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
1.71GBQ422.1743.5296.3%49.1%49.1%
Qwen3-1.7B
Qwen Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
1.28GBQ422.8741.7682.6%45.2%45.2%
Qwen3-8B
Qwen Family
Dense
CPU
AMD EPYC 9554 (32 cores, 117GB)
5.03GBQ416.0239.98149.6%59.9%59.9%
Llama-3.1-8B
Llama Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
4.92GBQ422.6736.8162.4%38.4%38.4%
Shakti-100M
Shakti Family
Dense
CPU
Intel Xeon Silver 4110 (197GB)
126MBQ421.3536.6471.6%41.7%41.7%
Llama-Guard-3-8B
Llama Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
4.92GBQ413.0135.89175.9%63.8%63.8%
Llama-3.1-8B
Llama Family
Dense
CPU
AMD EPYC 9554 (32 cores, 117GB)
4.92GBQ419.4334.4277.1%43.6%43.6%
Qwen3-8B
Qwen Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
5.03GBQ428.6434.2119.4%16.3%16.3%
DeepSeek-R1-Distill-Llama-8B
DeepSeek Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
4.92GBQ415.433.91120.2%54.6%54.6%
Llama-3.3-70B
Llama Family
Dense
GPU
L40s (48GB)
42.5GBQ419.7833.4869.3%40.9%40.9%
SmolLM2-135M
SmolLM Family
Dense
Device
Raspberry Pi 5 (8GB)
105MBQ42532.35529.4%22.7%22.7%
Gemma-2-2B-IT
Gemma Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
1.71GBQ414.6731.8116.8%53.9%53.9%
Gemma-3-4B-IT
Gemma Family
Multimodal
CPU
AMD EPYC 9554 (16 cores, 105GB)
2.49GBQ417.0731.7586.0%46.2%46.2%
Llama-3.2-3B
Llama Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
2.02GBQ416.631.6990.9%47.6%47.6%
Phi-4-mini-reasoning
Phi Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
2.49GBQ417.1730.2876.4%43.3%43.3%
Phi-4-mini-Instruct
Phi Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
2.49GBQ416.3330.1384.5%45.8%45.8%
Shakti-500M
Shakti Family
Dense
Device
Raspberry Pi 5 (8GB)
303MBQ42229.5434.3%25.5%25.5%
Qwen3-8B
Qwen Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
5.03GBQ412.6529.11130.1%56.5%56.5%
SmolLM2-360M
SmolLM Family
Dense
Device
Raspberry Pi 5 (8GB)
271MBQ42228.9931.8%24.1%24.1%
Qwen3-4B
Qwen Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
2.2GBQ413.728.18105.7%51.4%51.4%
Shakti-2.5B
Shakti Family
Dense
Device
iPhone 14 (6GB)
1.5GBQ41827.3251.8%34.1%34.1%
Llama-3.1-8B
Llama Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
4.92GBQ417.826.5649.2%33.0%33.0%
Shakti-250M
Shakti Family
Dense
CPU
Intel Xeon Silver 4110 (197GB)
148MBQ414.7525.6774.0%42.5%42.5%
Llama-3.2-1B
Llama Family
Dense
CPU
Intel Xeon Silver 4110 (197GB)
808MBQ410.3524.78139.4%58.2%58.2%
Phi-4-mini-Instruct
Phi Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
2.49GBQ41123.94117.6%54.1%54.1%
Gemma-3-4B-IT
Gemma Family
Multimodal
CPU
Intel Core i7-14700K (28 cores, 94GB)
2.49GBQ411.6322.4893.3%48.3%48.3%
Gemma-2-9B-IT
Gemma Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
5.76GBQ411.0920.7386.9%46.5%46.5%
Llama-3.1-8B
Llama Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
4.92GBQ413.419.1542.9%30.0%30.0%
Phi-4-mini-reasoning
Phi Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
2.49GBQ48.5518.8119.9%54.5%54.5%
Gemma-3-12B-IT
Gemma Family
Multimodal
CPU
AMD EPYC 9554 (60 cores, 201GB)
7.3GBQ410.618.2572.2%41.9%41.9%
Qwen2.5-500M
Qwen Family
Dense
Device
Raspberry Pi 5 (8GB)
398MBQ41418.2430.3%23.2%23.2%
DeepSeek-R1-Distill-Llama-8B
DeepSeek Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
4.92GBQ48.7915.8380.1%44.5%44.5%
Llama-Guard-3-8B
Llama Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
4.92GBQ47.9815.2691.2%47.7%47.7%
Qwen3-8B
Qwen Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
5.03GBQ48.3414.2971.3%41.6%41.6%
Shakti-500M
Shakti Family
Dense
CPU
Intel Xeon Silver 4110 (197GB)
303MBQ44.5614.26212.7%68.0%68.0%
Gemma-2-9B-IT
Gemma Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
5.76GBQ47.9213.9776.4%43.3%43.3%
DeepSeek-R1-Distill-Qwen-1.5B
Deepseek Family
Dense
CPU
Intel Xeon Silver 4110
1.12GBQ46.8313.1492.4%48.0%48.0%
Llama-Guard-3-8B
Llama Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
4.92GBQ46.9612.7683.3%45.5%45.5%
Gemma-3-12B-IT
Gemma Family
Multimodal
CPU
AMD EPYC 9554 (16 cores, 105GB)
7.3GBQ46.7511.0663.9%39.0%39.0%
Gemma-2-9B-IT
Gemma Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
5.76GBQ45.1110.0897.3%49.3%49.3%
DeepSeek-R1-Distill-Llama-8B
DeepSeek Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
4.92GBQ44.889.5896.3%49.1%49.1%
Shakti-2.5B
Shakti Family
Dense
CPU
Intel Xeon Silver 4110 (197GB)
1.5GBQ45.129.3582.6%45.2%45.2%
Llama-3.2-3B-Instruct
Llama Family
Dense
CPU
Intel Xeon Silver 4110
2.02GBQ44.118.76113.1%53.1%53.1%
Gemma-3-4b-it
Gemma Family
Multimodal
CPU
Intel Xeon Silver 4110
2.49GBQ42.928.52191.8%65.7%65.7%
Gemma-3-12B-IT
Gemma Family
Multimodal
CPU
Intel Core i7-14700K (28 cores, 94GB)
7.3GBQ44.527.8974.6%42.7%42.7%
Qwen3-4B
Qwen Family
Dense
CPU
Intel Xeon Silver 4110
2.2GBQ43.266.67104.6%51.1%51.1%
DeepSeek-R1-Distill-Llama-8B
Deepseek Family
Dense
CPU
Intel Xeon Silver 4110
4.92GBQ42.556.27145.9%59.3%59.3%
Llama-3.3-70B
Llama Family
Dense
CPU
AMD EPYC 9554 (32 cores, 117GB)
42.5GBQ42.015.6178.6%64.1%64.1%
Llama-3.1-8B
Llama Family
Dense
CPU
Intel Xeon Silver 4110
4.92GBQ42.235.34139.5%58.2%58.2%
Llama-3.3-70B
Llama Family
Dense
CPU
AMD EPYC 9554 (60 cores, 201GB)
42.5GBQ42.74.5267.4%40.3%40.3%
Shakti-2.5B
Shakti Family
Dense
Device
Raspberry Pi 5 (8GB)
1.5GBQ43.24.4539.1%28.1%28.1%
Llama-3.3-70B
Llama Family
Dense
CPU
Intel Core i7-14700K (28 cores, 94GB)
42.5GBQ42.14.34106.7%51.6%51.6%
Llama-3.3-70B
Llama Family
Dense
CPU
AMD EPYC 9554 (16 cores, 105GB)
42.5GBQ41.432.1953.1%34.7%34.7%