Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

147

Base only

Active filters: sglang

AxionML/Gemma-4-12B-FP8

Image-Text-to-Text • 12B • Updated 4 days ago • 4.67k • 3

SwitchXDDD/multilingual-eagle3-qwen3-8b

0.4B • Updated 5 days ago • 16 • 1

AxionML/Gemma-4-12B-NVFP4

Image-Text-to-Text • 8B • Updated 4 days ago • 12.8k • 1

Rifky/LFM2.5-8B-A1B-FP8

8B • Updated 4 days ago • 22 • 1

kunhunjon/gemma-4-12B-it-qat-assistant-w4a16-ct

Text Generation • 0.4B • Updated about 4 hours ago • 1

SurfaceData/llava-v1.6-mistral-7b-sglang

Image-Text-to-Text • 8B • Updated Mar 7, 2024 • 13 • 9

SurfaceData/llava-v1.6-vicuna-7b-sglang

Image-Text-to-Text • 7B • Updated Mar 7, 2024 • 13 • 1

tclf90/qwen2.5-72b-instruct-gptq-int4

Text Generation • 73B • Updated May 12, 2025 • 78 • 2

tclf90/qwen2.5-72b-instruct-gptq-int3

Text Generation • 69B • Updated May 12, 2025 • 70

alvarobartt/grok-2-tokenizer

Updated Aug 27, 2025 • 3

unsloth/grok-2

Text Generation • Updated Sep 6, 2025 • 107 • 5

osmapi/MiniMax-M2-THRIFT

173B • Updated Nov 13, 2025 • 1.61k • 35

mradermacher/MiniMax-M2-THRIFT-GGUF

Updated Apr 28 • 2

JasmineBBB/Kimi-Linear-48B-A3B-Instruct-bnb-4bit

Text Generation • 49B • Updated Nov 5, 2025 • 6 • 1

mradermacher/MiniMax-M2-THRIFT-i1-GGUF

173B • Updated Apr 28 • 81 • 10

bartowski/VibeStudio_MiniMax-M2-THRIFT-GGUF

Text Generation • 173B • Updated Nov 20, 2025 • 1.73k • 8

osmapi/MiniMax-M2-THRIFT-55

106B • Updated Dec 3, 2025 • 144 • 5

JinnP/SGLang-EAGLE3-Qwen3-Coder-30B-A3B-Instruct

Text Generation • 0.2B • Updated Nov 25, 2025 • 15 • 2

mradermacher/MiniMax-M2-THRIFT-55-GGUF

106B • Updated Apr 18 • 31 • 2

mradermacher/MiniMax-M2-THRIFT-55-i1-GGUF

106B • Updated Apr 19 • 395 • 2

osmapi/MiniMax-M2-THRIFT-55-MLX-4bit

106B • Updated Dec 2, 2025 • 32 • 2

osmapi/MiniMax-M2-THRIFT-55-MLX-6bit

106B • Updated Dec 3, 2025 • 24

Doradus-AI/MiroThinker-v1.0-30B-FP8

Text Generation • 31B • Updated Dec 5, 2025 • 6 • 4

Doradus-AI/Hermes-4.3-36B-FP8

Text Generation • 36B • Updated Dec 7, 2025 • 6.56k • 3

Doradus-AI/RnJ-1-Instruct-FP8

Text Generation • 9B • Updated Dec 7, 2025 • 623k • 4

QuantTrio/Kimi-K2.5-E304

Image-Text-to-Text • 841B • Updated Feb 2 • 384 • 3

bullpoint/Qwen3-Coder-Next-AWQ-4bit

Text Generation • 14B • Updated Feb 3 • 89.1k • 26

QuantTrio/Qwen3-Coder-Next-E336

Text Generation • 53B • Updated Feb 6 • 132 • 2

QuantTrio/Qwen3-Coder-Next-E400

Text Generation • 63B • Updated Feb 6 • 15 • 2

elon-trump/pixtral-12b-2409-w4a16-gptq

3B • Updated Apr 25