LMX-Omni-5.5B-Lite
LMX-Omni-5.5B-Lite is a Lemonade Mix (LMX) virtual model that offers true all-to-all omni-modality: chat, vision, image generation, speech transcription, and text-to-speech through a single OpenAI-compatible model. It is a curated bundle of component models, sized to run on mainstream hardware (32 GB APUs), served by Lemonade.
Components
| Component | Recipe | Role | Size (GB) |
|---|---|---|---|
Qwen3.5-4B-MTP-GGUF |
llamacpp | Chat + vision (planner LLM) | 3.66 |
SD-Turbo |
sd-cpp | Image generation | 5.21 |
Whisper-Tiny |
whispercpp | Speech transcription | 0.075 |
kokoro-v1 |
kokoro | Text-to-speech | 0.354 |
Total download: ~9.3 GB
Usage
Use it in the Lemonade app
Install Lemonade, then:
lemonade pull LMX-Omni-5.5B-Lite
lemonade run LMX-Omni-5.5B-Lite
Pulling the collection downloads every component above. Once downloaded, the model appears in the Lemonade app's chat, where you can talk to it, generate images, and hear spoken replies.
Use it in any OpenAI-compatible app
Any app that calls /chat/completions and can render multimedia output, such as Open WebUI or
AnythingLLM, can use the model by name. The server runs an internal tool-calling loop across
the components and embeds generated images and audio directly in the assistant message, so to the
app it behaves as a genuine OpenAI-compatible chat model.
Build your own app
LMX models are made for building apps with native omni-modal interactions. Apps that want full control can run the omni tool-calling loop themselves, addressing the planner LLM directly with the same tool definitions. See Lemonade Omni Models for the server-side and client-side orchestration contracts, and Embeddable Lemonade to ship the whole experience inside your own app.
Collection definition
This repository is the source of truth for the collection definition: the self-contained manifest
LMX-Omni-5.5B-Lite.json carries each component's checkpoint, recipe,
labels, and defaults, so the collection is fully reproducible from this repo alone.