βš™οΈ Prerequisites

  • If you haven’t already, install the nexa-SDK.
  • Below are the GGUF-compatible model types you can experiment with right away.

LLM - Language Models

πŸ“ Language models in GGUF format. Try out this quick example: Try it out:
nexa infer nexaml/Qwen3-0.6B
⌨️ Once model loads, type or paste multi-line text directly into the CLI to chat with the model.

LMM - Multimodal Models

πŸ–ΌοΈ Language models that also accept vision and/or audio inputs. LMM in GGUF formats. Try out this quick example:
nexa infer NexaAI/Qwen2.5-Omni-3B-GGUF
⌨️ Drag photos or audio clips directly into the CLI β€” you can even drop multiple images at once!

Supported Model List

We curated a list of top, high quality models in GGUF format.
To try other GGUF models, visit Hugging Face, copy the path of any compatible GGUF model (e.g., unsloth/Qwen2.5-VL-3B-Instruct-GGUF), and replace the model path in the command above.

πŸ™‹ Request New Models

Want a specific model? Submit an issue on the nexa-sdk GitHub or request in our Discord/Slack community!