Run GGUF Models

⚙️ Prerequisites

If you haven’t already, install the nexa-SDK.
Below are the GGUF-compatible model types you can experiment with right away.

LLM - Language Models

📝 Language models in GGUF format. Try out this quick example: Try it out:

nexa infer nexaml/Qwen3-0.6B

⌨️ Once model loads, type or paste multi-line text directly into the CLI to chat with the model.

LMM - Multimodal Models

🖼️ Language models that also accept vision and/or audio inputs. LMM in GGUF formats. Try out this quick example:

nexa infer NexaAI/Qwen2.5-Omni-3B-GGUF

⌨️ Drag photos or audio clips directly into the CLI — you can even drop multiple images at once!

Supported Model List

We curated a list of top, high quality models in GGUF format.

LLMs for GGUF

Multimodal for GGUF

To try other GGUF models, visit Hugging Face, copy the path of any compatible GGUF model (e.g., unsloth/Qwen2.5-VL-3B-Instruct-GGUF), and replace the model path in the command above.

🙋 Request New Models

Want a specific model? Submit an issue on the nexa-sdk GitHub or request in our Discord/Slack community!

Was this page helpful?

Yes

Get Started

Usage

⚙️ Prerequisites

LLM - Language Models

LMM - Multimodal Models

Supported Model List

LLMs for GGUF

Multimodal for GGUF

🙋 Request New Models

Get Started

Usage

​⚙️ Prerequisites

​LLM - Language Models

​LMM - Multimodal Models

​Supported Model List

LLMs for GGUF

Multimodal for GGUF

​🙋 Request New Models

⚙️ Prerequisites

LLM - Language Models

LMM - Multimodal Models

Supported Model List

🙋 Request New Models