Windows x64 Installation
Download the installer:Run the downloaded .exe file and follow the installation wizard.Running Your First Model
Currently, we support LLM and Multimodal models. More model type support is coming soon!Language Model (LLM)nexa infer NexaAI/Qwen3-0.6B
Multimodal Modelnexa infer NexaAI/Qwen2.5-Omni-3B-GGUF
To try other GGUF models, visit Hugging Face, copy the path of any compatible GGUF model (e.g., unsloth/Qwen2.5-VL-3B-Instruct-GGUF), and replace the model path in the command above.
Currently, LLM (Large Language Model) and VLM (Vision Language Model) are in the testing scope. More modalities are coming soon!
Windows ARM64 Installation
Download the installer:Run the downloaded .exe file and follow the installation wizard.Running Your First Model
Currently, we support LLM and Multimodal models. More model type support is coming soon!Language Model (LLM)nexa infer NexaAI/Qwen3-0.6B
Multimodal Modelnexa infer NexaAI/Qwen2.5-Omni-3B-GGUF
To try other GGUF models, visit Hugging Face, copy the path of any compatible GGUF model (e.g., unsloth/Qwen2.5-VL-3B-Instruct-GGUF), and replace the model path in the command above.
Currently, LLM (Large Language Model) and VLM (Vision Language Model) are in the testing scope. More modalities are coming soon!
NPU Acceleration (Snapdragon X Elite)
Hardware requirement: The following NPU-accelerated model currently runs only on Qualcomm Snapdragon X Elite laptops.
If you have a Snapdragon X Elite PC, you can run the flagship OmniNeural-4B model with NPU acceleration:OmniNeural-4B (Multimodal NPU Model)Voice Input Mode: Once running, record your voice directly in terminal:Press CTRL + C to stop recording, then hit enter to send.File Input: Drag image/audio files into the command line:> describe this image '/path/to/image.jpg' '/path/to/audio.wav'
For detailed NPU setup instructions and advanced features, see the NPU Guide. Linux Installation
Run the following command to download and install:curl -fsSL /path/to/install.sh -o install.sh && chmod +x install.sh && ./install.sh
Running Your First Model
Currently, we support LLM and Multimodal models. More model type support is coming soon!Language Model (LLM)nexa infer NexaAI/Qwen3-0.6B
Multimodal Modelnexa infer NexaAI/Qwen2.5-Omni-3B-GGUF
To try other GGUF models, visit Hugging Face, copy the path of any compatible GGUF model (e.g., unsloth/Qwen2.5-VL-3B-Instruct-GGUF), and replace the model path in the command above.
We currently support LLM (Large Language Model) and VLM (Vision Language Model). More modalities are coming soon!