NexaAI Windows ARM64 Setup Guide
This guide demonstrates how to use the NexaAI SDK for various AI inference tasks on NPU devices, including:- LLM (Large Language Model): Text generation and conversation
- VLM (Vision Language Model): Multimodal understanding and generation
- Embedder: Text vectorization and similarity computation
- Reranker: Document reranking
- ASR (Automatic Speech Recognition): Speech-to-text transcription
- CV (Computer Vision): OCR/text recognition
Prerequisites
1. Install the correct Python version
If you prefer, we also offer a video tutorial for the installation. Check it out here. NexaAI requires Python 3.11 – 3.13 (ARM64 build) on Windows ARM. Please download and install the official ARM64 Python from the python-3.11.1-arm64.exe. Make sure you read the instructions below carefully before proceeding.❗ IMPORTANT: Make sure you select “Add python.exe to PATH” on the first screen of the installation wizard.
🛑 Make sure you restart the terminal or your IDE after installation.
⚠️ Do not use Conda or x86 builds — they are incompatible with native ARM64 binaries. If you are in a conda environment, run conda deactivate first.
Verify the installation:
In case your environment path gets overriden by some environment manager, we recommend you to run the following commands to restore PATH variable from system settings.
Python version: 3.11.0 (main, Oct 24 2022, 18:15:22) [MSC v.1933 64 bit (ARM64)]Expected output must contain version
3.11.0 and architecture ARM64.
If it does show AMD64 or incorrect version, try the following:
- (If you have conda installed) Run
conda deactivateto deactivate the current conda environment. - (If your
pythonexecutable points to the x86 version) You may need to make the ARM64 Python come before the x86 Python in your PATH.- Hit the
Winkey, and typeenv, and hit Enter to selectEdit the system environment variablessetting. - Click on
Environment Variables...button. - Select
Pathand clickEdit.... - Find your ARM64 Python installation path, and move it to the top of the list.
- Hit
OKfor several times to close all the dialogs and save the changes.
- Hit the
- (If you forgot to select “Add python.exe to PATH” on the first screen of the installation wizard)
- Run the installation wizard again, follow the instructions to remove the current installation, and then reinstall from the Wizard. Make sure to select “Add python.exe to PATH” this time.
2. Create and activate a virtual environment
3. Install the NexaAI SDK
4. Verify Your Environment
Run the following code to ensure you have the right environment:Authentication Setup
Before running any examples, you need to set up your NexaAI authentication token.Set Token in Code
Replace"YOUR_NEXA_TOKEN_HERE" with your actual NexaAI token from https://sdk.nexa.ai/:
1. LLM (Large Language Model) NPU Inference
Using NPU-accelerated large language models for text generation and conversation. Llama3.2-3B-NPU-Turbo is specifically optimized for NPU.2. VLM (Vision Language Model) NPU Inference
Using NPU-accelerated vision language models for multimodal understanding and generation. OmniNeural-4B supports joint processing of images and text.3. Embedder NPU Inference
Using NPU-accelerated embedding models for text vectorization and similarity computation. embeddinggemma-300m-npu is a lightweight embedding model specifically optimized for NPU.4. ASR (Automatic Speech Recognition) NPU Inference
Using NPU-accelerated speech recognition models for speech-to-text transcription. parakeet-npu provides high-quality speech recognition with NPU acceleration.5. Reranker NPU Inference
Using NPU-accelerated reranking models for document reranking. jina-v2-rerank-npu can perform precise similarity-based document ranking based on queries.6. Computer Vision (CV) NPU Inference
Run NPU-accelerated computer vision tasks (e.g., OCR/text recognition) on images.Next Steps
- Explore the API Reference for comprehensive documentation
- Check out the macOS Guide for Apple Silicon optimization
- Visit the Windows x64 Guide for CPU/GPU acceleration
Was this page helpful?