Optimized format for running models on Neural Processing Units, enabling efficient, low-latency inference on edge and specialized hardware.
nexa config set license '<your_token_here>'
nexa infer NexaAI/qwen3-4B-npu
nexa infer NexaAI/OmniNeural-4B