Overview - Documentations

💰 New: Join our Builder Bounty Program — earn up to 1,500 USD for building with NexaSDK.

NexaSDK lets you run any AI model locally on NPUs, GPUs, or CPUs - easiest to use, most flexibile, and with best performance. NexaSDK is powered by NexaML engine, our unified inference engine built from scratch at the kernel level. It is capable of Day-0 support for new model architectures (LLMs, multimodal, audio, vision). NexaML supports 3 model formats: GGUF, MLX, and Nexa AI’s own .nexa format.

Why NexaSDK

Feature	NexaSDK	Ollama	llama.cpp
NPU support	🟢 NPU-first	🔴	🔴
Day-0 support for model formats (GGUF, MLX, NEXA)	🟢 Support Any Model Day-0	🔴	🟡
Full multimodality	🟢 Image, Audio, Text	🟡	🟡
Cross-platform support	🟢 Desktop, Mobile, Automotive, IoT	🟡	🟡
One line of code to run	🟢	🟢	🟡
OpenAI API + Function calling	🟢	🟢	🟢