Skip to main content
NexaSDK banner NexaSDK is an easy-to-use developer toolkit for running any AI model locally — across NPUs, GPUs, and CPUs — powered by our NexaML engine, built entirely from scratch for peak performance on every hardware stack. Unlike wrappers that depend on existing runtimes, NexaML is a unified inference engine built at the kernel level. It’s what lets NexaSDK achieve Day-0 support for new model architectures (LLM, VLM, CV, Embedding, Rerank, ASR, TTS). NexaML supports 3 model formats: GGUF, MLX, and Nexa AI’s own .nexa format.

Why NexaSDK

FeatureNexaSDKOllamallama.cppLM Studio
NPU support🟢 NPU-first🟡🟡🔴
Android/iOS SDK support🟢 NPU/GPU/CPU support🟡🟡🔴
Linux support (Docker image)🟢🟢🟢🔴
Support any model in GGUF, MLX, NEXA format🟢 Low-level Control🔴🟡🔴
Full multimodality support🟢 Image, Audio, Text, Embedding, Rerank, ASR, TTS🟡🟡🟡
Cross-platform support🟢 Desktop, Mobile (Android, iOS), Automotive, IoT (Linux)🟡🟡🟡
One line of code to run🟢🟢🟡🟢
OpenAI-compatible API + Function calling🟢🟢🟢🟢
Legend: 🟢 Supported  |  🟡 Partial or limited support  |  🔴 No

Get Started

Community