Getting Started
To use the API, first start the NexaSDK Docker container in server mode:bash
http://127.0.0.1:18181 by default. Keep the container running, and make your requests from another terminal or application.
You can also access the interactive Swagger UI documentation at
http://127.0.0.1:18181/docs/ui to explore and test the API endpoints directly from your browser.Replace
YOUR_LONG_TOKEN_HERE with your actual Nexa token. You can obtain a token by creating an account at sdk.nexa.ai and generating one in Deployment → Create Token.The
--privileged flag is required for NPU access.Model Choice
NexaSDK Docker supports Linux ARM64 architecture. For a complete list of supported models and their Hugging Face links, see the Overview page.API Endpoints
The NexaSDK REST API provides OpenAI-compatible endpoints for various AI tasks. For detailed API documentation including request/response formats, examples, and all available endpoints, please refer to the CLI REST API documentation.Available Endpoints
/v1/chat/completions- Creates model responses for conversations (LLM and VLM)/v1/embeddings- Creates embeddings for text input/v1/reranking- Reranks documents based on query relevance
All API endpoints, request/response formats, and usage examples are documented in the CLI REST API page. The API interface is identical whether running via CLI or Docker - only the server startup method differs.
Was this page helpful?