REST API

Getting Started

To use the API, first start the NexaSDK Docker container in server mode:

bash

docker run --rm -dp 18181:18181 --privileged \
  -e NEXA_TOKEN="YOUR_LONG_TOKEN_HERE" \
  nexa4ai/nexasdk serve

The server runs on http://127.0.0.1:18181 by default.
Keep the container running, and make your requests from another terminal or application.
To see a full list of configurable options for the server, you can check the container logs or refer to the Quickstart guide.

Replace YOUR_LONG_TOKEN_HERE with your actual Nexa token. You can obtain a token by creating an account at sdk.nexa.ai and generating one in Deployment → Create Token.

The --privileged flag is required for NPU access on ARM64 systems. For x64 systems, you may omit this flag if not using NPU.

Model Choice

NexaSDK Docker supports both Linux ARM64 and x64 architectures. For a complete list of supported models and their Hugging Face links, see the Quickstart guide.

API Endpoints

The NexaSDK REST API provides OpenAI-compatible endpoints for various AI tasks. For detailed API documentation including request/response formats, examples, and all available endpoints, please refer to the CLI REST API documentation.

Available Endpoints

/v1/chat/completions - Creates model responses for conversations (LLM and VLM)
/v1/embeddings - Creates embeddings for text input
/v1/reranking - Reranks documents based on query relevance

All API endpoints, request/response formats, and usage examples are documented in the CLI REST API page. The API interface is identical whether running via CLI or Docker - only the server startup method differs.

Was this page helpful?

Yes

Get Started

Nexa CLI Usage

Android SDK

Linux Docker

Python Library

Community

Getting Started

Model Choice

API Endpoints

Available Endpoints

Get Started

Nexa CLI Usage

Android SDK

Linux Docker

Python Library

Community

​Getting Started

​Model Choice

​API Endpoints

​Available Endpoints

Getting Started

Model Choice

API Endpoints

Available Endpoints