> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nexa.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# REST API

> Local OpenAI-compatible API for text generation, embeddings, and more

## **Getting Started**

To use the API, first start the NexaSDK Docker container in server mode:

```bash bash theme={"dark"}
export NEXA_TOKEN="YOUR_LONG_TOKEN_HERE"
docker run --rm -d -p 18181:18181 --privileged \
  -v /path/to/data:/data \
  -v /etc/machine-id:/etc/machine-id:ro \
  -e NEXA_TOKEN \
  nexa4ai/nexasdk serve
```

The server runs on `http://127.0.0.1:18181` by default. Keep the container running, and make your requests from another terminal or application.

<Info>
  You can also access the interactive Swagger UI documentation at [`http://127.0.0.1:18181/docs/ui`](http://127.0.0.1:18181/docs/ui) to explore and test the API endpoints directly from your browser. For a full list of configurable options, refer to the [Quickstart guide](/en/nexa-sdk-docker/quickstart).
</Info>

<Note>
  Replace `YOUR_LONG_TOKEN_HERE` with your actual Nexa token. You can obtain a token by creating an account at [sdk.nexa.ai](https://sdk.nexa.ai) and generating one in **Deployment → Create Token**.
</Note>

<Note>
  The `--privileged` flag is required for NPU access.
</Note>

## Model Choice

NexaSDK Docker supports both Linux ARM64 and x64 architectures. For a complete list of supported models and their Hugging Face links, see the [Quickstart guide](/en/nexa-sdk-docker/quickstart#supported-models).

## **API Endpoints**

The NexaSDK REST API provides OpenAI-compatible endpoints for various AI tasks. For detailed API documentation including request/response formats, examples, and all available endpoints, please refer to the [CLI REST API documentation](/en/nexa-sdk-go/NexaAPI).

### **Available Endpoints**

* **`/v1/chat/completions`** - Creates model responses for conversations (LLM and VLM)
* **`/v1/embeddings`** - Creates embeddings for text input
* **`/v1/reranking`** - Reranks documents based on query relevance

<Info>
  All API endpoints, request/response formats, and usage examples are documented in the [CLI REST API page](/en/nexa-sdk-go/NexaAPI). The API interface is identical whether running via CLI or Docker - only the server startup method differs.
</Info>

<br />

<div class="feedback-wrapper">
  <span class="feedback-label">Was this page helpful?</span>

  <div class="feedback-toggle">
    <input type="radio" name="feedback" id="feedback-yes" class="feedback-input" />

    <label for="feedback-yes" class="feedback-button">
      <img src="https://mintcdn.com/nexaai/g8-zBYnunEyVtcK3/Images/FeedBack/thumbs-up.svg?fit=max&auto=format&n=g8-zBYnunEyVtcK3&q=85&s=0b57c51c8db9940403e7552956e5c30e" alt="Thumbs up" class="feedback-icon" noZoom width="14" height="14" data-path="Images/FeedBack/thumbs-up.svg" />

      Yes
    </label>

    <input type="radio" name="feedback" id="feedback-no" class="feedback-input" />

    <label for="feedback-no" class="feedback-button">
      <img src="https://mintcdn.com/nexaai/g8-zBYnunEyVtcK3/Images/FeedBack/thumbs-down.svg?fit=max&auto=format&n=g8-zBYnunEyVtcK3&q=85&s=ebacf61d57c8259c6df243d329b548b3" alt="Thumbs down" class="feedback-icon" noZoom width="14" height="14" data-path="Images/FeedBack/thumbs-down.svg" />

      No
    </label>
  </div>
</div>
