Quickstart

Prerequisites

Before you begin, make sure you have:

Python 3.10
- If you are using conda, you can create a new environment via:
  conda create -n nexaai python=3.10 conda activate nexaai

Installation

Install the latest NexaAI Python SDK from PyPI. Install command by OS:

Windows and Linux:
```
pip install nexaai
```
macOS:
```
pip install 'nexaai[mlx]'
```

Authentication Setup

Before running any examples, you need to set up your NexaAI authentication token.

Set Token in Environment

Replace "YOUR_NEXA_TOKEN_HERE" with your actual NexaAI token from https://sdk.nexa.ai/:

Linux/macOS:

export NEXA_TOKEN="YOUR_NEXA_TOKEN_HERE"

Windows:
```
$env:NEXA_TOKEN="YOUR_NEXA_TOKEN_HERE"
```

Running Your First Model

Language Model (LLM)

Python

from nexaai.llm import LLM, GenerationConfig
from nexaai.common import ModelConfig, ChatMessage

# Initialize model
model_path = "~/.cache/nexa.ai/nexa_sdk/models/Qwen/Qwen3-0.6B-GGUF/Qwen3-0.6B-Q8_0.gguf"
m_cfg = ModelConfig()
llm = LLM.from_(model_path, plugin_id="cpu_gpu", device_id="cpu", m_cfg=m_cfg)

# Create conversation
conversation = [ChatMessage(role="system", content="You are a helpful assistant.")]
conversation.append(ChatMessage(role="user", content="Hello, how are you?"))

# Apply chat template and generate
prompt = llm.apply_chat_template(conversation)
for token in llm.generate_stream(prompt, g_cfg=GenerationConfig(max_tokens=100)):
    print(token, end="", flush=True)

Multimodal Model (VLM)

Python

from nexaai.vlm import VLM, GenerationConfig
from nexaai.common import ModelConfig, MultiModalMessage, MultiModalMessageContent

# Initialize model
model_path = "~/.cache/nexa.ai/nexa_sdk/models/NexaAI/gemma-3n-E4B-it-4bit-MLX/model-00001-of-00002.safetensors"
m_cfg = ModelConfig()
vlm = VLM.from_(name_or_path=model_path, m_cfg=m_cfg, plugin_id="cpu_gpu", device_id="")

# Create multimodal conversation
conversation = [MultiModalMessage(role="system", 
                                content=[MultiModalMessageContent(type="text", text="You are a helpful assistant.")])]

# Add user message with image
contents = [
    MultiModalMessageContent(type="text", text="Describe this image"),
    MultiModalMessageContent(type="image", text="path/to/image.jpg")
]
conversation.append(MultiModalMessage(role="user", content=contents))

# Apply chat template and generate
prompt = vlm.apply_chat_template(conversation)
for token in vlm.generate_stream(prompt, g_cfg=GenerationConfig(max_tokens=100, image_paths=["path/to/image.jpg"])):
    print(token, end="", flush=True)

Embedder

Python

from nexaai.embedder import Embedder, EmbeddingConfig

# Initialize embedder
model_path = "~/.cache/nexa.ai/nexa_sdk/models/NexaAI/jina-v2-fp16-mlx/model.safetensors"
embedder = Embedder.from_(name_or_path=model_path, plugin_id="cpu_gpu")

# Generate embeddings
texts = ["Hello world", "How are you?"]
config = EmbeddingConfig(batch_size=2)
embeddings = embedder.generate(texts=texts, config=config)

for text, embedding in zip(texts, embeddings):
    print(f"Text: {text}")
    print(f"Embedding dimension: {len(embedding)}")

Reranker

Python

from nexaai.rerank import Reranker, RerankConfig

# Initialize reranker
model_path = "~/.cache/nexa.ai/nexa_sdk/models/NexaAI/jina-v2-rerank-mlx/jina-reranker-v2-base-multilingual-f16.safetensors"
reranker = Reranker.from_(name_or_path=model_path, plugin_id="cpu_gpu")

# Rerank documents
query = "What is machine learning?"
documents = ["Machine learning is a subset of AI", "Python is a programming language"]
config = RerankConfig(batch_size=2)
scores = reranker.rerank(query=query, documents=documents, config=config)

for doc, score in zip(documents, scores):
    print(f"[{score:.4f}] {doc}")

Next Steps

Explore the API Reference for detailed usage examples
Check out the troubleshooting guide for common issues

Was this page helpful?

Yes

Get Started

Usage

Python SDK

Mobile

Prerequisites

Installation

Authentication Setup

Set Token in Environment

Running Your First Model

Language Model (LLM)

Multimodal Model (VLM)

Embedder

Reranker

Next Steps

Get Started

Usage

Python SDK

Mobile

​Prerequisites

​Installation

​Authentication Setup

​Set Token in Environment

​Running Your First Model

​Language Model (LLM)

​Multimodal Model (VLM)

​Embedder

​Reranker

​Next Steps

Prerequisites

Installation

Authentication Setup

Set Token in Environment

Running Your First Model

Language Model (LLM)

Multimodal Model (VLM)

Embedder

Reranker

Next Steps