Documentations
Join DiscordModel HubGitHub
  • Nexa On-Device AI Hub Overview
  • Getting Started
    • 🔗Installation
    • ▶️Use Model
    • ⬆️Upload Model
  • NEXA SDK
    • 📋CLI Reference
    • Python Interface
      • GGUF
      • ONNX
    • 🚀Inference
      • GGUF
      • ONNX
    • ⚙️Local server
  • Resources
    • ‼️Troubleshoot
Powered by GitBook
On this page
  • Upload Method
  • About Model Tag

Was this helpful?

  1. Getting Started

Upload Model

🤝 Share your model and connect with developers, researchers, and users for support and collaboration.

PreviousUse ModelNextCLI Reference

Last updated 9 months ago

Was this helpful?

To upload models, you will need to create an account with . Currently, you can upload model through web interface. After uploading the model, you will have control over what files to include in your model, and how to create a for your model to make your model more discoverable, read more below.

Upload Method

Step 0. To upload models to the Hub, visit Nexa Hub and make sure you registered an account for .

Step 1. Click "Upload your model"

Step 2. Fill-in model name, parameters, model type, and license (optional)

Step 4. Edit model tag to make your model more discoverable

Step 5. Upload files

Step 6. Edit README to add more descriptions for your model

About Model Tag

Model Tag Name

We recommend that model tag names should make sense. In official models in Model Hub, we add model precision in the model tag name.

Model Precision
Bits per Weight (BPW) Approximation

gguf-q2_K

2

gguf-q3_K_L

3

gguf-q3_K_M

3

gguf-q3_K_S

3

gguf-q4_0

4

gguf-q4_1

4

gguf-q4_K_M

4

gguf-q4_K_S

4

gguf-q5_0

5

gguf-q5_1

5

gguf-q5_K_M

5

gguf-q5_K_S

5

gguf-q6_K

6

gguf-q8_0

8

onnx-int4

4

onnx-int8

8

onnx-bf16

16

onnx-fp16

16

onnx-fp32

32

File Format Tag

  • GGUF

  • ONNX

RAM

This metric indicates the minimum random access memory (RAM) necessary for local model execution. You can calculate the approximate RAM required according to the formula:

File Size

Displays the total storage space required for the model.

Step 3. Fill-in Model tag name, see the on how to tag your model

is an optimized binary format designed for efficient model loading and saving, particularly suited for inference tasks. It is compatible with GGML and other executors. Developed by , the creator of (a widely-used C/C++ LLM inference framework), GGUF forms the foundation of the Nexa SDK's GGML component.

is an open standard format for representing machine learning models. It establishes a common set of operators and a unified file format, enabling AI developers to utilize models across various frameworks, tools, runtimes, and compilers. ONNX shows unique performance advantages on devices with limited ram(mobile, IoT). The Nexa SDK's ONNX component is built upon the onnxruntime framework.

RAM=Parameters∗BPW/8RAM = Parameters * BPW / 8RAM=Parameters∗BPW/8
⬆️
GGUF
@ggerganov
llama.cpp
ONNX
tag section
Nexa AI Hub
Nexa AI Hub
tag
Click the "Sign in/Sign up" button on the top right corner
Click the "Upload your model" button on the top right corner
Name your model in the highlighted "Model tag name" field
Click "Edit quantization tag"
Edit quantization tag information as needed
Click "Upload file" button under "Files" tab
Upload file here
Click "Edit" button under "Model Info" tab
Update README info here