Upload Model
🤝 Share your model and connect with developers, researchers, and users for support and collaboration.
Last updated
Was this helpful?
🤝 Share your model and connect with developers, researchers, and users for support and collaboration.
Last updated
Was this helpful?
To upload models, you will need to create an account with . Currently, you can upload model through web interface. After uploading the model, you will have control over what files to include in your model, and how to create a for your model to make your model more discoverable, read more below.
Step 0. To upload models to the Hub, visit Nexa Hub and make sure you registered an account for .
Step 1. Click "Upload your model"
Step 2. Fill-in model name, parameters, model type, and license (optional)
Step 4. Edit model tag to make your model more discoverable
Step 5. Upload files
Step 6. Edit README to add more descriptions for your model
Model Tag Name
We recommend that model tag names should make sense. In official models in Model Hub, we add model precision in the model tag name.
gguf-q2_K
2
gguf-q3_K_L
3
gguf-q3_K_M
3
gguf-q3_K_S
3
gguf-q4_0
4
gguf-q4_1
4
gguf-q4_K_M
4
gguf-q4_K_S
4
gguf-q5_0
5
gguf-q5_1
5
gguf-q5_K_M
5
gguf-q5_K_S
5
gguf-q6_K
6
gguf-q8_0
8
onnx-int4
4
onnx-int8
8
onnx-bf16
16
onnx-fp16
16
onnx-fp32
32
File Format Tag
GGUF
ONNX
RAM
This metric indicates the minimum random access memory (RAM) necessary for local model execution. You can calculate the approximate RAM required according to the formula:
File Size
Displays the total storage space required for the model.
Step 3. Fill-in Model tag name, see the on how to tag your model
is an optimized binary format designed for efficient model loading and saving, particularly suited for inference tasks. It is compatible with GGML and other executors. Developed by , the creator of (a widely-used C/C++ LLM inference framework), GGUF forms the foundation of the Nexa SDK's GGML component.
is an open standard format for representing machine learning models. It establishes a common set of operators and a unified file format, enabling AI developers to utilize models across various frameworks, tools, runtimes, and compilers. ONNX shows unique performance advantages on devices with limited ram(mobile, IoT). The Nexa SDK's ONNX component is built upon the onnxruntime framework.