⬆️Upload Model
🤝 Share your model and connect with developers, researchers, and users for support and collaboration.
Last updated
🤝 Share your model and connect with developers, researchers, and users for support and collaboration.
Last updated
To upload models, you will need to create an account with Nexa AI Hub. Currently, you can upload model through web interface. After uploading the model, you will have control over what files to include in your model, and how to create a tag for your model to make your model more discoverable, read more below.
Step 0. To upload models to the Hub, visit Nexa Hub and make sure you registered an account for Nexa AI Hub.
Step 1. Click "Upload your model"
Step 2. Fill-in model name, parameters, model type, and license (optional)
Step 3. Fill-in Model tag name, see the tag section on how to tag your model
Step 4. Edit model tag to make your model more discoverable
Step 5. Upload files
Step 6. Edit README to add more descriptions for your model
Model Tag Name
We recommend that model tag names should make sense. In official models in Model Hub, we add model precision in the model tag name.
File Format Tag
GGUF
GGUF is an optimized binary format designed for efficient model loading and saving, particularly suited for inference tasks. It is compatible with GGML and other executors. Developed by @ggerganov, the creator of llama.cpp (a widely-used C/C++ LLM inference framework), GGUF forms the foundation of the Nexa SDK's GGML component.
ONNX
ONNX is an open standard format for representing machine learning models. It establishes a common set of operators and a unified file format, enabling AI developers to utilize models across various frameworks, tools, runtimes, and compilers. ONNX shows unique performance advantages on devices with limited ram(mobile, IoT). The Nexa SDK's ONNX component is built upon the onnxruntime framework.
RAM
This metric indicates the minimum random access memory (RAM) necessary for local model execution. You can calculate the approximate RAM required according to the formula:
File Size
Displays the total storage space required for the model.
Model Precision | Bits per Weight (BPW) Approximation |
---|---|
gguf-q2_K
2
gguf-q3_K_L
3
gguf-q3_K_M
3
gguf-q3_K_S
3
gguf-q4_0
4
gguf-q4_1
4
gguf-q4_K_M
4
gguf-q4_K_S
4
gguf-q5_0
5
gguf-q5_1
5
gguf-q5_K_M
5
gguf-q5_K_S
5
gguf-q6_K
6
gguf-q8_0
8
onnx-int4
4
onnx-int8
8
onnx-bf16
16
onnx-fp16
16
onnx-fp32
32