Remove all downloaded models on your local computer.
nexaclean
Run a model
Run a model on your local computer. If the model file is not yet downloaded, it will be automatically fetched first. For more details, please refer to the Inference page.
By default nexa run will run GGUF models. Use nexa onnx to run ONNX models.
Run text-generation models on your local computer.
nexarunMODEL_PATHusage: nexa run [-h] [-t TEMPERATURE] [-m MAX_NEW_TOKENS] [-k TOP_K] [-p TOP_P] [-sw [STOP_WORDS ...]] [-pf] [-st] model_path
positionalarguments:model_pathPathoridentifierforthemodelinNexaModelHuboptions:-h,--helpshowthishelpmessageandexit-pf,--profilingEnableprofilinglogsfortheinferenceprocess-st,--streamlitRuntheinferenceinStreamlitUI-lp,--local_pathIndicatethatthemodelpathprovidedisthelocalpath,mustbeusedwith-mt -mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf,--huggingfaceLoadmodelfromHuggingFaceHub,mustbeusedwith-mtTextgenerationoptions:-t,--temperatureTEMPERATURETemperatureforsampling-m,--max_new_tokensMAX_NEW_TOKENSMaximumnumberofnewtokenstogenerate-k,--top_kTOP_KTop-ksamplingparameter-p,--top_pTOP_PTop-psamplingparameter-sw,--stop_words [STOP_WORDS ...]Listofstopwordsforearlystopping--lora_pathPathtoaLoRAfiletoapplytothemodel--nctxMaximumcontextlengthofthemodelyou're using
Example Command:
nexarunllama2
Run image-generation model
Run image-generation models on your local computer.
nexarunMODEL_PATHusage: nexa run [-h] [-i2i] [-ns NUM_INFERENCE_STEPS] [-np NUM_IMAGES_PER_PROMPT] [-H HEIGHT] [-W WIDTH] [-g GUIDANCE_SCALE] [-o OUTPUT] [-s RANDOM_SEED] [-st] model_path
positionalarguments:model_pathPathoridentifierforthemodelinNexaModelHuboptions:-h,--helpshowthishelpmessageandexit-st,--streamlitRuntheinferenceinStreamlitUI,canbeusedwith-lpor-hf-lp,--local_pathIndicatethatthemodelpathprovidedisthelocalpath,mustbeusedwith-mt -mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf,--huggingfaceLoadmodelfromHuggingFaceHub,mustbeusedwith-mtImagegenerationoptions:-i2i,--img2imgWhethertorunimage-to-imagegeneration-ns,--num_inference_stepsNUM_INFERENCE_STEPSNumberofinferencesteps-np,--num_images_per_promptNUM_IMAGES_PER_PROMPTNumberofimagestogenerateperprompt-H,--heightHEIGHTHeightoftheoutputimage-W,--widthWIDTHWidthoftheoutputimage-g,--guidance_scaleGUIDANCE_SCALEGuidancescalefordiffusion-o,--outputOUTPUTOutputpathforthegeneratedimage-s,--random_seedRANDOM_SEEDRandomseedforimagegeneration--lora_dirLORA_DIRPathtodirectorycontainingLoRAfiles--wtypeWTYPEWeighttype (f32, f16,q4_0,q4_1,q5_0,q5_1,q8_0)--control_net_pathCONTROL_NET_PATHPathtocontrolnetmodel--control_image_pathCONTROL_IMAGE_PATHPathtoimageconditionforControlNet--control_strengthCONTROL_STRENGTHStrengthtoapplyControlNet
Example Command:
nexarunsd1-4
Run vision-language model
Run vision-language models on your local computer.
nexarunMODEL_PATHusage: nexa run [-h] [-t TEMPERATURE] [-m MAX_NEW_TOKENS] [-k TOP_K] [-p TOP_P] [-sw [STOP_WORDS ...]] [-pf] [-st] model_path
positionalarguments:model_pathPathoridentifierforthemodelinNexaModelHuboptions:-h,--helpshowthishelpmessageandexit-pf,--profilingEnableprofilinglogsfortheinferenceprocess-st,--streamlitRuntheinferenceinStreamlitUI,canbeusedwith-lpor-hf-lp,--local_pathIndicatethatthemodelpathprovidedisthelocalpath,mustbeusedwith-mt -mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf,--huggingfaceLoadmodelfromHuggingFaceHub,mustbeusedwith-mtVLMgenerationoptions:-t,--temperatureTEMPERATURETemperatureforsampling-m,--max_new_tokensMAX_NEW_TOKENSMaximumnumberofnewtokenstogenerate-k,--top_kTOP_KTop-ksamplingparameter-p,--top_pTOP_PTop-psamplingparameter-sw,--stop_words [STOP_WORDS ...]Listofstopwordsforearlystopping
Example Command:
nexarunnanollava
Run audio model
Run audio models on your local computer.
nexarunMODEL_PATHusage: nexa run [-h] [-o OUTPUT_DIR] [-b BEAM_SIZE] [-l LANGUAGE] [--task TASK] [-t TEMPERATURE] [-c COMPUTE_TYPE] [-st] model_path
positionalarguments:model_pathPathoridentifierforthemodelinNexaModelHuboptions:-h,--helpshowthishelpmessageandexit-st,--streamlitRuntheinferenceinStreamlitUI,canbeusedwith-lpor-hf-lp,--local_pathIndicatethatthemodelpathprovidedisthelocalpath,mustbeusedwith-mt -mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf,--huggingfaceLoadmodelfromHuggingFaceHub,mustbeusedwith-mtAutomaticSpeechRecognitionoptions:-b,--beam_sizeBEAM_SIZEBeamsizetousefortranscription-l,--languageLANGUAGEThelanguagespokenintheaudio.Itshouldbealanguagecodesuchas'en'or'fr'.--taskTASKTasktoexecute (transcribe ortranslate)-c,--compute_typeCOMPUTE_TYPETypetouseforcomputation (e.g., float16,int8,int8_float16)
nexaembedmxbai"I love Nexa AI."nexaembednomic"I love Nexa AI.">>generated_embeddings.txtnexaembednomic-embed-text-v1.5:fp16"I love Nexa AI."nexaembedsentence-transformers/all-MiniLM-L6-v2:gguf-fp16"I love Nexa AI.">>generated_embeddings.txt
Start local server
start a local server using models on your local computer.
nexaserverMODEL_PATHusage:nexaserver [-h] [--host HOST] [--port PORT] [--reload] [--nctx NCTX] model_pathpositionalarguments:model_pathPathoridentifierforthemodelinS3options:-h,--helpshowthishelpmessageandexit-lp,--local_pathIndicatethatthemodelpathprovidedisthelocalpath,mustbeusedwith-mt -mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf,--huggingfaceLoadmodelfromHuggingFaceHub,mustbeusedwith-mt--hostHOSTHosttobindtheserverto--portPORTPorttobindtheserverto--reloadEnableautomaticreloadingoncodechanges
Example Command:
nexa server llama2
Run Model Evaluation
Run evaluation using models on your local computer.
usage:nexaevalmodel_path [-h] [--tasks TASKS] [--limit LIMIT]positionalarguments:model_pathPathoridentifierforthemodelinNexaModelHuboptions:-h,--helpshowthishelpmessageandexit--tasksTASKSTaskstoevaluate,comma-separated --limit LIMIT Limit the number of examples per task. If <1, limit is a percentage of the total number of examples.