GGUF
Inference with GGUF models
Text-generation model
Command usage
Options
-t, --temperature
: Temperature for sampling-m, --max_new_tokens
: Maximum number of new tokens to generate-k, --top_k
: Top-k sampling parameter-p, --top_p
: Top-p sampling parameter-sw, --stop_words
: List of stop words for early stopping-hf, --huggingface
: Load model from Hugging Face Hub, use Hugging Face repo_id as model_path-pf, --profiling
: Enable profiling logs for the inference process-st, --streamlit
: Run the inference in Streamlit UI-lp, --local_path
: Indicate the model path is local path, must be used with -mt-mt, --model_type
: Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]-hf, --huggingface
: Load model from Hugging Face Hub, must be used with -mt
Streamlit Interface
Image-generation model
Command Usage
Options
-i2i, --img2img
: Whether to run image-to-image generation-ns, --num_inference_steps
: Number of inference steps-np, --num_images_per_prompt
: Number of images to generate per prompt-H, --height
: Height of the output image-W, --width
: Width of the output image-g, --guidance_scale
: Guidance scale for diffusion-o, --output
: Output path for the generated image-s, --random_seed
: Random seed for image generation--lora_dir
: Path to directory containing LoRA files--wtype
: Weight type (f32, f16, q4_0, q4_1, q5_0, q5_1, q8_0)--control_net_path
: Path to Control Net model--control_image_path
: Path to image condition for Control Net--control_strength
: Strength to apply Control Net-st, --streamlit
: Run the inference in Streamlit UI-lp, --local_path
: Indicate the model path is local path, must be used with -mt-mt, --model_type
: Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]-hf, --huggingface
: Load model from Hugging Face Hub, must be used with -mt
Streamlit Interface
Vision-language model
Command Usage
Options
-t, --temperature
: Temperature for sampling-m, --max_new_tokens
: Maximum number of new tokens to generate-k, --top_k
: Top-k sampling parameter-p, --top_p
: Top-p sampling parameter-sw, --stop_words
: List of stop words for early stopping-pf, --profiling
: Enable profiling logs for the inference process-st, --streamlit
: Run the inference in Streamlit UI-lp, --local_path
: Indicate the model path is local path, must be used with -mt-mt, --model_type
: Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]-hf, --huggingface
: Load model from Hugging Face Hub, must be used with -mt
Streamlit Interface
Automatic Speech Recognition model
Command Usage
Options
-o, --output_dir
: Output directory for transcriptions-b, --beam_size
: Beam size to use for transcription-l, --language
: The language spoken in the audio. It should be a language code like 'en' or 'fr'.--task
: Task to execute (transcribe or translate)-c, --compute_type
: Type to use for computation (e.g., default, float16, int8, int8_float16)-st, --streamlit
: Run the inference in Streamlit UI-lp, --local_path
: Indicate the model path is local path, must be used with -mt-mt, --model_type
: Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]-hf, --huggingface
: Load model from Hugging Face Hub, must be used with -mt
Streamlit Interface
Last updated