GGUF
GGUF Interface
NexaTextInference
A class used for loading text models and running text generation.
Methods
run(): Run the text generation loop.
run_streamlit(): Run the Streamlit UI.
create_embedding(input): Embed a string.
create_chat_completion(messages): Generate completion for a chat conversation.
create_completion(prompt): Generate completion for a given prompt.
Arguments
model_path
(str): Path or identifier for the model in Nexa Model Hub.local_path
(str): Local path of the model. Either model_path or local_path should be provided.embedding
(bool): Enable embedding generation.stop_words
(list): List of stop words for early stopping.temperature
(float): Temperature for sampling.max_new_tokens
(int): Maximum number of new tokens to generate.top_k
(int): Top-k sampling parameter.top_p
(float): Top-p sampling parameter.profiling
(bool): Enable timing measurements for the generation process.streamlit
(bool): Run the inference in Streamlit UI.
Example Code
NexaImageInference
A class used for loading image models and running image generation.
Methods
txt2img(prompt): Generate images from text.
img2img(image_path, prompt): Generate images from an image.
run_txt2img(): Run the text-to-image generation loop.
run_img2img(): Run the image-to-image generation loop.
run_streamlit(): Run the Streamlit UI.
Arguments
model_path
(str): Path or identifier for the model in Nexa Model Hub.local_path
(str): Local path of the model. Either model_path or local_path should be provided.output_path
(str): Output path for the generated image.num_inference_steps
(int): Number of inference steps.width
(int): Width of the output image.height
(int): Height of the output image.guidance_scale
(float): Guidance scale for diffusion.random_seed
(int): Random seed for image generation.streamlit
(bool): Run the inference in Streamlit UI.
Example Code
NexaVLMInference
A class used for loading VLM (Vision-Language Model) models and running text generation.
Methods
run(): Run the text generation loop.
run_streamlit(): Run the Streamlit UI.
create_chat_completion(messages): Generate text completion for a given chat prompt.
_chat(user_input, image_path): Generate text about the given image
Arguments
model_path
(str): Path or identifier for the model in Nexa Model Hub.local_path
(str): Local path of the model. Either model_path or local_path should be provided.stop_words
(list): List of stop words for early stopping.temperature
(float): Temperature for sampling.max_new_tokens
(int): Maximum number of new tokens to generate.top_k
(int): Top-k sampling parameter.top_p
(float): Top-p sampling parameter.profiling
(bool): Enable timing measurements for the generation process.streamlit
(bool): Run the inference in Streamlit UI.
Example Code
NexaVoiceInference
A class used for loading voice models and running voice transcription.
Methods
run(): Run the voice transcription loop.
run_streamlit(): Run the Streamlit UI.
transcribe(audio_path): Transcribe the audio file into text
Arguments
model_path
(str): Path or identifier for the model in Nexa Model Hub.local_path
(str): Local path of the model. Either model_path or local_path should be provided.output_dir
(str): Output directory for transcriptions.compute_type
(str): Type to use for computation (e.g., float16, int8, int8_float16).beam_size
(int): Beam size to use for transcription.language
(str): The language spoken in the audio.task
(str): Task to execute (transcribe or translate).temperature
(float): Temperature for sampling.
Example Code
Generate Embeddings
Generate text embeddings that can be used in RAG.
Example Code
Last updated