βš™οΈ Prerequisites

  • If you haven’t already, install the nexa-SDK by following the installation guide.
  • MLX models only work on MacOS. Verify you have at least 16GB RAM.
  • Below are the MLX-compatible model types you can experiment with right away.

LLM - Language Models

πŸ“ Language models in MLX format. Try out this quick example: Try it out:
./nexa infer nexaml/Qwen3-0.6B-bf16-MLX
⌨️ Once model loads, type or paste multi-line text directly into the CLI to chat with the model.

LMM - Multimodal Models

πŸ–ΌοΈ Language models that also accept vision and/or audio inputs. LMM in MLX formats. Try out this quick example:
./nexa infer nexaml/gemma-3-4b-it-8bit-MLX
⌨️ Drag photos or audio clips directly into the CLI β€” you can even drop multiple images at once!

Supported Model List

We curated a list of top, high quality models in MLX format.
Many MLX models in the Hugging Face mlx-community have quality issues and may not run locally. We recommend using models from our collection for best results.

πŸ™‹ Request New Models

Want a specific model? Submit an issue on the nexa-sdk GitHub or request in our Discord/Slack community!