Nexa Sdk
Run any AI model locally - text, speech, vision understanding and image generation
Why Choose Nexa Sdk?
Choose this if you want to run any AI model locally on your device, whether it’s text, speech, or images, with support for lots of hardware and formats. It’s ideal for developers wanting powerful, flexible AI inference on-device.
Run any AI model locally - text, speech, vision understanding and image generation
Social Media
Nexa Sdk Introduction
What is Nexa Sdk?
Nexa SDK is an on-device inference framework that runs any model on any device, across any backend. It runs on CPUs, GPUs, NPUs with backend support for CUDA, Metal, Vulkan, and Qualcomm NPU. It handles multiple input modalities including text 📝, image 🖼️, and audio 🎧. The SDK includes an OpenAI-compatible API server with support for JSON schema-based function calling and streaming. It supports model formats such as GGUF, MLX, Nexa AI's own .nexa format, enabling efficient quantized inference across diverse platforms.
How to use Nexa Sdk?
- Download & follow instruction on https://sdk.nexa.ai. - Run commands in terminal. GitHub Repo: https://github.com/NexaAI/nexa-sdk
Why Choose Nexa Sdk?
Choose this if you want to run any AI model locally on your device, whether it’s text, speech, or images, with support for lots of hardware and formats. It’s ideal for developers wanting powerful, flexible AI inference on-device.
Nexa Sdk Features
AI Chat Generator
- ✓Run multimodal models (text, speech, vision understanding and image generation) locally
- ✓Integrate into on-device AI apps
- ✓First NPU-Aware Multimodal Inference Stack
- ✓Run models from Hugging Face
- ✓Supports model formats such as GGUF, MLX, Nexa AI's own .nexa format
- ✓OpenAI-compatible API server
FAQ?
Pricing
Open Source
No description available.
Customized pricing for enterprise use
No description available.