Onnx Runtime

The best 47 Onnx Runtime AI tools - Free & Paid

Free AI tools 💸 All categories 🎨 Deals ％ For you 👀

Explore 47 AI for Onnx Runtime

Free Only

foundrylocal.ai

Foundry Local runs AI models on-device using ONNX Runtime (CPU/GPU/NPU) to keep data local, offering an OpenAI-compatible API, Python/JS/C#/Rust SDKs, a model hub, and CLI tools for edge and enterprise deployments.

LLM

Free

gpt-oss playground

1 0

gpt-oss playground provides open-weight demos of gpt-oss-120b and 20b for infrastructure testing, distributed and on-device inference, benchmarking, API integration, and reproducible research, with adjustable reasoning levels and visible-reasoning for diagnostics. Demo-only; validate outputs.

AI Agents

Freemium

Onyx.app

3 0

Onyx.app is a conversational AI platform that combines chat-based search, configurable agents, and action orchestration for teams. It integrates with enterprise systems to automate tasks and manage knowledge, with deployment options for regulated industries.

AI Assistant

Free trial - $20/mo

OfoxAI

2 0

OfoxAI is a centralized AI gateway that streamlines access and management of AI models and inference endpoints. It enables multi-model orchestration, intelligent request routing, and built-in API management with security, observability, and MLOps integration for scalable, reliable deployments.

Developer tools

Freemium

ComfyOnline

ComfyOnline lets users run ComfyUI workflows online, automatically installing dependencies and models. It auto‑generates APIs for image, video, audio, and text generation, supports advanced services, LLMs, custom nodes, and scales with traffic.

Developer tools

Subscription - $70/mo

Openrouter.ai

11 4

OpenRouter gives one API key to access 300+ models from 60+ providers, SDK‑compatible, with visual routing, automated fall‑back, edge hosting, data‑policy controls, and agentic tools for building efficient autonomous workflows.

Developer tools

Freemium

Orq.ai

Orq.ai is a generative AI collaboration platform for building, evaluating, and deploying LLM applications. It provides an agent runtime for multi-agent workflows, secure model gateway, RAG-enabled knowledge base, monitoring, evaluation tools, APIs, and governance controls.

LLM

- $35/mo

Related topics: 🔍 ux optimization tool 🔍 machine learning-free model running 🔍 pytorch 🔍 fast machine learning inference 🔍 on-premise ml inference 🔍 openai gpt engine-powered platform

Openmed

2 0

openmed is an on-device clinical AI platform for PHI/PII detection, de-identification, and healthcare NER, offering 1,000+ model variants, multilingual support, curated biomedical datasets, configurable privacy controls, and air-gapped macOS/iOS/server runtimes.

Security and Privacy

GPUX.AI

GPUX is a serverless inference platform that delivers 1‑second cold starts and GPU‑accelerated execution for models like Stable Diffusion XL, ESRGAN, and Whisper. It supports P2P and read‑write volume access for rapid, scalable deployment on NVIDIA RTX 4090 GPUs.

Development

Freemium

onit.com

Onit Unity is an enterprise legal operations platform combining matter and spend management, contract lifecycle management, AI-assisted contract and invoice review, and workflow automation to centralize intake, approvals, reporting and real-time spend and matter visibility.

Legal

Free

RunningHub

13 3

RunningHub is a cloud IDE for ComfyUI workflows, enabling in‑browser design, editing, and GPU‑accelerated execution. It offers pre‑installed nodes, access to major diffusion and video models, training tools, API integration, and real‑time collaboration.

Image editing

Free

Can I run AI

2 0 1

canirun.ai is a searchable database mapping AI models to compatible hardware, listing CPUs/GPUs (including Apple M-series and NVIDIA cards), model requirements, VRAM/memory needs, filters and comparisons to plan local inference, fine-tuning, or deployment.

LLM

Free

Atomic Chat

2 0

Atomic Chat is a fully offline, on-device AI chat app for macOS, Windows, Linux, iOS, and Android that runs 1,000+ LLMs locally with built-in agent support, persistent memory, and privacy-first design. It features TurboQuant optimizations for up to 8x faster attention and lower memory use, with one-

Chat

Free

Operator browser base

Open Operator is a user-friendly AI tool that allows users to view, run, and browse AI models directly in their web browser. Powered by Stagehand and BrowserBase, it offers a seamless experience for exploring AI predictions effortlessly.

Developer tools

Openfang.sh

1 0 3

OpenFang.sh is an open-source agent operating system that orchestrates autonomous AI agents and capability packages across macOS, Linux, and Windows. It provides a secure, sandboxed runtime with built-in tools for tasks like research, monitoring, and automation, all managed through a native desktop

AI Agents

Freemium

Opper.ai

2 0

Opper is a unified AI gateway and agent control plane that routes requests across 200+ models and modalities, offering centralized model routing, automated fallbacks, budget caps, LLM observability, a multi-provider testing playground, OpenAI-compatible SDK, and enterprise privacy/compliance control

LLM

Usage Based

OnDemand

1 0

OnDemand AI Agents is a decentralized OS that lets users build, deploy, and scale AI agents without a dev team. It offers a no‑code workflow builder, an agent marketplace, secure model integration, an AI playground for testing, and enterprise‑grade security.

Automation

Freemium

Oxlo AI

1 0

Oxlo.ai provides a privacy-first inference API for running frontier and open-source models, supporting streaming, function calling, embeddings, speech and vision tasks, long-context and agentic workflows, and OpenAI-compatible SDKs for scalable chatbots, RAG, document Q&A, and batch processing.

API

Free trial - $80/mo

Onnix

0 1

Onnix AI creates slide decks from bank templates, learns from past decks for rapid iteration, and performs Excel‑style data analysis via prompt commands, integrating with FactSet, CapIQ, or other feeds, delivering traceable, shareable insights for senior and junior teams.

Finance

Freemium

ONVY

4 1

ONVY consolidates biometric, lab, wearable, and environmental data via a single API, delivering AI‑driven coaching, personalized nudges, and adaptive nutrition or training plans. It offers GDPR/HIPAA‑compliant security, enterprise dashboards, modular integration, and continuous learning for preventi

Health

Subscription

ZETIC.MLange

1 0

ZETIC deploys TorchScript, TensorFlow, and ONNX models to mobile and embedded devices, quantizing for CPU, GPU, or NPU to reach up to 60× speed and 50% size reduction. It supplies benchmarks and a 3‑line offline code snippet for privacy‑preserving AI.

Model generation

Free

OCMaker AI

OC Maker AI generates anime-style characters from text or image inputs, offering inpainting/outpainting, pose control, animation and video-to-video conversion, 3D/Live2D export, and fine-grained style editing for illustration, animation, and game asset workflows.

AI Characters

Free - $8

OpenHuman

OpenHuman is an open-source personal AI framework for private, on‑premises deployments and local model execution, providing an agent framework, prompt management, local speech (Whisper/Piper), integrations, Docker/one‑click deployment, and developer tooling.

Personal assistant

Free

local.ai

local.ai runs language models locally without GPUs. Its Rust backend keeps the binary under 10 MB and performs CPU inference with GGML quantization. A single‑click interface streams responses to a UI, while a model manager tracks, verifies, and resumes downloads.

Developer tools

Freemium

OmniRoute

OmniRoute is an open-source AI gateway that routes requests to 236 LLM providers via a single /v1 endpoint, offering multi-provider routing with auto-fallback, token compression, persistent memory, resilience controls, MCP/A2A support, and self-hosted analytics.

Infrastructure tools

Freemium

LLMWare.ai

LLMWare AI installs a lightweight client on PCs, providing instant access to 100+ AI models optimized for Intel and Qualcomm hardware. It supports RAG, auto‑tunes weights, runs locally without Wi‑Fi, and offers an admin console for monitoring, scaling, and audit logs.

LLM

Freemium

OpenCraft AI

1 0

OpenCraft AI is a secure, multi‑model copilot that unifies GPT‑4, Claude, and Gemini. It preserves context across model switches, keeps uploaded files accessible, auto‑formats chats into reports or decks, and generates images with consistent voice tone for streamlined workflows.

Code assistant

Paid

Openfabric AI

1 0

Openfabric is a decentralized layer‑one blockchain that lets AI developers, data providers, and infrastructure partners build, train, and deploy algorithms on a permissionless network. Its marketplace offers ready‑made tools, and token holders can stake for governance and liquidity.

AI Assistant

Freemium

Msgmate

Open‑Chat is a self‑hostable, decentralized chat platform that supports both proprietary and open‑source AI models. It provides a full chat API, runs LLMs on local GPUs, lowers latency, enhances privacy, and deploys easily on personal or cloud servers.

Chat

Freemium

OminiGate.ai

OminiGate.ai is a unified AI API gateway that provides an OpenAI-compatible endpoint for text, image, and video models, enabling seamless switching between providers like OpenAI and Anthropic with minimal code changes. It features intelligent routing, automatic failover, cost optimization, and enter

API

Subscription

Unsloth Studio

4 0 2

Unsloth Studio is a no-code web UI enabling local training, running, and exporting of open AI models like Qwen3.5 and NVIDIA Nemotron 3, simplifying experimentation for users without extensive technical expertise.

Infrastructure tools

Free

InfinityFlow

Infinity is an AI‑native database offering hybrid search across dense/sparse embeddings, tensors, and full‑text with optional RRF, weighted‑sum, or ColBERT reranking. It delivers 0.1 ms latency, 15 k qps, supports strings, numerics, and vectors for LLM developers, data scientists, and AI engineers.

LLM

Freemium

Nexa.ai

Nexa AI offers an on‑device platform that lets developers deploy vision, audio, and text models to NPUs, GPUs, and CPUs with one line of code. The SDK supports day‑zero deployment, multimodal inference, and optimizations for mobile, automotive, and IoT devices.

AI Assistant

Free

Onyxium AI

2 0

Onyxium consolidates image, language, and speech AI models for developers, designers, and teams. Customizable parameters and usage logs support tailored output, workflow tracking, and seamless embedding into applications, boosting efficiency throughout the development cycle.

Development

Freemium

Osmantic ODS

ODS is an on-premises AI server for PC, Mac, and Linux offering local LLM inference, model management, agents, chat/voice UI, RAG with local document search, image pipelines, integrations (Ollama, ComfyUI, n8n), and operational tooling for private deployments.

LLM

Free

OpenCode.ai

4 0 1

OpenCode.ai is an open-source AI coding agent that runs directly in your terminal, IDE, or desktop. It connects to 75+ LLM providers, supports offline use, and enables multi-session collaboration for code review and debugging.

Code assistant

Free

Xturing

0 1

xTuring is an open‑source framework that lets developers and researchers build, fine‑tune, and deploy LLMs efficiently. It supports LoRA adapters, INT8 quantization, custom datasets, offers CLI and notebooks, and provides a unified API for multiple backends.

Development

Freemium

Supertonic

4 0

Supertonic is a lightning-fast, on-device text-to-speech (TTS) system built for local inference using ONNX Runtime, supporting 31 languages and offering a compact, open-weight model for edge deployment.

Text-to-speech

Free

Open Notebook

Open Notebook is a self-hosted, open-source notebook for private LLM workflows, supporting over 16 AI providers. It enables multi-modal content management, vector search, and contextual chat with full data sovereignty for research and development teams.

LLM

Freemium

OpenComputer

Opencomputer is a scalable, on-demand compute platform for LLM agents and AI workloads, combining VM-level isolation with sandboxed execution. It supports type-1 ephemeral sandboxes for fast cold-starts (~100ms) and type-2 persistent sandboxes for long-running agent sessions with state preservation

Infrastructure tools

Freemium

Llama.cpp

3 0 1

Llama.cpp is an open-source tool for efficient inference of large language models. Run open source LLM models locally everywhere.

Infrastructure tools

Free

OpenInterpreter

1 0

Openinterpreter is an open-source AI coding agent that runs in the terminal or as a desktop app, enabling code editing, command execution, and test automation within a sandboxed OS environment. It supports multiple model providers (OpenAI, Anthropic, local models) and features configurable permissio

Code assistant

Freemium

Exllama

1 0

exllama is a memory-efficient tool for executing Hugging Face transformers with the LLaMA models using quantized weights, enabling high-performance NLP tasks on modern GPUs while minimizing memory usage and supporting various hardware configurations.

LLM

Free

Onvo AI

1 0

Onvo AI revolutionizes data visualization through AI prompts, enabling users to easily generate tailored charts and dashboards without intricate queries. It ensures secure sharing, supports multiple data source integrations, and provides SDKs for smooth product incorporation.

Data analysis

Free trial

Nebius AI Studio

9 3

Nebius AI Studio offers efficient model deployment with hosted open-source models, ultra-low latency, and scalable processing options. It simplifies AI model exploration through an intuitive interface while ensuring verified quality and performance for diverse applications.

Model generation

Free trial

KoboldCPP

1 0

KoboldCpp is a versatile AI text-generation tool that supports various GGML and GGUF models with an intuitive UI, native image generation, and enhanced performance via CUDA and CLBlast acceleration.

Infrastructure tools

Free

Odysseus

Odysseus is a privacy-first, self-hosted AI workspace for running and serving local LLMs, autonomous agents, and multi-turn chat, offering model management, hardware-aware serving, built-in tools, persistent memory, research workflows, and integrations.

Infrastructure tools

Free

Onnx Runtime

The best 47 Onnx Runtime AI tools - Free & Paid

Explore 47 AI for Onnx Runtime

Related topics

Related Topics