Local LLM Tools
Explore the highest-rated open-source tools for running Large Language Models locally. Discover inference engines, GUI clients, and API wrappers for GGUF and safetensors models. Sorted by GitHub authority and active contributions.
Kubernetes operator for local LLM inference with llama.cpp, vLLM, TGI, and mlx-server — multi-GPU
This repository provides a ready-to-use Google Colab notebook that turns Colab into a temporary
Run local LLM from Huggingface in React-Native or Expo using onnxruntime.
Run Open Source/Open Weight LLMs locally with OpenAI compatible APIs
Chat, RAG search, multi-step Plans workflows, MCP tools, and Agents integration. Supports OpenAI,
Project Jarvis is a versatile AI assistant that integrates various functionalities.
SmarterRouter: An intelligent LLM gateway and VRAM-aware router for Ollama, llama.cpp, and OpenAI.
M-Courtyard: Local AI Model Fine-tuning Assistant for Apple Silicon. Zero-code, zero-cloud,
Native LLM inference server for Apple Silicon. OpenAI + Anthropic API compatible. No Python.
Vibecode Editor is a fullstack, web-based IDE built with Next.js and Monaco Editor. It features
Openai-style, fast & lightweight local language model inference w/ documents
A python package for developing AI applications with local LLMs.
Give a query, get a dataroom. Pi + self-hosted Qwen3.6 research harness on a single L4.
🤖 Visual AI agent workflow automation platform with local LLM integration - build intelligent
Private on-device AI chat for Android — runs any GGUF model locally via llama.cpp with
Orchestrate a swarm of Claude Code agents with a local brain that learns from you.
Local LLM Testing & Benchmarking for Apple Silicon
macOS menu bar app that exposes Apple's on-device Foundation Models as an OpenAI-compatible local
Custom TTS component for Home Assistant. Utilizes the OpenAI speech engine or any compatible
Open Source Local Data Analysis Assistant.
One-click Qwen3.6-27B inference on Windows. 158 tok/s on RTX 5090, 72 tok/s on RTX 3090. Native, no
Find the best models and how to run them locally.
Your fully proficient, AI-powered and local chatbot assistant🤖
LLM story writer with a focus on high-quality long output based on a user provided prompt.