← Back to Home
Defilantech / LLMKube logo

Defilantech / LLMKube

Description

Kubernetes operator for local LLM inference with llama.cpp, vLLM, TGI, and mlx-server — multi-GPU NVIDIA + Apple Silicon Metal, autoscaling, air-gapped, production-ready

Technical Specifications

Core Language Go
GitHub Authority ⭐ 127 stars
Last Code Push 2026-06-10
Open Issues / Bugs 🛠️ 43 bugs listed
License Type Open-Source (Free to use)

Get Source Code

This project is open-source and hosted on GitHub. Click below to explore the repository, deployment guides, or fork the code.

Go to Repository →

Browse Software Categories