Defilantech / LLMKube

Tags:

open source ai go local llm apple silicon autoscaling

Description

Kubernetes operator for local LLM inference with llama.cpp, vLLM, TGI, and mlx-server — multi-GPU NVIDIA + Apple Silicon Metal, autoscaling, air-gapped, production-ready

Technical Specifications

Core Language	Go
GitHub Authority	⭐ 127 stars
Last Code Push	2026-06-10
Open Issues / Bugs	🛠️ 43 bugs listed
License Type	Open-Source (Free to use)

Get Source Code

This project is open-source and hosted on GitHub. Click below to explore the repository, deployment guides, or fork the code.

Go to Repository →

Defilantech / LLMKube

Description

Technical Specifications

Get Source Code

Browse Software Categories