SoftPicks
.net
← Back to Home
autoscaling
Defilantech / LLMKube
Kubernetes operator for local LLM inference with llama.cpp, vLLM, TGI, and mlx-server — multi-GPU