Heterogeneous GPU Sharing on Kubernetes
-
Updated
Jun 18, 2026 - Go
Heterogeneous GPU Sharing on Kubernetes
KVM Backend for VirtualBox
Here are my personal paper reading notes (including machine learning systems, AI infrastructure, and other interesting stuffs).
Tensor Fusion is a state-of-the-art GPU virtualization and pooling solution designed to optimize GPU cluster utilization to its fullest potential.
KPilot: Unified control plane for multi-cluster Kubernetes management, GPU compute scheduling, and model serving.
A GPU Virtualisation benchmarking tool - For LLM like workloads, overhead evaluations and on resource isolation metrics
TensorFusion landing page and product docs
Research: vGPU unlock on consumer NVIDIA RTX 5090 (Blackwell/GB202). 19 binary patches, full CPU-side pipeline working, GSP firmware blocked by fused-off VF PRIV registers.
Transparent library for CUDA/NVML that virtualizes GPU access, enforces memory limits, and enables container-friendly sharing with minimal performance overhead.
🎮 Enable GPU passthrough for KVM on full AMD systems, simplifying setup for Arch Linux and Windows 11 virtual machines.
HAMI (Heterogeneous AI Computing Virtualization Middleware) is an open-source project that enables GPU virtualization and sharing for Kubernetes workloads, allowing multiple AI containers to share GPU resources efficiently.
Add a description, image, and links to the gpu-virtualization topic page so that developers can more easily learn about it.
To associate your repository with the gpu-virtualization topic, visit your repo's landing page and select "manage topics."