My set up for running Kubernetes on the Nvidia DGX Spark with the GPU Operator alongside Ollama, OpenWebUI, and Jupyter Notebooks.
I recently got an NVIDIA DGX Spark—an AI supercomputer on the Grace Blackwell GB10 platform. As a Field Software Engineer at Canonical, I deploy infrastructure like OpenStack, Kubernetes, and Ceph at scale. Instead of running just NVIDIA’s AI stack, I wanted to set it up as a true infrastructure node.
This post covers how I set up MicroK8s alongside Docker, got the NVIDIA GPU Operator running, and configured GPU time-slicing to run multiple workloads concurrently on a single GB10 GPU. Along the way, I’ll explain why I chose MicroK8s specifically — and why the newer Canonical Kubernetes k8s snap didn’t work out of the box on this hardware.
sudo snap install microk8s --classic --channel=1.32/stable
sudo microk8s status --wait-readyAdd your user to the microk8s group:
sudo usermod -aG microk8s $USER
newgrp microk8s
microk8s kubectl get nodesEnable the core addons you’ll need:
microk8s enable dns
microk8s enable hostpath-storage
microk8s enable helm3Alias kubectl and helm so you’re not typing microk8s constantly:
echo "alias kubectl='microk8s kubectl'" >> ~/.bashrc
echo "alias helm='microk8s helm3'" >> ~/.bashrc
source ~/.bashrcNote: Images pulled with **docker pull** are not visible to Kubernetes pods, and vice versa, since they use different container runtimes. To use a local Docker image in a pod, import it with:
docker save myimage:latest | microk8s ctr images import -Usually, you reference images by their registry path, so both runtimes can pull as needed. This is rarely a concern unless you build images locally.
The GPU Operator automates everything needed to expose NVIDIA GPUs to Kubernetes workloads: driver management, the container toolkit, device plugin, DCGM exporter for metrics, and MIG configuration. On the DGX Spark, the NVIDIA driver is already installed and managed by DGX OS, so we skip the operator’s driver component and let it handle everything else.
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
helm repo update nvidia
helm repo list | grep nvidiaThe key flag here is --set driver.enabled=false — we’re telling the operator not to manage the driver, since DGX OS handles it itself. We do want the toolkit (which configures containerd to use the NVIDIA runtime) and all other components.
Because MicroK8s runs in a snap confinement, its containerd socket and config files live under /var/snap/microk8s/ rather than at the system defaults. The GPU Operator needs to know exactly where to find them, which is why we pass three environment variables explicitly:
helm install gpu-operator -n gpu-operator --create-namespace \
nvidia/gpu-operator \
--version=v25.10.1 \
--set driver.enabled=false \
--set toolkit.enabled=true \
--set toolkit.env[0].name=CONTAINERD_CONFIG \
--set toolkit.env[0].value=/var/snap/microk8s/current/args/containerd-template.toml \
--set toolkit.env[1].name=CONTAINERD_SOCKET \
--set toolkit.env[1].value=/var/snap/microk8s/common/run/containerd.sock \
--set toolkit.env[2].name=RUNTIME_CONFIG_SOURCE \
--set-string toolkit.env[2].value=file=/var/snap/microk8s/current/args/containerd.tomlGetting these paths wrong is the most common failure point when deploying the GPU Operator on any snap-based Kubernetes. The operator pods will crash-loop with an unhelpful socket error if they can’t reach containerd. If you’re adapting this for Canonical Kubernetes (k8s snap), the equivalent paths are under /ck8s/k8s-containerd/, but again, on the DGX Spark, I’d steer you toward MicroK8s for the iptables reason above.
Create a test pod that runs a CUDA vector addition:
# cuda-test.yaml
apiVersion: v1
kind: Pod
metadata:
name: cuda-vectoradd
spec:
restartPolicy: OnFailure
runtimeClassName: nvidia
containers:
- name: cuda-vectoradd
image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04"
resources:
limits:
nvidia.com/gpu: 1kubectl apply -f cuda-test.yaml
kubectl logs cuda-vectoraddA successful output looks like:
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done# Clean up
kubectl delete -f cuda-test.yamlBy default, Kubernetes treats a GPU as a single resource, and only one pod can request nvidia.com/gpu:1 at a time. But on DGX Spark’s GB10 (128GB memory), this is restrictive. Multiple workloads, such as Jupyter, inference, and training, can often coexist.
GPU time-slicing lets Kubernetes treat a single GPU as multiple slots. The NVIDIA device plugin multiplexes access among pods. Note: No memory isolation exists, so VRAM is shared. Exceeding it results in an OOM error—keep this in mind when designing workloads.
cat <<'EOF' | kubectl apply -n gpu-operator -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: time-slicing-config
data:
any: |-
version: v1
flags:
migStrategy: none
sharing:
timeSlicing:
resources:
- name: nvidia.com/gpu
replicas: 4
EOFreplicas: 4 means Kubernetes will see 4 GPU slots. Common values are 2–8, depending on your workload mix.
kubectl patch clusterpolicy/cluster-policy -n gpu-operator --type merge -p \
'{"spec":{"devicePlugin":{"config":{"name":"time-slicing-config","default":"any"}}}}'kubectl delete pod -n gpu-operator -l app=nvidia-device-plugin-daemonsetWait 30–60 seconds, then:
kubectl describe node | grep nvidia.com/gpuExpected output:
nvidia.com/gpu.replicas=4
nvidia.com/gpu.sharing-strategy=time-slicing
nvidia.com/gpu.product=NVIDIA-GB10-SHARED
nvidia.com/gpu: 4
nvidia.com/gpu: 4The SHARED suffix and sharing-strategy=time-slicing label confirm time-slicing is active. Each pod still requests nvidia.com/gpu: 1, with multiplexing handled by Kubernetes.
# Change the replica count
kubectl edit configmap time-slicing-config -n gpu-operator
kubectl delete pod -n gpu-operator -l app=nvidia-device-plugin-daemonset
# Revert to exclusive access
kubectl patch clusterpolicy/cluster-policy -n gpu-operator --type merge -p \
'{"spec":{"devicePlugin":{"config":{"name":"","default":""}}}}'
kubectl delete configmap time-slicing-config -n gpu-operator
kubectl delete pod -n gpu-operator -l app=nvidia-device-plugin-daemonset