Gpubox

Revision as of 20:54, 16 February 2026 by Judytuna (talk | contribs) (mediawiki formatting)

mediawiki

gpubox Setup Guide

Bare Metal Configuration

Debian 13 Installation

  • Use the provided ISO (`debian-13.3.0-amd64-netinst.iso`)
  • Ensure correct network interface configuration (check `ip a` after installation)

IPMI Setup

  • Access IPMI via `ipmitool` (e.g., `ipmitool -I lanplus -H 10.0.0.234 -U ADMIN -P ADMIN pwd` for password)

Proxmox VE Installation

  • Install Proxmox VE on bare metal using the official installer
  • Configure storage (e.g., local LVM for VMs)

GPU Passthrough Configuration

BIOS/UEFI

  • Enable VT-d (Virtualization Technology for Directed I/O)

Identify GPUs

``` {{#lst:|l|nvidia}} // List all NVIDIA GPUs ``` Example output: ``` 08:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev ff) ``` Use the vendor ID (`10de`) and device ID (`1b06`) to identify GPUs.

VFIO Modules

Create `/etc/modules-load.d/vfio.conf` with: ``` vfio vfio_iommu_type1 vfio_pci vfio_virqfd ```

NVIDIA Drivers on Host

Edit /etc/apt/sources.list

``` sed -i 's/main/main non-free contrib/g' /etc/apt/sources.list apt update apt install -y nvidia-driver nvidia-kernel-dkms ```

VM Templates & Cloning

Template VM

  • Use `debian-13.3.0-amd64-netinst.iso` to create a minimal Debian 13 template
  • Convert to template (Proxmox: VM > Convert to Template)

Clone VMs

  • Clone the template for `ollama-2080` and `dockerhost`
  • **Pass GPU**: In VM settings, go to "Hardware" > "PCI" > "Raw" and select the GPU (use `lspci` IDs)

Specific VM Configurations

ollama-2080

  • Install `ollama` (e.g., `curl -fsSL https://ollama.com/install.sh | sh`)
  • Configure GPU acceleration (check `ollama --version` and ensure NVIDIA drivers are loaded)

dockerhost

  • Install Docker:

``` apt install -y docker.io systemctl enable --now docker ```

  • Add user to `docker` group (`usermod -aG docker $USER`)

ai-conductor

  • Install required tools (e.g., `kubectl` for Kubernetes orchestration)

Key Commands

```

  1. Check GPU visibility in host

lspci -k | grep -A 2 "VGA"

  1. Verify VFIO modules loaded

lsmod | grep vfio

  1. Test NVIDIA driver

nvidia-smi # Should show GPU details

  1. Clone a template in Proxmox

qm clone <source_VM_ID> <new_VM_ID> --name "ollama-2080" ```

Troubleshooting

  • **GPU Not Visible**: Ensure VT-d is enabled in BIOS and the GPU is listed in `lspci`
  • **Driver Issues**: Reinstall `nvidia-driver` and reboot
  • **Permission Errors**: Add user to `docker` and `kvm` groups