192
edits
(init gpubox notes) |
(mediawiki formatting) |
||
| Line 1: | Line 1: | ||
gpubox Setup Guide | mediawiki | ||
= gpubox Setup Guide = | |||
== Bare Metal Configuration == | |||
=== Debian 13 Installation === | |||
* Use the provided ISO (`debian-13.3.0-amd64-netinst.iso`) | |||
* Ensure correct network interface configuration (check `ip a` after installation) | |||
=== IPMI Setup === | |||
* Access IPMI via `ipmitool` (e.g., `ipmitool -I lanplus -H 10.0.0.234 -U ADMIN -P ADMIN pwd` for password) | |||
=== Proxmox VE Installation === | |||
* Install Proxmox VE on bare metal using the official installer | |||
* Configure storage (e.g., local LVM for VMs) | |||
== GPU Passthrough Configuration == | |||
=== BIOS/UEFI === | |||
* Enable VT-d (Virtualization Technology for Directed I/O) | |||
=== Identify GPUs === | |||
``` | |||
{{#lst:|l|nvidia}} // List all NVIDIA GPUs | |||
``` | |||
Example output: | |||
``` | |||
08:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev ff) | |||
``` | |||
Use the vendor ID (`10de`) and device ID (`1b06`) to identify GPUs. | |||
=== VFIO Modules === | |||
Create `/etc/modules-load.d/vfio.conf` with: | |||
``` | |||
vfio | |||
vfio_iommu_type1 | |||
vfio_pci | |||
vfio_virqfd | |||
``` | |||
== NVIDIA Drivers on Host == | |||
=== Edit /etc/apt/sources.list === | |||
``` | |||
sed -i 's/main/main non-free contrib/g' /etc/apt/sources.list | |||
apt update | |||
apt install -y nvidia-driver nvidia-kernel-dkms | |||
``` | |||
== VM Templates & Cloning == | |||
=== Template VM === | |||
* Use `debian-13.3.0-amd64-netinst.iso` to create a minimal Debian 13 template | |||
* Convert to template (Proxmox: VM > Convert to Template) | |||
=== Clone VMs === | |||
* Clone the template for `ollama-2080` and `dockerhost` | |||
* **Pass GPU**: In VM settings, go to "Hardware" > "PCI" > "Raw" and select the GPU (use `lspci` IDs) | |||
== Specific VM Configurations == | |||
=== ollama-2080 === | |||
* Install `ollama` (e.g., `curl -fsSL https://ollama.com/install.sh | sh`) | |||
* Configure GPU acceleration (check `ollama --version` and ensure NVIDIA drivers are loaded) | |||
=== dockerhost === | |||
* Install Docker: | |||
``` | |||
apt install -y docker.io | |||
systemctl enable --now docker | |||
``` | |||
* Add user to `docker` group (`usermod -aG docker $USER`) | |||
=== ai-conductor === | |||
* Install required tools (e.g., `kubectl` for Kubernetes orchestration) | |||
== Key Commands == | |||
``` | |||
# Check GPU visibility in host | # Check GPU visibility in host | ||
lspci -k | grep -A 2 "VGA" | lspci -k | grep -A 2 "VGA" | ||
| Line 139: | Line 89: | ||
# Clone a template in Proxmox | # Clone a template in Proxmox | ||
qm clone <source_VM_ID> <new_VM_ID> --name "ollama-2080" | qm clone <source_VM_ID> <new_VM_ID> --name "ollama-2080" | ||
``` | |||
== Troubleshooting == | |||
* **GPU Not Visible**: Ensure VT-d is enabled in BIOS and the GPU is listed in `lspci` | |||
* **Driver Issues**: Reinstall `nvidia-driver` and reboot | |||
* **Permission Errors**: Add user to `docker` and `kvm` groups | |||