Local Courseware Deployment

This project builds a student-friendly local lab environment for the courseware with a small control surface:

./deploy-courseware.sh installs and configures the environment, then starts every managed service.
./destroy-courseware.sh stops the managed services, uninstalls courseware-managed Ollama, and removes the project-owned lab state.
./labctl provides day-two controls such as assets lab2, start, stop, status, urls, logs, and open kiln.

What It Installs

Ollama
llama.cpp
TransformerLab, pinned to the classic single-user v0.28.2 release
Open WebUI
ChunkViz
Embedding Atlas
Promptfoo
Unsloth Studio
Kiln Desktop
Course-specific support assets for lab 2 and lab 4

Supported Host Profiles

This build intentionally avoids the reference VM's hardware workarounds.

macOS: Apple Silicon only, with at least 16 GB unified memory.
Native Debian/Ubuntu: Debian-family Linux with an NVIDIA GPU visible to nvidia-smi and at least 8 GB VRAM.
WSL: Debian/Ubuntu-family Linux running under WSL, with the NVIDIA GPU exposed into the distro.

The launcher and Ansible preflight now classify the host dynamically and apply different setup behavior for:

macos
native-debian-ubuntu
wsl

WSL Check

If you run this inside WSL, the launcher checks GPU readiness before Ansible starts.

If that check fails, fix WSL first:

Install or update the NVIDIA Windows driver with WSL/CUDA support
Run wsl --update in Windows PowerShell
Run wsl --shutdown
Reopen WSL and confirm nvidia-smi works

Important: nvidia-smi is only the driver check. Building CUDA-enabled llama.cpp also requires the Linux-side CUDA toolkit inside the distro.

On Linux and WSL, the first ./labctl up or ./labctl preflight run may prompt once for your sudo password so Ansible can install system packages.

On Ubuntu WSL x86_64, preflight now installs the Linux-side CUDA toolkit automatically if it is missing.

It first tries the distro package:

sudo apt install -y nvidia-cuda-toolkit

If that package is unavailable or still does not expose nvcc, the installer falls back to NVIDIA's WSL-Ubuntu repository bootstrap for the toolkit only, not a Linux GPU driver.

If the automatic bootstrap still fails, verify:

nvcc --version
ls /usr/local/cuda/include/cuda_runtime.h

For non-Ubuntu WSL distros, install the CUDA toolkit manually before running the deploy script.

Native Debian/Ubuntu CUDA Behavior

On native Debian/Ubuntu hosts, the installer now handles three CUDA-toolkit cases:

If the toolkit is already usable, it reuses the existing install instead of forcing a reinstall.
If the distro exposes nvidia-cuda-toolkit, it installs that package.
If the distro package is unavailable, it bootstraps NVIDIA's official CUDA network repository for supported native Debian/Ubuntu releases and installs the toolkit from there.

If apt starts in a broken dependency state, the installer now attempts dpkg --configure -a and apt-get --fix-broken install before retrying package installation.

If CUDA is already mounted or preinstalled outside PATH, the installer now detects standard locations such as /usr/local/cuda/bin/nvcc and /usr/local/cuda-*/bin/nvcc.

Standard Assumptions

The host-side install path assumes modern local tooling, but TransformerLab itself is provisioned from a pinned classic single-user layout.
TransformerLab is intentionally pinned to the older single-user v0.28.2 release because newer upstream releases changed the project structure and behavior in ways that break this courseware.
This project does not rely on TransformerLab's upstream install.sh; the Ansible role provisions the pinned release directly so web assets, env layout, and runtime behavior stay reproducible.
The courseware repairs installed TransformerLab Fastchat plugin manifests so Fastchat-gated features such as Model Architecture and Visualize Logprobs stay available on pinned installs.
No Ollama models are pulled during ./labctl up; students pull models manually as part of the courseware.
WhiteRabbitNeo assets are handled separately from ./labctl up and ./labctl preflight.
Run ./labctl assets lab2 when you want to populate repo-local lab 2 assets in assets/lab2/ from Hugging Face.
After base setup, run state/lab2/download_whiterabbitneo-gguf.sh to fetch only the Q4_K_M, Q8_0, and IQ2_M files from bartowski/WhiteRabbitNeo_WhiteRabbitNeo-V3-7B-GGUF and register local Ollama models WhiteRabbitNeo, WhiteRabbitNeo-Q4, WhiteRabbitNeo-Q8, and WhiteRabbitNeo-IQ2.
TransformerLab and Unsloth homes are redirected into this project's state/ tree via symlinks.
Managed web services bind for access from both Linux and the Windows side of WSL, while labctl urls still reports localhost-friendly URLs.
The local Ansible bootstrap in .venv-ansible/ is machine-specific and will be recreated automatically if the folder is copied between hosts.
llama.cpp now uses a conservative, memory-aware build parallelism setting instead of an unbounded -j build, which avoids OOM failures on smaller Linux and WSL hosts.

Lab URLs

After ./deploy-courseware.sh, run ./labctl urls.

Default endpoints:

Ollama API: http://127.0.0.1:11434
Open WebUI: http://127.0.0.1:8080
TransformerLab: http://127.0.0.1:8338
ChunkViz: http://127.0.0.1:3001
Embedding Atlas: http://127.0.0.1:5055
Unsloth Studio: http://127.0.0.1:8888
Promptfoo UI: http://127.0.0.1:15500
Wiki: http://127.0.0.1:80

Notes

./labctl up installs the environment and then starts every managed service.
./labctl versions shows the pinned TransformerLab and Ansible runtime versions used by this workspace.
./labctl assets lab2 is a separate manual step that clones the base WhiteRabbitNeo repo into assets/lab2/WhiteRabbitNeo-V3-7B and downloads the supported Q4_K_M, Q8_0, and IQ2_M GGUFs into assets/lab2/WhiteRabbitNeo_WhiteRabbitNeo-V3-7B-GGUF.
TransformerLab is installed as a pinned single-user app and no default courseware-managed TransformerLab user is created automatically.
./labctl start core starts only ollama and open-webui.
./labctl start all starts every managed web service.
./labctl open kiln launches the Kiln desktop app installed into the project state.
The scripted Promptfoo install drops a starter config at state/lab6/promptfoo.yaml.
labctl start all now includes Promptfoo via promptfoo view and the cloned wiki app.
Lab 2 includes state/lab2/download_whiterabbitneo-gguf.sh, which uses git + git lfs to pull only the supported WhiteRabbitNeo quants. Add --download-only if you want the files without Ollama registration.
The wiki is cloned from https://git.zuccaro.me/bzuccaro/LLM-Labs.git into state/repos/LLM-Labs and started with npm.
./labctl down now uninstalls Ollama entirely when this project installed it, instead of only stopping the service.
Unsloth Studio currently supports chat and data workflows on macOS; Linux/WSL remains the standard path for NVIDIA-backed training.

7.0 KiB Raw Blame History