Local Courseware Deployment
This project builds a student-friendly local lab environment for the courseware with a small control surface:
./deploy-courseware.shinstalls and configures the environment, then starts every managed service../destroy-courseware.shstops the managed services, uninstalls courseware-managed Ollama, and removes the project-owned lab state../labctlprovides day-two controls such asassets lab2,ollama_models,update_wiki,start,stop,status,urls,logs, andopen kiln.
What It Installs
- Ollama
llama.cpp- Netron, served locally on port
8338 - Open WebUI
- ChunkViz
- Embedding Atlas
- Promptfoo
- Unsloth Studio
- Kiln Desktop
- Course-specific support assets for lab 1, lab 2, and lab 4
Lab 1 Defaults
Lab 1 is now provisioned directly by the installer:
- The
Llama-3.2-1B.Q4_K_M.gguffile is mirrored intostate/models/lab1/. - The Lab 1 confidence widget uses the pre-pulled Gemma 4 E2B Q4 Ollama model,
batiai/gemma4-e2b:q4. - The wiki serves a same-host download link for the Llama GGUF through
/api/lab1/models/.... - Lab 1 confidence visualization requires Ollama
0.12.11or newer because it depends on logprobs.
Lab 2 Defaults
./labctl up now pre-pulls the Gemma 4 E2B Ollama variants used by the wiki widgets:
cajina/gemma4_e2b-q2_k_xl:v01batiai/gemma4-e2b:q4batiai/gemma4-e2b:q6
If you want to re-pull just those managed Ollama models later, run ./labctl ollama_models.
Supported Host Profiles
This build intentionally avoids the reference VM's hardware workarounds.
- macOS: Apple Silicon only, with at least 16 GB unified memory.
- Native Debian/Ubuntu: Debian-family Linux with an NVIDIA GPU visible to
nvidia-smiand at least 8 GB VRAM. - WSL: Debian/Ubuntu-family Linux running under WSL, with the NVIDIA GPU exposed into the distro.
The launcher and Ansible preflight classify the host dynamically and apply different setup behavior for:
macosnative-debian-ubuntuwsl
WSL Check
If you run this inside WSL, the launcher checks GPU readiness before Ansible starts.
If that check fails, fix WSL first:
- Install or update the NVIDIA Windows driver with WSL/CUDA support
- Run
wsl --updatein Windows PowerShell - Run
wsl --shutdown - Reopen WSL and confirm
nvidia-smiworks
Important: nvidia-smi is only the driver check. Building CUDA-enabled llama.cpp also requires the Linux-side CUDA toolkit inside the distro.
On Linux and WSL, the first ./labctl up or ./labctl preflight run may prompt once for your sudo password so Ansible can install system packages.
On Ubuntu WSL x86_64, preflight now installs the Linux-side CUDA toolkit automatically if it is missing.
It first tries the distro package:
sudo apt install -y nvidia-cuda-toolkit
If that package is unavailable or still does not expose nvcc, the installer falls back to NVIDIA's WSL-Ubuntu repository bootstrap for the toolkit only, not a Linux GPU driver.
If the automatic bootstrap still fails, verify:
nvcc --versionls /usr/local/cuda/include/cuda_runtime.h
For non-Ubuntu WSL distros, install the CUDA toolkit manually before running the deploy script.
Native Debian/Ubuntu CUDA Behavior
On native Debian/Ubuntu hosts, the installer handles three CUDA-toolkit cases:
- If the toolkit is already usable, it reuses the existing install instead of forcing a reinstall.
- If the distro exposes
nvidia-cuda-toolkit, it installs that package. - If the distro package is unavailable, it bootstraps NVIDIA's official CUDA network repository for supported native Debian/Ubuntu releases and installs the toolkit from there.
If apt starts in a broken dependency state, the installer attempts dpkg --configure -a and apt-get --fix-broken install before retrying package installation.
If CUDA is already mounted or preinstalled outside PATH, the installer detects standard locations such as /usr/local/cuda/bin/nvcc and /usr/local/cuda-*/bin/nvcc.
Standard Assumptions
- The default deployment is centered on Ollama-backed local inference and browser-based tools such as Netron and the wiki.
- Netron is installed into a managed Python virtual environment and served locally instead of being provisioned as a desktop package.
- Lab 1's Llama GGUF download is mirrored locally during
./labctl up, so students do not have to fetch it manually from the original source. - WhiteRabbitNeo assets remain a separate Lab 2 flow and are still handled outside the default
./labctl uprun. - Run
./labctl assets lab2when you want to populate repo-local Lab 2 assets inassets/lab2/from Hugging Face. - After base setup, run
state/lab2/download_whiterabbitneo-gguf.shto fetch only theQ4_K_M,Q8_0, andIQ2_Mfiles frombartowski/WhiteRabbitNeo_WhiteRabbitNeo-V3-7B-GGUFand register local Ollama modelsWhiteRabbitNeo,WhiteRabbitNeo-Q4,WhiteRabbitNeo-Q8, andWhiteRabbitNeo-IQ2. - Unsloth homes are redirected into this project's
state/tree via symlinks. - Managed web services bind for access from both Linux and the Windows side of WSL, while
labctl urlsstill reports localhost-friendly URLs. - The local Ansible bootstrap in
.venv-ansible/is machine-specific and will be recreated automatically if the folder is copied between hosts. llama.cppuses a conservative, memory-aware build parallelism setting instead of an unbounded-jbuild, which avoids OOM failures on smaller Linux and WSL hosts.
Lab URLs
After ./deploy-courseware.sh, run ./labctl urls.
Default endpoints:
- Ollama API:
http://127.0.0.1:11434 - Open WebUI:
http://127.0.0.1:8080 - Netron:
http://127.0.0.1:8338 - ChunkViz:
http://127.0.0.1:3001 - Embedding Atlas:
http://127.0.0.1:5055 - Unsloth Studio:
http://127.0.0.1:8888 - Promptfoo UI:
http://127.0.0.1:15500 - Wiki:
http://127.0.0.1:80 - Lab 3 Terminal:
http://127.0.0.1:7681/wetty
Lab 3 Browser Terminal
The deployment will:
- bind
sshdto127.0.0.1:22only - install WeTTY and expose it at
http://127.0.0.1:7681/wetty - leave login identity management to the host, so any existing local account with password-based SSH access can sign in through the browser terminal
Notes
./labctl upinstalls the environment and then starts every managed service../labctl versionsshows the pinned Netron version, minimum Ollama version, and Ansible runtime version used by this workspace../labctl assets lab2is a separate manual step that clones the base WhiteRabbitNeo repo intoassets/lab2/WhiteRabbitNeo-V3-7Band downloads the supportedQ4_K_M,Q8_0, andIQ2_MGGUFs intoassets/lab2/WhiteRabbitNeo_WhiteRabbitNeo-V3-7B-GGUF../labctl ollama_modelsre-pulls the managed Lab 2 Gemma 4 E2B Ollama model set without rerunning the full installer../labctl update_wikihard-resets the managed wiki checkout to the remote latest, rebuilds it, and restarts only the managed wiki service on port80../labctl start corestarts onlyollamaandopen-webui../labctl start allstarts every managed web service../labctl open kilnlaunches the Kiln desktop app installed into the project state.- The scripted Promptfoo install drops a starter config at
state/lab6/promptfoo.yaml. labctl start allincludes Promptfoo viapromptfoo viewand the cloned wiki app.- Lab 2 includes
state/lab2/download_whiterabbitneo-gguf.sh, which usesgit+git lfsto pull only the supported WhiteRabbitNeo quants. Add--download-onlyif you want the files without Ollama registration. - The wiki is cloned from
https://git.zuccaro.me/bzuccaro/LLM-Labs.gitintostate/repos/LLM-Labsand started withnpm. ./labctl downuninstalls Ollama entirely when this project installed it, instead of only stopping the service.- Unsloth Studio currently supports chat and data workflows on macOS; Linux/WSL remains the standard path for NVIDIA-backed training.