Fixed Lab 2

2026-03-26 20:09:49 -06:00
parent 3bafa35460
commit a663cdbd42
8 changed files with 70 additions and 20 deletions
@@ -135,7 +135,7 @@ git clone https://huggingface.co/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B
 **LLaMa.cpp** makes it easy for us to package models downloaded in SafeTensors format to GGUF.  We can convert the model with the following official project script command:

 ```bash
-python3 convert_hf_to_gguf.py /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B/WhiteRabbitNeo-V3-7B --outfile /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B.gguf
+convert_hf_to_gguf.py /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B/WhiteRabbitNeo-V3-7B --outfile /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B.gguf
 ```

 ### 4 Execute: Review Model Metadata
@@ -212,7 +212,7 @@ Quantization reduces memory footprints and speeds inference, but it typically ra
 To generate an 8 bit, 4 bit, and 1 bit quantization, run the following commands:

 <div class="lab-callout lab-callout--warning">
-  <strong>Warning:</strong> Although these quantization steps are provided for replication, pre-quantized support files are already available in <code>/home/student/lab2/WhiteRabbitNeo/</code> for faster lab progress.
+  <strong>Warning:</strong> Although these quantization steps are provided for replication, pre-quantized support files are already available in <code>/home/student/lab2/WhiteRabbitNeo/</code> for faster lab progress.  <br><br>You can skip these commands when participating in a live teaching session.
 </div>

 ```bash
@@ -459,7 +459,7 @@ When finished, you will be presented with a prompt, similar to the `llama-cli` c

 Similarly, we can do the same by pulling a model directly from **HuggingFace**.  As long as the source file is a .gguf of any quantization level that fits within our system memory, Ollama can fetch it directly. 

-1. **Select a Quantized Model from Objective 1** – visit [CodeIsAbstract](https://huggingface.co/CodeIsAbstract/Llama-3.2-1B-Q8_0-GGUF) in your browser.
+1. **Select the Quantized Model from Objective 1** – visit [CodeIsAbstract](https://huggingface.co/CodeIsAbstract/Llama-3.2-1B-Q8_0-GGUF) in your browser.
 2. **Use this model** - Click Use this model → choose the Ollama tab. The page displays a ready‑to‑run command:

 <figure style="text-align: center;">
@@ -528,10 +528,9 @@ ollama run WhiteRabbitNeo
 | `ollama list` | Shows all models currently registered with Ollama. |
 | `ollama rm <tag>` | Deletes the specified model (freeing disk space). |
 | `ollama show <tag>` | Prints model metadata (architecture, context length, quantization). |
+| `ollama show <tag> --modelfile` | Prints an existing model's modelfile. Often useful for templating our own. |
 | `ollama serve` | Starts the OpenAI-compatible API server (runs automatically when you first use `ollama run`). |

-<br>
-
 ---

 ## Conclusion
@@ -27,9 +27,9 @@ Password: student

 Once you've successfully connected to Open WebUI, follow the registration instructions.  Feel free to register with any information, as Kaggle instance will tear itself down after four hours (barring manual intervention or inactivity).  Once successful, move on to the next objective.
 <figure style="text-align: center;">
-  <a href="https://i.imgur.com/QrQwWuD.jpeg" target="_blank">
+  <a href="https://i.imgur.com/btkT9IH.png" target="_blank">
    <img 
-      src="https://i.imgur.com/QrQwWuD.jpeg" 
+      src="https://i.imgur.com/btkT9IH.png" 
      style="width: 50%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
  </a>
  <figcaption style="margin-top: 8px; font-size: 1.1em;">
@@ -49,8 +49,8 @@ Locate, pull, and run **Gemma 3 4B‑IT‑QAT** (a quant‑aware‑trained m
   * Locate the search box at the top of the page.  

   <figure style="text-align:center;">
-   <a href="https://i.imgur.com/yQ9KMsa.png" target="_blank">
-   <img src="https://i.imgur.com/yQ9KMsa.png" width="600"
+   <a href="https://i.imgur.com/TuUbK7O.png" target="_blank">
+   <img src="https://i.imgur.com/TuUbK7O.png" width="600"
        style="display:block; margin-left:auto; margin-right:auto; border:5px solid black;">
   </a>
   <figcaption>Ollama homepage – use the search bar to look for “Gemma 3”.</figcaption>
@@ -64,8 +64,8 @@ Locate, pull, and run **Gemma 3 4B‑IT‑QAT** (a quant‑aware‑trained m
   * Click the **`Tags`** link beneath the model description.  

   <figure style="text-align:center;">
-   <a href="https://i.imgur.com/NgcM7qx.png" target="_blank">
-   <img src="https://i.imgur.com/NgcM7qx.png" width="600"
+   <a href="https://i.imgur.com/eaRaqnq.png" target="_blank">
+   <img src="https://i.imgur.com/eaRaqnq.png" width="600"
        style="display:block; margin-left:auto; margin-right:auto; border:5px solid black;">
   </a>
   <figcaption>Tag view – each entry shows the model size and a short description.</figcaption>
@@ -76,8 +76,8 @@ Locate, pull, and run **Gemma 3 4B‑IT‑QAT** (a quant‑aware‑trained m
   * The size column reads **`3.4 GB`**, indicating the VRAM required for inference.  

   <figure style="text-align:center;">
-   <a href="https://i.imgur.com/nDPlOdd.png" target="_blank">
-   <img src="https://i.imgur.com/nDPlOdd.png" width="600"
+   <a href="https://i.imgur.com/Sf8sSs3.png" target="_blank">
+   <img src="https://i.imgur.com/Sf8sSs3.png" width="600"
        style="display:block; margin-left:auto; margin-right:auto; border:5px solid black;">
   </a>
   <figcaption>Model size for `Qwen3.5:4b` (≈ 3.3 GB VRAM).</figcaption>