Fixed Lab 2

This commit is contained in:
c4ch3c4d3
2026-03-26 20:09:49 -06:00
parent 3bafa35460
commit a663cdbd42
8 changed files with 70 additions and 20 deletions
+4 -5
View File
@@ -135,7 +135,7 @@ git clone https://huggingface.co/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B
**LLaMa.cpp** makes it easy for us to package models downloaded in SafeTensors format to GGUF. We can convert the model with the following official project script command:
```bash
python3 convert_hf_to_gguf.py /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B/WhiteRabbitNeo-V3-7B --outfile /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B.gguf
convert_hf_to_gguf.py /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B/WhiteRabbitNeo-V3-7B --outfile /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B.gguf
```
### 4 Execute: Review Model Metadata
@@ -212,7 +212,7 @@ Quantization reduces memory footprints and speeds inference, but it typically ra
To generate an 8 bit, 4 bit, and 1 bit quantization, run the following commands:
<div class="lab-callout lab-callout--warning">
<strong>Warning:</strong> Although these quantization steps are provided for replication, pre-quantized support files are already available in <code>/home/student/lab2/WhiteRabbitNeo/</code> for faster lab progress.
<strong>Warning:</strong> Although these quantization steps are provided for replication, pre-quantized support files are already available in <code>/home/student/lab2/WhiteRabbitNeo/</code> for faster lab progress. <br><br>You can skip these commands when participating in a live teaching session.
</div>
```bash
@@ -459,7 +459,7 @@ When finished, you will be presented with a prompt, similar to the `llama-cli` c
Similarly, we can do the same by pulling a model directly from **HuggingFace**. As long as the source file is a .gguf of any quantization level that fits within our system memory, Ollama can fetch it directly.
1. **Select a Quantized Model from Objective 1** visit [CodeIsAbstract](https://huggingface.co/CodeIsAbstract/Llama-3.2-1B-Q8_0-GGUF) in your browser.
1. **Select the Quantized Model from Objective 1** visit [CodeIsAbstract](https://huggingface.co/CodeIsAbstract/Llama-3.2-1B-Q8_0-GGUF) in your browser.
2. **Use this model** - Click Use this model → choose the Ollama tab. The page displays a readytorun command:
<figure style="text-align: center;">
@@ -528,10 +528,9 @@ ollama run WhiteRabbitNeo
| `ollama list` | Shows all models currently registered with Ollama. |
| `ollama rm <tag>` | Deletes the specified model (freeing disk space). |
| `ollama show <tag>` | Prints model metadata (architecture, context length, quantization). |
| `ollama show <tag> --modelfile` | Prints an existing model's modelfile. Often useful for templating our own. |
| `ollama serve` | Starts the OpenAI-compatible API server (runs automatically when you first use `ollama run`). |
<br>
---
## Conclusion
+8 -8
View File
@@ -27,9 +27,9 @@ Password: student
Once you've successfully connected to Open WebUI, follow the registration instructions. Feel free to register with any information, as Kaggle instance will tear itself down after four hours (barring manual intervention or inactivity). Once successful, move on to the next objective.
<figure style="text-align: center;">
<a href="https://i.imgur.com/QrQwWuD.jpeg" target="_blank">
<a href="https://i.imgur.com/btkT9IH.png" target="_blank">
<img
src="https://i.imgur.com/QrQwWuD.jpeg"
src="https://i.imgur.com/btkT9IH.png"
style="width: 50%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
</a>
<figcaption style="margin-top: 8px; font-size: 1.1em;">
@@ -49,8 +49,8 @@ Locate, pull, and run **Gemma34BITQAT** (a quantawaretrained m
* Locate the search box at the top of the page.
<figure style="text-align:center;">
<a href="https://i.imgur.com/yQ9KMsa.png" target="_blank">
<img src="https://i.imgur.com/yQ9KMsa.png" width="600"
<a href="https://i.imgur.com/TuUbK7O.png" target="_blank">
<img src="https://i.imgur.com/TuUbK7O.png" width="600"
style="display:block; margin-left:auto; margin-right:auto; border:5px solid black;">
</a>
<figcaption>Ollama homepage use the search bar to look for “Gemma3”.</figcaption>
@@ -64,8 +64,8 @@ Locate, pull, and run **Gemma34BITQAT** (a quantawaretrained m
* Click the **`Tags`** link beneath the model description.
<figure style="text-align:center;">
<a href="https://i.imgur.com/NgcM7qx.png" target="_blank">
<img src="https://i.imgur.com/NgcM7qx.png" width="600"
<a href="https://i.imgur.com/eaRaqnq.png" target="_blank">
<img src="https://i.imgur.com/eaRaqnq.png" width="600"
style="display:block; margin-left:auto; margin-right:auto; border:5px solid black;">
</a>
<figcaption>Tag view each entry shows the model size and a short description.</figcaption>
@@ -76,8 +76,8 @@ Locate, pull, and run **Gemma34BITQAT** (a quantawaretrained m
* The size column reads **`3.4GB`**, indicating the VRAM required for inference.
<figure style="text-align:center;">
<a href="https://i.imgur.com/nDPlOdd.png" target="_blank">
<img src="https://i.imgur.com/nDPlOdd.png" width="600"
<a href="https://i.imgur.com/Sf8sSs3.png" target="_blank">
<img src="https://i.imgur.com/Sf8sSs3.png" width="600"
style="display:block; margin-left:auto; margin-right:auto; border:5px solid black;">
</a>
<figcaption>Model size for `Qwen3.5:4b` (≈ 3.3GB VRAM).</figcaption>
Binary file not shown.

After

Width:  |  Height:  |  Size: 81 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 118 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 157 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB