Update lab model defaults and assets

This commit is contained in:
2026-04-24 20:08:56 -06:00
parent fcb2dcb36d
commit 562be3fd1f
18 changed files with 8971 additions and 916856 deletions
@@ -86,14 +86,13 @@ Use the launch panel below to open the local Netron service on port `8338`.
<div data-lab1-netron-panel></div>
### Execute: Download the Two GGUF Files
### Execute: Download the GGUF File
You will work with two small GGUF models in this objective:
You will work with a small GGUF model in this objective. Download the provided file:
- [Qwen 3 0.6B](/api/lab1/models/qwen3-0.6b-q8_0.gguf)
- [Llama 3.2 1B](/api/lab1/models/llama-3.2-1b-q4_k_m.gguf)
- [Llama-3.2-1B.Q4_K_M.gguf](/api/lab1/models/llama-3.2-1b-q4_k_m.gguf) for Llama 3.2 1B
These files are intentionally small enough to make architecture exploration practical in a classroom lab. Download both files to a convenient location such as your `Downloads` folder. Once you've downloaded your files, you can open them using the "Open Model" Button on the Netron Homepage.
This file is intentionally small enough to make architecture exploration practical in a classroom lab. Save it to a convenient location such as your `Downloads` folder. Once you've downloaded the file, you can open it using the "Open Model" button on the Netron home page.
<figure style="text-align:center;">
<a href="https://i.imgur.com/Y7QpGpG.png" target="_blank">
@@ -105,7 +104,7 @@ These files are intentionally small enough to make architecture exploration prac
Once Netron is open:
1. Select **Open Model** or drag a GGUF file directly into the browser window.
2. Start with `Qwen 3 0.6B`.
2. Open `Llama 3.2 1B`.
Netron will display the model as a graph of tensors, operators, and named blocks. This is a more literal view than the simplified lecture diagrams, but it is still showing the same fundamental idea: the model is a large stack of numeric values, each serving a different purpose to model language.
@@ -132,26 +131,26 @@ As you move around the graph, focus on these three recurring structures. Each o
</li>
</ul>
Notably, Qwen 3 0.6B is composed of 28 of these blocks! This is signifigantly more than GPT-2 (12 blocks), despite this model being 1/3rd the size!
Notably, even small local models are composed of many repeated blocks. The exact count varies by model family, size, and export format, but the important pattern is the repeated attention and feed-forward structure.
Lastly, you may see labels such as **MatMul**, **Mul**, or **mulmat**, depending on how the graph was exported and named. In practice, these are often part of the feed-forward path that expands and reshapes the model's internal representation before passing it onward.
**Compare the Two Small Models**
**Inspect the Small Model**
Both models are small compared to modern production systems, but they are still large enough to reveal repeating architectural patterns.
This model is small compared to modern production systems, but it is still large enough to reveal repeating architectural patterns.
As you compare them, ask:
As you inspect it, ask:
- Where do the repeating blocks begin to stand out?
- Which names remain stable between the two models?
- How many *Attention Heads* does each model have? How might this affect transformations predicted by the model?
- Which names remain stable across repeated blocks?
- How many *Attention Heads* does the model have? How might this affect transformations predicted by the model?
<figure style="text-align:center;">
<a href="https://i.imgur.com/WhnFZss.png" target="_blank">
<img src="https://i.imgur.com/WhnFZss.png" width="600" style="border:5px solid black;">
</a>
<figcaption>Netron Qwen 3 0.6B Layers 1 & 2</figcaption>
</figure>
<figure style="display:flex; flex-direction:column; align-items:center; text-align:center;">
<a href="https://i.imgur.com/WhnFZss.png" target="_blank" style="display:block; max-width:100%;">
<img src="https://i.imgur.com/WhnFZss.png" width="600" style="display:block; max-width:100%; height:auto; border:5px solid black;">
</a>
<figcaption>Netron Qwen 3 0.6B Layers 1 & 2</figcaption>
</figure>
---
@@ -160,7 +159,7 @@ As you compare them, ask:
### Execute: Run the Local Confidence Widget
The widget below talks to the preloaded local Lab 1 model through Ollama. Enter any prompt you like, generate a response, and then hover over the output tokens.
The widget below talks to the preloaded Gemma 4 E2B Q4 model through Ollama. Enter any prompt you like, generate a response, and then hover over the output tokens.
<div data-lab1-confidence></div>