Refactor lab 1 for Netron and local confidence views

2026-04-16 11:15:39 -06:00
parent a97c8a7694
commit e4621ca65b
20 changed files with 1634 additions and 280 deletions
@@ -1,19 +1,20 @@
 ---
 order: 1
-title: Lab 1 - Visualizing LLMs in TransformerLab
+title: Lab 1 - Model Structure, Tokenization, and Confidence Visualization
-description: Explore model structure, tokenization, and next-token prediction inside TransformerLab.
+description: Explore GGUF model structure in Netron, inspect tokenization interactively, and visualize token confidence with a local Ollama model.
 ---
 <!-- breakout-style: instruction-rails -->
 <!-- step-style: underline -->
 <!-- objective-style: divider -->
-# Lab 1 - Visualizing LLMs in TransformerLab
+# Lab 1 - Model Structure, Tokenization, and Confidence Visualization
 In this lab, we will:
- Download and Visualize LLama-3.2-1B-Instruct
+- Visualize two small GGUF models in Netron
- Visualize Tokenization & Prediction with LLama-3.2-1B-Instruct
+- Observe how text is split into tokens and token IDs
 - Inspect the confidence of a local model one token at a time
 <div class="lab-callout lab-callout--info">
  <strong>Lab Flow Guide</strong><br />
@@ -21,258 +22,181 @@ In this lab, we will:
  <strong>Execute</strong> steps require performing actions in the lab environment.
 </div>
-## Objective 1: Starting TransformerLab
+## Objective 1: Visualize Tokenization and Token IDs
-### Execute: Access the Lab Environment
+### Execute: Use the Tokenizer Playground
-To start Lab 1, ensure you've received a WireGuard configuration and system IP from your instructor. If you're unfamiliar with WireGuard, assistance will be provided to ensure you can access the lab environment for the duration of class.
+The below embedded tool below allows you to enter raw text and observe how it's converted into model tokens. Tokenization is the critical first step that enables a Large Language Model to process and understand user input, accomplished by transforming words into numerical values.
-All systems use the default username and password of `student`. All labs are located in the student home folder. To start Lab 1, run
+<div data-tokenizer-playground></div>
-```bash
+### Explore: Try Multiple Inputs
 ~/lab1/lab1_start.sh
 ```
-using the `lab1_start.sh` script in the `lab1` folder.
+Enter several different inputs and compare how the tokenization changes. Use at least these three examples:
-Lastly, if necessary, you can `su -` to root at any time. No password will be required.
+1. `The quick brown fox jumps over the lazy dog`
 2. `cybersecurity analyst`
 3. `printf("hello");`
-Once started, you can reach TransformerLab on port 8338 of your Lab VM (http://<IP>:8338).
+Then try a few of your own. Short English phrases, punctuation, code, and unusual spacing are all good choices.
-## Objective 2: Visualizing a LLM
+### Explore: Compare the Two Tokenization Views
-### Explore: Understand the Model and Runtime
+This tool is especially useful because it shows both:
-The next steps will guide us through the process of deploying and interacting with a pre-trained LLM, `LLama-3.2-1B-Instruct`. To do this, we’ll be utilizing an inference engine – software designed to execute LLM models and generate token predictions. You'll encounter models packaged in the **GGUF** format, a file format designed for efficient storage and loading of quantized LLMs, enabling them to run on a wider range of hardware. Don't worry if these terms are new to you – the specifics of inference engines and the details of **GGUF** quantized LLMs will be thoroughly explained in the following section of this course.
+- The **visual split** of the text into tokens
 - The underlying **token ID values**
-Normally to start, we'll need to install an **inference engine** capable of running **GGUF** files.
+Those are two views of the same process.
-### Execute: Verify the FastChat Plugin
+The visual split helps us see where the model grouped characters or subwords together. The token ID view reminds us that the model never consumes English directly. It consumes numeric identifiers that point into the tokenizer vocabulary.
-Navigate to **Plugins**, and in the search bar type `Fastchat`. Note that it has already been installed for you!
+As you work through your examples, ask:
 - Which full words remain intact?
 - Which words get split into subwords or punctuation chunks?
 - When spacing changes, do the token IDs change too?
 Lastly, experiment with how different tokenizeres can change how inputs are split into different numerical values.  How might this affect the next steps in the transformation process?
   <figure style="text-align:center;">
-  <a href="https://imgur.com/9Waj8VG.png" target="_blank">
+   <a href="https://i.imgur.com/kc8W4gU.png" target="_blank">
-    <img 
+   <img src="https://i.imgur.com/kc8W4gU.png" width="800" style="border:5px solid black;">
      src="https://imgur.com/9Waj8VG.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
   </a>
-  <figcaption style="margin-top: 8px; font-size: 1.1em;">
+   <figcaption>Tokenization - GPT3</figcaption>
    Plugins
  </figcaption>
   </figure>
   <br>
 ### Execute: Find and Load `LLama-3.2-1B-Instruct`
 Next, navigate to **Model Registry**. You should see `LLama-3.2-1B-Instruct` right away on your screen, but if not, please start searching for this model using the search bar.
   <figure style="text-align:center;">
-  <a href="https://i.imgur.com/UyWdnMR.png" target="_blank">
+   <a href="https://i.imgur.com/xMKEBwB.png" target="_blank">
-    <img 
+   <img src="https://i.imgur.com/xMKEBwB.png" width="800" style="border:5px solid black;">
      src="https://i.imgur.com/UyWdnMR.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
   </a>
-  <figcaption style="margin-top: 8px; font-size: 1.1em;">
+   <figcaption>Tokenization - GPT4</figcaption>
    Model Registry Selection.
  </figcaption>
   </figure>
 <br>
 Once downloaded, Select **Foundation** & our newly downloaded `LLama-3.2-1B-Instruct` model.
 <figure style="text-align: center;">
  <a href="https://i.imgur.com/Aez94RU.png" target="_blank">
    <img 
      src="https://i.imgur.com/Aez94RU.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
  </a>
  <figcaption style="margin-top: 8px; font-size: 1.1em;">
    Model Selection
  </figcaption>
 </figure>
 <br>
 Once selected, click **Run**. Give TransformerLab a moment to successfully load the model.
 <figure style="text-align: center;">
  <a href="https://i.imgur.com/f4YcA8P.png" target="_blank">
    <img 
      src="https://i.imgur.com/f4YcA8P.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
  </a>
  <figcaption style="margin-top: 8px; font-size: 1.1em;">
    Starting a Model
  </figcaption>
 </figure>
 <br>
 ### Explore: Inspect the Architecture View
 To start, lets navigate to the **Interact** page, and then select **Model Architecture** from the Chat drop down.
 <figure style="text-align: center;">
  <a href="https://i.imgur.com/X0CM31h.png" target="_blank">
    <img 
      src="https://i.imgur.com/X0CM31h.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
  </a>
  <figcaption style="margin-top: 8px; font-size: 1.1em;">
    Model Architecture Dropdown
  </figcaption>
 </figure>
 <br>
 This page allows us to visualize the actively loaded model, in this case our downloaded `LLama-3.2-1B-Instruct-`. This interactive view is equivalent to the greatly simplified version shown on the slide “Transformation: Multylayer Perceptron” from our lecture. We can explore this view by:
 - Holding down both right and left mouse buttons and dragging will move the entire model.
 - Holding down just the left mouse button will allow you to rotate the view.
 <figure style="text-align: center;">
  <a href="https://i.imgur.com/8hXTGlt.png" target="_blank">
    <img 
      src="https://i.imgur.com/8hXTGlt.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
  </a>
  <figcaption style="margin-top: 8px; font-size: 1.1em;">
    Model Visualization
  </figcaption>
 </figure>
 <br>
 ### Explore: Interpret Layers, Blocks, and Parameters
 Each layer of the model performs a specific task, taking the input provided, and transforming it into the statistically most likely completion of text, token by token. This format of Llama 3.1 1B is made up of 372 **layers**. Each layer will transform the input of the layer above it, until eventually, we end up with the statically likely completion.
 You have likely also noticed that the colors repeat. Each set of repeating **layers** is organized into **blocks**. Each **block** is a grouping of **layers** that perform the same functions, but with a slightly different focus. For example, one **block** may focus on nouns, and another may focus on adjectives, and so on.
 The **layers** within Llama 3.1 1B are as follows:
 <ul class="concept-pill-list">
  <li>
    <span class="concept-pill-label">Attention:</span>
    <span>Focuses the model on specific parts of an input sequence to more accurately predict the next token.</span>
  </li>
  <li>
    <span class="concept-pill-label">Weights:</span>
    <span>The core learnable parameters of the network.</span>
  </li>
  <li>
    <span class="concept-pill-label">Biases:</span>
    <span>Additional parameters added after the weighted sum to shift (transform) the output.</span>
  </li>
  <li>
    <span class="concept-pill-label">Scale:</span>
    <span>Normalizes the output of previous <strong>layers</strong> to prepare the next round of transformation.</span>
  </li>
 </ul>
 Each of these **layers** also has a different type, corresponding to Q, K, V, and much more. 5. The **layers** between the small “Attention” **layers** are all considered to make up a single “block.”
 To the side, we can see the actual number values of each weight within each layer.
 Fundamentally, the LLM itself is this stack of numbers. Those numbers allow us to transform tokenized input (such as English), and transform that into a useful output. The more **layers** & **blocks**, the bigger the model, the more accurate and “intelligent” the model will behave. This 1B parameter model is incredibly small however, so the “truthfulness” of generated predictions is likely to be suspect (aka Hallucinated). The model will at least sound very confident however!
 <br>
 ---
 ## Objective 3: Tokenization & Prediction with LLama-3.2-1B-Instruct
-### Execute: Interactive Chat
+## Objective 2: Open Netron and Download the Lab Models
-Lets next move on to active conversation with the model. Navigate to the **Chat** tab from the dropdown menu.
+### Execute: Launch Netron
 For this lab, model visualization now happens in **Netron**, a lightweight browser tool for inspecting model structure.
 Use the launch panel below to open the local Netron service on port `8338`.
 <div data-lab1-netron-panel></div>
 ### Execute: Download the Two GGUF Files
 You will work with two small GGUF models in this objective:
 - [Qwen 3 0.6B](/api/lab1/models/qwen3-0.6b-q8_0.gguf)
 - [Llama 3.2 1B](/api/lab1/models/llama-3.2-1b-q4_k_m.gguf)
 These files are intentionally small enough to make architecture exploration practical in a classroom lab. Download both files to a convenient location such as your `Downloads` folder.  Once you've downloaded your files, you can open them using the "Open Model" Button on the Netron Homepage.
   <figure style="text-align:center;">
-  <a href="https://i.imgur.com/e40Jrku.png" target="_blank">
+   <a href="https://i.imgur.com/Y7QpGpG.png" target="_blank">
-    <img 
+   <img src="https://i.imgur.com/Y7QpGpG.png" width="800" style="border:5px solid black;">
      src="https://i.imgur.com/e40Jrku.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
   </a>
-  <figcaption style="margin-top: 8px; font-size: 1.1em;">
+   <figcaption>Netron Start Page</figcaption>
    Select Chat
  </figcaption>
   </figure>
 <br>
-Once loaded, feel free to type any message and interact with the model in any way. To speed up the pace of our lab, I recommend setting your maximum output length to 64 tokens.
+Once Netron is open:
 1. Select **Open Model** or drag a GGUF file directly into the browser window.
 2. Start with `Qwen 3 0.6B`.
 Netron will display the model as a graph of tensors, operators, and named blocks. This is a more literal view than the simplified lecture diagrams, but it is still showing the same fundamental idea: the model is a large stack of numeric values, each serving a different purpose to model language.
 ### Explore: What to Look For
 As you move around the graph, focus on these three recurring structures.  Each of these grouping of these individual *layers* is what defines a *block*:
 <ul class="concept-pill-list">
  <li>
    <span class="concept-pill-label">Tokenization:</span>
    <span>Converts textual input into numeric values, a requirement to allow the Machine to understand a user's input.</span>
  </li>
  <li>
    <span class="concept-pill-label">Embedding:</span>
    <span>Takes tokenized ID values, and converts them into positional vectors the model can perform transformation against.</span>
  </li>
  <li>
    <span class="concept-pill-label">Multi-head attention:</span>
    <span>"Attends" to the Query (What am I looking for?), Key (What do I contain?), and Value (What do I pass on?) of each token.  .</span>
  </li>
  <li>
    <span class="concept-pill-label">Feed-forward / mulmat:</span>
    <span>Applies learned "transformations" after attention to further refine each token representation.</span>
  </li>
 </ul>
 Notably, Qwen 3 0.6B is composed of 28 of these blocks! This is signifigantly more than GPT-2 (12 blocks), despite this model being 1/3rd the size! 
 Lastly, you may see labels such as **MatMul**, **Mul**, or **mulmat**, depending on how the graph was exported and named. In practice, these are often part of the feed-forward path that expands and reshapes the model's internal representation before passing it onward.
 **Compare the Two Small Models**
 Both models are small compared to modern production systems, but they are still large enough to reveal repeating architectural patterns.
 As you compare them, ask:
 - Where do the repeating blocks begin to stand out?
 - Which names remain stable between the two models?
 - How many *Attention Heads* does each model have?  How might this affect transformations predicted by the model?
   <figure style="text-align:center;">
-  <a href="https://i.imgur.com/MdAIKLn.png" target="_blank">
+   <a href="https://i.imgur.com/WhnFZss.png" target="_blank">
-    <img 
+   <img src="https://i.imgur.com/WhnFZss.png" width="600" style="border:5px solid black;">
      src="https://i.imgur.com/MdAIKLn.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
   </a>
-  <figcaption style="margin-top: 8px; font-size: 1.1em;">
+   <figcaption>Netron Qwen 3 0.6B Layers 1 & 2</figcaption>
    Maximum Length - 64
  </figcaption>
   </figure>
 <br>
-If text generation fails, or acts weird (such as merely repeating your input back to you), unload and reload the model using the previous Foundation screen from the last Objective.
+---
 ### Execute: View Tokenization
-If everything is in working order, review the **Tokenize** view. This allows us to visually see how Llama 3.2 will convert our input text into “tokens,” or numbers that represent the input English. Feel free to input any sentence into the box to review what the final tokenized version will be.
+## Objective 3: Visualize Prediction Confidence
-<figure style="text-align: center;">
+### Execute: Run the Local Confidence Widget
  <a href="https://i.imgur.com/I9tU8jK.png" target="_blank">
    <img 
      src="https://i.imgur.com/I9tU8jK.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
  </a>
  <figcaption style="margin-top: 8px; font-size: 1.1em;">
    Tokenize View
  </figcaption>
 </figure>
 <br>
-### Execute: Visualize Next-Token Activations
+The widget below talks to the preloaded local Lab 1 model through Ollama. Enter any prompt you like, generate a response, and then hover over the output tokens.
-Next, select Model Activations. By entering “The quick brown fox” and selecting visualize, we can see how the model selects the next word, and the models level of confidence. Also feel free to redo this process with alternative sentences.
+<div data-lab1-confidence></div>
-<figure style="text-align: center;">
+### Explore: Interpret the Color Coding
  <a href="https://i.imgur.com/JeWpoqV.png" target="_blank">
    <img 
      src="https://i.imgur.com/JeWpoqV.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
  </a>
  <figcaption style="margin-top: 8px; font-size: 1.1em;">
    Next Word Prediction
  </figcaption>
 </figure>
 <br>
-### Execute: Compare Confidence Views
+Each token in the output is colored by the model's confidence in that selected token.
-Note how confident the model is about the word jumps in this famous phrase. For an alternative view of the same output, you can also select the **Visualize Logprobes** option from the menu, which will show the same information but by color.
+In general:
-<figure style="text-align: center;">
+- Greener tokens indicate the model was more confident in that choice
-  <a href="https://i.imgur.com/PvkgQUr.png" target="_blank">
+- Warmer yellow or orange tokens indicate a weaker preference
-    <img 
+- Hovering over a token reveals the selected token's percentage and the strongest alternate predictions
      src="https://i.imgur.com/PvkgQUr.png" 
      style="width: 90%; display: block; margin-left: auto; margin-right: auto; border: 5px solid black;">
  </a>
  <figcaption style="margin-top: 8px; font-size: 1.1em;">
    Green is Confident.  Red is less confident.
  </figcaption>
 </figure>
 <br>
-### Explore: Continue Exploring TransformerLab Features
+This is useful because it shows us that model output is not magic or certainty. Each generated token is chosen from a probability distribution over many possible next tokens.
-Please continue to explore Transformers Lab until you’re ready to move on. While we will utilize many different tools other than Transformers Lab throughout this course due to its beta nature, this software is improving all the time and is worth watching! Transformers lab supports many advanced features, in various stages of development, such as:
+### Explore: Try Different Prompt Styles
- Batch Text Generation
+To make the confidence view more interesting, compare:
 - LLM Fine Tuning
 - LLM Evaluation
 - Retrieval Augmented Generation (RAG)
  We will discuss these topics and more throughout the course.
-<br>
+1. A common phrase such as `The quick brown fox`
 2. A factual question
 3. A short cybersecurity prompt
 Notice where the model appears highly certain and where it becomes less stable. Small local models often produce text that sounds very confident even when the underlying prediction distribution is more fragile than it first appears.
 <div class="lab-screenshot-placeholder">
  <strong>Screenshot Placeholder</strong>
  Confidence heatmap and hover tooltip view.
 </div>
 ---
 ## Conclusion
-In this lab, we observed the foundational concepts of all LLMs in action using TransformerLab. Through hands-on exploration, we observed the process of tokenization – how text is converted into numerical representations for the model – and visualized the model's prediction process, including its confidence levels for different token selections. By navigating the model’s layers and blocks, we gained an appreciation for the sheer scale and complexity inherent in modern LLMs.
+In this lab, we explored three foundational views of an LLM.
-This initial experience provides a crucial stepping stone for further exploration of LLMs, laying the groundwork for future labs focused on fine-tuning, evaluation, and advanced techniques like Retrieval Augmented Generation.
+First, we opened two GGUF model files in Netron and inspected the architecture directly. Then we used a tokenizer playground to see how plain text becomes tokens and token IDs. Finally, we used a local confidence visualizer to watch a small model generate output token by token while exposing how certain it was about each choice.
 Together, these three perspectives give us a much more grounded picture of what an LLM actually is: a structured file of learned weights, a tokenizer that converts text into IDs, and a prediction engine that selects the next token from a probability distribution.
@@ -179,7 +179,7 @@ We should then see:
 A text listing of all of the model's tensors, and the precision of each. Because we have merely converted the model's format, and not performed quantization, the model is still in **FP16**.
- This is a text view of the previous graphical view we saw in **Lab 1, Objective 2: Visualizing a LLM**. While **TransformerLab** calls tensors **layers**, terms such as **tensors**, **layers**, and **blocks** can all be used semi-interchangeably, depending on the tool in question. We will further confuse these topics when we get to the Ollama objective below.
+- This is a text view of the previous graphical view we saw in **Lab 1, Objective 2: Visualizing a LLM**. While tools such as **Netron** may expose tensors, operators, and repeating blocks with different labels, terms such as **tensors**, **layers**, and **blocks** can still be used semi-interchangeably at this level of discussion. We will further confuse these topics when we get to the Ollama objective below.
  - Pedantically, the proper definitions are:
    - Tensor - A multi-dimensional array of vectors to store data
    - Layer - A base computational unit in a neural network
@@ -1307,9 +1307,6 @@
        "arm64"
      ],
      "dev": true,
      "libc": [
        "glibc"
      ],
      "license": "MIT",
      "optional": true,
      "os": [
@@ -1327,9 +1324,6 @@
        "arm64"
      ],
      "dev": true,
      "libc": [
        "musl"
      ],
      "license": "MIT",
      "optional": true,
      "os": [
@@ -1347,9 +1341,6 @@
        "ppc64"
      ],
      "dev": true,
      "libc": [
        "glibc"
      ],
      "license": "MIT",
      "optional": true,
      "os": [
@@ -1367,9 +1358,6 @@
        "s390x"
      ],
      "dev": true,
      "libc": [
        "glibc"
      ],
      "license": "MIT",
      "optional": true,
      "os": [
@@ -1387,9 +1375,6 @@
        "x64"
      ],
      "dev": true,
      "libc": [
        "glibc"
      ],
      "license": "MIT",
      "optional": true,
      "os": [
@@ -1407,9 +1392,6 @@
        "x64"
      ],
      "dev": true,
      "libc": [
        "musl"
      ],
      "license": "MIT",
      "optional": true,
      "os": [
@@ -5520,9 +5502,6 @@
        "arm64"
      ],
      "dev": true,
      "libc": [
        "glibc"
      ],
      "license": "MPL-2.0",
      "optional": true,
      "os": [
@@ -5544,9 +5523,6 @@
        "arm64"
      ],
      "dev": true,
      "libc": [
        "musl"
      ],
      "license": "MPL-2.0",
      "optional": true,
      "os": [
@@ -5568,9 +5544,6 @@
        "x64"
      ],
      "dev": true,
      "libc": [
        "glibc"
      ],
      "license": "MPL-2.0",
      "optional": true,
      "os": [
@@ -5592,9 +5565,6 @@
        "x64"
      ],
      "dev": true,
      "libc": [
        "musl"
      ],
      "license": "MPL-2.0",
      "optional": true,
      "os": [
@@ -0,0 +1,163 @@
 import { NextResponse } from "next/server";
 import { normalizeUpstreamChatEndpoint } from "~/lib/lab2-chat";
 import {
  clampLab1Messages,
  extractLab1AssistantContent,
  extractLab1ResponseTokens,
  getLab1SystemPrompt,
  LAB1_CONFIDENCE_MODEL_ALIAS,
  LAB1_DEFAULT_MAX_TOKENS,
  LAB1_DEFAULT_TEMPERATURE,
  type Lab1ConfidenceMessage,
 } from "~/lib/lab1-confidence";
 type ChatRouteRequestBody = {
  messages?: Lab1ConfidenceMessage[];
 };
 const LOCAL_OLLAMA_TIMEOUT_MS = 90000;
 function getLocalOllamaEndpoint() {
  const configuredBaseUrl =
    process.env.COURSEWARE_OLLAMA_BASE_URL?.trim() || "http://127.0.0.1:11434";
  return normalizeUpstreamChatEndpoint(configuredBaseUrl);
 }
 function getLab1ModelAlias() {
  return (
    process.env.COURSEWARE_LAB1_OLLAMA_MODEL_ALIAS?.trim() ||
    LAB1_CONFIDENCE_MODEL_ALIAS
  );
 }
 export async function POST(request: Request) {
  let body: ChatRouteRequestBody;
  try {
    body = (await request.json()) as ChatRouteRequestBody;
  } catch {
    return NextResponse.json(
      {
        error: "The request body must be valid JSON.",
      },
      { status: 400 },
    );
  }
  if (!Array.isArray(body.messages) || body.messages.length === 0) {
    return NextResponse.json(
      {
        error: "At least one chat message is required.",
      },
      { status: 400 },
    );
  }
  const controller = new AbortController();
  const timeoutId = setTimeout(
    () => controller.abort(),
    LOCAL_OLLAMA_TIMEOUT_MS,
  );
  try {
    const upstreamResponse = await fetch(getLocalOllamaEndpoint(), {
      body: JSON.stringify({
        logprobs: true,
        max_tokens: LAB1_DEFAULT_MAX_TOKENS,
        messages: [
          {
            content: getLab1SystemPrompt(),
            role: "system",
          },
          ...clampLab1Messages(body.messages),
        ],
        model: getLab1ModelAlias(),
        stream: false,
        temperature: LAB1_DEFAULT_TEMPERATURE,
        top_logprobs: 5,
      }),
      headers: {
        "Content-Type": "application/json",
      },
      method: "POST",
      signal: controller.signal,
    });
    const responseText = await upstreamResponse.text();
    const parsedBody = JSON.parse(responseText) as unknown;
    if (!upstreamResponse.ok) {
      const message =
        typeof parsedBody === "object" &&
        parsedBody !== null &&
        "error" in parsedBody &&
        typeof parsedBody.error === "object" &&
        parsedBody.error !== null &&
        "message" in parsedBody.error &&
        typeof parsedBody.error.message === "string"
          ? parsedBody.error.message
          : `The local Ollama endpoint returned ${upstreamResponse.status}.`;
      return NextResponse.json(
        {
          error: message,
        },
        { status: upstreamResponse.status },
      );
    }
    if (!parsedBody || typeof parsedBody !== "object") {
      return NextResponse.json(
        {
          error: "The local Ollama endpoint returned an unreadable response.",
        },
        { status: 502 },
      );
    }
    const tokens = extractLab1ResponseTokens(parsedBody);
    if (tokens.length === 0) {
      return NextResponse.json(
        {
          error:
            "The local Ollama response did not include token logprobs. Confirm the installed Ollama version supports logprobs.",
        },
        { status: 502 },
      );
    }
    const content =
      extractLab1AssistantContent(parsedBody) ||
      tokens.map((token) => token.token).join("");
    return NextResponse.json({
      content,
      model:
        ("model" in parsedBody && typeof parsedBody.model === "string"
          ? parsedBody.model
          : getLab1ModelAlias()),
      role: "assistant",
      tokens,
    });
  } catch (caughtError) {
    if (caughtError instanceof Error && caughtError.name === "AbortError") {
      return NextResponse.json(
        {
          error: `The local Ollama endpoint timed out after ${Math.floor(LOCAL_OLLAMA_TIMEOUT_MS / 1000)} seconds.`,
        },
        { status: 504 },
      );
    }
    return NextResponse.json(
      {
        error: "The Lab 1 confidence route could not reach the local Ollama endpoint.",
      },
      { status: 502 },
    );
  } finally {
    clearTimeout(timeoutId);
  }
 }
@@ -0,0 +1,75 @@
 import { createReadStream, statSync } from "fs";
 import path from "path";
 import { Readable } from "stream";
 import { NextResponse } from "next/server";
 const modelFileMap = {
  "llama-3.2-1b-q4_k_m.gguf": {
    envKey: "COURSEWARE_LAB1_LLAMA_MODEL_PATH",
    fileName: "Llama-3.2-1B.Q4_K_M.gguf",
  },
  "qwen3-0.6b-q8_0.gguf": {
    envKey: "COURSEWARE_LAB1_QWEN_MODEL_PATH",
    fileName: "Qwen3-0.6B-Q8_0.gguf",
  },
 } as const;
 type ModelSlug = keyof typeof modelFileMap;
 function resolveModelPath(slug: string) {
  const config = modelFileMap[slug as ModelSlug];
  if (!config) {
    return null;
  }
  const configuredPath = process.env[config.envKey]?.trim();
  if (!configuredPath) {
    return null;
  }
  return {
    absolutePath: path.resolve(configuredPath),
    fileName: config.fileName,
  };
 }
 export async function GET(
  _request: Request,
  context: { params: Promise<{ filename: string }> },
 ) {
  const { filename } = await context.params;
  const resolvedFile = resolveModelPath(filename.toLowerCase());
  if (!resolvedFile) {
    return NextResponse.json(
      {
        error: "The requested Lab 1 model file was not found.",
      },
      { status: 404 },
    );
  }
  try {
    const fileStats = statSync(resolvedFile.absolutePath);
    const stream = Readable.toWeb(
      createReadStream(resolvedFile.absolutePath),
    ) as ReadableStream;
    return new NextResponse(stream, {
      headers: {
        "Cache-Control": "private, max-age=0, must-revalidate",
        "Content-Disposition": `attachment; filename="${resolvedFile.fileName}"`,
        "Content-Length": String(fileStats.size),
        "Content-Type": "application/octet-stream",
      },
    });
  } catch {
    return NextResponse.json(
      {
        error: "The requested Lab 1 model file could not be opened.",
      },
      { status: 404 },
    );
  }
 }
@@ -0,0 +1,84 @@
 import { fireEvent, render, screen } from "@testing-library/react";
 import { beforeEach, describe, expect, it, vi } from "vitest";
 import { Lab1ConfidenceChat } from "~/components/labs/Lab1ConfidenceChat";
 describe("Lab1ConfidenceChat", () => {
  beforeEach(() => {
    vi.restoreAllMocks();
  });
  it("renders colorized tokens and tooltip data from the Lab 1 chat route", async () => {
    vi.stubGlobal(
      "fetch",
      vi.fn(async () => {
        return {
          json: async () => ({
            content: "often works",
            model: "lab1-qwen3-0.6b-q8_0",
            role: "assistant",
            tokens: [
              {
                logprob: Math.log(0.4),
                probability: 40,
                token: "often",
                topAlternatives: [
                  { probability: 14, token: "commonly" },
                  { probability: 10, token: "also" },
                ],
              },
              {
                logprob: Math.log(0.8),
                probability: 80,
                token: " works",
                topAlternatives: [],
              },
            ],
          }),
          ok: true,
        };
      }),
    );
    render(<Lab1ConfidenceChat />);
    fireEvent.change(screen.getByLabelText("Prompt"), {
      target: { value: "Explain how often phishing succeeds." },
    });
    fireEvent.submit(
      screen.getByRole("button", { name: "Generate Output" }).closest("form")!,
    );
    expect(await screen.findByLabelText("often 40.0%")).toBeInTheDocument();
    expect(screen.getByText("14.0%:")).toBeInTheDocument();
    expect(screen.getByText("commonly")).toBeInTheDocument();
    expect(screen.getByText("lab1-qwen3-0.6b-q8_0")).toBeInTheDocument();
  });
  it("shows an inline error when the local route fails", async () => {
    vi.stubGlobal(
      "fetch",
      vi.fn(async () => {
        return {
          json: async () => ({
            error: "The local Ollama request failed.",
          }),
          ok: false,
        };
      }),
    );
    render(<Lab1ConfidenceChat />);
    fireEvent.change(screen.getByLabelText("Prompt"), {
      target: { value: "Trigger an error." },
    });
    fireEvent.submit(
      screen.getByRole("button", { name: "Generate Output" }).closest("form")!,
    );
    expect(
      await screen.findByText("The local Ollama request failed."),
    ).toBeInTheDocument();
  });
 });
@@ -0,0 +1,244 @@
 "use client";
 import { FormEvent, useState } from "react";
 import {
  formatProbabilityPercent,
  getConfidenceBand,
  type Lab1ConfidenceMessage,
  type Lab1ConfidenceResponse,
  type Lab1ResponseToken,
 } from "~/lib/lab1-confidence";
 type UserTurn = {
  content: string;
  id: string;
  role: "user";
 };
 type AssistantTurn = Lab1ConfidenceResponse & {
  error?: string;
  id: string;
 };
 type ChatTurn = AssistantTurn | UserTurn;
 const starterPrompts = [
  "The quick brown fox",
  "Write one sentence explaining what a firewall does.",
  "List three words that describe a phishing email.",
 ] as const;
 function buildTurnId() {
  return `lab1-turn-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
 }
 function toConversation(messages: ChatTurn[]) {
  return messages.map(({ content, role }) => ({ content, role }));
 }
 function renderTooltip(token: Lab1ResponseToken) {
  return (
    <span className="lab1-confidence__tooltip">
      <strong>{formatProbabilityPercent(token.probability)}</strong>
      {token.topAlternatives.length > 0 ? (
        <span className="lab1-confidence__tooltip-list">
          {token.topAlternatives.map((candidate) => (
            <span key={`${token.token}-${candidate.token}`}>
              {formatProbabilityPercent(candidate.probability)}:{" "}
              <code>{candidate.token}</code>
            </span>
          ))}
        </span>
      ) : (
        <span className="lab1-confidence__tooltip-list">
          <span>No alternate tokens returned for this position.</span>
        </span>
      )}
    </span>
  );
 }
 export function Lab1ConfidenceChat() {
  const [draft, setDraft] = useState<string>(starterPrompts[0]);
  const [messages, setMessages] = useState<ChatTurn[]>([]);
  const [error, setError] = useState<string | null>(null);
  const [isSubmitting, setIsSubmitting] = useState(false);
  async function handleSubmit(event: FormEvent<HTMLFormElement>) {
    event.preventDefault();
    const prompt = draft.trim();
    if (!prompt) {
      setError("Enter a prompt to inspect the model output.");
      return;
    }
    const nextUserTurn: UserTurn = {
      content: prompt,
      id: buildTurnId(),
      role: "user",
    };
    const nextConversation = [...messages, nextUserTurn];
    setMessages(nextConversation);
    setDraft("");
    setError(null);
    setIsSubmitting(true);
    try {
      const response = await fetch("/api/lab1/chat", {
        body: JSON.stringify({
          messages: toConversation(nextConversation),
        }),
        headers: {
          "Content-Type": "application/json",
        },
        method: "POST",
      });
      const payload = (await response.json()) as Lab1ConfidenceResponse & {
        error?: string;
      };
      if (!response.ok) {
        throw new Error(payload.error || "The local Ollama request failed.");
      }
      setMessages((currentMessages) => [
        ...currentMessages,
        {
          ...payload,
          id: buildTurnId(),
        },
      ]);
    } catch (caughtError) {
      setError(
        caughtError instanceof Error
          ? caughtError.message
          : "The local Ollama request failed.",
      );
    } finally {
      setIsSubmitting(false);
    }
  }
  return (
    <section className="lab1-confidence" data-widget-enhanced="true">
      <div className="lab1-confidence__header">
        <p className="lab1-confidence__eyebrow">Lab 1 Confidence View</p>
        <h3>Visualize token confidence locally</h3>
        <p className="lab1-confidence__lede">
          This widget uses the preloaded local Lab 1 Qwen model. Hover over any
          output token to inspect its probability and the strongest alternate
          predictions returned for that position.
        </p>
      </div>
      <div className="lab1-confidence__prompt-row">
        {starterPrompts.map((prompt) => (
          <button
            className="lab1-confidence__prompt-chip"
            key={prompt}
            onClick={() => setDraft(prompt)}
            type="button"
          >
            {prompt}
          </button>
        ))}
      </div>
      <div className="lab1-confidence__transcript" aria-live="polite">
        {messages.length === 0 ? (
          <div className="lab1-confidence__empty">
            <strong>Try a short prompt first.</strong>
            <p>
              Start with one of the suggested prompts, then hover across the
              model output to compare high-confidence and low-confidence tokens.
            </p>
          </div>
        ) : (
          messages.map((message) => {
            if (message.role === "user") {
              return (
                <article
                  className="lab1-confidence__message lab1-confidence__message--user"
                  key={message.id}
                >
                  <div className="lab1-confidence__message-meta">
                    <span>You</span>
                  </div>
                  <pre className="lab1-confidence__message-body">
                    <code>{message.content}</code>
                  </pre>
                </article>
              );
            }
            return (
              <article className="lab1-confidence__message" key={message.id}>
                <div className="lab1-confidence__message-meta">
                  <span>Assistant</span>
                  <code>{message.model}</code>
                </div>
                <div className="lab1-confidence__token-stream" role="list">
                  {message.tokens.map((token, index) => (
                    <span
                      aria-label={`${token.token} ${formatProbabilityPercent(
                        token.probability,
                      )}`}
                      className={`lab1-confidence__token lab1-confidence__token--${getConfidenceBand(
                        token.probability,
                      )}`}
                      key={`${message.id}-${index}-${token.token}`}
                      role="listitem"
                    >
                      {token.token}
                      {renderTooltip(token)}
                    </span>
                  ))}
                </div>
                {message.error ? (
                  <p className="lab1-confidence__message-warning">
                    {message.error}
                  </p>
                ) : null}
              </article>
            );
          })
        )}
      </div>
      <form className="lab1-confidence__composer" onSubmit={handleSubmit}>
        <label
          className="lab1-confidence__composer-label"
          htmlFor="lab1-confidence-draft"
        >
          Prompt
        </label>
        <textarea
          id="lab1-confidence-draft"
          name="draft"
          onChange={(event) => setDraft(event.target.value)}
          placeholder="Ask a question or start a phrase and inspect the output."
          rows={5}
          value={draft}
        />
        <div className="lab1-confidence__composer-actions">
          <div className="lab1-confidence__composer-state">
            <span>Inference target</span>
            <strong>Local Lab 1 Qwen model</strong>
          </div>
          <button disabled={isSubmitting} type="submit">
            {isSubmitting ? "Generating..." : "Generate Output"}
          </button>
        </div>
        {error ? <p className="lab1-confidence__error">{error}</p> : null}
      </form>
    </section>
  );
 }
@@ -0,0 +1,66 @@
 "use client";
 import { useEffect, useState } from "react";
 import {
  fetchCoursewareRuntimeConfig,
  normalizeCoursewareRuntimeConfig,
 } from "~/lib/courseware-runtime";
 export function Lab1NetronPanel() {
  const [runtimeConfig, setRuntimeConfig] = useState(() =>
    normalizeCoursewareRuntimeConfig(),
  );
  const [isResolved, setIsResolved] = useState(false);
  useEffect(() => {
    let isCancelled = false;
    void fetchCoursewareRuntimeConfig()
      .then((nextConfig) => {
        if (isCancelled) return;
        setRuntimeConfig(nextConfig);
      })
      .catch(() => {
        if (isCancelled) return;
        setRuntimeConfig(normalizeCoursewareRuntimeConfig());
      })
      .finally(() => {
        if (isCancelled) return;
        setIsResolved(true);
      });
    return () => {
      isCancelled = true;
    };
  }, []);
  return (
    <section className="lab1-netron-panel" data-widget-enhanced="true">
      <div>
        <p className="lab1-netron-panel__eyebrow">Lab 1 Model Structure</p>
        <h3>Open Netron on port 8338</h3>
        <p className="lab1-netron-panel__lede">
          Netron runs as a local browser tool for this lab. Open it, then use
          the download links in this section to load each GGUF file manually.
        </p>
      </div>
      <div className="lab1-netron-panel__actions">
        <a
          className="lab1-netron-panel__primary"
          href={runtimeConfig.lab1NetronUrl}
          rel="noreferrer"
          target="_blank"
        >
          Open Netron
        </a>
        <span className="lab1-netron-panel__note" aria-live="polite">
          {isResolved
            ? `Netron URL: ${runtimeConfig.lab1NetronUrl}`
            : "Resolving the Netron URL..."}
        </span>
      </div>
    </section>
  );
 }
@@ -51,7 +51,7 @@ describe("fetchLab3RuntimeConfig", () => {
      );
    await expect(fetchLab3RuntimeConfig()).resolves.toEqual({
-      terminalPath: "http://127.0.0.1:7681/wetty",
+      terminalPath: "http://localhost:7681/wetty",
    });
    expect(fetchMock).toHaveBeenCalledWith(LAB3_RUNTIME_CONFIG_PATH, {
@@ -95,7 +95,7 @@ describe("Lab3TerminalFrame", () => {
    await waitFor(() => {
      expect(screen.getByTitle("Lab 3 terminal session")).toHaveAttribute(
        "src",
-        "http://127.0.0.1:7681/wetty",
+        "http://localhost:7681/wetty",
      );
    });
    expect(
@@ -2,7 +2,10 @@
 import { useEffect, useRef, useState } from "react";
-import { fetchLab3RuntimeConfig, normalizeLab3RuntimeConfig } from "~/lib/lab3-runtime";
+import {
  fetchCoursewareRuntimeConfig,
  normalizeCoursewareRuntimeConfig,
 } from "~/lib/courseware-runtime";
 type Lab3TerminalFrameProps = {
  srcPath?: string;
@@ -14,18 +17,20 @@ export function Lab3TerminalFrame({ srcPath }: Lab3TerminalFrameProps) {
  const [status, setStatus] = useState<"loading" | "ready" | "error">("loading");
  const [isConfigResolved, setIsConfigResolved] = useState(Boolean(srcPath));
  const [runtimeConfig, setRuntimeConfig] = useState(() => {
-    return normalizeLab3RuntimeConfig(
+    return normalizeCoursewareRuntimeConfig(
      srcPath ? { lab3TerminalUrl: srcPath } : undefined,
    );
  });
-  const terminalPath = runtimeConfig.terminalPath;
+  const terminalPath = runtimeConfig.lab3TerminalUrl;
  useEffect(() => {
    let isCancelled = false;
    if (srcPath) {
-      setRuntimeConfig(normalizeLab3RuntimeConfig({ lab3TerminalUrl: srcPath }));
+      setRuntimeConfig(
        normalizeCoursewareRuntimeConfig({ lab3TerminalUrl: srcPath }),
      );
      setIsConfigResolved(true);
      return;
    }
@@ -33,14 +38,14 @@ export function Lab3TerminalFrame({ srcPath }: Lab3TerminalFrameProps) {
    setIsConfigResolved(false);
    setStatus("loading");
-    void fetchLab3RuntimeConfig()
+    void fetchCoursewareRuntimeConfig()
      .then((nextConfig) => {
        if (isCancelled) return;
        setRuntimeConfig(nextConfig);
      })
      .catch(() => {
        if (isCancelled) return;
-        setRuntimeConfig(normalizeLab3RuntimeConfig());
+        setRuntimeConfig(normalizeCoursewareRuntimeConfig());
      })
      .finally(() => {
        if (isCancelled) return;
@@ -0,0 +1,39 @@
 import { render, screen, waitFor } from "@testing-library/react";
 import { afterEach, describe, expect, it, vi } from "vitest";
 import { LabContent } from "~/components/labs/LabContent";
 describe("LabContent", () => {
  afterEach(() => {
    vi.restoreAllMocks();
  });
  it("renders the Lab 1 widget tokens into interactive components", async () => {
    vi.spyOn(globalThis, "fetch").mockResolvedValue(
      new Response(
        JSON.stringify({
          lab1NetronUrl: "http://127.0.0.1:8338",
          lab3TerminalUrl: "http://127.0.0.1:7681/wetty",
        }),
        { status: 200 },
      ),
    );
    render(
      <LabContent
        className="lab-content"
        html={[
          "<div data-lab1-netron-panel></div>",
          "<div data-tokenizer-playground></div>",
          "<div data-lab1-confidence></div>",
        ].join("")}
      />,
    );
    await waitFor(() => {
      expect(screen.getByRole("link", { name: "Open Netron" })).toBeInTheDocument();
    });
    expect(screen.getByText("Tokenizer Playground")).toBeInTheDocument();
    expect(screen.getByText("Visualize token confidence locally")).toBeInTheDocument();
  });
 });
@@ -1,10 +1,13 @@
 "use client";
 import { Fragment, useEffect, useRef, useState } from "react";
 import { Lab1ConfidenceChat } from "~/components/labs/Lab1ConfidenceChat";
 import { Lab1NetronPanel } from "~/components/labs/Lab1NetronPanel";
 import { Lab3TerminalFrame } from "~/components/labs/Lab3TerminalFrame";
 import { Objective5Chat } from "~/components/labs/Objective5Chat";
 import { QuantizationGridExplorer } from "~/components/labs/QuantizationGridExplorer";
 import { QuantizationExplorer } from "~/components/labs/QuantizationExplorer";
 import { TokenizerPlaygroundEmbed } from "~/components/labs/TokenizerPlaygroundEmbed";
 type LabContentProps = {
  className: string;
@@ -35,6 +38,9 @@ const quantizationGridExplorerToken =
  "<div data-quantization-grid-explorer></div>";
 const objective5ChatToken = "<div data-objective5-chat></div>";
 const lab3TerminalToken = "<div data-lab3-terminal></div>";
 const lab1ConfidenceToken = "<div data-lab1-confidence></div>";
 const lab1NetronToken = "<div data-lab1-netron-panel></div>";
 const tokenizerPlaygroundToken = "<div data-tokenizer-playground></div>";
 function looksLikeCliCommand(commandText: string, className: string) {
  if (cliLanguagePattern.test(className)) return true;
@@ -199,7 +205,7 @@ export function LabContent({ className, html }: LabContentProps) {
  const renderedContent = html
    .split(
      new RegExp(
-        `(${escapeRegex(quantizationExplorerToken)}|${escapeRegex(quantizationGridExplorerToken)}|${escapeRegex(objective5ChatToken)}|${escapeRegex(lab3TerminalToken)})`,
+        `(${escapeRegex(quantizationExplorerToken)}|${escapeRegex(quantizationGridExplorerToken)}|${escapeRegex(objective5ChatToken)}|${escapeRegex(lab3TerminalToken)}|${escapeRegex(lab1ConfidenceToken)}|${escapeRegex(lab1NetronToken)}|${escapeRegex(tokenizerPlaygroundToken)})`,
        "g",
      ),
    )
@@ -225,6 +231,18 @@ export function LabContent({ className, html }: LabContentProps) {
        return <Lab3TerminalFrame key={`lab3-terminal-${index}`} />;
      }
      if (part === lab1ConfidenceToken) {
        return <Lab1ConfidenceChat key={`lab1-confidence-${index}`} />;
      }
      if (part === lab1NetronToken) {
        return <Lab1NetronPanel key={`lab1-netron-${index}`} />;
      }
      if (part === tokenizerPlaygroundToken) {
        return <TokenizerPlaygroundEmbed key={`tokenizer-playground-${index}`} />;
      }
      return (
        <Fragment key={`html-segment-${index}`}>
          <div dangerouslySetInnerHTML={{ __html: part }} />
@@ -0,0 +1,67 @@
 "use client";
 import { useState } from "react";
 type LabIFrameEmbedProps = {
  eyebrow: string;
  heading: string;
  lede: string;
  src: string;
  title: string;
 };
 export function LabIFrameEmbed({
  eyebrow,
  heading,
  lede,
  src,
  title,
 }: LabIFrameEmbedProps) {
  const [hasLoaded, setHasLoaded] = useState(false);
  const [hasErrored, setHasErrored] = useState(false);
  return (
    <section className="lab-iframe-embed" data-widget-enhanced="true">
      <div className="lab-iframe-embed__header">
        <p className="lab-iframe-embed__eyebrow">{eyebrow}</p>
        <h3>{heading}</h3>
        <p className="lab-iframe-embed__lede">{lede}</p>
      </div>
      <div className="lab-iframe-embed__actions">
        <a
          className="lab-iframe-embed__link"
          href={src}
          rel="noreferrer"
          target="_blank"
        >
          Open in New Tab
        </a>
        <span className="lab-iframe-embed__status" aria-live="polite">
          {hasErrored
            ? "The embedded view is unavailable right now."
            : hasLoaded
              ? "The embedded tool is ready below."
              : "Loading the embedded tool..."}
        </span>
      </div>
      <div className="lab-iframe-embed__frame-shell">
        <iframe
          className="lab-iframe-embed__frame"
          loading="lazy"
          onError={() => setHasErrored(true)}
          onLoad={() => setHasLoaded(true)}
          referrerPolicy="no-referrer"
          src={src}
          title={title}
        />
      </div>
      <p className="lab-iframe-embed__fallback">
        If the embedded tool does not appear, open it in a new tab and continue
        from there.
      </p>
    </section>
  );
 }
@@ -0,0 +1,19 @@
 import { render, screen } from "@testing-library/react";
 import { describe, expect, it } from "vitest";
 import { TokenizerPlaygroundEmbed } from "~/components/labs/TokenizerPlaygroundEmbed";
 describe("TokenizerPlaygroundEmbed", () => {
  it("renders the iframe wrapper and fallback link", () => {
    render(<TokenizerPlaygroundEmbed />);
    expect(screen.getByTitle("Tokenizer playground")).toHaveAttribute(
      "src",
      "https://xenova-the-tokenizer-playground.static.hf.space",
    );
    expect(screen.getByRole("link", { name: "Open in New Tab" })).toHaveAttribute(
      "href",
      "https://xenova-the-tokenizer-playground.static.hf.space",
    );
  });
 });
@@ -0,0 +1,18 @@
 "use client";
 import { LabIFrameEmbed } from "~/components/labs/LabIFrameEmbed";
 const TOKENIZER_PLAYGROUND_URL =
  "https://xenova-the-tokenizer-playground.static.hf.space";
 export function TokenizerPlaygroundEmbed() {
  return (
    <LabIFrameEmbed
      eyebrow="Lab 1 Tokenization"
      heading="Tokenizer Playground"
      lede="Try several prompts, compare how the text is segmented, and then inspect the token IDs that the model will actually consume."
      src={TOKENIZER_PLAYGROUND_URL}
      title="Tokenizer playground"
    />
  );
 }
@@ -0,0 +1,94 @@
 export const COURSEWARE_RUNTIME_CONFIG_PATH = "/courseware-runtime.json";
 export const LAB1_DEFAULT_NETRON_URL = "http://127.0.0.1:8338";
 export const LAB3_DEFAULT_TERMINAL_PATH = "/wetty";
 export type CoursewareRuntimeConfig = {
  lab1NetronUrl?: string;
  lab3TerminalUrl?: string;
 };
 export type ResolvedCoursewareRuntimeConfig = {
  lab1NetronUrl: string;
  lab3TerminalUrl: string;
 };
 const loopbackHosts = new Set(["127.0.0.1", "localhost", "::1"]);
 function rewriteLoopbackHost(urlValue: string, currentHostname?: string) {
  try {
    const url = new URL(urlValue);
    if (!currentHostname || !loopbackHosts.has(url.hostname)) {
      return url.toString();
    }
    url.hostname = currentHostname;
    return url.toString();
  } catch {
    return urlValue;
  }
 }
 function getCurrentHostname() {
  if (typeof window === "undefined") {
    return undefined;
  }
  const hostname = window.location.hostname?.trim();
  return hostname || undefined;
 }
 export function getLab1NetronUrl(
  envValue?: string,
  currentHostname = getCurrentHostname(),
 ) {
  const trimmedValue = envValue?.trim();
  if (!trimmedValue) {
    return rewriteLoopbackHost(LAB1_DEFAULT_NETRON_URL, currentHostname);
  }
  return rewriteLoopbackHost(trimmedValue, currentHostname);
 }
 export function getLab3TerminalPath(
  envValue?: string,
  currentHostname = getCurrentHostname(),
 ) {
  const trimmedValue = envValue?.trim();
  if (!trimmedValue) {
    return LAB3_DEFAULT_TERMINAL_PATH;
  }
  if (/^https?:\/\//i.test(trimmedValue)) {
    return rewriteLoopbackHost(trimmedValue, currentHostname);
  }
  return trimmedValue.startsWith("/") ? trimmedValue : `/${trimmedValue}`;
 }
 export function normalizeCoursewareRuntimeConfig(
  config?: CoursewareRuntimeConfig,
  currentHostname = getCurrentHostname(),
 ): ResolvedCoursewareRuntimeConfig {
  return {
    lab1NetronUrl: getLab1NetronUrl(config?.lab1NetronUrl, currentHostname),
    lab3TerminalUrl: getLab3TerminalPath(
      config?.lab3TerminalUrl,
      currentHostname,
    ),
  };
 }
 export async function fetchCoursewareRuntimeConfig() {
  const response = await fetch(COURSEWARE_RUNTIME_CONFIG_PATH, {
    cache: "no-store",
  });
  if (!response.ok) {
    throw new Error(`Runtime config request failed: ${response.status}`);
  }
  const config = (await response.json()) as CoursewareRuntimeConfig;
  return normalizeCoursewareRuntimeConfig(config);
 }
@@ -0,0 +1,85 @@
 import { describe, expect, it } from "vitest";
 import {
  extractLab1AssistantContent,
  extractLab1ResponseTokens,
  formatProbabilityPercent,
  getConfidenceBand,
  logprobToProbabilityPercent,
 } from "~/lib/lab1-confidence";
 describe("logprobToProbabilityPercent", () => {
  it("converts a logprob into a rounded percentage", () => {
    expect(logprobToProbabilityPercent(Math.log(0.4))).toBe(40);
  });
 });
 describe("extractLab1AssistantContent", () => {
  it("reads assistant content from an OpenAI-compatible response", () => {
    expect(
      extractLab1AssistantContent({
        choices: [
          {
            message: {
              content: "hello from the local model",
            },
          },
        ],
      }),
    ).toBe("hello from the local model");
  });
 });
 describe("extractLab1ResponseTokens", () => {
  it("maps token logprobs and alternate candidates into display data", () => {
    expect(
      extractLab1ResponseTokens({
        choices: [
          {
            logprobs: {
              content: [
                {
                  logprob: Math.log(0.4),
                  token: "often",
                  top_logprobs: [
                    { logprob: Math.log(0.4), token: "often" },
                    { logprob: Math.log(0.14), token: "commonly" },
                    { logprob: Math.log(0.1), token: "also" },
                  ],
                },
              ],
            },
          },
        ],
      }),
    ).toEqual([
      {
        logprob: Math.log(0.4),
        probability: 40,
        token: "often",
        topAlternatives: [
          { probability: 14, token: "commonly" },
          { probability: 10, token: "also" },
        ],
      },
    ]);
  });
 });
 describe("getConfidenceBand", () => {
  it("assigns a stable band for each probability range", () => {
    expect(getConfidenceBand(75)).toBe("very-high");
    expect(getConfidenceBand(45)).toBe("high");
    expect(getConfidenceBand(20)).toBe("medium");
    expect(getConfidenceBand(7)).toBe("low");
    expect(getConfidenceBand(1)).toBe("very-low");
  });
 });
 describe("formatProbabilityPercent", () => {
  it("formats probability values for tooltip display", () => {
    expect(formatProbabilityPercent(40)).toBe("40.0%");
    expect(formatProbabilityPercent(4.2)).toBe("4.20%");
    expect(formatProbabilityPercent(0.456)).toBe("0.456%");
  });
 });
@@ -0,0 +1,172 @@
 export const LAB1_CONFIDENCE_MODEL_ALIAS = "lab1-qwen3-0.6b-q8_0";
 export const LAB1_DEFAULT_MAX_TOKENS = 64;
 export const LAB1_DEFAULT_TEMPERATURE = 0.7;
 export const LAB1_MAX_CONTEXT_MESSAGES = 10;
 export const LAB1_MAX_MESSAGE_LENGTH = 4000;
 export type Lab1ConfidenceRole = "assistant" | "user";
 export type Lab1ConfidenceMessage = {
  content: string;
  role: Lab1ConfidenceRole;
 };
 export type Lab1TopAlternative = {
  probability: number;
  token: string;
 };
 export type Lab1ResponseToken = {
  logprob: number;
  probability: number;
  token: string;
  topAlternatives: Lab1TopAlternative[];
 };
 export type Lab1ConfidenceResponse = {
  content: string;
  model: string;
  role: "assistant";
  tokens: Lab1ResponseToken[];
 };
 type OpenAiLogprobAlternative = {
  logprob?: number;
  token?: string;
 };
 type OpenAiLogprobToken = {
  logprob?: number;
  token?: string;
  top_logprobs?: OpenAiLogprobAlternative[];
 };
 type OpenAiCompatibilityPayload = {
  choices?: Array<{
    logprobs?: {
      content?: OpenAiLogprobToken[];
    };
    message?: {
      content?: string;
    };
  }>;
  model?: string;
 };
 export function getLab1SystemPrompt() {
  return [
    "You are helping students inspect token-level confidence in a local language model.",
    "Reply clearly and concisely.",
    "Prefer one compact paragraph unless the user explicitly asks for a list.",
  ].join(" ");
 }
 export function clampLab1Messages(messages: Lab1ConfidenceMessage[]) {
  return messages
    .filter((message) => {
      return (
        (message.role === "assistant" || message.role === "user") &&
        typeof message.content === "string"
      );
    })
    .map((message) => {
      return {
        content: message.content.slice(0, LAB1_MAX_MESSAGE_LENGTH),
        role: message.role,
      } satisfies Lab1ConfidenceMessage;
    })
    .slice(-LAB1_MAX_CONTEXT_MESSAGES);
 }
 export function logprobToProbabilityPercent(logprob: number) {
  if (!Number.isFinite(logprob)) {
    return 0;
  }
  return roundProbability(Math.exp(logprob) * 100);
 }
 export function formatProbabilityPercent(probability: number) {
  if (probability >= 10) {
    return `${probability.toFixed(1)}%`;
  }
  if (probability >= 1) {
    return `${probability.toFixed(2)}%`;
  }
  if (probability > 0) {
    return `${probability.toFixed(3)}%`;
  }
  return "0%";
 }
 export function getConfidenceBand(probability: number) {
  if (probability >= 70) return "very-high";
  if (probability >= 40) return "high";
  if (probability >= 15) return "medium";
  if (probability >= 5) return "low";
  return "very-low";
 }
 export function extractLab1AssistantContent(payload: OpenAiCompatibilityPayload) {
  const content = payload.choices?.[0]?.message?.content?.trim();
  return content || null;
 }
 export function extractLab1ResponseTokens(
  payload: OpenAiCompatibilityPayload,
 ): Lab1ResponseToken[] {
  const rawTokens = payload.choices?.[0]?.logprobs?.content;
  if (!Array.isArray(rawTokens)) {
    return [];
  }
  return rawTokens.flatMap((rawToken) => {
    const token = rawToken.token ?? "";
    const logprob = rawToken.logprob;
    if (!token || typeof logprob !== "number" || !Number.isFinite(logprob)) {
      return [];
    }
    const topAlternatives = Array.isArray(rawToken.top_logprobs)
      ? rawToken.top_logprobs
          .flatMap((candidate) => {
            const candidateToken = candidate.token ?? "";
            const candidateLogprob = candidate.logprob;
            if (
              !candidateToken ||
              candidateToken === token ||
              typeof candidateLogprob !== "number" ||
              !Number.isFinite(candidateLogprob)
            ) {
              return [];
            }
            return [
              {
                probability: logprobToProbabilityPercent(candidateLogprob),
                token: candidateToken,
              } satisfies Lab1TopAlternative,
            ];
          })
          .slice(0, 5)
      : [];
    return [
      {
        logprob,
        probability: logprobToProbabilityPercent(logprob),
        token,
        topAlternatives,
      } satisfies Lab1ResponseToken,
    ];
  });
 }
 function roundProbability(value: number) {
  return Math.max(0, Math.min(100, Math.round(value * 100) / 100));
 }
@@ -1,43 +1,39 @@
-export const LAB3_DEFAULT_TERMINAL_PATH = "/wetty";
+import {
-export const LAB3_RUNTIME_CONFIG_PATH = "/courseware-runtime.json";
+  COURSEWARE_RUNTIME_CONFIG_PATH,
  LAB3_DEFAULT_TERMINAL_PATH,
  fetchCoursewareRuntimeConfig,
  getLab3TerminalPath as getSharedLab3TerminalPath,
  normalizeCoursewareRuntimeConfig,
  type CoursewareRuntimeConfig,
 } from "~/lib/courseware-runtime";
-export type Lab3RuntimeConfig = {
+export const LAB3_RUNTIME_CONFIG_PATH = COURSEWARE_RUNTIME_CONFIG_PATH;
-  lab3TerminalUrl?: string;
+export { LAB3_DEFAULT_TERMINAL_PATH };
-};
+
 export type Lab3RuntimeConfig = Pick<CoursewareRuntimeConfig, "lab3TerminalUrl">;
 export type ResolvedLab3RuntimeConfig = {
  terminalPath: string;
 };
 export function getLab3TerminalPath(envValue?: string) {
-  const trimmedValue = envValue?.trim();
+  return getSharedLab3TerminalPath(envValue);
  if (!trimmedValue) {
    return LAB3_DEFAULT_TERMINAL_PATH;
  }
  if (/^https?:\/\//i.test(trimmedValue)) {
    return trimmedValue;
  }
  return trimmedValue.startsWith("/") ? trimmedValue : `/${trimmedValue}`;
 }
 export function normalizeLab3RuntimeConfig(
  config?: Lab3RuntimeConfig,
 ): ResolvedLab3RuntimeConfig {
  const runtimeConfig = normalizeCoursewareRuntimeConfig(config);
  return {
-    terminalPath: getLab3TerminalPath(config?.lab3TerminalUrl),
+    terminalPath: runtimeConfig.lab3TerminalUrl,
  };
 }
 export async function fetchLab3RuntimeConfig() {
-  const response = await fetch(LAB3_RUNTIME_CONFIG_PATH, { cache: "no-store" });
+  const runtimeConfig = await fetchCoursewareRuntimeConfig();
-  if (!response.ok) {
+  return {
-    throw new Error(`Runtime config request failed: ${response.status}`);
+    terminalPath: runtimeConfig.lab3TerminalUrl,
-  }
+  };
  const config = (await response.json()) as Lab3RuntimeConfig;
  return normalizeLab3RuntimeConfig(config);
 }
@@ -1747,3 +1747,318 @@ ol {
    height: 18rem;
  }
 }
 .lab-content [data-lab1-confidence],
 .lab-content [data-lab1-netron-panel],
 .lab-content [data-tokenizer-playground] {
  display: block;
  margin: 1.75rem 0;
 }
 .lab-content .lab-screenshot-placeholder {
  margin: 1.25rem 0;
  border: 1px dashed #b8c6d6;
  border-radius: 1rem;
  background:
    linear-gradient(135deg, rgba(0, 78, 120, 0.04), rgba(248, 156, 39, 0.08));
  padding: 1rem 1.1rem;
  color: #345;
 }
 .lab-content .lab-screenshot-placeholder strong {
  display: block;
  margin-bottom: 0.35rem;
  color: #004e78;
 }
 .lab1-netron-panel,
 .lab-iframe-embed,
 .lab1-confidence {
  border: 1px solid #d7e2ee;
  border-radius: 1.25rem;
  background: linear-gradient(180deg, #fff 0%, #f7fafc 100%);
  box-shadow: 0 18px 40px -34px rgba(0, 78, 120, 0.55);
  padding: 1.35rem;
 }
 .lab1-netron-panel h3,
 .lab-iframe-embed h3,
 .lab1-confidence h3 {
  margin: 0;
  color: #12344d;
 }
 .lab1-netron-panel__eyebrow,
 .lab-iframe-embed__eyebrow,
 .lab1-confidence__eyebrow {
  margin: 0 0 0.35rem;
  color: #0f5c8b;
  font-size: 0.8rem;
  font-weight: 700;
  letter-spacing: 0.08em;
  text-transform: uppercase;
 }
 .lab1-netron-panel__lede,
 .lab-iframe-embed__lede,
 .lab1-confidence__lede {
  margin: 0.55rem 0 0;
  color: #53687b;
 }
 .lab1-netron-panel__actions,
 .lab-iframe-embed__actions {
  display: flex;
  align-items: center;
  justify-content: space-between;
  gap: 0.9rem;
  margin-top: 1rem;
  flex-wrap: wrap;
 }
 .lab1-netron-panel__primary,
 .lab-iframe-embed__link,
 .lab1-confidence__composer button,
 .lab1-confidence__prompt-chip {
  border-radius: 999px;
  border: 1px solid #0f5c8b;
  background: #0f5c8b;
  color: #fff;
  cursor: pointer;
  font: inherit;
  font-weight: 600;
  padding: 0.68rem 1.05rem;
  text-decoration: none;
  transition:
    transform 120ms ease,
    box-shadow 120ms ease,
    background-color 120ms ease;
 }
 .lab1-netron-panel__primary:hover,
 .lab-iframe-embed__link:hover,
 .lab1-confidence__composer button:hover,
 .lab1-confidence__prompt-chip:hover {
  transform: translateY(-1px);
  box-shadow: 0 12px 28px -22px rgba(15, 92, 139, 0.85);
 }
 .lab1-netron-panel__note,
 .lab-iframe-embed__status,
 .lab-iframe-embed__fallback {
  color: #63788d;
  font-size: 0.92rem;
 }
 .lab-iframe-embed__frame-shell {
  margin-top: 1rem;
  overflow: hidden;
  border: 1px solid #d7e2ee;
  border-radius: 1rem;
  background: #fff;
 }
 .lab-iframe-embed__frame {
  display: block;
  width: 100%;
  min-height: 560px;
  border: 0;
  background: #f8fafc;
 }
 .lab1-confidence__header {
  display: grid;
  gap: 0.2rem;
 }
 .lab1-confidence__prompt-row {
  display: flex;
  flex-wrap: wrap;
  gap: 0.65rem;
  margin-top: 1rem;
 }
 .lab1-confidence__prompt-chip {
  background: #f3f8fc;
  border-color: #d1dfeb;
  color: #0f5c8b;
 }
 .lab1-confidence__transcript {
  display: grid;
  gap: 1rem;
  margin-top: 1rem;
 }
 .lab1-confidence__empty,
 .lab1-confidence__message {
  border: 1px solid #d7e2ee;
  border-radius: 1rem;
  background: #fff;
  padding: 1rem;
 }
 .lab1-confidence__message--user {
  background: linear-gradient(180deg, #f8fbff 0%, #fff 100%);
 }
 .lab1-confidence__message-meta {
  display: flex;
  align-items: center;
  justify-content: space-between;
  gap: 0.75rem;
  color: #456;
  font-size: 0.92rem;
  font-weight: 600;
 }
 .lab1-confidence__message-meta code {
  color: #0f5c8b;
 }
 .lab1-confidence__message-body {
  margin: 0.75rem 0 0;
  white-space: pre-wrap;
 }
 .lab1-confidence__token-stream {
  margin-top: 0.85rem;
  padding: 0.9rem;
  border-radius: 0.95rem;
  background: #f8fbfd;
  border: 1px solid #e0e8f0;
  line-height: 2;
  white-space: pre-wrap;
 }
 .lab1-confidence__token {
  position: relative;
  border-radius: 0.42rem;
  padding: 0.12rem 0.08rem;
  transition: filter 120ms ease;
 }
 .lab1-confidence__token:hover {
  filter: saturate(1.05);
 }
 .lab1-confidence__token--very-high {
  background: rgba(88, 185, 102, 0.3);
 }
 .lab1-confidence__token--high {
  background: rgba(149, 209, 102, 0.26);
 }
 .lab1-confidence__token--medium {
  background: rgba(242, 220, 96, 0.26);
 }
 .lab1-confidence__token--low {
  background: rgba(246, 171, 82, 0.24);
 }
 .lab1-confidence__token--very-low {
  background: rgba(233, 117, 89, 0.24);
 }
 .lab1-confidence__tooltip {
  position: absolute;
  left: 0;
  top: calc(100% + 0.45rem);
  z-index: 5;
  display: none;
  min-width: 180px;
  max-width: 260px;
  border: 1px solid #d7e2ee;
  border-radius: 0.85rem;
  background: rgba(255, 255, 255, 0.98);
  box-shadow: 0 18px 38px -26px rgba(17, 44, 73, 0.7);
  color: #24384c;
  padding: 0.7rem 0.8rem;
  white-space: normal;
 }
 .lab1-confidence__token:hover .lab1-confidence__tooltip,
 .lab1-confidence__token:focus-visible .lab1-confidence__tooltip {
  display: block;
 }
 .lab1-confidence__tooltip strong {
  display: block;
  color: #0f5c8b;
 }
 .lab1-confidence__tooltip-list {
  display: grid;
  gap: 0.22rem;
  margin-top: 0.35rem;
  font-size: 0.88rem;
 }
 .lab1-confidence__composer {
  display: grid;
  gap: 0.7rem;
  margin-top: 1rem;
 }
 .lab1-confidence__composer-label {
  color: #12344d;
  font-weight: 700;
 }
 .lab1-confidence__composer textarea {
  width: 100%;
  min-height: 110px;
  border: 1px solid #cedbe8;
  border-radius: 1rem;
  padding: 0.95rem 1rem;
  font: inherit;
  resize: vertical;
 }
 .lab1-confidence__composer textarea:focus {
  outline: 2px solid rgba(15, 92, 139, 0.18);
  outline-offset: 1px;
 }
 .lab1-confidence__composer-actions {
  display: flex;
  align-items: center;
  justify-content: space-between;
  gap: 1rem;
  flex-wrap: wrap;
 }
 .lab1-confidence__composer-state {
  display: grid;
  gap: 0.18rem;
 }
 .lab1-confidence__composer-state span {
  color: #63788d;
  font-size: 0.88rem;
 }
 .lab1-confidence__composer-state strong {
  color: #12344d;
 }
 .lab1-confidence__message-warning,
 .lab1-confidence__error {
  color: #b54731;
 }
@media (max-width: 768px) {
  .lab-iframe-embed__actions,
  .lab1-confidence__composer-actions,
  .lab1-netron-panel__actions,
  .lab1-confidence__message-meta {
    align-items: stretch;
    flex-direction: column;
  }
  .lab-iframe-embed__frame {
    min-height: 440px;
  }
 }