New Lab 2

2026-04-07 16:02:48 -06:00
parent 6bcebd55ee
commit 9f3af49845
65 changed files with 6650 additions and 1553 deletions
@@ -1,11 +1,19 @@
+---
+order: 1
+title: Lab 1 - Visualizing LLMs in TransformerLab
+description: Explore model structure, tokenization, and next-token prediction inside TransformerLab.
+---
+
 <!-- breakout-style: instruction-rails -->
 <!-- step-style: underline -->
 <!-- objective-style: divider -->

 # Lab 1 - Visualizing LLMs in TransformerLab
+
 In this lab, we will:
-* Download and Visualize LLama-3.2-1B-Instruct
-* Visualize Tokenization & Prediction with LLama-3.2-1B-Instruct
+
+- Download and Visualize LLama-3.2-1B-Instruct
+- Visualize Tokenization & Prediction with LLama-3.2-1B-Instruct

 <div class="lab-callout lab-callout--info">
  <strong>Lab Flow Guide</strong><br />
@@ -13,14 +21,13 @@ In this lab, we will:
  <strong>Execute</strong> steps require performing actions in the lab environment.
 </div>

-
 ## Objective 1: Starting TransformerLab

 ### Execute: Access the Lab Environment

 To start Lab 1, ensure you've received a WireGuard configuration and system IP from your instructor. If you're unfamiliar with WireGuard, assistance will be provided to ensure you can access the lab environment for the duration of class.

-All systems use the default username and password of `student`. All labs are located in the student home folder. To start Lab 1, run 
+All systems use the default username and password of `student`. All labs are located in the student home folder. To start Lab 1, run

 ```bash
 ~/lab1/lab1_start.sh
@@ -28,7 +35,7 @@ All systems use the default username and password of `student`. All labs are loc

 using the `lab1_start.sh` script in the `lab1` folder.

-Lastly, if necessary, you can `su -` to root at any time. No password will be required.  
+Lastly, if necessary, you can `su -` to root at any time. No password will be required.

 Once started, you can reach TransformerLab on port 8338 of your Lab VM (http://<IP>:8338).

@@ -54,11 +61,11 @@ Navigate to **Plugins**, and in the search bar type `Fastchat`. Note that it has
    Plugins
  </figcaption>
 </figure>
-<br> 
+<br>

 ### Execute: Find and Load `LLama-3.2-1B-Instruct`

-Next, navigate to **Model Registry**.  You should see `LLama-3.2-1B-Instruct` right away on your screen, but if not, please start searching for this model using the search bar.
+Next, navigate to **Model Registry**. You should see `LLama-3.2-1B-Instruct` right away on your screen, but if not, please start searching for this model using the search bar.

 <figure style="text-align: center;">
  <a href="https://i.imgur.com/UyWdnMR.png" target="_blank">
@@ -86,7 +93,7 @@ Once downloaded, Select **Foundation** & our newly downloaded `LLama-3.2-1B-Inst
 </figure>
 <br>

-Once selected, click **Run**.  Give TransformerLab a moment to successfully load the model.
+Once selected, click **Run**. Give TransformerLab a moment to successfully load the model.

 <figure style="text-align: center;">
  <a href="https://i.imgur.com/f4YcA8P.png" target="_blank">
@@ -101,8 +108,8 @@ Once selected, click **Run**.  Give TransformerLab a moment to successfully load
 <br>

 ### Explore: Inspect the Architecture View
-To start, lets navigate to the **Interact** page, and then select **Model Architecture** from the Chat drop down.

+To start, lets navigate to the **Interact** page, and then select **Model Architecture** from the Chat drop down.

 <figure style="text-align: center;">
  <a href="https://i.imgur.com/X0CM31h.png" target="_blank">
@@ -117,8 +124,9 @@ To start, lets navigate to the **Interact** page, and then select **Model Archit
 <br>

 This page allows us to visualize the actively loaded model, in this case our downloaded `LLama-3.2-1B-Instruct-`. This interactive view is equivalent to the greatly simplified version shown on the slide “Transformation: Multylayer Perceptron” from our lecture. We can explore this view by:
-* Holding down both right and left mouse buttons and dragging will move the entire model.
-* Holding down just the left mouse button will allow you to rotate the view.
+
+- Holding down both right and left mouse buttons and dragging will move the entire model.
+- Holding down just the left mouse button will allow you to rotate the view.

 <figure style="text-align: center;">
  <a href="https://i.imgur.com/8hXTGlt.png" target="_blank">
@@ -134,10 +142,11 @@ This page allows us to visualize the actively loaded model, in this case our dow

 ### Explore: Interpret Layers, Blocks, and Parameters

-Each layer of the model performs a specific task, taking the input provided, and transforming it into the statistically most likely completion of text, token by token.  This format of Llama 3.1 1B is made up of 372 **layers**.  Each layer will transform the input of the layer above it, until eventually, we end up with the statically likely completion.
-You have likely also noticed that the colors repeat.  Each set of repeating **layers** is organized into **blocks**.  Each **block** is a grouping of **layers** that perform the same functions, but with a slightly different focus.  For example, one **block** may focus on nouns, and another may focus on adjectives, and so on.  
+Each layer of the model performs a specific task, taking the input provided, and transforming it into the statistically most likely completion of text, token by token. This format of Llama 3.1 1B is made up of 372 **layers**. Each layer will transform the input of the layer above it, until eventually, we end up with the statically likely completion.
+You have likely also noticed that the colors repeat. Each set of repeating **layers** is organized into **blocks**. Each **block** is a grouping of **layers** that perform the same functions, but with a slightly different focus. For example, one **block** may focus on nouns, and another may focus on adjectives, and so on.

 The **layers** within Llama 3.1 1B are as follows:
+
 <ul class="concept-pill-list">
  <li>
    <span class="concept-pill-label">Attention:</span>
@@ -157,11 +166,10 @@ The **layers** within Llama 3.1 1B are as follows:
  </li>
 </ul>

-Each of these **layers** also has a different type, corresponding to Q, K, V, and much more.
-5. The **layers** between the small “Attention” **layers** are all considered to make up a single “block.”
-   To the side, we can see the actual number values of each weight within each layer.
+Each of these **layers** also has a different type, corresponding to Q, K, V, and much more. 5. The **layers** between the small “Attention” **layers** are all considered to make up a single “block.”
+To the side, we can see the actual number values of each weight within each layer.

-Fundamentally, the LLM itself is this stack of numbers.  Those numbers allow us to transform tokenized input (such as English), and transform that into a useful output.  The more **layers** & **blocks**, the bigger the model, the more accurate and “intelligent” the model will behave.  This 1B parameter model is incredibly small however, so the “truthfulness” of generated predictions is likely to be suspect (aka Hallucinated).  The model will at least sound very confident however!
+Fundamentally, the LLM itself is this stack of numbers. Those numbers allow us to transform tokenized input (such as English), and transform that into a useful output. The more **layers** & **blocks**, the bigger the model, the more accurate and “intelligent” the model will behave. This 1B parameter model is incredibly small however, so the “truthfulness” of generated predictions is likely to be suspect (aka Hallucinated). The model will at least sound very confident however!

 <br>

@@ -185,7 +193,7 @@ Lets next move on to active conversation with the model. Navigate to the **Chat*
 </figure>
 <br>

-Once loaded, feel free to type any message and interact with the model in any way.  To speed up the pace of our lab, I recommend setting your maximum output length to 64 tokens.
+Once loaded, feel free to type any message and interact with the model in any way. To speed up the pace of our lab, I recommend setting your maximum output length to 64 tokens.

 <figure style="text-align: center;">
  <a href="https://i.imgur.com/MdAIKLn.png" target="_blank">
@@ -203,7 +211,7 @@ If text generation fails, or acts weird (such as merely repeating your input bac

 ### Execute: View Tokenization

-If everything is in working order, review the **Tokenize** view.  This allows us to visually see how Llama 3.2 will convert our input text into “tokens,” or numbers that represent the input English.  Feel free to input any sentence into the box to review what the final tokenized version will be.
+If everything is in working order, review the **Tokenize** view. This allows us to visually see how Llama 3.2 will convert our input text into “tokens,” or numbers that represent the input English. Feel free to input any sentence into the box to review what the final tokenized version will be.

 <figure style="text-align: center;">
  <a href="https://i.imgur.com/I9tU8jK.png" target="_blank">
@@ -219,7 +227,7 @@ If everything is in working order, review the **Tokenize** view.  This allows us

 ### Execute: Visualize Next-Token Activations

-Next, select Model Activations.  By entering “The quick brown fox” and selecting visualize, we can see how the model selects the next word, and the models level of confidence.  Also feel free to redo this process with alternative sentences.
+Next, select Model Activations. By entering “The quick brown fox” and selecting visualize, we can see how the model selects the next word, and the models level of confidence. Also feel free to redo this process with alternative sentences.

 <figure style="text-align: center;">
  <a href="https://i.imgur.com/JeWpoqV.png" target="_blank">
@@ -235,7 +243,7 @@ Next, select Model Activations.  By entering “The quick brown fox” and selec

 ### Execute: Compare Confidence Views

-Note how confident the model is about the word jumps in this famous phrase.  For an alternative view of the same output, you can also select the **Visualize Logprobes** option from the menu, which will show the same information but by color.
+Note how confident the model is about the word jumps in this famous phrase. For an alternative view of the same output, you can also select the **Visualize Logprobes** option from the menu, which will show the same information but by color.

 <figure style="text-align: center;">
  <a href="https://i.imgur.com/PvkgQUr.png" target="_blank">
@@ -251,12 +259,13 @@ Note how confident the model is about the word jumps in this famous phrase.  For

 ### Explore: Continue Exploring TransformerLab Features

-Please continue to explore Transformers Lab until you’re ready to move on.  While we will utilize many different tools other than Transformers Lab throughout this course due to its beta nature, this software is improving all the time and is worth watching!  Transformers lab supports many advanced features, in various stages of development, such as:
-* Batch Text Generation
-* LLM Fine Tuning
-* LLM Evaluation
-* Retrieval Augmented Generation (RAG)
-We will discuss these topics and more throughout the course.
+Please continue to explore Transformers Lab until you’re ready to move on. While we will utilize many different tools other than Transformers Lab throughout this course due to its beta nature, this software is improving all the time and is worth watching! Transformers lab supports many advanced features, in various stages of development, such as:
+
+- Batch Text Generation
+- LLM Fine Tuning
+- LLM Evaluation
+- Retrieval Augmented Generation (RAG)
+  We will discuss these topics and more throughout the course.

 <br>

@@ -264,6 +273,6 @@ We will discuss these topics and more throughout the course.

 ## Conclusion

-In this lab, we observed the foundational concepts of all LLMs in action using TransformerLab. Through hands-on exploration, we observed the process of tokenization – how text is converted into numerical representations for the model – and visualized the model's prediction process, including its confidence levels for different token selections. By navigating the model’s layers and blocks, we gained an appreciation for the sheer scale and complexity inherent in modern LLMs. 
+In this lab, we observed the foundational concepts of all LLMs in action using TransformerLab. Through hands-on exploration, we observed the process of tokenization – how text is converted into numerical representations for the model – and visualized the model's prediction process, including its confidence levels for different token selections. By navigating the model’s layers and blocks, we gained an appreciation for the sheer scale and complexity inherent in modern LLMs.

 This initial experience provides a crucial stepping stone for further exploration of LLMs, laying the groundwork for future labs focused on fine-tuning, evaluation, and advanced techniques like Retrieval Augmented Generation.