Lab 5 Moved to Unsloth

2026-03-27 21:09:58 -06:00
parent d8882384f7
commit 882abccb65
3 changed files with 314 additions and 129 deletions
@@ -112,6 +112,15 @@ Locate, pull, and run **Qwen3.5 4B** using the **Open WebUI**.  By defualt, Op
   <figcaption>Successful inference – the model returns a coherent answer.</figcaption>
   </figure>

+9. **Download Gemma3n e2B**
+* While we're downloading models, let us download one more.  You can either repeat the process from the previous steps to find and download **Gemma3n e2B**, or just use the following model tag to download the model via the Open WebUI search bar:
+
+```bash
+ollama pull gemma3n:e2b 
+```
+
+Google has designed gemma 3n models designed for efficient execution on resource constrained devices such as laptops, tablets, phones, or Nvidia 2080 Super GPUs.
+
 ---


@@ -203,7 +212,11 @@ Feel free to continue to explore with other topics or images.  Note how each tim
 ### Explore: Prompt Engineering & System Prompting

 <div class="lab-callout lab-callout--warning">
-  <strong>Warning:</strong> As you explore chat via Open WebUI, ensure you turn  <code>think (Ollama)</code> to OFF.  Qwen3.5 4b is likely to enter an infinite thinking loop for these tasks otherwise, which will require a VM reboot.
+  <strong>Warning:</strong> As you explore chat via Open WebUI, ensure you turn  <code>think (Ollama)</code> to OFF. <strong>Qwen3.5 4B</strong> is likely to enter an infinite thinking loop for these tasks otherwise, which will require a VM reboot.
+
+<br><br>
+  
+  Alternatively, choose to perform these steps with **Gemma3n e2B**, which can handle tight environments more gracefully.
 </div>

 Next, lets review different ways we can coax a model to perform better, without having to perform fine tuning or parameter customization.  We can do this by "priming" the model with our first prompt in a number of ways: