diff --git a/content/labs/lab-5-api-and-harnesses.md b/content/labs/lab-5-api-and-harnesses.md new file mode 100644 index 0000000..209f02c --- /dev/null +++ b/content/labs/lab-5-api-and-harnesses.md @@ -0,0 +1,360 @@ +--- +order: 5 +title: Lab 5 - API & Harnesses +description: Generate an Open WebUI API key, connect one of three coding harnesses, and build a small Zork-style game. +--- + + + + + +# Lab 5 - API & Harnesses + +In this lab, we will: + +- Generate a personal API key inside Open WebUI +- Install one of three coding harnesses +- Configure that harness to talk to Open WebUI as the backend +- Use the harness to build a small Zork-style game + +
+ Lab Flow Guide
+ This lab stays on a single high-level track, but Objectives 2 and 3 branch into three harness paths.
+ Pick one harness, complete its branch, then rejoin the common Zork objective at the end. +
+ +To start this lab, one web service has been preconfigured: + +- Open WebUI - http://:8080 + +## Objective 1 Execute: Generate an Open WebUI API Key + +Before we install any harness, we need a key that lets the harness call the same model backend exposed through Open WebUI. + +### Execute: Sign in to Open WebUI + +1. Navigate to `http://:8080`. +2. Sign in with the same account you used in Lab 4, or the credentials supplied by your instructor. +3. Confirm that you can reach the normal chat screen before continuing. + +### Execute: Create a personal access token + +According to the Open WebUI reference docs, API keys are created from **Settings -> Account** and authenticate with the same permissions as the user who created them. + +1. Click your avatar in the lower-left corner. + +
+ + + +
+ User Settings +
+
+
+ +2. Open **Settings**. +3. Open **Account**. +4. Locate the **API Keys** section. + +
+ + + +
+ API Key +
+
+
+ +6. Copy the key immediately and store it somewhere safe for the duration of the lab. + +
+ If you do not see API Keys: Open WebUI requires the feature to be enabled globally, and your user account needs permission to generate keys. Ask your instructor for help before continuing. +
+ +### Execute: Sanity-check the key from the terminal + +Run a quick authenticated request against the Open WebUI model list endpoint. You should receive JSON back instead of an authentication error. + +```bash +curl http://:8080/api/models \ + -H "Authorization: Bearer YOUR_OPENWEBUI_API_KEY" +``` + +If this request works, your harness will use the same key for later steps. + +--- + +## Objective 2 Execute: Choose and Install a Harness + +All three branches ultimately talk to the same Open WebUI backend. The difference is the user interface and configuration style for each harness. + +
+ + + +
+ +

Select a path to reveal that harness's instructions throughout the rest of the lab. Select the same card again if you want to hide the harness-specific instructions and return to the shared overview.

+ +### Execute: Install the harness you want to use + +
+

Path A

+

Install OpenCode

+

OpenCode is a terminal-native coding agent. Its official docs recommend either the install script or the npm package.

+
curl -fsSL https://opencode.ai/install | bash
+opencode --version
+

If you prefer npm and already have Node.js installed:

+
npm install -g opencode-ai
+opencode --version
+

Once installed, stay in the terminal. We will configure OpenCode in Objective 3.

+
+ +
+

Path B

+

Install Kilo Code for VS Code

+

Kilo Code is primarily used through the editor UI. For this Linux-first lab flow, use VS Code on the student workstation and install the extension from the marketplace.

+
    +
  1. Open VS Code.
  2. +
  3. Open the Extensions view.
  4. +
  5. Search for Kilo Code.
  6. +
  7. Click Install.
  8. +
  9. Reload VS Code if prompted.
  10. +
  11. Open the project folder you want to work in before moving to Objective 3.
  12. +
+
+ Tip: Kilo Code supports several providers and local-model options. In this lab, we will use its OpenAI Compatible provider flow so it can target Open WebUI. +
+
+ +
+

Path C

+

Install Factory Droid

+

Factory's Droid harness runs in the terminal and supports BYOK custom models through Factory configuration files.

+
curl -fsSL https://app.factory.ai/cli | sh
+droid --version
+

If the shell needs to be reloaded after install, open a fresh terminal and rerun droid --version.

+
+ +--- + +## Objective 3 Execute: Configure Your Harness for Open WebUI + +For all three harnesses, the common backend values are: + +- `Base URL` - `http://:8080/api` +- `API Key` - `YOUR_OPENWEBUI_API_KEY` +- `Model ID` - Any model ID returned by Open WebUI, such as `qwen3.5:4b` + +The shared idea is simple: your harness sends requests to Open WebUI's authenticated API endpoints instead of directly to a cloud provider. + +### Execute: Apply the configuration for your chosen harness + +
+

Path A

+

Configure OpenCode

+

OpenCode supports OpenAI-compatible providers through its JSON config. Create either a project-local opencode.json file or a global config under ~/.config/opencode/opencode.json.

+ +

It can also be easier to start opencode once, and exit with /exit. Use the following example to help structure your opencode.json file. +

{
+  "$schema": "https://opencode.ai/config.json",
+  "provider": {
+    "openwebui": {
+      "name": "Open WebUI",
+      "options": {
+        "baseURL": "http://<YOUR STUDENT IP>:8080/api",
+      },
+      "models": {
+        "qwen3.5:4b": {
+          "name": "Qwen 3.5 4B"
+        }
+      }
+    }
+  },
+  "model": "openwebui/qwen3.5:4b"
+}
+

After saving the config, you can login with opencode auth login:

+
+ + + +
+ opencode auth login +
+
+

After logging in, start OpenCode from your project directory:

+
cd /path/to/your/project
+opencode
+
+ +
+

Path B

+

Configure Kilo Code in VS Code

+

Kilo Code's documented workflow is provider-driven through the extension settings UI. Use the following values when creating or editing your provider profile.

+
    +
  • API Provider - OpenAI Compatible
  • +
  • OpenAI Base URL - http://<YOUR STUDENT IP>:8080/api
  • +
  • API Key - YOUR_OPENWEBUI_API_KEY
  • +
  • Model ID - qwen3.5:4b or another model exposed by Open WebUI
  • +
  • Approval Mode - Leave the safer default enabled for your first run
  • +
+
+
    +
  1. Open the Kilo Code panel in VS Code.
  2. +
  3. Open its provider or API settings.
  4. +
  5. Select OpenAI Compatible as the provider.
  6. +
  7. Paste in the base URL and API key values above.
  8. +
  9. Pick a model ID that exists in Open WebUI.
  10. +
  11. Start a new task to verify Kilo Code can connect successfully.
  12. +
+ +
+ + + +
+ Kilo Code Settings +
+
+
+ +
+ + + +
+ Provider Settings +
+
+
+ +
+ Tip: If model discovery fails, go back to your terminal and rerun the curl /api/models check from Objective 1. The harness and the curl command use the same authentication path. +
+
+ +
+

Path C

+

Configure Factory Droid

+

Factory's BYOK documentation supports custom model entries in ~/.factory/config.json. Because Open WebUI exposes a chat-completions-compatible API, use the generic-chat-completion-api provider type.

+
{
+  "custom_models": [
+    {
+      "model_display_name": "Open WebUI - Qwen 3.5 4B",
+      "model": "qwen3.5:4b",
+      "base_url": "http://<YOUR STUDENT IP>:8080/api",
+      "api_key": "YOUR_OPENWEBUI_API_KEY",
+      "provider": "generic-chat-completion-api",
+      "max_tokens": 4096
+    }
+  ]
+}
+

After saving the config:

+
    +
  1. Launch droid.
  2. +
  3. Open the model selector with /model.
  4. +
  5. Choose your new custom Open WebUI model entry.
  6. +
  7. Start a new session in the target project directory.
  8. +
+
+ +--- + +## Objective 4 Execute: Build a Tiny Zork Clone + +At this point, all three branches reconnect. The rest of the lab is the same no matter which harness you chose. + +### Execute: Start your harness session + +
+
+ OpenCode + Terminal Session + Run opencode inside the project directory. +
+
+ Kilo Code + VS Code Task + Open the repo folder and start a new Kilo Code task from the side panel. +
+
+ Factory Droid + CLI Session + Run droid, type "/mission", and ensure you've selected your custom model for each phase. +
+
+ +### Execute: Give the harness a shared prompt + +Use the following prompt as your starting task. Ensure you are in **Plan** mode (or a Droid Mission): + +```text +You are helping me build a tiny terminal adventure game in Python. + +Create a Zork-style prototype with: +- at least 5 connected rooms +- movement commands like north, south, east, and west +- a simple inventory system +- one collectible key +- one locked room or door +- a short win condition + +Use clean, readable Python and keep everything runnable from the terminal. +After writing the code, explain how to launch the game and what commands the player can use. +``` + +### Explore: Execute the result + +Once your harness produces the first version, keep pushing it with follow-up prompts: + +1. Ask it to add a help command. +2. Ask it to improve room descriptions. +3. Ask it to prevent impossible movement. +4. Ask it to add one extra puzzle or hidden interaction. + +Alternatively, reflect if you'd instead focused on using a Spec Driven development flow. How might the AI model perform more accurately as the requirements become more complicated? + +### Checkpoint: What success can look like + +Before finishing the lab, confirm that your game can: + +1. Start from the terminal without errors. +2. Accept basic movement commands. +3. Let the player pick up at least one item. +4. Use that item to unlock progress. +5. Reach a clear win state. + +## Conclusion + +In this lab, we: + +1. Generated an Open WebUI API key. +2. Installed a harness of our choice. +3. Connected that harness back to Open WebUI. +4. Used the harness to build a small but complete coding exercise. + +You should now have a repeatable pattern for testing other harnesses against the same Open WebUI deployment. We've also shown how a full local stack can work, from model selection, inference, harness installation, to real coding work. diff --git a/content/labs/lab-5-embedding-and-chunking.md b/content/labs/lab-6-embedding-and-chunking.md similarity index 99% rename from content/labs/lab-5-embedding-and-chunking.md rename to content/labs/lab-6-embedding-and-chunking.md index a185aa6..7f27d91 100644 --- a/content/labs/lab-5-embedding-and-chunking.md +++ b/content/labs/lab-6-embedding-and-chunking.md @@ -1,6 +1,6 @@ --- -order: 5 -title: Lab 5 - Embedding and Chunking +order: 6 +title: Lab 6 - Embedding and Chunking description: Explore chunking strategies and embeddings, then connect them to retrieval workflows. --- @@ -8,7 +8,7 @@ description: Explore chunking strategies and embeddings, then connect them to re -# Lab 5 - Embedding and Chunking +# Lab 6 - Embedding and Chunking In this lab, we will: diff --git a/content/labs/lab-5-embedding-and-chunking_files/1-ChunkViz-Home.png b/content/labs/lab-6-embedding-and-chunking_files/1-ChunkViz-Home.png similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/1-ChunkViz-Home.png rename to content/labs/lab-6-embedding-and-chunking_files/1-ChunkViz-Home.png diff --git a/content/labs/lab-5-embedding-and-chunking_files/2-ChunkViz-SizeandOverlap.png b/content/labs/lab-6-embedding-and-chunking_files/2-ChunkViz-SizeandOverlap.png similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/2-ChunkViz-SizeandOverlap.png rename to content/labs/lab-6-embedding-and-chunking_files/2-ChunkViz-SizeandOverlap.png diff --git a/content/labs/lab-5-embedding-and-chunking_files/3 - Chunking Strategies.png b/content/labs/lab-6-embedding-and-chunking_files/3 - Chunking Strategies.png similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/3 - Chunking Strategies.png rename to content/labs/lab-6-embedding-and-chunking_files/3 - Chunking Strategies.png diff --git a/content/labs/lab-5-embedding-and-chunking_files/4 - Chunk Colors Paragraphs.png b/content/labs/lab-6-embedding-and-chunking_files/4 - Chunk Colors Paragraphs.png similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/4 - Chunk Colors Paragraphs.png rename to content/labs/lab-6-embedding-and-chunking_files/4 - Chunk Colors Paragraphs.png diff --git a/content/labs/lab-5-embedding-and-chunking_files/5 - Embedding Atlas CLI.png b/content/labs/lab-6-embedding-and-chunking_files/5 - Embedding Atlas CLI.png similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/5 - Embedding Atlas CLI.png rename to content/labs/lab-6-embedding-and-chunking_files/5 - Embedding Atlas CLI.png diff --git a/content/labs/lab-5-embedding-and-chunking_files/6 - Colorize.png b/content/labs/lab-6-embedding-and-chunking_files/6 - Colorize.png similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/6 - Colorize.png rename to content/labs/lab-6-embedding-and-chunking_files/6 - Colorize.png diff --git a/content/labs/lab-5-embedding-and-chunking_files/7 - Semantic Similarity.png b/content/labs/lab-6-embedding-and-chunking_files/7 - Semantic Similarity.png similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/7 - Semantic Similarity.png rename to content/labs/lab-6-embedding-and-chunking_files/7 - Semantic Similarity.png diff --git a/content/labs/lab-5-embedding-and-chunking_files/8 - Surprising Similarity.png b/content/labs/lab-6-embedding-and-chunking_files/8 - Surprising Similarity.png similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/8 - Surprising Similarity.png rename to content/labs/lab-6-embedding-and-chunking_files/8 - Surprising Similarity.png diff --git a/content/labs/lab-5-embedding-and-chunking_files/Mermaid.png b/content/labs/lab-6-embedding-and-chunking_files/Mermaid.png similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/Mermaid.png rename to content/labs/lab-6-embedding-and-chunking_files/Mermaid.png diff --git a/content/labs/lab-5-embedding-and-chunking_files/combined_big_mitre.json b/content/labs/lab-6-embedding-and-chunking_files/combined_big_mitre.json similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/combined_big_mitre.json rename to content/labs/lab-6-embedding-and-chunking_files/combined_big_mitre.json diff --git a/content/labs/lab-5-embedding-and-chunking_files/eval_big_mitre.json b/content/labs/lab-6-embedding-and-chunking_files/eval_big_mitre.json similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/eval_big_mitre.json rename to content/labs/lab-6-embedding-and-chunking_files/eval_big_mitre.json diff --git a/content/labs/lab-5-embedding-and-chunking_files/ttps_dataset.parquet b/content/labs/lab-6-embedding-and-chunking_files/ttps_dataset.parquet similarity index 100% rename from content/labs/lab-5-embedding-and-chunking_files/ttps_dataset.parquet rename to content/labs/lab-6-embedding-and-chunking_files/ttps_dataset.parquet diff --git a/content/labs/lab-6-dataset-generation-and-fine-tuning.md b/content/labs/lab-7-dataset-generation-and-fine-tuning.md similarity index 99% rename from content/labs/lab-6-dataset-generation-and-fine-tuning.md rename to content/labs/lab-7-dataset-generation-and-fine-tuning.md index b1de756..915db1f 100644 --- a/content/labs/lab-6-dataset-generation-and-fine-tuning.md +++ b/content/labs/lab-7-dataset-generation-and-fine-tuning.md @@ -1,6 +1,6 @@ --- -order: 6 -title: Lab 6 - Dataset Generation and Fine Tuning +order: 7 +title: Lab 7 - Dataset Generation and Fine Tuning description: Review dataset options, generate examples with Kiln.ai, and fine-tune a model in Unsloth. --- @@ -8,7 +8,7 @@ description: Review dataset options, generate examples with Kiln.ai, and fine-tu -# Lab 6 - Dataset Generation and Fine Tuning +# Lab 7 - Dataset Generation and Fine Tuning In this lab, we will: diff --git a/content/labs/lab-6-dataset-generation-and-fine-tuning_files/05e780b2-c8b3-4947-8596-165e37b6fa00.png b/content/labs/lab-7-dataset-generation-and-fine-tuning_files/05e780b2-c8b3-4947-8596-165e37b6fa00.png similarity index 100% rename from content/labs/lab-6-dataset-generation-and-fine-tuning_files/05e780b2-c8b3-4947-8596-165e37b6fa00.png rename to content/labs/lab-7-dataset-generation-and-fine-tuning_files/05e780b2-c8b3-4947-8596-165e37b6fa00.png diff --git a/content/labs/lab-6-dataset-generation-and-fine-tuning_files/28819e1c-7806-413d-a563-26532000ca7a.png b/content/labs/lab-7-dataset-generation-and-fine-tuning_files/28819e1c-7806-413d-a563-26532000ca7a.png similarity index 100% rename from content/labs/lab-6-dataset-generation-and-fine-tuning_files/28819e1c-7806-413d-a563-26532000ca7a.png rename to content/labs/lab-7-dataset-generation-and-fine-tuning_files/28819e1c-7806-413d-a563-26532000ca7a.png diff --git a/content/labs/lab-6-dataset-generation-and-fine-tuning_files/64da1f9c-a120-4819-b005-df1bc8ef7b1d.png b/content/labs/lab-7-dataset-generation-and-fine-tuning_files/64da1f9c-a120-4819-b005-df1bc8ef7b1d.png similarity index 100% rename from content/labs/lab-6-dataset-generation-and-fine-tuning_files/64da1f9c-a120-4819-b005-df1bc8ef7b1d.png rename to content/labs/lab-7-dataset-generation-and-fine-tuning_files/64da1f9c-a120-4819-b005-df1bc8ef7b1d.png diff --git a/content/labs/lab-6-dataset-generation-and-fine-tuning_files/combined_big_mitre.json b/content/labs/lab-7-dataset-generation-and-fine-tuning_files/combined_big_mitre.json similarity index 100% rename from content/labs/lab-6-dataset-generation-and-fine-tuning_files/combined_big_mitre.json rename to content/labs/lab-7-dataset-generation-and-fine-tuning_files/combined_big_mitre.json diff --git a/content/labs/lab-6-dataset-generation-and-fine-tuning_files/eval_big_mitre.json b/content/labs/lab-7-dataset-generation-and-fine-tuning_files/eval_big_mitre.json similarity index 100% rename from content/labs/lab-6-dataset-generation-and-fine-tuning_files/eval_big_mitre.json rename to content/labs/lab-7-dataset-generation-and-fine-tuning_files/eval_big_mitre.json diff --git a/content/labs/lab-6-dataset-generation-and-fine-tuning_files/fe155203-6e88-49a3-a687-e1ff817b6988.png b/content/labs/lab-7-dataset-generation-and-fine-tuning_files/fe155203-6e88-49a3-a687-e1ff817b6988.png similarity index 100% rename from content/labs/lab-6-dataset-generation-and-fine-tuning_files/fe155203-6e88-49a3-a687-e1ff817b6988.png rename to content/labs/lab-7-dataset-generation-and-fine-tuning_files/fe155203-6e88-49a3-a687-e1ff817b6988.png diff --git a/content/labs/lab-7-evaluation-and-red-teaming.md b/content/labs/lab-8-evaluation-and-red-teaming.md similarity index 92% rename from content/labs/lab-7-evaluation-and-red-teaming.md rename to content/labs/lab-8-evaluation-and-red-teaming.md index f19a716..a1a438c 100644 --- a/content/labs/lab-7-evaluation-and-red-teaming.md +++ b/content/labs/lab-8-evaluation-and-red-teaming.md @@ -1,6 +1,6 @@ --- -order: 7 -title: Lab 7 - Evaluation and Red Teaming +order: 8 +title: Lab 8 - Evaluation and Red Teaming description: Probe model defenses manually and with Promptfoo to evaluate security controls. --- @@ -8,7 +8,7 @@ description: Probe model defenses manually and with Promptfoo to evaluate securi -# Lab 7 - Evaluation and Red Teaming +# Lab 8 - Evaluation and Red Teaming In this lab, we will: @@ -97,7 +97,7 @@ Promptfoo is available on our lab machine at http://:15500. We Promptfoo is designed to be approachable for both beginners and practitioners. Its wizard guides you through configuring the target, selecting datasets and mutation strategies, and tracking execution.
- Tip: Although the Promptfoo WebUI is convenient, it hides a critical configuration option for this lab inside the YAML file. Please use the provided configuration file: [lab-7-evaluation-and-red-teaming/promptfoo.yaml](content/labs/lab-7-evaluation-and-red-teaming/promptfoo.yaml). Upload it with the Load Config button in the lower-left corner, then proceed with the following screenshot steps. + Tip: Although the Promptfoo WebUI is convenient, it hides a critical configuration option for this lab inside the YAML file. Please use the provided configuration file: [lab-8-evaluation-and-red-teaming/promptfoo.yaml](/labs/lab-8-evaluation-and-red-teaming/promptfoo.yaml). Upload it with the Load Config button in the lower-left corner, then proceed with the following screenshot steps.
@@ -167,7 +167,7 @@ Promptfoo is highly flexible. Anything that involves mass evaluation of prompts ### Explore: Promptfoo evaluation workflow
- Tip: Please use the provided evaluation configuration file: [lab-7-evaluation-and-red-teaming/mmlu-promptfoo-config.yaml](content/labs/lab-7-evaluation-and-red-teaming/mmlu-promptfoo-config.yaml). Upload it with the Load Config button in the lower-left corner, then proceed with the following screenshot steps. + Tip: Please use the provided evaluation configuration file: [lab-8-evaluation-and-red-teaming/mmlu-promptfoo-config.yaml](/labs/lab-8-evaluation-and-red-teaming/mmlu-promptfoo-config.yaml). Upload it with the Load Config button in the lower-left corner, then proceed with the following screenshot steps.
diff --git a/content/labs/lab-7-evaluation-and-red-teaming/1 - Direct Injection.png b/public/labs/lab-8-evaluation-and-red-teaming/1 - Direct Injection.png similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/1 - Direct Injection.png rename to public/labs/lab-8-evaluation-and-red-teaming/1 - Direct Injection.png diff --git a/content/labs/lab-7-evaluation-and-red-teaming/2 - Red Team.png b/public/labs/lab-8-evaluation-and-red-teaming/2 - Red Team.png similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/2 - Red Team.png rename to public/labs/lab-8-evaluation-and-red-teaming/2 - Red Team.png diff --git a/content/labs/lab-7-evaluation-and-red-teaming/3 - Model Selection.png b/public/labs/lab-8-evaluation-and-red-teaming/3 - Model Selection.png similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/3 - Model Selection.png rename to public/labs/lab-8-evaluation-and-red-teaming/3 - Model Selection.png diff --git a/content/labs/lab-7-evaluation-and-red-teaming/4 - Model Testing Target.png b/public/labs/lab-8-evaluation-and-red-teaming/4 - Model Testing Target.png similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/4 - Model Testing Target.png rename to public/labs/lab-8-evaluation-and-red-teaming/4 - Model Testing Target.png diff --git a/content/labs/lab-7-evaluation-and-red-teaming/5 - Dataset Selection.png b/public/labs/lab-8-evaluation-and-red-teaming/5 - Dataset Selection.png similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/5 - Dataset Selection.png rename to public/labs/lab-8-evaluation-and-red-teaming/5 - Dataset Selection.png diff --git a/content/labs/lab-7-evaluation-and-red-teaming/6 - Mutation Strategies.png b/public/labs/lab-8-evaluation-and-red-teaming/6 - Mutation Strategies.png similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/6 - Mutation Strategies.png rename to public/labs/lab-8-evaluation-and-red-teaming/6 - Mutation Strategies.png diff --git a/content/labs/lab-7-evaluation-and-red-teaming/7 - Final Report.png b/public/labs/lab-8-evaluation-and-red-teaming/7 - Final Report.png similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/7 - Final Report.png rename to public/labs/lab-8-evaluation-and-red-teaming/7 - Final Report.png diff --git a/content/labs/lab-7-evaluation-and-red-teaming/Eval.png b/public/labs/lab-8-evaluation-and-red-teaming/Eval.png similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/Eval.png rename to public/labs/lab-8-evaluation-and-red-teaming/Eval.png diff --git a/content/labs/lab-7-evaluation-and-red-teaming/Image 1.png b/public/labs/lab-8-evaluation-and-red-teaming/Image 1.png similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/Image 1.png rename to public/labs/lab-8-evaluation-and-red-teaming/Image 1.png diff --git a/content/labs/lab-7-evaluation-and-red-teaming/mmlu-promptfoo-config.yaml b/public/labs/lab-8-evaluation-and-red-teaming/mmlu-promptfoo-config.yaml similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/mmlu-promptfoo-config.yaml rename to public/labs/lab-8-evaluation-and-red-teaming/mmlu-promptfoo-config.yaml diff --git a/content/labs/lab-7-evaluation-and-red-teaming/promptfoo.yaml b/public/labs/lab-8-evaluation-and-red-teaming/promptfoo.yaml similarity index 100% rename from content/labs/lab-7-evaluation-and-red-teaming/promptfoo.yaml rename to public/labs/lab-8-evaluation-and-red-teaming/promptfoo.yaml diff --git a/src/components/labs/LabContent.test.tsx b/src/components/labs/LabContent.test.tsx index 680933d..70f884f 100644 --- a/src/components/labs/LabContent.test.tsx +++ b/src/components/labs/LabContent.test.tsx @@ -1,4 +1,4 @@ -import { render, screen, waitFor } from "@testing-library/react"; +import { fireEvent, render, screen, waitFor } from "@testing-library/react"; import { afterEach, describe, expect, it, vi } from "vitest"; import { LabContent } from "~/components/labs/LabContent"; @@ -36,4 +36,55 @@ describe("LabContent", () => { expect(screen.getByText("Tokenizer Playground")).toBeInTheDocument(); expect(screen.getByText("Visualize token confidence locally")).toBeInTheDocument(); }); + + it("filters harness branches from a single Objective 2 selector", () => { + render( + ', + '', + '', + '', + "", + '

OpenCode Install

', + '

Kilo Code Install

', + '

Factory Droid Install

', + '

OpenCode Config

', + '

Kilo Code Config

', + ].join("")} + />, + ); + + const openCodeInstall = screen.getByText("OpenCode Install").closest("section"); + const kiloCodeInstall = screen.getByText("Kilo Code Install").closest("section"); + const droidInstall = screen.getByText("Factory Droid Install").closest("section"); + const kiloCodeConfig = screen.getByText("Kilo Code Config").closest("section"); + + expect(openCodeInstall?.hidden).toBe(true); + expect(kiloCodeInstall?.hidden).toBe(true); + expect(droidInstall?.hidden).toBe(true); + expect(kiloCodeConfig?.hidden).toBe(true); + + fireEvent.click(screen.getByRole("button", { name: "Kilo Code" })); + + expect(screen.getByRole("button", { name: "Kilo Code" })).toHaveAttribute( + "aria-pressed", + "true", + ); + expect(openCodeInstall?.hidden).toBe(true); + expect(kiloCodeInstall?.hidden).toBe(false); + expect(droidInstall?.hidden).toBe(true); + expect(kiloCodeConfig?.hidden).toBe(false); + + fireEvent.click(screen.getByRole("button", { name: "Kilo Code" })); + + expect(screen.getByRole("button", { name: "Kilo Code" })).toHaveAttribute( + "aria-pressed", + "false", + ); + expect(openCodeInstall?.hidden).toBe(true); + expect(kiloCodeInstall?.hidden).toBe(true); + expect(droidInstall?.hidden).toBe(true); + }); }); diff --git a/src/components/labs/LabContent.tsx b/src/components/labs/LabContent.tsx index 620d540..d54aa10 100644 --- a/src/components/labs/LabContent.tsx +++ b/src/components/labs/LabContent.tsx @@ -198,6 +198,64 @@ async function copyTextToClipboard(text: string) { } } +function enhanceHarnessSelectors(root: HTMLElement) { + const harnessButtons = Array.from( + root.querySelectorAll("button[data-harness-choice]"), + ); + const harnessBranches = Array.from( + root.querySelectorAll("[data-harness-branch]"), + ); + + if (harnessButtons.length === 0 || harnessBranches.length === 0) { + return () => {}; + } + + const supportedHarnesses = new Set( + harnessButtons + .map((button) => button.dataset.harnessChoice?.trim()) + .filter((value): value is string => Boolean(value)), + ); + + let selectedHarness: string | null = null; + + const syncHarnessSelection = () => { + for (const button of harnessButtons) { + const harnessId = button.dataset.harnessChoice?.trim() ?? ""; + const isSelected = selectedHarness !== null && harnessId === selectedHarness; + button.setAttribute("aria-pressed", isSelected ? "true" : "false"); + button.dataset.selected = isSelected ? "true" : "false"; + } + + for (const branch of harnessBranches) { + const harnessId = branch.dataset.harnessBranch?.trim() ?? ""; + const shouldHide = + selectedHarness === null || harnessId !== selectedHarness; + branch.hidden = shouldHide; + branch.setAttribute("aria-hidden", shouldHide ? "true" : "false"); + } + }; + + syncHarnessSelection(); + + const handleHarnessClick = (event: Event) => { + const target = event.target as HTMLElement; + const button = target.closest("button[data-harness-choice]"); + if (!button || !root.contains(button)) return; + + const harnessId = button.dataset.harnessChoice?.trim() ?? ""; + if (!supportedHarnesses.has(harnessId)) return; + + event.preventDefault(); + selectedHarness = selectedHarness === harnessId ? null : harnessId; + syncHarnessSelection(); + }; + + root.addEventListener("click", handleHarnessClick); + return () => { + root.removeEventListener("click", handleHarnessClick); + }; +} + export function LabContent({ className, html }: LabContentProps) { const containerRef = useRef(null); const [zoomedImage, setZoomedImage] = useState(null); @@ -273,6 +331,7 @@ export function LabContent({ className, html }: LabContentProps) { } enhanceSettingsLists(root); + const cleanupHarnessSelectors = enhanceHarnessSelectors(root); const handleRootClick = (event: Event) => { const target = event.target as HTMLElement; @@ -320,6 +379,7 @@ export function LabContent({ className, html }: LabContentProps) { root.addEventListener("click", handleRootClick); return () => { + cleanupHarnessSelectors(); root.removeEventListener("click", handleRootClick); }; }, [html]); diff --git a/src/lib/labs.test.ts b/src/lib/labs.test.ts new file mode 100644 index 0000000..0c3a5b0 --- /dev/null +++ b/src/lib/labs.test.ts @@ -0,0 +1,28 @@ +import { describe, expect, it } from "vitest"; + +import { getLabDocument, getLabSummaries } from "~/lib/labs"; + +describe("labs discovery", () => { + it("returns the renumbered labs in order including the new Lab 5", () => { + const labs = getLabSummaries(); + + expect(labs.map((lab) => lab.slug)).toEqual([ + "lab-1-visualization-in-transformerlab", + "lab-2-quantization-tradeoffs", + "lab-3-llama-cpp-and-ollama", + "lab-4-oi-prompting", + "lab-5-api-and-harnesses", + "lab-6-embedding-and-chunking", + "lab-7-dataset-generation-and-fine-tuning", + "lab-8-evaluation-and-red-teaming", + ]); + }); + + it("resolves the new Lab 5 document", () => { + const lab = getLabDocument("lab-5-api-and-harnesses"); + + expect(lab).not.toBeNull(); + expect(lab?.title).toBe("Lab 5 - API & Harnesses"); + expect(lab?.fileName).toBe("lab-5-api-and-harnesses.md"); + }); +}); diff --git a/src/styles/globals.css b/src/styles/globals.css index 14ec845..e4400e1 100644 --- a/src/styles/globals.css +++ b/src/styles/globals.css @@ -417,6 +417,105 @@ ol { white-space: nowrap; } +.lab-content .lab-harness-chooser { + display: grid; + grid-template-columns: repeat(3, minmax(0, 1fr)); + gap: 0.9rem; + margin: 1rem 0 1.35rem; +} + +.lab-content .lab-harness-card { + display: flex; + flex-direction: column; + gap: 0.38rem; + min-height: 100%; + padding: 0.95rem 1rem; + border: 1px solid #d9e4ed; + border-radius: 16px; + background: + linear-gradient(180deg, rgba(255, 255, 255, 0.98), rgba(244, 249, 253, 0.96)), + radial-gradient(circle at top right, rgba(248, 156, 39, 0.12), transparent 32%); + box-shadow: 0 12px 28px -28px rgba(15, 23, 42, 0.35); + color: #18384f; + text-decoration: none; +} + +.lab-content button.lab-harness-card { + width: 100%; + cursor: pointer; + text-align: left; + font: inherit; +} + +.lab-content a.lab-harness-card:hover, +.lab-content button.lab-harness-card:hover { + transform: translateY(-1px); + border-color: #f1bd70; + box-shadow: 0 20px 36px -30px rgba(15, 23, 42, 0.45); +} + +.lab-content .lab-harness-card[aria-pressed="true"] { + border-color: #cf7a08; + background: + linear-gradient(180deg, rgba(255, 251, 243, 0.98), rgba(255, 244, 227, 0.96)), + radial-gradient(circle at top right, rgba(248, 156, 39, 0.18), transparent 32%); + box-shadow: 0 20px 36px -30px rgba(114, 63, 8, 0.5); +} + +.lab-content .lab-harness-card strong { + color: #0f3d58; + font-size: 1rem; + line-height: 1.35; +} + +.lab-content .lab-harness-card span { + color: #486073; + font-size: 0.92rem; +} + +.lab-content .lab-harness-card__tag { + display: inline-flex; + align-items: center; + width: fit-content; + border-radius: 999px; + padding: 0.22rem 0.58rem; + background: #e7f2fb; + color: #0f5c8b; + font-size: 0.72rem; + font-weight: 800; + letter-spacing: 0.06em; + text-transform: uppercase; +} + +.lab-content .lab-harness-branch { + margin: 1rem 0 1.4rem; + padding: 1rem 1.1rem 1.1rem; + border: 1px solid #dce7f0; + border-left: 6px solid #0f5c8b; + border-radius: 16px; + background: + linear-gradient(180deg, rgba(250, 252, 254, 0.98), rgba(244, 249, 252, 0.95)), + radial-gradient(circle at top right, rgba(248, 156, 39, 0.08), transparent 26%); + scroll-margin-top: 1.5rem; +} + +.lab-content .lab-harness-branch__eyebrow { + margin: 0 0 0.4rem; + color: #9a5f00; + font-size: 0.74rem; + font-weight: 800; + letter-spacing: 0.08em; + text-transform: uppercase; +} + +.lab-content .lab-harness-branch > h3 { + margin: 0 0 0.45rem; +} + +.lab-content .lab-harness-branch > p:last-child { + margin-bottom: 0; +} + .lab-content hr { margin: 2rem 0 1.4rem; border-color: #d7dee6; @@ -1686,6 +1785,15 @@ ol { padding: 0.52rem 0.75rem; } + .lab-content .lab-harness-chooser { + grid-template-columns: 1fr; + } + + .lab-content .lab-harness-card, + .lab-content .lab-harness-branch { + padding: 0.85rem 0.9rem; + } + .lab-content ul.lab-settings-list .lab-setting-value { justify-self: start; }