Fixed Lab 2
This commit is contained in:
@@ -135,7 +135,7 @@ git clone https://huggingface.co/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B
|
||||
**LLaMa.cpp** makes it easy for us to package models downloaded in SafeTensors format to GGUF. We can convert the model with the following official project script command:
|
||||
|
||||
```bash
|
||||
python3 convert_hf_to_gguf.py /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B/WhiteRabbitNeo-V3-7B --outfile /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B.gguf
|
||||
convert_hf_to_gguf.py /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B/WhiteRabbitNeo-V3-7B --outfile /home/student/lab2/WhiteRabbitNeo/WhiteRabbitNeo-V3-7B.gguf
|
||||
```
|
||||
|
||||
### 4 Execute: Review Model Metadata
|
||||
@@ -212,7 +212,7 @@ Quantization reduces memory footprints and speeds inference, but it typically ra
|
||||
To generate an 8 bit, 4 bit, and 1 bit quantization, run the following commands:
|
||||
|
||||
<div class="lab-callout lab-callout--warning">
|
||||
<strong>Warning:</strong> Although these quantization steps are provided for replication, pre-quantized support files are already available in <code>/home/student/lab2/WhiteRabbitNeo/</code> for faster lab progress.
|
||||
<strong>Warning:</strong> Although these quantization steps are provided for replication, pre-quantized support files are already available in <code>/home/student/lab2/WhiteRabbitNeo/</code> for faster lab progress. <br><br>You can skip these commands when participating in a live teaching session.
|
||||
</div>
|
||||
|
||||
```bash
|
||||
@@ -459,7 +459,7 @@ When finished, you will be presented with a prompt, similar to the `llama-cli` c
|
||||
|
||||
Similarly, we can do the same by pulling a model directly from **HuggingFace**. As long as the source file is a .gguf of any quantization level that fits within our system memory, Ollama can fetch it directly.
|
||||
|
||||
1. **Select a Quantized Model from Objective 1** – visit [CodeIsAbstract](https://huggingface.co/CodeIsAbstract/Llama-3.2-1B-Q8_0-GGUF) in your browser.
|
||||
1. **Select the Quantized Model from Objective 1** – visit [CodeIsAbstract](https://huggingface.co/CodeIsAbstract/Llama-3.2-1B-Q8_0-GGUF) in your browser.
|
||||
2. **Use this model** - Click Use this model → choose the Ollama tab. The page displays a ready‑to‑run command:
|
||||
|
||||
<figure style="text-align: center;">
|
||||
@@ -528,10 +528,9 @@ ollama run WhiteRabbitNeo
|
||||
| `ollama list` | Shows all models currently registered with Ollama. |
|
||||
| `ollama rm <tag>` | Deletes the specified model (freeing disk space). |
|
||||
| `ollama show <tag>` | Prints model metadata (architecture, context length, quantization). |
|
||||
| `ollama show <tag> --modelfile` | Prints an existing model's modelfile. Often useful for templating our own. |
|
||||
| `ollama serve` | Starts the OpenAI-compatible API server (runs automatically when you first use `ollama run`). |
|
||||
|
||||
<br>
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
Reference in New Issue
Block a user