README updates, textbook polynomial cell, self-contained notebook

Same set of changes as che-computing-dev/LLMs: - 03/04/05 READMEs: uv add workflow, required model caching - 05-tool-use: add Setup section, requirements.txt - 06-neural-networks: textbook cubic polynomial comparison cell - 06-neural-networks: add nn_workshop_colab.ipynb (self-contained, inline data) - vocab.md: catch up with terms from 02-05 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-04 10:18:10 -04:00 · 2026-05-04 10:18:10 -04:00 · f7d2b48f5a
commit f7d2b48f5a
parent a1f9d4d5ed
7 changed files with 534 additions and 23 deletions
--- a/03-rag/README.md
+++ b/03-rag/README.md
@ -59,20 +59,35 @@ source .venv/bin/activate

 ### Install the required packages

+Each section has its own `requirements.txt` listing the libraries it needs.
+
+**If you are using `uv` for the workshop** (recommended):
+
 ```bash
-pip install llama-index-core llama-index-readers-file \
-    llama-index-llms-ollama llama-index-embeddings-huggingface \
-    python-dateutil
+cd /path/to/llm-workshop
+uv add $(cat 03-rag/requirements.txt)
 ```

-The `llama-index-*` packages are components of the [LlamaIndex](https://docs.llamaindex.ai/en/stable/) framework, which provides the plumbing for building RAG systems. `python-dateutil` is used by `clean_eml.py` for parsing email dates.
+`uv add` adds the packages to `pyproject.toml`, updates `uv.lock`, and installs them into `.venv/`.

-A `requirements.txt` is provided:
+**If you have a plain venv activated:**

 ```bash
 pip install -r requirements.txt
 ```

+Either way, the relevant packages are:
+
+```
+llama-index-core
+llama-index-readers-file
+llama-index-llms-ollama
+llama-index-embeddings-huggingface
+python-dateutil
+```
+
+The `llama-index-*` packages are components of the [LlamaIndex](https://docs.llamaindex.ai/en/stable/) framework, which provides the plumbing for building RAG systems. `python-dateutil` is used by `clean_eml.py` for parsing email dates.
+
 ### Pull the LLM

 We will use the `command-r7b` model, which was fine-tuned for RAG tasks:
@ -85,22 +100,18 @@ Other models work too — `llama3.1:8B`, `deepseek-r1:8B`, `gemma3:1b` — but `

 ### Cache the embedding model

-The embedding model converts text into vectors. We use `BAAI/bge-large-en-v1.5`, a sentence transformer hosted on Huggingface. It will download automatically on first use (~1.3 GB), but you can pre-cache it with a short Python script:
+**This step is required, not optional.** The embedding model `BAAI/bge-large-en-v1.5` (~1.3 GB) is downloaded from Hugging Face on first use. The `build.py` and `query.py` scripts run in *offline mode* (`HF_HUB_OFFLINE=1`) so that subsequent runs are fast and deterministic — but that means they cannot download the model on demand. If you skip this step, the scripts will fail with a `LocalEntryNotFoundError`.

-```python
-from llama_index.embeddings.huggingface import HuggingFaceEmbedding
-embed_model = HuggingFaceEmbedding(
-    cache_folder="./models",
-    model_name="BAAI/bge-large-en-v1.5"
-)
-```
-
-Save this as `cache_model.py` and run it:
+Run the included `cache_model.py` script first:

 ```bash
+cd 03-rag
 python cache_model.py
 ```
-(This is also saved in the Github.) Each script that uses the model will set environmental variables to prevent checking for updates. You can manually update either by running `cache_model.py` or editing the scripts themselves.
+
+This populates `./models/` with the embedding model. After it succeeds, `build.py` and `query.py` will run.
+
+If you ever need to refresh the model or switch to a different one, edit `cache_model.py` (or temporarily set `HF_HUB_OFFLINE=0` in your shell) and re-run.

 ## 2. The libraries we use

@ -283,7 +294,7 @@ Our custom prompt in `query.py` is more detailed — it asks for structured outp

 > **Exercise 7:** Bring your own documents. Find a collection of text files — research paper abstracts, class notes, or a downloaded text from Project Gutenberg — and build a RAG system over them. What questions can you answer that a plain LLM cannot?

-> **Exercise 8 (optional, sets up Part IV):** Build a larger corpus. Ten emails is small enough that retrieval is barely selective — the system returns most of the corpus on every query. The script `fetch_arxiv.py` pulls 100 recent abstracts from a chosen arXiv category and writes one text file per abstract:
+> **Exercise 8 (optional, sets up Part IV):** Build a larger corpus. Ten emails is small enough that retrieval is barely selective. The system returns most of the corpus on every query. The script `fetch_arxiv.py` pulls 100 recent abstracts from a chosen arXiv category and writes one text file per abstract:
 >
 > ```bash
 > python fetch_arxiv.py --category cs.LG --max 100 --output data_arxiv