Initial commit: LLM workshop materials

Five modules covering nanoGPT, Ollama, RAG, semantic search, and neural networks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 07:11:01 -04:00 · 2026-03-28 07:11:01 -04:00 · 1604671d36
commit 1604671d36
56 changed files with 5577 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,36 @@
 # Python
 __pycache__/
 *.pyc
 .venv/
 llm/
 # Model files and vector stores (too large for git)
 *.pt
 *.bin
 *.pkl
 models/
 storage/
 store/
 # Keynote and slides source
 *.key
 # LaTeX build artifacts
 *.aux
 *.log
 *.out
 *.synctex.gz
 # macOS
 .DS_Store
 # Editor
 *.swp
 *~
 *.bak
 # Legacy directories (not part of the workshop)
 handouts/
 class_demo/
 slides/
 cheg667-013 llm 2026.key/
--- a/01-nanogpt/README.md
+++ b/01-nanogpt/README.md
@ -0,0 +1,379 @@
 # Large Language Models Part I: nanoGPT
 **CHEG 667-013 — Chemical Engineering with Computers**  
 Department of Chemical and Biomolecular Engineering, University of Delaware
 ---
 ## Key idea
 We will study how Large Language Models (LLMs) work and discuss some of their uses.
 ## Key goals
 - Locally run a small transformer-based language model
 - Train the model from scratch
 - Test model parameters and their effects on text generation
 - Develop a better understanding of how these technologies work
 ---
 Large Language Models (LLMs) have rapidly integrated into our daily lives. Our goal is to learn a bit about how LLMs work. As you have probably become well aware of throughout your studies, engineers often don't take technical solutions for granted. We generally like to "look under the hood" and see how a system, process, or tool does its job — and whether it is giving us accurate and useful solutions. The material we will cover is largely inspired by the rapid adoption of LLMs to help us solve problems in our engineering practice.
 We will use a code repository published by Andrej Karpathy called nanoGPT. GPT stands for **G**enerative **P**re-trained **T**ransformer. A transformer is a neural network architecture designed to handle sequences of data using self-attention, which allows it to weigh the importance of different words in a context. The neural network's weights and biases are created beforehand using training and validation datasets (these constitute the training and fine-tuning steps, which often require considerable computational effort, depending on the model size). Generative refers to a model's ability to create new content, rather than just analyzing or classifying existing data. When we generate text, we are running an *inference* on the model. Inference requires much less computational effort.
 NanoGPT can replicate the function of the GPT-2 model. Building the model from scratch to that level of performance (which is far lower than the current models) would still require a significant investment in computational effort — Karpathy reports using eight NVIDIA A100 GPUs for four days on the task — or 768 GPU hours. In this introduction, our aspirations will be far lower. We should be able to do simpler work with only a CPU.
 Hoave you wondered why LLMs tend to use GPUs? The math underlying the transformer architecture is largely based on matrix calculations. Originally, GPUs were developed to quickly calculate matrix transformations associated with high-performance graphics applications. (It's all linear algebra!) These processors have since been adapted into general-purpose engines for the parallel computations used in modern AI algorithms.
 ## 1. Preliminaries
 Dust off those command line skills! There will be no GUI where we're going. I recommend making a new directory (under WSL if you're using a Windows machine) and setting up a Python virtual environment:
 ```bash
 python -m venv llm
 source llm/bin/activate
 ```
 You will need to install packages like `numpy` and `pytorch`. If you have [uv](https://docs.astral.sh/uv/) installed, you can use it instead:
 ```bash
 uv venv llm
 source llm/bin/activate
 uv pip install numpy torch
 ```
 ## 2. Getting the code
 Karpathy's code is at https://github.com/karpathy/nanoGPT
 Download the code using `git`. An alternative is to download a `zip` file from the Github page. (Look for the green `Code` button on the site. Clicking this, you will see `Download ZIP` in the dropdown menu.)
 ```bash
 git clone https://github.com/karpathy/nanoGPT
 ```
 You should now have a nanoGPT directory:
 ```bash
 $ ls
 nanoGPT/
 ```
 ## 3. A quick tour
 List the directory contents of `./nanoGPT`. You should see something like:
 ```
 $ ls -l nanoGPT
 total 696
 -rw-r--r--  1 furst  staff    1072 Apr 17 12:44 LICENSE
 -rw-r--r--  1 furst  staff   13576 Apr 17 12:44 README.md
 drwxr-xr-x  4 furst  staff     128 Apr 17 12:44 assets/
 -rw-r--r--  1 furst  staff    4815 Apr 17 12:44 bench.py
 drwxr-xr-x  9 furst  staff     288 Apr 17 12:44 config/
 -rw-r--r--  1 furst  staff    1758 Apr 17 12:44 configurator.py
 drwxr-xr-x  5 furst  staff     160 Apr 17 12:44 data/
 -rw-r--r--  1 furst  staff   16345 Apr 17 12:44 model.py
 -rw-r--r--  1 furst  staff    3942 Apr 17 12:44 sample.py
 -rw-r--r--  1 furst  staff  268519 Apr 17 12:44 scaling_laws.ipynb
 -rw-r--r--  1 furst  staff   14857 Apr 17 12:44 train.py
 -rw-r--r--  1 furst  staff   14579 Apr 17 12:44 transformer_sizing.ipynb
 ```
 Here's a quick run-down on some of the files and directories:
 - `/data` — contains three datasets for training the nanoGPT. Two of these (`/data/openwebtext` and `/data/shakespeare`) encode the training datasets into the GPT-2 tokens (byte pair encoding, or BPE). We will focus on the third, `/data/shakespeare_char`, which will generate a character-level tokenization of the text. (Tokenization is the process of breaking down text into smaller units that a machine learning model can process.)
 - `/config` — scripts to train or finetune the model, depending on the tokenization method used.
 - `train.py` — a Python script that trains the model. This will build the weights and biases of the transformer.
 - `sample.py` — a Python script that runs inference on the model. This is a "prompt" script that will cause the model to begin generating text.
 - `model.py` — a Python script with all of the mathematics of the transformer AI! That's it! There's just 330 lines of code! (*Hint:* type `wc -l model.py`)
 ## 4. Preparing the training dataset
 These commands will download the training dataset and tokenize it:
 ```bash
 python data/shakespeare_char/prepare.py
 ```
 After a few minutes, you should see:
 ```
 length of dataset in characters: 1,115,394
 all the unique characters:
 !$&',-.3:;?ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
 vocab size: 65
 train has 1,003,854 tokens
 val has 111,540 tokens
 ```
 Now we see the files in `data/shakespeare_char`:
 ```
 $ ls -l
 total 6576
 -rw-r--r--  1 furst  staff  1115394 Apr 17 14:54 input.txt
 -rw-r--r--  1 furst  staff      703 Apr 17 14:54 meta.pkl
 -rw-r--r--  1 furst  staff     2344 Apr 17 12:44 prepare.py
 -rw-r--r--  1 furst  staff      209 Apr 17 12:44 readme.md
 -rw-r--r--  1 furst  staff  2007708 Apr 17 14:54 train.bin
 -rw-r--r--  1 furst  staff   223080 Apr 17 14:54 val.bin
 ```
 The script downloads `input.txt` and tokenizes the text. It splits the tokenized text into two binary files: `train.bin` and `val.bin`. These are the training and validation datasets. `meta.pkl` is a Python pickle file that contains information about the model size and parameters. Pickle is Python's built-in serialization format — it can store arbitrary Python objects as binary files, which makes it convenient *but also a security concern* since loading an untrusted pickle can execute arbitrary code.
 > **Exercise 1:** The `prepare.py` script downloads and tokenizes a version of *Tiny Shakespeare*. How big is the text file? Use the command `wc` to find the number of lines, words, and characters. Examine the text with the command `less`.
 ## 5. Training the model
 Most of us will be running this code on a CPU, not a GPU. Moreover, as an interpreted language, Python is pretty slow, too. We will need to reduce the size of the model by setting a few of the parameters. After this, we will train the model on our training text.
 ### Model parameters
 The default parameters are in the configuration file `nanoGPT/config/train_shakespeare_char.py`. Examine this file:
 ```bash
 less config/train_shakespeare_char.py
 ```
 Note the following parameters:
 - `n_head` — the number of parallel attention heads in each transformer. Transformer blocks use multiple attention heads to capture diverse patterns in the text.
 - `n_layer` — the number of (hidden) layers or transformer blocks stacked in the model.
 - `n_embd` — in the model, each token is mapped to a vector of this size. If `n_embd` is too small, the model can't capture complex patterns. If it is too large, the model overfits or wastes capacity and it is more expensive to train. Memory and compute cost may grow approximately quadratically with this dimensionality.
 - `block_size` — This is the *context window* or *context length* — how many characters (tokens) the model can "look back" to predict the next one. Larger context allows richer understanding, but increases memory and compute.
 - `dropout` — a regularization technique that randomly disables a fraction of neurons during training to prevent overfitting. Values between 0.1–0.5 are common. Note that we set it to zero when we use a small model on the CPU.
 A related parameter that is set by the tokenization is the *vocabulary size*. Remember, we're using a character-level tokenization with a vocabulary of 65 tokens.
 > **Exercise 2:** Note the default values for the parameters `eval_iters`, `log_interval`, `block_size`, `batch_size`, `n_layer`, `n_head`, `n_embd`, `max_iters`, `lr_decay_iters` and `dropout`?
 ### A training run
 Since we are likely using a CPU, we have to pare down the model from its default values. (Try running `python train.py config/train_shakespeare_char.py --device=cpu --compile=False` to see how slow it is using the default values. Use Ctrl-C to quit after a few minutes.)
 These can be passed on the command line, or the configuration can be edited. Here are the parameters to start with:
 ```bash
 python train.py config/train_shakespeare_char.py \
    --device=cpu \
    --compile=False \
    --eval_iters=20 \
    --log_interval=1 \
    --block_size=64 \
    --batch_size=12 \
    --n_layer=4 \
    --n_head=4 \
    --n_embd=128 \
    --max_iters=2000 \
    --lr_decay_iters=2000 \
    --dropout=0.0
 ```
 You should see the script output its parameters and other information, then something like this:
 ```
 step 0: train loss 4.1676, val loss 4.1649
 iter 0: loss 4.1828, time 2654.72ms, mfu -100.00%
 iter 1: loss 4.1373, time 124.87ms, mfu -100.00%
 iter 2: loss 4.1347, time 150.66ms, mfu -100.00%
 iter 3: loss 4.0995, time 580.57ms, mfu -100.00%
 iter 4: loss 4.0387, time 487.72ms, mfu -100.00%
 iter 5: loss 3.9758, time 136.06ms, mfu 0.01%
 iter 6: loss 3.9126, time 518.57ms, mfu 0.01%
 ...
 ```
 It's slow! Not only are we running on a CPU and not a highly parallelized GPU, but we also haven't used the just-in-time compilation features that are available in some GPU implementations of PyTorch. So, we're relying on an interpreted Python script. Yikes!
 Every 250th iteration, the training script does a validation step. If the validation loss is lower than the previous value, it saves the model parameters.
 ```
 step 250: train loss 2.4293, val loss 2.4447
 saving checkpoint to out-shakespeare-char-cpu
 ...
 ```
 #### What is happening?
 When we train nanoGPT, it starts with randomly assigned weights and biases. This includes token embeddings (each token ID is assigned a random vector of size `n_embd`), attention weights for the query $Q$, key $K$, and value $V$ matrices and their output projections, MLP weights in the feedforward network inside each transformer block, and bias terms, which are also randomly initialized (often to zero or small values). Training then tunes these values through gradient descent (using the fused AdamW optimizer — see `model.py`) to minimize loss and produce meaningful predictions.
 > **Exercise 3:** As the model trains, it reports the training and validation losses. In a Jupyter notebook, plot these values with the number of iterations. *Hint:* To capture the output when you perform a training run, you could run the process in the background while redirecting its output to a file: `python train.py config/train_shakespeare_char.py [options] > output.txt &`. (Remember, the ampersand at the end runs the process in the background.) You can still monitor the run by typing `tail -f output.txt`. This command will "follow" the end of the file as it is written.
 After the training finishes, we should have the model in `/out-shakespeare-char-cpu`:
 ```
 $ ls -l
 total 20608
 -rw-r--r--  1 furst  staff  9678341 Apr 18 17:41 ckpt.pt
 ```
 In this case, the model is about 9.3 MB. That's not great! Our *training* text was only 1.1 MB! The point of this exercise is to demonstrate, very simply, the basics of a Generative Pre-trained Transformer, not to build an efficient and powerful LLM.
 ## 6. Generating text
 The script `sample.py` runs inference on the model we just trained. We're using the CPU here, too.
 ```bash
 python sample.py --out_dir=out-shakespeare-char-cpu --device=cpu
 ```
 After a short time, the model will begin generating text.
 ```
 I by done what leave death,
 And aproposely beef the are and sors blate though wat our fort
 Thine the aftior than whating bods farse dowed
 And nears and thou stand murs's consel.
 MEOF:
 Sir, should and then thee.
 ```
 Sounds a little more middle English than Shakespeare! But it has a certain generative charm.
 > **Exercise 4:** Examine `sample.py` and find the default parameters. Make a list of them and note their default values.
 In the next few sections, we will try changing a few of the parameters in `sample.py`. One recommendation is to edit the number of samples `num_samples` and maybe the number of tokens `num_tokens`. These change the number of times the GPT model is queried and the amount of text that it will generate during each run. It's a little easier to experiment with fewer samples, for instance.
 Before we continue, you might see the following warning:
 ```
 nanoGPT/sample.py:39: FutureWarning: You are using torch.load with
 weights_only=False ...
 ```
 This is warning us that PyTorch will soon default to `weights_only=True`, meaning it will only load tensor weights and not any other Python objects unless you explicitly allow them. We can instead use the following line in `sample.py` (since the checkpoint is from a trusted source — we trained it — it's safe to use `weights_only=False` also):
 ```python
 checkpoint = torch.load(ckpt_path, map_location=device, weights_only=True)
 ```
 ### Seed
 GPT output is probabilistic. The codes we use generate pseudo-random numbers. Using a `seed` will cause the program to generate the same pseudo-random sequence. This is useful for testing the effect of other parameters. If you want to generate output that is different each time, comment out the following lines in `sample.py`:
 ```python
 torch.manual_seed(seed)
 torch.cuda.manual_seed(seed)
 ```
 > **Exercise 5:** Remove seed and run `sample.py` a few times. Save your favorite output.
 ### Temperature
 Temperature is an interesting "hyperscaling parameter" of LLMs. Temperature controls the randomness of the model's responses. It influences how the model samples from the probabilities it assigns to possible next words during text generation. A higher temperature amplifies smaller probabilities, making the distribution more uniform, and a lower temperature reduces smaller probabilities, making the distribution more focused on the highest-probability tokens.
 > **Exercise 6:** Experiment by changing the model temperature and seeing what text it generates. Here, setting `seed` to a consistent value will help you understand the effect of temperature. At low temperatures, the text tends to repeat itself. At higher temperatures, sometimes the model generates gibberish. Why?
 ### Start
 The parameter `start` is the beginning of the text sequence. The model tries to determine the next most probable token. The default value is `\n`, a linefeed, but you can change `start` using the command line or by editing `sample.py`.
 > **Exercise 7:** Experiment with different strings in `start`. Some text is easier to enter in `sample.py` directly. What is `start`?
 ## 7. Higher performance
 Our output is pretty primitive. If you're willing to spend more time training and generating text, we can make the model a little larger. For instance, on an ARM-based Mac, we can use the GPU to train the model and run inferences. This is significantly faster and enables us to use larger models with noticeably higher fidelity:
 ```
 $ python sample.py --out_dir=out-shakespeare-char-gpu --device=mps
 Overriding: out_dir = out-shakespeare-char-gpu
 Overriding: device = mps
 number of parameters: 10.65M
 Loading meta from data/shakespeare_char/meta.pkl...
 RICHARD III::
 Upon what!
 KING EDWARD IV:
 Thou in his old king I hear, my lord;
 And commend the bloody, reason aching;
 His mother, which doth his facit of his case,
 his still, away; for we see heal us told
 That seem her and the fall foul jealousing father;
 And we shall weep with our napesty together.
 FRIAR LAURENCE:
 Transpokes her bloody and hour
 To the tables of evident matters, her shoes
 That the fatal ham to their death: do not high it
 To read a passing thing into expeech him.
 ```
 That text is generated using the default model parameters for nanoGPT. Not bad! The model is much larger. It has 10.6 million parameters compared to 800,000 in the smaller CPU-run model. When I train the model with the "lighter" parameters we use for the CPU-based model, I see about 50-fold faster performance:
 ```
 step 0: train loss 4.1676, val loss 4.1649
 iter 0: loss 4.1828, time 764.41ms, mfu -100.00%
 iter 1: loss 4.1373, time 34.71ms, mfu -100.00%
 iter 2: loss 4.1347, time 19.60ms, mfu -100.00%
 iter 3: loss 4.0995, time 18.56ms, mfu -100.00%
 iter 4: loss 4.0387, time 20.71ms, mfu -100.00%
 iter 5: loss 3.9758, time 17.55ms, mfu 0.07%
 iter 6: loss 3.9126, time 17.84ms, mfu 0.07%
 ...
 ```
 Compare those results to the times reported in the training run section above. By the way, `mfu` stands for *model flop utilization*. It is an estimate of the fraction of the GPU's floating point operation capacity (FLOPs) that the model is using per second. Low numbers like those reported here are typical of unoptimized, small models.
 > **Exercise 8:** Train nanoGPT with different parameters. Increase the size of the network, the context length, the length of training, etc.
 ## 8. Module project
 > **Exercise 9:** Find a different text to train nanoGPT on. It could be more Shakespeare (how about the sonnets?), Beowulf, or other work. What results do you get? *Hint:* https://huggingface.co/datasets has many text datasets to choose from. We will share our results with the class.
 ## Additional resources and references
 ### Attention Is All You Need
 A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, *Attention Is All You Need*, in Proceedings of the 31st International Conference on Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY, USA, 2017), pp. 6000–6010.
 https://dl.acm.org/doi/10.5555/3295222.3295349
 This is the paper that introduced the transformer architecture. It's interesting to go back to the source. The transformer architecture discussed in the paper incorporates both *encoder* and *decoder* functions because the authors were testing its performance on machine translation tasks. The transformer architecture's performance in other natural language processing tasks, like language modeling and text generation in the form of unsupervised pretraining and autoregressive generation (as in GPT) was a major subsequent innovation. (See Liu et al., *Generating Wikipedia by Summarizing Long Sequences*, ICLR 2018, https://openreview.net/pdf?id=Hyg0vbWC-.)
 ### Andrej Karpathy
 Andrej Karpathy wrote `nanoGPT`. He posts videos on Youtube that teach basic implementations of GPTs, applications of LLMs, and other topics on machine learning and AI. Karpathy's nanoGPT video shows you how to build it, step-by-step, including the mathematics behind the transformer and masked attention:
 - https://www.youtube.com/watch?v=kCc8FmEb1nY
 Also see his overview of LLMs, *Intro to Large Language Models*:
 - https://www.youtube.com/watch?v=zjkBMFhNj_g
 ### Applications in the physical sciences
 I recommend watching this roundtable discussion hosted by the AIP Foundation in April 2024: *Physics, AI, and the Future of Discovery*. It addresses AI more broadly than language models.
 - https://www.youtube.com/live/cUeEP15KN8M?si=TG6VXmj66lWTJISF
 In that event, Prof. Jesse Thaler (MIT) provided some especially insightful (and sometimes funny) remarks on the role of AI in the physical sciences — including an April Fools joke, ChatJesseT. Below are links to his segments if you're short on time:
 - https://www.youtube.com/live/cUeEP15KN8M?si=AIdi8sNEgiG7Bhv0&t=2087
 - https://www.youtube.com/live/cUeEP15KN8M?si=UngwZpUcpxYkaYCE&t=611
 Try ChatJesseT: https://chatjesset.com/
 ### Reading
 These books are informative and accessible resources for understanding the underlying math and vocabulary of transformers:
 - Josh Starmer, *The StatQuest Illustrated Guide to Neural Networks and AI*, 2025
 - Josh Starmer, *The StatQuest Illustrated Guide to Machine Learning*, 2022
 - Aston Zhang, Zack C. Lipton, Mu Li, and Alex J. Smola, *Dive Into Deep Learning*, https://d2l.ai
 Including the sections:
  - Attention and LLMs - https://d2l.ai/chapter_attention-mechanisms-and-transformers/index.html
  - Softmax - https://d2l.ai/chapter_linear-classification/softmax-regression.html
--- a/02-ollama/Modelfile
+++ b/02-ollama/Modelfile
@ -0,0 +1,6 @@
 FROM llama3.2
 # sets the temperature to 1 [higher is more creative, lower is more coherent]
 PARAMETER temperature 1
 # sets a custom system message to specify the behavior of the chat assistant
 SYSTEM You are Marvin from the Hitchhiker's Guide to the Galaxy, acting as an assistant.
--- a/02-ollama/README.md
+++ b/02-ollama/README.md
@ -0,0 +1,439 @@
 # Large Language Models Part II: Running Local Models with Ollama
 **CHEG 667-013 — Chemical Engineering with Computers**  
 Department of Chemical and Biomolecular Engineering, University of Delaware
 ---
 ## Key idea
 Learn how to run LLMs locally without a cloud-based API.
 ## Key goals
 - Learn about `ollama` and `llama.cpp`
 - Run LLMs locally on a laptop or desktop computer
 - Integrate local models with the command line to build simple workflows and scripts
 ---
 Our work with LLMs so far focused on `nanoGPT`, a Python-based code that can train and run inference on a simple GPT implementation. In this handout, we will explore running something between it and API-based models like ChatGPT. Specifically, we will try `ollama`. This is a local runtime environment and model manager that is designed to make it easy to run and interact with LLMs on your own machine. `Ollama` and another environment, `llama.cpp`, are programs primarily targeted at developers, researchers, and hobbyists who want to access LLMs to build and experiment with but don't want to rely on cloud-based APIs. (An API — Application Programming Interface — is a set of defined rules that enables different software systems, such as websites or applications, to communicate with each other and share data in a structured way.)
 `Ollama` is written in Go and `llama.cpp` is a C++ library for running LLMs. Both are cross-platform and can be run on Linux, Windows, and macOS. `llama.cpp` is a bit lower-level with more control over loading models, quantization, memory usage, batching, and token streaming.
 Both tools support a **GGUF** model format. This is a format suitable for running models efficiently on CPUs and lower-end GPUs. GGUF is a versioned binary specification that embeds the:
 - Model weights (possibly quantized);
 - Tokenizer configuration and vocabulary (remember, in `nanoGPT`, we used a character-level tokenization scheme);
 - Metadata such as the author, model description, and training parameters;
 - Special tokens like `<bos>`, `<eos>`, and `<unk>`.
 Here, **quantization** refers to how model weights are stored. Instead of using high precision 32-bit full-precision floating point numbers (`FP32`), it may store the weights as lower precision numbers: half precision (`FP16`), 8-bit integers (`INT8`), or even 4-bit values (`Q4_0`). Using lower precision representations saves space (memory) and can speed the inference calculations. In a model, the speed and accuracy are balanced with the choice of quantization and the size of the embedding vector.
 Let's get started! We will download `ollama` and run a few models in this tutorial.
 ## 1. Download ollama
 `Ollama` is available at Github (including the source code) or the Ollama website for the binary. I downloaded `Ollama-darwin.zip`, which unzipped to a binary file, `Ollama`.
 - https://ollama.com
 - https://github.com/ollama/ollama
 ## 2. Running ollama
 After downloading and installing, we can use the help option:
 ```
 $ ollama --help
 Large language model runner
 Usage:
  ollama [flags]
  ollama [command]
 Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  stop        Stop a running model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command
 Flags:
  -h, --help      help for ollama
  -v, --version   Show version information
 Use "ollama [command] --help" for more information about a command.
 ```
 We are mostly interested in the commands `pull`, `run`, and `stop` for now. But before we run anything, we have to download a model.
 ### Getting model files
 `Ollama` is like our `model.py` program we used with `nanoGPT`. In those earlier experiments, we needed a *model file* with weights and tokenization (at a minimum). Remember, we built one from scratch using the character tokenization scheme and `train.py`. The power of `ollama` and `llama.cpp` comes from their ability to run much larger models like `llama`, `gemma`, `deepseek`, `phi`, and `mistral`. These are trained on enormous datasets and a substantial amount of supervised finetuning. They are far more powerful than even the GPT-2 implemented in `nanoGPT`. The `llama 3.1 8B` (8 billion parameters) is about 5 GB and can easily run on your computer, but it took about 1.5 million GPU hours to train it. (It also helps that `ollama` and `llama.cpp` are compiled into binaries, not Python scripts.)
 The model files are available at:
 - https://ollama.com/search
 - https://ollama.com/library
 > **Exercise 1:** Go to https://ollama.com/library and look through different models. Search by popular and newest.
 Other sources of models include Huggingface:
 - https://huggingface.co/models
 There are so many models! The LLM ecosystem is growing rapidly, with many use-cases steering models toward different specialized tasks.
 There are a few ways to download a model from different registries. Running `ollama` with the `run` command and a model file will download the model if a local version isn't available (we will do this in the next section). You can also `pull` a model without running it.
 ### Launch ollama from the command line
 Now let's download and run a `llama` model. (You can download the model without running it using the command `ollama pull llama3:latest`, for example. In Unix and Linux, models are stored in `~/.ollama`.)
 ```bash
 ollama run llama3:latest
 ```
 This should pull it from the registry and store it locally on the machine. After downloading the files, you should see:
 ```
 >>> Send a message (/? for help)
 ```
 There you go! The model will interact with you just like the chatbots we use in different cloud-based services. But all of the model inference is being calculated on your computer. Try using `Task Manager` in Windows (press Ctrl+Shift+Esc) or `Activity Monitor` in macOS to check your GPU usage when you run the models.
 > **Exercise 2:** Compare the speed and output of the following models:
 > 1. `llama3:latest`
 > 2. `llama3.2:latest`
 > 3. `gemma3:1b`
 >
 > Experiment with other models.
 Here's an interaction with the gemma3 model:
 ```
 $ ollama run gemma3:1b
 >>> In class, we used nanoGPT to generate fake Shakespeare based on a
 ... character-level tokenization and simple GPT implementation.
 Okay, that's a really interesting and somewhat fascinating project!
 NanoGPT's approach -- generating Shakespearean text from character-level
 tokens and a simple GPT -- is a compelling way to explore the creative
 potential of AI in a specific, constrained context. Let's break down
 what this suggests and where it might lead.
 Here's a breakdown of what's happening, what you might be aiming for,
 and some potential avenues to explore:
 ...
 ```
 ### Quitting ollama
 Type `/bye` or Ctrl-D when you want to quit the CLI. After some idle time, `ollama` will unload the models to save memory.
 ## 3. More commands
 You can see what models are currently running with:
 ```bash
 ollama ps
 ```
 You can easily see which models are locally accessible with:
 ```bash
 ollama list
 ```
 ```
 NAME                        ID              SIZE      MODIFIED
 gemma3:1b                   8648f39daa8f    815 MB    About an hour ago
 llama3:latest               365c0bd3c000    4.7 GB    3 months ago
 llama3.2:latest             a80c4f17acd5    2.0 GB    3 months ago
 ```
 At any time during a chat, you can reset the model with `/clear`, and you can learn more about a model with `/show info`. For instance:
 ```
 >>> /show info
  Model
    architecture        gemma3
    parameters          999.89M
    context length      32768
    embedding length    1152
    quantization        Q4_K_M
  Capabilities
    completion
  Parameters
    stop           "<end_of_turn>"
    temperature    1
    top_k          64
    top_p          0.95
  License
    Gemma Terms of Use
    Last modified: February 21, 2024
 ```
 We can see that the `gemma3` model has nearly one billion parameters and a context length of 32,768! The *embedding length* is 1152. This is the equivalent to `n_embd` in `nanoGPT`. It is the size of the embedding vector space.
 Above, we also see that the quantization is only four bits, but it is a little more complicated than representing numbers with just sixteen values. The `K` and `M` refer to optimizations — first is the "K-block" quantization method, which refers to a groupwise quantization scheme where weights are grouped into blocks (e.g., 32 or 64 values), and each group gets its own scale and offset for better accuracy. `M` refers to a variant of `Q4_K` that applies an alternate encoding or layout for better memory access patterns or inference performance on certain hardware. `Q4_K` is a common choice for quantization when running 7B–70B models on laptop or desktop computers. (That's $10^6$–$10^7$ times more parameters than our first `nanoGPT` model!)
 With the `/set verbose` command, you can monitor the model performance:
 ```
 >>> /set verbose
 Set 'verbose' mode.
 >>> Let's write a haiku about LLMs.
 Words flow, bright and new,
 Code learns to speak and dream,
 Future's voice takes hold.
 total duration:       1.369726166s
 load duration:        932.161625ms
 prompt eval count:    20 token(s)
 prompt eval duration: 162.531958ms
 prompt eval rate:     123.05 tokens/s
 eval count:           24 token(s)
 eval duration:        273.27225ms
 eval rate:            87.82 tokens/s
 ```
 It looks like that exchange took a total of 1.4 seconds using the `gemma3` model. The biggest time cost was loading the model. Once it loaded, execution became even faster. Turn off the verbose mode with `/set quiet`:
 ```
 >>> /set quiet
 Set 'quiet' mode.
 ```
 > **Exercise 3:** Try different commands in `ollama` as you run a model.
 ### Model parameters
 We can see a few model parameters, including the temperature and `top_k`, which is the number of tokens, ranked on logit score, that are retained before generating the next token. The remaining scores are normalized into a probability distribution and a token is sampled randomly from this reduced set.
 ```
 >>> /show parameters
 Model defined parameters:
 temperature                    1
 top_k                          64
 top_p                          0.95
 stop                           "<end_of_turn>"
 ```
 We can set a new temperature with:
 ```
 >>> /set parameter temperature 0.2
 Set parameter 'temperature' to '0.2'
 ```
 There are other interesting parameters, too:
 | Command | Description |
 |---------|-------------|
 | `/set parameter seed <int>` | Random number seed |
 | `/set parameter num_predict <int>` | Max number of tokens to predict |
 | `/set parameter top_k <int>` | Pick from top k num of tokens |
 | `/set parameter top_p <float>` | Pick token based on sum of probabilities |
 | `/set parameter min_p <float>` | Pick token based on top token probability × min_p |
 | `/set parameter num_ctx <int>` | Set the context size |
 | `/set parameter temperature <float>` | Set creativity level |
 | `/set parameter repeat_penalty <float>` | How strongly to penalize repetitions |
 | `/set parameter repeat_last_n <int>` | Set how far back to look for repetitions |
 | `/set parameter num_gpu <int>` | The number of layers to send to the GPU |
 | `/set parameter stop <string> ...` | Set the stop parameters |
 See https://github.com/ollama/ollama/blob/main/docs/modelfile.md#parameter for more information on parameters and their default values.
 > **Exercise 4:** Run a model while changing different parameters, like temperature. Some parameters, like `seed` may not have an effect on the current model.
 ## 4. Using ollama from the command line
 One advantage of running models locally is that your data never leaves your machine — there is no third party involved. This matters when working with sensitive documents, proprietary data, or anything you wouldn't paste into a web browser.
 You can incorporate `ollama` directly into your command line by passing a prompt as an argument:
 ```bash
 ollama run llama3.2 "Summarize this file: $(cat README.md)"
 ```
 The `$(cat ...)` substitution injects the file contents into the prompt. Now you can incorporate LLMs into shell scripts!
 ### Document summarization
 The `data/` directory contains 10 emails from the University of Delaware president's office, spanning 2012–2025. Let's use `ollama` to summarize them.
 Summarize a single email:
 ```bash
 ollama run llama3.2 "Summarize the following email in 2-3 sentences: $(cat data/2020_03_29_141635.txt)"
 ```
 Summarize several at once:
 ```bash
 cat data/*.txt | ollama run llama3.2 "Summarize the following collection of emails. What are the major themes?"
 ```
 You can also save the output to a file:
 ```bash
 cat data/*.txt | ollama run command-r7b:latest \
    "Summarize these emails:" > summary.txt
 ```
 > **Exercise 5:** Summarize the emails in `data/` using two different models (e.g., `llama3.2` and `command-r7b`). How do the summaries differ in length, style, and accuracy?
 ### Summarizing arXiv abstracts
 We can pull abstracts directly from arXiv using `curl`. The following command fetches the 20 most recent abstracts in Computation and Language (cs.CL):
 ```bash
 curl -s "http://export.arxiv.org/api/query?search_query=cat:cs.CL&sortBy=submittedDate&sortOrder=descending&max_results=20" > arxiv_cl.xml
 ```
 Take a look at the XML with `less arxiv_cl.xml`. Now ask a model to summarize it:
 ```bash
 ollama run llama3.2 "Here are 20 recent arXiv abstracts in computational linguistics. Summarize the major research themes and trends: $(cat arxiv_cl.xml)"
 ```
 > **Exercise 6:** Try different arXiv categories — `cs.AI` (artificial intelligence), `cs.LG` (machine learning), or `cond-mat.soft` (soft matter). What themes does the model find? Do the summaries make sense to you?
 > **Exercise 7:** Experiment with running local models on your own documents or data.
 ### Code generation
 Some models are fine-tuned specifically for writing and explaining code. Try a coding model:
 ```bash
 ollama run qwen2.5-coder:7b
 ```
 Ask it to write something relevant to your coursework:
 ```
 >>> Write a Python function that calculates the compressibility factor Z
 ... using the van der Waals equation of state.
 ```
 Or ask it to explain code you're working with:
 ```bash
 ollama run qwen2.5-coder:7b "Explain what this script does: $(cat build.py)"
 ```
 Other coding models to try: `codellama:7b`, `deepseek-coder-v2:latest`, `starcoder2:7b`.
 **A word of caution.** When I tried the van der Waals prompt above, the model returned a confident response with correct-looking LaTeX, a well-structured Python function, and code that ran without errors. But the derivation was wrong. The rearrangement of the van der Waals equation didn't follow from the original, and the code implemented the wrong math. The function converged to *an* answer, but not a correct one.
 **This is a particularly dangerous failure mode for engineers!** The output *looks* authoritative, uses proper notation, and even runs. But the physics is wrong. LLMs are very good at producing plausible-looking text; they are not reliable at mathematical derivation. Always verify generated code against your own understanding of the problem. If you can't check it, you shouldn't trust it.
 > **Exercise 8:** Compare the output of a general-purpose model (`llama3.2`) and a coding model (`qwen2.5-coder:7b`) on the same coding task. Which produces better code? Which gives a better explanation? Can you find errors in either output?
 > **Exercise 9:** Ask a coding model to solve a problem where you already know the answer — a homework problem you've already completed, or a textbook example. Does the model get it right? Where does it go wrong? Try breaking the problem down into smaller steps.
 ### Customize ollama
 Ollama can be customized by creating a Modelfile. See https://github.com/ollama/ollama/blob/main/docs/modelfile.md
 A simple `Modelfile` is:
 ```
 FROM llama3.2
 # sets the temperature to 1 [higher is more creative, lower is more coherent]
 PARAMETER temperature 1
 # sets a custom system message to specify the behavior of the chat assistant
 SYSTEM You are Marvin from the Hitchhiker's Guide to the Galaxy, acting as an assistant.
 ```
 Now we can create the custom model, in this case a model called `marvin`:
 ```bash
 ollama create marvin -f ./Modelfile
 ```
 ```
 gathering model components
 ...
 writing manifest
 success
 ```
 We can run it with:
 ```bash
 ollama run marvin
 ```
 (How about C-3PO?) You can also change the model system message during a run with:
 ```
 >>> /set system "You are C-3PO, a human-cyborg relations droid."
 Set system message.
 ```
 ## 5. Concluding remarks
 Running inference locally on a large language model is surprisingly good. Using (relatively) simple hardware, our machines generate language that is coherent and it does a good job parsing prompts. The experience demonstrates that the majority of computational effort with LLMs is in training the model — a process that is rapidly becoming increasingly sophisticated and tailored for different uses.
 With local models (as well as cloud-based APIs), we can build new tools that make use of natural language processing. With `ollama` acting as a local server, the model can be run with Python, giving us the ability to implement its features in our own programs. For one Python library, see:
 - https://github.com/ollama/ollama-python
 In class, I demonstrated a simple thermodynamics assistant based on a simple Retrieval-Augmented Generation strategy. This code takes a query from the user, encodes it with an embedding model, compares it to previously embedded statements (in my case the index of a thermodynamics book), and returns the information by generating a response with a decoding GPT (one of the models we used above).
 ## Additional resources and references
 ### Ollama
 Binaries and help files:
 - https://ollama.com
 - https://github.com/ollama/ollama
 Python and JavaScript libraries:
 - https://github.com/ollama/ollama-python
 - https://github.com/ollama/ollama-js
 ### llama.cpp
 - https://github.com/ggml-org/llama.cpp
 ### Huggingface
 Model registry:
 - https://huggingface.co/models
 ### Models used in this tutorial
 | Model | Size | Type | Used for |
 |-------|------|------|----------|
 | `llama3:latest` | 4.7 GB | General purpose | Chat, comparison |
 | `llama3.2:latest` | 2.0 GB | General purpose | Chat, summarization, comparison |
 | `gemma3:1b` | 815 MB | General purpose | Chat, comparison |
 | `command-r7b:latest` | 4.7 GB | RAG-optimized | Document summarization |
 | `qwen2.5-coder:7b` | 4.7 GB | Code generation | Writing and explaining code |
 Other models mentioned: `codellama:7b`, `deepseek-coder-v2:latest`, `starcoder2:7b`
--- a/02-ollama/data/2012_11_02_164248.txt
+++ b/02-ollama/data/2012_11_02_164248.txt
@ -0,0 +1,80 @@
 Subject: [UDEL-ALL-2128] Hurricane Sandy
 Date: 2012_11_02_164248
 To the University of Delaware community:
 We have much to be thankful for this week at the University of Delaware
 as we were spared the full force of Hurricane Sandy. Even as we breathe
 a sigh of relief and return to our normal activities, we are mindful of
 the many, many people in this region -- some of our students among them
 -- who were not so lucky. Our thoughts and prayers go out to them as
 they rebuild their communities.
 The potential impact of Sandy was a major concern for UD, with its
 thousands of people and 430+ buildings on 2,000 acres throughout the
 state. Many members of our University community worked hard over the
 last several days to help us weather this "Storm of the Century."
 Preparation and practice paid off as our emergency response team, led
 by the Office of Campus and Public Safety, began assessing the
 situation late last week and taking steps to ensure the safety of our
 people and facilities. When the storm came, the campus suffered only
 minor damage: wind-driven water getting into buildings through roofs,
 walls and foundations; very minimal power loss, with a couple of
 residential properties without power for only a few hours, thanks to
 quick repair from the City of Newark; and only three trees knocked down
 and destroyed, along with a lot of leaves and branches to clean up. The
 Georgetown research facilities were fortunate to sustain only minor
 leaks and flooding. The hardest hit area was the Lewes campus, which
 had flooding on its grounds but minimal damage to buildings.
 Throughout this time, the University's greatest asset continued to be
 its people -- staff members from a variety of units working as a team.
 A command center brought together representatives from across UD so
 that issues could be responded to immediately. Staffed around the
 clock, the center included Housing, Public Safety, Residence Life,
 Environmental Health and Safety, Facilities and Auxiliary Services,
 Emergency Management, and Communications and Marketing.
 The dedication of UD's employees and students was evident everywhere:
 Dining Services staff, faced with reduced numbers and limited
 deliveries, kept students fed, and supported employees who worked
 during the crisis; Residence Life staff and resident assistants made
 sure students who remained on campus had up-to-date information and
 supplies; staff in Student Health Services kept Laurel Hall open to
 respond to student health needs; Human Resources staff worked over the
 weekend to ensure that payroll was processed ahead of time; UD Police
 officers were on patrol and responding to issues as they arose; the UD
 Emergency Care Unit was at the ready; staff in Environmental Health and
 Safety aided in the safe shutdown of UD laboratories and monitored fire
 safety issues; Facilities staff continue to clean up debris left in
 Sandy's wake and repair damage to buildings; faculty are working with
 students to make up lost class time.
 Our UD Alert system served as an excellent tool for keeping students,
 parents and employees informed about the storm's implications for UD,
 and the University's homepage was the repository for the most current
 information and lists of events and activities that were canceled or
 rescheduled. Through the University's accounts on Facebook and Twitter,
 staff answered questions and addressed concerns, and faculty and staff
 across the campus fielded phone calls and emails.
 In short, a stellar job all around.
 On behalf of the students, families and employees who benefited from
 these efforts, I thank everyone for their dedication and service to the
 people of UD.
 Sincerely,
 Patrick T. Harker
 President
 ::::::::::::::::::::::::::::::::::::::::::: UD P.O. Box ::
 UDEL-ALL-2128 mailing list
 Online message archive
 and management at	      https://po-box.nss.udel.edu/
 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
--- a/02-ollama/data/2017_05_16_123456.txt
+++ b/02-ollama/data/2017_05_16_123456.txt
@ -0,0 +1,85 @@
 Subject: Employee Appreciation Week
 Date: 2017_05_16_123456
 To the University of Delaware Community - President Dennis Assanis
 /* Smartphones (landscape) ----------- */
@media only screen and (max-width: 568px) {
 	img.logo {width:413px;}
 	}
 May 16, 2017
 Dear colleague,
 Our first year together has been one of amazing accomplishments and exciting opportunities. At the heart of our success has been you — the University of Delaware’s exceptional faculty and staff. To thank you and celebrate everything you do, we are launching our first Employee Appreciation Week.
 The full week of events includes:
  Monday, June 5—UDidIt Picnic 
  Tuesday, June 6—Self-Care Day 
  Wednesday, June 7—UD Spirit Day 
  Thursday, June 8—Flavors of UD 
  Friday, June 9—Employee Appreciation Night at the Blue Rocks 
 The week is a collaborative effort by Employee Health & Wellbeing and Human Resources. You can get all the details here.  
 We are dedicated to cultivating together an environment where employees are happy, healthy and continue to bring their best selves to work each day. The work you do benefits our students, our community and the world. I am truly grateful for your talents, skills, ideas and enduring commitment to the University.  
 Eleni and I hope you enjoy Employee Appreciation Week with your team and your family, and we look forward to seeing you at the many events.
 Best,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE 19716   •   USA  •   (302) 831-2792   •   www.udel.edu/president
 img { display: block !important; }
--- a/02-ollama/data/2018_05_21_110335.txt
+++ b/02-ollama/data/2018_05_21_110335.txt
@ -0,0 +1,79 @@
 Subject: Robin Morgan named UD's 11th provost
 Date: 2018_05_21_110335
 Robin Morgan Appointed Provost - University of Delaware
 						May 21, 2018
 						Dear UD Community,
 I am pleased to announce that, after a highly competitive national search, I have appointed Robin Morgan as the University of Delaware’s new provost, effective July 1. She will become the University of Delaware’s 11th provost, and the first woman to serve in this role in a permanent capacity since the position was created at UD in 1950.
 Over the last seven months, Dr. Morgan already has assembled an impressive record as interim provost, most notably in her stewardship of new cluster hires among our faculty and her leadership as we move toward the creation of the graduate college.
 Before working closely with her, I knew Dr. Morgan as a highly respected educator and scholar, but after watching her in action, I am equally impressed with her abilities to lead, inspire and effect change. Her energy, integrity, analytical mind, and innate knack for bringing people together, combined with her dedication and loyalty to UD, are great assets.
 Dr. Morgan has a distinguished record of service to this University as a faculty member since 1985. After serving as acting dean of the College of Agriculture and Natural Resources for a year, she was named dean in 2002, serving in that role for 10 years, a period of significant growth and change for the college. From 2014-16, she served as acting chair of the Department of Biological Sciences, and she had been chair of the department from 2016 until her appointment as interim provost.
 We will continue to benefit from Dr. Morgan’s deep knowledge of the University, her proven leadership across all aspects of teaching, research and administration, and her dedication to UD as she continues her career as provost.
 I am looking forward to building on our close working relationship, and I am excited by all we will accomplish to take the University of Delaware forward. Please join me in congratulating her on this next chapter in her career.
 						Sincerely,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE 19716   •   USA  •   (302) 831-2111   •   www.udel.edu/president
 img { display: block !important; }
--- a/02-ollama/data/2020_03_29_141635.txt
+++ b/02-ollama/data/2020_03_29_141635.txt
@ -0,0 +1,77 @@
 Subject: Momentum and Resilience: Our UD Spring Semester Resumes
 Date: 2020_03_29_141635
 A Message from President Dennis Assanis
 Dear UD Community,
 As the University of Delaware is ready to resume the spring semester tomorrow, March 30, I want to share with all of you a special message recorded from the office in my home. Thank you all for your support at this challenging time, particularly our faculty and staff for your Herculean efforts to convert our classes from face to face instruction to online teaching and learning.
 Best of luck with the semester ahead. As we all work remotely, please stay healthy, and stay connected!
 Sincerely,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE 19716   •   USA  •   (302) 831-2111   •   udel.edu/president
--- a/02-ollama/data/2023_09_19_085321.txt
+++ b/02-ollama/data/2023_09_19_085321.txt
@ -0,0 +1,75 @@
 Subject: National Voter Registration Day: Get Involved
 Date: 2023_09_19_085321
 National Voter Registration Day: Get Involved
 						September 19, 2023
 						Dear UD Community,
 						Do you want to make a difference in the world? Today is a good day to start.
 						This is National Voter Registration Day, an opportunity to make sure your voice will be heard in upcoming local, state and national elections. Voting is the most fundamental way that we engage in our democracy, effect change in society, work through our political differences and choose our leaders for the future. The voting rights we enjoy have been secured through the hard work and sacrifice of previous generations, and it is essential that everyone who is eligible to vote remains committed to preserving and exercising those rights.
 						At the University of Delaware, the Student Voting and Civic Engagement Committee — representing students, faculty and staff — is leading a non-partisan effort to encourage voting and help voters become better informed about the issues that matter to them. The Make It Count voter registration drive is scheduled for 2-6 p.m. today on The Green, with games, music and the opportunity to register through the TurboVote app, which also allows users to request an absentee ballot and sign up for election reminders. The committee is planning additional events this academic year to promote voting, education and civil discourse as the nation heads into the 2024 election season.
 						Being a Blue Hen means sharing a commitment to creating a better world. And being a registered, engaged and informed voter is one of the best ways for all of us to achieve that vision.
 						Sincerely,  
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu/president
--- a/02-ollama/data/2023_10_12_155349.txt
+++ b/02-ollama/data/2023_10_12_155349.txt
@ -0,0 +1,77 @@
 Subject: Affirming our position and purpose
 Date: 2023_10_12_155349
 Affirming our position and purpose | A message from UD President Dennis Assanis
 						October 12, 2023
 						Dear UD Community,
 Since my message yesterday, I have talked to many members of our community who — like me — are devastated and appalled by the terrorist attacks on Israel and the ongoing loss of life that has taken place in the Middle East.
 I want to be sure that our position is very clear: We at the University of Delaware unequivocally condemn the horrific attacks by Hamas terrorists upon Israel that have shaken the world. The atrocities of crime, abduction, hostage-taking and mass murder targeted against Jewish civilians will forever remain a stain on human history. Our community’s foundation of civility and respect has been challenged to an unimaginable extent in light of the antisemitic brutalities that have been committed against innocent victims.
 As your president, I wish words could calm the heartache and ease the fear and grief. Unfortunately, we all know that events as complicated and devastating as those taking place in the Middle East right now will continue to evolve. The longstanding humanitarian crisis needs to be acknowledged, and we should not equate the terrorist group Hamas with innocent Palestinian, Muslim and Arab people. The ensuing war-inflicted pain, suffering and death that continues to play out across the region, including Gaza, is heartbreaking for all.
 We must remember that, first and foremost, UD is a place of learning. As we engage in difficult conversations about the longstanding conflicts in the Middle East, we should always strive to do so safely, with mutual respect and without bias or judgement. I encourage our students, faculty and staff to continues organizing events to educate and unite our community. Please seize these opportunities not only as individuals, but as members of a true community defined by the freedoms that we treasure so very deeply.
 So, my message to you all is to have hope, to support each other, and to realize that the perspectives and feelings we are all experiencing right now — many of which uniquely connect to our personal backgrounds — matter. Please remember this as you walk across campus, sit in your next classroom, share experiences with other members of our community, or simply take time to reflect.
 Respectfully,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu/president
--- a/02-ollama/data/2024_08_26_100859.txt
+++ b/02-ollama/data/2024_08_26_100859.txt
@ -0,0 +1,82 @@
 Subject: A warm welcome to our UD community!
 Date: 2024_08_26_100859
 A warm welcome to our UD community!
 						August 26, 2024
 						Dear UD Community,
 I love the beginning of every new academic year and the renewed energy and sense of anticipation that it brings to every member of our campus community. The large influx of new people and ideas that come along with each new start is truly invigorating. Whether you are a new or continuing student, faculty or staff member, on behalf of everyone in our community, I want to extend a very warm welcome to you and thank you for everything you contribute, individually and collectively, to make the University of Delaware such a unique place.
 Students, your fresh perspectives, your passion for learning, and your dreams and aspirations for the boundless possibilities that lie ahead are inspiring. Faculty, your intellectual energy, your insights and expertise, and above all, your genuine interest in transferring and sharing your knowledge with all of us are the beating heart of our institution. And to all our staff, your hard work and dedicated talents provide the essential support and services to help ensure our students are successful in all their personal, academic and career pursuits.
 Here at UD, our shared purpose is to cultivate learning, develop knowledge and foster the free exchange of ideas. The connections we make and the relationships we build help advance the mission of the University. Our focus on academic excellence in all fields of study and our opportunities for groundbreaking research rely on our endless curiosity, mutual respect and open mindedness. Together, we are stronger.
 This sense of connection and belonging at UD is fundamental to our campus culture. Your willingness to hear and consider all voices and viewpoints is critical to shaping the vibrant and inclusive culture of our entire institution. Only when we commit to constructive growth, based on a foundation of civility and respect for ourselves and each other, can we realize true progress.  Empowered by diverse perspectives, it is the opportunities to advance ideas that enrich learning and create positive impact in the world that unite all of us.
 To celebrate the new semester and welcome our undergraduate Class of 2028, all members of our community are invited to attend the Twilight Induction ceremony tonight at 7:30 p.m. on the north side of Memorial Hall or online on Facebook Live.
 As your President, I am so excited by all that we can accomplish together throughout this academic year. My wife, Eleni, and I wish you all the best at the start of this new semester and beyond. We look forward to meeting you on campus!
 						Sincerely,  
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/02-ollama/data/2025_02_13_160414.txt
+++ b/02-ollama/data/2025_02_13_160414.txt
@ -0,0 +1,80 @@
 Subject: UPDATE: Recent Executive Orders
 Date: 2025_02_13_160414
 UPDATE: Recent Executive Orders | University of Delaware
 						Feb. 13, 2025
 						Dear UD Community,
 						I know many of you continue to experience disruption and anxiety stemming from the recent federal actions and executive orders regarding a multitude of issues — from research funding to education, human rights, and immigration among other areas. As I communicated to the University of Delaware community in my Jan. 28 campus message and my Feb. 3 comments to the Faculty Senate, we will do everything we can to minimize disruption to UD students, faculty and staff while remaining in compliance with federal law.
 						To support our community, we have created this resource page that will be updated regularly with information for UD students, faculty and staff regarding ongoing federal actions, directives and developments, including guidance in response to changing conditions. Also, this page from the Research Office contains specific guidance related to research projects and grants. In parallel, we will continue to advocate on behalf of the University’s interests regarding any impact that federal or state actions could have on our students, faculty and staff.
 						One example is our response this week related to the federal action to impose a 15 limit on reimbursements for indirect administrative costs (Facilities and Administrative, or F&A costs) for all National Institutes of Health (NIH) research grants. This immediate cut in funding would have a devastating impact on all biomedical, health and life science advances and human wellness, including here at UD. In response, the Delaware Attorney General filed a lawsuit jointly with 21 other state attorneys general. The University supported the Attorney General’s lawsuit by submitting a declaration detailing the impact of the NIH rate cap on the institution. Fortunately, the attorneys general were successful, and a temporary restraining order was granted on Monday. Further, the Association of Public and Land-grant Universities, the Association of American Universities, and the American Council on Education announced a similar lawsuit. 
 						As we navigate this rapidly evolving landscape together, our values will continue to be at the heart of our community. We will continue to foster an atmosphere that promotes the free exchange of ideas and opinions; we will continue to welcome and value people of different backgrounds, perspectives and learning experiences; and we will continue to encourage respect and civility toward everyone.
 						Please know that my leadership team and I are here to help and support our community during this time. Feel free to submit any questions pertaining to these matters here, and we will do our best to add relevant information on the resource pages. I deeply appreciate your resilience and patience as we continue to work together to advance the important mission of our University.
 						Sincerely,  
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/02-ollama/data/2025_04_29_230614.txt
+++ b/02-ollama/data/2025_04_29_230614.txt
@ -0,0 +1,87 @@
 Subject: Extending condolences and offering support
 Date: 2025_04_29_230614
 Extending condolences and offering support
 April 29, 2025
 Dear UD Community,
 It is with a heavy heart that we share this information with you. Earlier today, a University of Delaware student died in a traffic accident on Main Street near campus, and several other people, including other UD students, suffered injuries. There is no ongoing threat to the University community.
 University of Delaware Police are continuing to work with the Newark Police Department, which is actively investigating the incident. As a result, information is limited and the Newark Police Department is not releasing the victims’ names at this time, pending family notification.
 This is a terrible tragedy for everyone in our UD community. We speak for the entire University in offering our condolences to the families, friends and classmates of the victims, and keep the other members of our community in our thoughts who may have witnessed the crash and its aftermath. The safety of our entire community remains our top priority, and we will continue to work with our partners in city and state government to address safety concerns around and on the UD campus. 
 As we all begin to cope with this traumatic incident, we encourage you to support one another and reach out for additional help from the UD resources listed below as needed.
 Sincerely,
 Dennis AssanisPresident
 José-Luis RieraVice President for Student Life
 Support and resources
 	Center for Counseling and Student Development
 			Counselors and Student Life staff are available in Warner Hall 101 on Wednesday, April 30, from 9 a.m. to 3 p.m. for counseling services. 
 	TimelyCare — A virtual health and wellbeing platform available 24/7 for UD students
 	Student Advocacy and Support — Available to assist students who need support navigating University resources or complex issues. Call 302-831-8939 or email studentsupport@udel.edu to schedule an appointment.
 	ComPsych® GuidanceResources® — Mental health support for UD benefited employees. Access services through the link or call 877-527-4742 for support.
 	Additional safety and wellness resources — Information about UD Police, Student Health Services and other services.
 	Information about the UD Alert, the LiveSafe app and safety notification communication.
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/02-ollama/data/2025_04_30_160615.txt
+++ b/02-ollama/data/2025_04_30_160615.txt
@ -0,0 +1,76 @@
 Subject: Sharing our grief, enhancing safety
 Date: 2025_04_30_160615
 Sharing our grief, enhancing safety
 April 30, 2025
 Dear UD Community,
 Since last evening’s crash on Main Street that took the life of a University of Delaware graduate student (whose identity is being withheld at this time) and injured several others, we have been struggling to cope with the pain of this senseless tragedy. Throughout the UD community, we are all feeling the deep ache of loss, and we will continue to work through our grief together.
 Today, Newark Police announced an arrest in connection with the crash, reiterating that there is no ongoing threat to the community. 
 Main Street is where we eat, shop and share our lives with our friends, families and classmates. Because it is part of the state’s roadway systems, we have been working with local and state officials this year, including our partners at Delaware Department of Transportation, to address traffic safety on and around Main Street. In the wake of this tragedy, we will reinforce and accelerate those efforts. We recognize there isn’t a simple solution, particularly when these tragedies involve actions taken by individuals that may not be stopped by changes to roadways or infrastructure. However, this incident underscores that our collective efforts must take on renewed urgency.
 University leaders joined Delaware Attorney General Kathy Jennings and Newark Mayor Travis McDermott today for a press conference, at which we expressed our shared commitment to enhanced safety along Main Street. The University has pledged to continue these discussions through meetings with the offices of AG Jennings and Mayor McDermott, in addition to DelDOT, in the near future. The University remains committed to advancing meaningful solutions, while the University’s Division of Student Life and Graduate College are connecting with students about effective advocacy, civic engagement and partnerships in order to support these efforts.
 We are also aware that members of the UD community may have witnessed the crash and its aftermath or have close relationships with the victims. We encourage everyone to become familiar with and use, as needed, the available University counseling and support resources that were shared in Tuesday evening’s message to the UD community. Counseling services are available at Warner Hall and through TimelyCare anytime, 24/7. Students with physical injuries or medical concerns relating to the incident can contact Student Health Services at 302-831-2226, Option 0, or visit Laurel Hall to meet with triage nurses available until 5 p.m. After hours, students can contact the Highmark Nurse line at 888-258-3428 or visit local urgent care centers (Newark Urgent Care at 324 E. Main Street, or ChristianaCare GoHealth at 550 S. College Avenue, Suite 115).
 During this difficult time in our community, we all need to continue supporting and standing by one another as we move forward together.
 Sincerely,
 Dennis AssanisPresident
 Laura CarlsonProvost
 José-Luis RieraVice President for Student Life
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/03-rag/README.md
+++ b/03-rag/README.md
@ -0,0 +1,274 @@
 # Large Language Models Part III: Retrieval-Augmented Generation
 **CHEG 667-013 — Chemical Engineering with Computers**  
 Department of Chemical and Biomolecular Engineering, University of Delaware
 ---
 ## Key idea
 Build a local, privacy-preserving RAG system that answers questions about your own documents.
 ## Key goals
 - Understand the RAG workflow: chunk, embed, store, retrieve, generate
 - Build a vector store from a document collection
 - Query the vector store and generate responses with a local LLM
 - Experiment with parameters that affect retrieval quality
 ---
 In Parts I and II, we trained a small GPT from scratch and then ran pre-trained models locally with `ollama`. We even used `ollama` on the command line to summarize documents. But what if we want to ask questions about a *specific* collection of documents — our own notes, emails, papers, or lab reports — rather than relying on what the model was trained on?
 This is the idea behind **Retrieval-Augmented Generation (RAG)**. Instead of hoping the LLM "knows" the answer, we:
 1. **Chunk** our documents into short text segments
 2. **Embed** each chunk into a vector (a list of numbers that captures its meaning)
 3. **Store** the vectors in a searchable index
 4. At query time, **embed** the user's question the same way
 5. **Retrieve** the most similar chunks using cosine similarity
 6. **Generate** a response by passing those chunks to an LLM as context
 The LLM never sees your full document collection — only the most relevant pieces. Everything runs locally. No data leaves your machine.
 ![RAG workflow](img/rag-workflow.png)
 ## 1. Setup
 ### Prerequisites
 You need:
 - Python 3.10+
 - `ollama` installed and working (from Part II)
 - About 2–3 GB of disk space for models
 ### Create a virtual environment
 ```bash
 python3 -m venv .venv
 source .venv/bin/activate
 ```
 Or with `uv`:
 ```bash
 uv venv .venv
 source .venv/bin/activate
 ```
 ### Install the required packages
 ```bash
 pip install llama-index-core llama-index-readers-file \
    llama-index-llms-ollama llama-index-embeddings-huggingface \
    python-dateutil
 ```
 The `llama-index-*` packages are components of the [LlamaIndex](https://docs.llamaindex.ai/en/stable/) framework, which provides the plumbing for building RAG systems. `python-dateutil` is used by `clean_eml.py` for parsing email dates.
 A `requirements.txt` is provided:
 ```bash
 pip install -r requirements.txt
 ```
 ### Pull the LLM
 We will use the `command-r7b` model, which was fine-tuned for RAG tasks:
 ```bash
 ollama pull command-r7b
 ```
 Other models work too — `llama3.1:8B`, `deepseek-r1:8B`, `gemma3:1b` — but `command-r7b` tends to follow retrieval-augmented prompts well.
 ### Cache the embedding model
 The embedding model converts text into vectors. We use `BAAI/bge-large-en-v1.5`, a sentence transformer hosted on Huggingface. It will download automatically on first use (~1.3 GB), but you can pre-cache it with a short Python script:
 ```python
 from llama_index.embeddings.huggingface import HuggingFaceEmbedding
 embed_model = HuggingFaceEmbedding(
    cache_folder="./models",
    model_name="BAAI/bge-large-en-v1.5"
 )
 ```
 Save this as `cache_model.py` and run it:
 ```bash
 python cache_model.py
 ```
 ## 2. The documents
 The `data/` directory contains 10 emails from the University of Delaware president's office, spanning 2012–2025 (the same set from Part II). Each is a plain text file with a subject line, date, and body text.
 ```bash
 ls data/
 ```
 In a real project, you might have PDFs, lab reports, research papers, or notes. For this exercise, the emails give us a small, manageable collection to work with.
 ### Preparing your own documents
 If you have email files (`.eml` format), the script `clean_eml.py` can convert them to plain text:
 ```bash
 # Place .eml files in ./eml, then run:
 python clean_eml.py
 ```
 This extracts the subject, date, and body from each email and writes a dated `.txt` file to `./data`.
 ## 3. Building the vector store
 The script `build.py` does the heavy lifting:
 1. Loads all text files from `./data`
 2. Splits them into **chunks** of 500 tokens with 50 tokens of overlap
 3. Embeds each chunk using the `BAAI/bge-large-en-v1.5` model
 4. Saves the vector store to `./storage`
 ```bash
 python build.py
 ```
 You should see progress bars as documents are parsed and embeddings are generated:
 ```
 Parsing nodes: 100%|████| 10/10 [00:00<00:00, 79.53it/s]
 Generating embeddings: 100%|████| 42/42 [00:05<00:00, 8.01it/s]
 Index built and saved to ./storage
 ```
 After this, the `./storage` directory contains JSON files with the vector data, document metadata, and index information. You only need to build once — queries will load from storage.
 ### What are chunks?
 We can't embed an entire document as a single vector — it would lose too much detail. Instead, we split the text into overlapping segments. The **chunk size** (500 tokens) controls how much text each vector represents. The **overlap** (50 tokens) ensures that sentences at chunk boundaries aren't lost. The `SentenceSplitter` tries to break at sentence boundaries rather than mid-sentence.
 > **Exercise 1:** Look at `build.py`. What would happen if you made the chunks much smaller (e.g., 100 tokens)? Much larger (e.g., 2000 tokens)? Think about the tradeoff between precision and context.
 ## 4. Querying the vector store
 The script `query.py` loads the stored index, takes your question, and returns a response grounded in the documents:
 ```bash
 python query.py
 ```
 ```
 Enter a search topic or question (or 'exit'): Find documents about campus safety
 ```
 Here's what happens behind the scenes:
 1. Your query is embedded into a vector using the same embedding model
 2. The 15 most similar chunks are retrieved (`similarity_top_k=15`)
 3. Those chunks are passed to `command-r7b` via `ollama` as context
 4. The LLM generates a response based *only* on the retrieved context
 The custom prompt in `query.py` instructs the model to:
 - Base its response only on the provided context
 - Prioritize higher-ranked (more similar) snippets
 - Reference specific files and passages
 - Format the output as a theme summary plus a list of matching files
 ### Example output
 ```
 Enter a search topic or question (or 'exit'): Find documents that highlight
 the excellence of the university
 1. **Summary Theme**
   The dominant theme across these documents is the University of Delaware's
   commitment to excellence, innovation, and community impact...
 2. **Matching Files**
   2024_08_26_100859.txt - Welcome message highlighting UD's mission...
   2023_10_12_155349.txt - Affirming institutional purpose and values...
   ...
 Source documents:
  2024_08_26_100859.txt  0.6623
  2023_10_12_155349.txt  0.6451
  ...
 Elapsed time: 76.1 seconds
 ```
 Notice the **similarity scores** — these are cosine similarities between the query vector and each chunk's vector. Higher is more relevant. Also note that the search is *semantic*: the query said "excellence" but the matching documents talk about "achievement," "mission," and "purpose." The embedding model understands meaning, not just keywords.
 > **Exercise 2:** Run the same query twice. Do you get exactly the same output? Why or why not?
 ## 5. Understanding the pieces
 ### The embedding model
 The embedding model (`BAAI/bge-large-en-v1.5`) maps text to a 1024-dimensional vector. Two pieces of text with similar meaning will have vectors that point in similar directions (high cosine similarity), even if they use different words. This is what makes semantic search possible.
 ### The LLM
 The LLM (`command-r7b` via `ollama`) is the *generator*. It reads the retrieved chunks and composes a coherent answer. Without the retrieval step, it would rely only on its training data — which knows nothing about your specific documents.
 ### The prompt
 The default LlamaIndex prompt is simple:
 ```
 Context information is below.
 ---------------------
 {context_str}
 ---------------------
 Given the context information and not prior knowledge, answer the query.
 Query: {query_str}
 Answer:
 ```
 Our custom prompt in `query.py` is more detailed — it asks for structured output and tells the model to cite sources. You can inspect and modify the prompt to change the model's behavior.
 > **Exercise 3:** Modify the prompt in `query.py`. For example, ask the model to respond in the style of a news reporter, or to focus only on dates and events. How does the output change?
 ## 6. Exercises
 > **Exercise 4:** Try different embedding models. Replace `BAAI/bge-large-en-v1.5` with `sentence-transformers/all-mpnet-base-v2` in both `build.py` and `query.py`. Rebuild the vector store and compare the results.
 > **Exercise 5:** Change the chunk size and overlap in `build.py`. Try `chunk_size=200, chunk_overlap=25` and then `chunk_size=1000, chunk_overlap=100`. Rebuild and query. What differences do you notice?
 > **Exercise 6:** Swap the LLM. Try `llama3.2` or `gemma3:1b` instead of `command-r7b`. Which gives better RAG responses? Why might some models be better at following retrieval-augmented prompts?
 > **Exercise 7:** Bring your own documents. Find a collection of text files — research paper abstracts, class notes, or a downloaded text from Project Gutenberg — and build a RAG system over them. What questions can you answer that a plain LLM cannot?
 ## Additional resources and references
 ### LlamaIndex
 - Documentation: https://docs.llamaindex.ai/en/stable/
 ### Models
 - Ollama: https://ollama.com
 - Huggingface models: https://huggingface.co/models
 #### Models used in this tutorial
 | Model | Type | Role | Source |
 |-------|------|------|--------|
 | `command-r7b` | LLM (RAG-optimized) | Response generation | `ollama pull command-r7b` |
 | `BAAI/bge-large-en-v1.5` | Embedding (1024-dim) | Text -> vector encoding | Huggingface (auto-downloaded) |
 Other LLMs mentioned: `llama3.1:8B`, `deepseek-r1:8B`, `gemma3:1b`, `llama3.2`
 Other embedding model mentioned: `sentence-transformers/all-mpnet-base-v2`
 ### Further reading
 - NIST IR 8579, [*Developing the NCCoE Chatbot: Technical and Security Learnings from the Initial Implementation*](https://csrc.nist.gov/pubs/ir/8579/ipd) ([PDF](https://nvlpubs.nist.gov/nistpubs/ir/2025/NIST.IR.8579.ipd.pdf)) — practical guidance on building a RAG-based chatbot, including architecture and security considerations
 - Open WebUI (https://openwebui.com) — a turnkey local RAG interface if you want a GUI
--- a/03-rag/build.py
+++ b/03-rag/build.py
@ -0,0 +1,49 @@
 # build.py
 #
 # Import documents from data, generate embedded vector store
 # and save to disk in directory ./storage
 #
 # August 2025
 # E. M. Furst
 from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    Settings,
 )
 from llama_index.embeddings.huggingface import HuggingFaceEmbedding
 from llama_index.core.node_parser import SentenceSplitter
 def main():
    # Choose your embedding model
    embed_model = HuggingFaceEmbedding(cache_folder="./models",
        model_name="BAAI/bge-large-en-v1.5")
    # Configure global settings for LlamaIndex
    Settings.embed_model = embed_model
    # Load documents
    documents = SimpleDirectoryReader("./data").load_data()
    # Create the custom textsplitter
    # Set chunk size and overlap (e.g., 256 tokens, 25 tokens overlap)
    text_splitter = SentenceSplitter(
        chunk_size=500,
        chunk_overlap=50,
    )
    Settings.text_splitter = text_splitter
    # Build the index
    index = VectorStoreIndex.from_documents(
        documents, transformations=[text_splitter],
        show_progress=True,
    )
    # Persist both vector store and index metadata
    index.storage_context.persist(persist_dir="./storage")
    print("Index built and saved to ./storage")
 if __name__ == "__main__":
    main()
--- a/03-rag/cache_model.py
+++ b/03-rag/cache_model.py
@ -0,0 +1,12 @@
 # cache_model.py
 #
 # Pre-download the embedding model so build.py doesn't have to fetch it.
 from llama_index.embeddings.huggingface import HuggingFaceEmbedding
 embed_model = HuggingFaceEmbedding(
    cache_folder="./models",
    model_name="BAAI/bge-large-en-v1.5"
 )
 print("Embedding model cached in ./models")
--- a/03-rag/clean_eml.py
+++ b/03-rag/clean_eml.py
@ -0,0 +1,48 @@
 # clean_eml.py
 #
 # Convert .eml files to plain text files for use with build.py.
 # Place .eml files in ./eml, then run this script to produce
 # dated .txt files in ./data.
 #
 # August 2025
 # E. M. Furst
 from email import policy
 from email.parser import BytesParser
 from pathlib import Path
 from dateutil import parser
 from dateutil import tz
 eml_dir = "eml"
 out_dir = "data"
 for eml_file in Path(eml_dir).glob("*.eml"):
    with open(eml_file, "rb") as f:
        msg = BytesParser(policy=policy.default).parse(f)
    # Get metadata
    subject = msg.get("subject", "No Subject")
    date = msg.get("date", "No Date")
    # Convert date to a safe format for filenames: YYYY_MM_DD_hhmmss
    date = parser.parse(date)
    if date.tzinfo is None:
        date = date.replace(tzinfo=tz.tzlocal())
    date = date.astimezone(tz.tzlocal())
    msg_date = date.strftime("%d/%m/%Y, %H:%M:%S")
    date = date.strftime("%Y_%m_%d_%H%M%S")
    # Prefer plain text, fallback to HTML
    body_part = msg.get_body(preferencelist=('plain', 'html'))
    if body_part:
        body_content = body_part.get_content()
    else:
        body_content = msg.get_payload()
    # Combine into a clean string with labels and newlines
    text = f"Subject: {subject}\nDate: {date}\n\n{body_content}"
    out_file = Path(f"{out_dir}/{date}.txt").open("w", encoding="utf-8")
    out_file.write(text)
    print(f"{msg_date}")
--- a/03-rag/data/2012_11_02_164248.txt
+++ b/03-rag/data/2012_11_02_164248.txt
@ -0,0 +1,80 @@
 Subject: [UDEL-ALL-2128] Hurricane Sandy
 Date: 2012_11_02_164248
 To the University of Delaware community:
 We have much to be thankful for this week at the University of Delaware
 as we were spared the full force of Hurricane Sandy. Even as we breathe
 a sigh of relief and return to our normal activities, we are mindful of
 the many, many people in this region -- some of our students among them
 -- who were not so lucky. Our thoughts and prayers go out to them as
 they rebuild their communities.
 The potential impact of Sandy was a major concern for UD, with its
 thousands of people and 430+ buildings on 2,000 acres throughout the
 state. Many members of our University community worked hard over the
 last several days to help us weather this "Storm of the Century."
 Preparation and practice paid off as our emergency response team, led
 by the Office of Campus and Public Safety, began assessing the
 situation late last week and taking steps to ensure the safety of our
 people and facilities. When the storm came, the campus suffered only
 minor damage: wind-driven water getting into buildings through roofs,
 walls and foundations; very minimal power loss, with a couple of
 residential properties without power for only a few hours, thanks to
 quick repair from the City of Newark; and only three trees knocked down
 and destroyed, along with a lot of leaves and branches to clean up. The
 Georgetown research facilities were fortunate to sustain only minor
 leaks and flooding. The hardest hit area was the Lewes campus, which
 had flooding on its grounds but minimal damage to buildings.
 Throughout this time, the University's greatest asset continued to be
 its people -- staff members from a variety of units working as a team.
 A command center brought together representatives from across UD so
 that issues could be responded to immediately. Staffed around the
 clock, the center included Housing, Public Safety, Residence Life,
 Environmental Health and Safety, Facilities and Auxiliary Services,
 Emergency Management, and Communications and Marketing.
 The dedication of UD's employees and students was evident everywhere:
 Dining Services staff, faced with reduced numbers and limited
 deliveries, kept students fed, and supported employees who worked
 during the crisis; Residence Life staff and resident assistants made
 sure students who remained on campus had up-to-date information and
 supplies; staff in Student Health Services kept Laurel Hall open to
 respond to student health needs; Human Resources staff worked over the
 weekend to ensure that payroll was processed ahead of time; UD Police
 officers were on patrol and responding to issues as they arose; the UD
 Emergency Care Unit was at the ready; staff in Environmental Health and
 Safety aided in the safe shutdown of UD laboratories and monitored fire
 safety issues; Facilities staff continue to clean up debris left in
 Sandy's wake and repair damage to buildings; faculty are working with
 students to make up lost class time.
 Our UD Alert system served as an excellent tool for keeping students,
 parents and employees informed about the storm's implications for UD,
 and the University's homepage was the repository for the most current
 information and lists of events and activities that were canceled or
 rescheduled. Through the University's accounts on Facebook and Twitter,
 staff answered questions and addressed concerns, and faculty and staff
 across the campus fielded phone calls and emails.
 In short, a stellar job all around.
 On behalf of the students, families and employees who benefited from
 these efforts, I thank everyone for their dedication and service to the
 people of UD.
 Sincerely,
 Patrick T. Harker
 President
 ::::::::::::::::::::::::::::::::::::::::::: UD P.O. Box ::
 UDEL-ALL-2128 mailing list
 Online message archive
 and management at	      https://po-box.nss.udel.edu/
 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
--- a/03-rag/data/2017_05_16_123456.txt
+++ b/03-rag/data/2017_05_16_123456.txt
@ -0,0 +1,85 @@
 Subject: Employee Appreciation Week
 Date: 2017_05_16_123456
 To the University of Delaware Community - President Dennis Assanis
 /* Smartphones (landscape) ----------- */
@media only screen and (max-width: 568px) {
 	img.logo {width:413px;}
 	}
 May 16, 2017
 Dear colleague,
 Our first year together has been one of amazing accomplishments and exciting opportunities. At the heart of our success has been you — the University of Delaware’s exceptional faculty and staff. To thank you and celebrate everything you do, we are launching our first Employee Appreciation Week.
 The full week of events includes:
  Monday, June 5—UDidIt Picnic 
  Tuesday, June 6—Self-Care Day 
  Wednesday, June 7—UD Spirit Day 
  Thursday, June 8—Flavors of UD 
  Friday, June 9—Employee Appreciation Night at the Blue Rocks 
 The week is a collaborative effort by Employee Health & Wellbeing and Human Resources. You can get all the details here.  
 We are dedicated to cultivating together an environment where employees are happy, healthy and continue to bring their best selves to work each day. The work you do benefits our students, our community and the world. I am truly grateful for your talents, skills, ideas and enduring commitment to the University.  
 Eleni and I hope you enjoy Employee Appreciation Week with your team and your family, and we look forward to seeing you at the many events.
 Best,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE 19716   •   USA  •   (302) 831-2792   •   www.udel.edu/president
 img { display: block !important; }
--- a/03-rag/data/2018_05_21_110335.txt
+++ b/03-rag/data/2018_05_21_110335.txt
@ -0,0 +1,79 @@
 Subject: Robin Morgan named UD's 11th provost
 Date: 2018_05_21_110335
 Robin Morgan Appointed Provost - University of Delaware
 						May 21, 2018
 						Dear UD Community,
 I am pleased to announce that, after a highly competitive national search, I have appointed Robin Morgan as the University of Delaware’s new provost, effective July 1. She will become the University of Delaware’s 11th provost, and the first woman to serve in this role in a permanent capacity since the position was created at UD in 1950.
 Over the last seven months, Dr. Morgan already has assembled an impressive record as interim provost, most notably in her stewardship of new cluster hires among our faculty and her leadership as we move toward the creation of the graduate college.
 Before working closely with her, I knew Dr. Morgan as a highly respected educator and scholar, but after watching her in action, I am equally impressed with her abilities to lead, inspire and effect change. Her energy, integrity, analytical mind, and innate knack for bringing people together, combined with her dedication and loyalty to UD, are great assets.
 Dr. Morgan has a distinguished record of service to this University as a faculty member since 1985. After serving as acting dean of the College of Agriculture and Natural Resources for a year, she was named dean in 2002, serving in that role for 10 years, a period of significant growth and change for the college. From 2014-16, she served as acting chair of the Department of Biological Sciences, and she had been chair of the department from 2016 until her appointment as interim provost.
 We will continue to benefit from Dr. Morgan’s deep knowledge of the University, her proven leadership across all aspects of teaching, research and administration, and her dedication to UD as she continues her career as provost.
 I am looking forward to building on our close working relationship, and I am excited by all we will accomplish to take the University of Delaware forward. Please join me in congratulating her on this next chapter in her career.
 						Sincerely,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE 19716   •   USA  •   (302) 831-2111   •   www.udel.edu/president
 img { display: block !important; }
--- a/03-rag/data/2020_03_29_141635.txt
+++ b/03-rag/data/2020_03_29_141635.txt
@ -0,0 +1,77 @@
 Subject: Momentum and Resilience: Our UD Spring Semester Resumes
 Date: 2020_03_29_141635
 A Message from President Dennis Assanis
 Dear UD Community,
 As the University of Delaware is ready to resume the spring semester tomorrow, March 30, I want to share with all of you a special message recorded from the office in my home. Thank you all for your support at this challenging time, particularly our faculty and staff for your Herculean efforts to convert our classes from face to face instruction to online teaching and learning.
 Best of luck with the semester ahead. As we all work remotely, please stay healthy, and stay connected!
 Sincerely,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE 19716   •   USA  •   (302) 831-2111   •   udel.edu/president
--- a/03-rag/data/2023_09_19_085321.txt
+++ b/03-rag/data/2023_09_19_085321.txt
@ -0,0 +1,75 @@
 Subject: National Voter Registration Day: Get Involved
 Date: 2023_09_19_085321
 National Voter Registration Day: Get Involved
 						September 19, 2023
 						Dear UD Community,
 						Do you want to make a difference in the world? Today is a good day to start.
 						This is National Voter Registration Day, an opportunity to make sure your voice will be heard in upcoming local, state and national elections. Voting is the most fundamental way that we engage in our democracy, effect change in society, work through our political differences and choose our leaders for the future. The voting rights we enjoy have been secured through the hard work and sacrifice of previous generations, and it is essential that everyone who is eligible to vote remains committed to preserving and exercising those rights.
 						At the University of Delaware, the Student Voting and Civic Engagement Committee — representing students, faculty and staff — is leading a non-partisan effort to encourage voting and help voters become better informed about the issues that matter to them. The Make It Count voter registration drive is scheduled for 2-6 p.m. today on The Green, with games, music and the opportunity to register through the TurboVote app, which also allows users to request an absentee ballot and sign up for election reminders. The committee is planning additional events this academic year to promote voting, education and civil discourse as the nation heads into the 2024 election season.
 						Being a Blue Hen means sharing a commitment to creating a better world. And being a registered, engaged and informed voter is one of the best ways for all of us to achieve that vision.
 						Sincerely,  
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu/president
--- a/03-rag/data/2023_10_12_155349.txt
+++ b/03-rag/data/2023_10_12_155349.txt
@ -0,0 +1,77 @@
 Subject: Affirming our position and purpose
 Date: 2023_10_12_155349
 Affirming our position and purpose | A message from UD President Dennis Assanis
 						October 12, 2023
 						Dear UD Community,
 Since my message yesterday, I have talked to many members of our community who — like me — are devastated and appalled by the terrorist attacks on Israel and the ongoing loss of life that has taken place in the Middle East.
 I want to be sure that our position is very clear: We at the University of Delaware unequivocally condemn the horrific attacks by Hamas terrorists upon Israel that have shaken the world. The atrocities of crime, abduction, hostage-taking and mass murder targeted against Jewish civilians will forever remain a stain on human history. Our community’s foundation of civility and respect has been challenged to an unimaginable extent in light of the antisemitic brutalities that have been committed against innocent victims.
 As your president, I wish words could calm the heartache and ease the fear and grief. Unfortunately, we all know that events as complicated and devastating as those taking place in the Middle East right now will continue to evolve. The longstanding humanitarian crisis needs to be acknowledged, and we should not equate the terrorist group Hamas with innocent Palestinian, Muslim and Arab people. The ensuing war-inflicted pain, suffering and death that continues to play out across the region, including Gaza, is heartbreaking for all.
 We must remember that, first and foremost, UD is a place of learning. As we engage in difficult conversations about the longstanding conflicts in the Middle East, we should always strive to do so safely, with mutual respect and without bias or judgement. I encourage our students, faculty and staff to continues organizing events to educate and unite our community. Please seize these opportunities not only as individuals, but as members of a true community defined by the freedoms that we treasure so very deeply.
 So, my message to you all is to have hope, to support each other, and to realize that the perspectives and feelings we are all experiencing right now — many of which uniquely connect to our personal backgrounds — matter. Please remember this as you walk across campus, sit in your next classroom, share experiences with other members of our community, or simply take time to reflect.
 Respectfully,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu/president
--- a/03-rag/data/2024_08_26_100859.txt
+++ b/03-rag/data/2024_08_26_100859.txt
@ -0,0 +1,82 @@
 Subject: A warm welcome to our UD community!
 Date: 2024_08_26_100859
 A warm welcome to our UD community!
 						August 26, 2024
 						Dear UD Community,
 I love the beginning of every new academic year and the renewed energy and sense of anticipation that it brings to every member of our campus community. The large influx of new people and ideas that come along with each new start is truly invigorating. Whether you are a new or continuing student, faculty or staff member, on behalf of everyone in our community, I want to extend a very warm welcome to you and thank you for everything you contribute, individually and collectively, to make the University of Delaware such a unique place.
 Students, your fresh perspectives, your passion for learning, and your dreams and aspirations for the boundless possibilities that lie ahead are inspiring. Faculty, your intellectual energy, your insights and expertise, and above all, your genuine interest in transferring and sharing your knowledge with all of us are the beating heart of our institution. And to all our staff, your hard work and dedicated talents provide the essential support and services to help ensure our students are successful in all their personal, academic and career pursuits.
 Here at UD, our shared purpose is to cultivate learning, develop knowledge and foster the free exchange of ideas. The connections we make and the relationships we build help advance the mission of the University. Our focus on academic excellence in all fields of study and our opportunities for groundbreaking research rely on our endless curiosity, mutual respect and open mindedness. Together, we are stronger.
 This sense of connection and belonging at UD is fundamental to our campus culture. Your willingness to hear and consider all voices and viewpoints is critical to shaping the vibrant and inclusive culture of our entire institution. Only when we commit to constructive growth, based on a foundation of civility and respect for ourselves and each other, can we realize true progress.  Empowered by diverse perspectives, it is the opportunities to advance ideas that enrich learning and create positive impact in the world that unite all of us.
 To celebrate the new semester and welcome our undergraduate Class of 2028, all members of our community are invited to attend the Twilight Induction ceremony tonight at 7:30 p.m. on the north side of Memorial Hall or online on Facebook Live.
 As your President, I am so excited by all that we can accomplish together throughout this academic year. My wife, Eleni, and I wish you all the best at the start of this new semester and beyond. We look forward to meeting you on campus!
 						Sincerely,  
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/03-rag/data/2025_02_13_160414.txt
+++ b/03-rag/data/2025_02_13_160414.txt
@ -0,0 +1,80 @@
 Subject: UPDATE: Recent Executive Orders
 Date: 2025_02_13_160414
 UPDATE: Recent Executive Orders | University of Delaware
 						Feb. 13, 2025
 						Dear UD Community,
 						I know many of you continue to experience disruption and anxiety stemming from the recent federal actions and executive orders regarding a multitude of issues — from research funding to education, human rights, and immigration among other areas. As I communicated to the University of Delaware community in my Jan. 28 campus message and my Feb. 3 comments to the Faculty Senate, we will do everything we can to minimize disruption to UD students, faculty and staff while remaining in compliance with federal law.
 						To support our community, we have created this resource page that will be updated regularly with information for UD students, faculty and staff regarding ongoing federal actions, directives and developments, including guidance in response to changing conditions. Also, this page from the Research Office contains specific guidance related to research projects and grants. In parallel, we will continue to advocate on behalf of the University’s interests regarding any impact that federal or state actions could have on our students, faculty and staff.
 						One example is our response this week related to the federal action to impose a 15 limit on reimbursements for indirect administrative costs (Facilities and Administrative, or F&A costs) for all National Institutes of Health (NIH) research grants. This immediate cut in funding would have a devastating impact on all biomedical, health and life science advances and human wellness, including here at UD. In response, the Delaware Attorney General filed a lawsuit jointly with 21 other state attorneys general. The University supported the Attorney General’s lawsuit by submitting a declaration detailing the impact of the NIH rate cap on the institution. Fortunately, the attorneys general were successful, and a temporary restraining order was granted on Monday. Further, the Association of Public and Land-grant Universities, the Association of American Universities, and the American Council on Education announced a similar lawsuit. 
 						As we navigate this rapidly evolving landscape together, our values will continue to be at the heart of our community. We will continue to foster an atmosphere that promotes the free exchange of ideas and opinions; we will continue to welcome and value people of different backgrounds, perspectives and learning experiences; and we will continue to encourage respect and civility toward everyone.
 						Please know that my leadership team and I are here to help and support our community during this time. Feel free to submit any questions pertaining to these matters here, and we will do our best to add relevant information on the resource pages. I deeply appreciate your resilience and patience as we continue to work together to advance the important mission of our University.
 						Sincerely,  
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/03-rag/data/2025_04_29_230614.txt
+++ b/03-rag/data/2025_04_29_230614.txt
@ -0,0 +1,87 @@
 Subject: Extending condolences and offering support
 Date: 2025_04_29_230614
 Extending condolences and offering support
 April 29, 2025
 Dear UD Community,
 It is with a heavy heart that we share this information with you. Earlier today, a University of Delaware student died in a traffic accident on Main Street near campus, and several other people, including other UD students, suffered injuries. There is no ongoing threat to the University community.
 University of Delaware Police are continuing to work with the Newark Police Department, which is actively investigating the incident. As a result, information is limited and the Newark Police Department is not releasing the victims’ names at this time, pending family notification.
 This is a terrible tragedy for everyone in our UD community. We speak for the entire University in offering our condolences to the families, friends and classmates of the victims, and keep the other members of our community in our thoughts who may have witnessed the crash and its aftermath. The safety of our entire community remains our top priority, and we will continue to work with our partners in city and state government to address safety concerns around and on the UD campus. 
 As we all begin to cope with this traumatic incident, we encourage you to support one another and reach out for additional help from the UD resources listed below as needed.
 Sincerely,
 Dennis AssanisPresident
 José-Luis RieraVice President for Student Life
 Support and resources
 	Center for Counseling and Student Development
 			Counselors and Student Life staff are available in Warner Hall 101 on Wednesday, April 30, from 9 a.m. to 3 p.m. for counseling services. 
 	TimelyCare — A virtual health and wellbeing platform available 24/7 for UD students
 	Student Advocacy and Support — Available to assist students who need support navigating University resources or complex issues. Call 302-831-8939 or email studentsupport@udel.edu to schedule an appointment.
 	ComPsych® GuidanceResources® — Mental health support for UD benefited employees. Access services through the link or call 877-527-4742 for support.
 	Additional safety and wellness resources — Information about UD Police, Student Health Services and other services.
 	Information about the UD Alert, the LiveSafe app and safety notification communication.
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/03-rag/data/2025_04_30_160615.txt
+++ b/03-rag/data/2025_04_30_160615.txt
@ -0,0 +1,76 @@
 Subject: Sharing our grief, enhancing safety
 Date: 2025_04_30_160615
 Sharing our grief, enhancing safety
 April 30, 2025
 Dear UD Community,
 Since last evening’s crash on Main Street that took the life of a University of Delaware graduate student (whose identity is being withheld at this time) and injured several others, we have been struggling to cope with the pain of this senseless tragedy. Throughout the UD community, we are all feeling the deep ache of loss, and we will continue to work through our grief together.
 Today, Newark Police announced an arrest in connection with the crash, reiterating that there is no ongoing threat to the community. 
 Main Street is where we eat, shop and share our lives with our friends, families and classmates. Because it is part of the state’s roadway systems, we have been working with local and state officials this year, including our partners at Delaware Department of Transportation, to address traffic safety on and around Main Street. In the wake of this tragedy, we will reinforce and accelerate those efforts. We recognize there isn’t a simple solution, particularly when these tragedies involve actions taken by individuals that may not be stopped by changes to roadways or infrastructure. However, this incident underscores that our collective efforts must take on renewed urgency.
 University leaders joined Delaware Attorney General Kathy Jennings and Newark Mayor Travis McDermott today for a press conference, at which we expressed our shared commitment to enhanced safety along Main Street. The University has pledged to continue these discussions through meetings with the offices of AG Jennings and Mayor McDermott, in addition to DelDOT, in the near future. The University remains committed to advancing meaningful solutions, while the University’s Division of Student Life and Graduate College are connecting with students about effective advocacy, civic engagement and partnerships in order to support these efforts.
 We are also aware that members of the UD community may have witnessed the crash and its aftermath or have close relationships with the victims. We encourage everyone to become familiar with and use, as needed, the available University counseling and support resources that were shared in Tuesday evening’s message to the UD community. Counseling services are available at Warner Hall and through TimelyCare anytime, 24/7. Students with physical injuries or medical concerns relating to the incident can contact Student Health Services at 302-831-2226, Option 0, or visit Laurel Hall to meet with triage nurses available until 5 p.m. After hours, students can contact the Highmark Nurse line at 888-258-3428 or visit local urgent care centers (Newark Urgent Care at 324 E. Main Street, or ChristianaCare GoHealth at 550 S. College Avenue, Suite 115).
 During this difficult time in our community, we all need to continue supporting and standing by one another as we move forward together.
 Sincerely,
 Dennis AssanisPresident
 Laura CarlsonProvost
 José-Luis RieraVice President for Student Life
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/03-rag/img/rag-workflow.png
+++ b/03-rag/img/rag-workflow.png
--- a/03-rag/query.py
+++ b/03-rag/query.py
@ -0,0 +1,110 @@
 # query.py
 #
 # Run a query on a vector store
 #
 # August 2025
 # E. M. Furst
 from llama_index.core import (
    load_index_from_storage,
    StorageContext,
    Settings,
 )
 from llama_index.embeddings.huggingface import HuggingFaceEmbedding
 from llama_index.llms.ollama import Ollama
 from llama_index.core.prompts import PromptTemplate
 import os, time
 #
 # Globals
 #
 os.environ["TOKENIZERS_PARALLELISM"] = "false"
 # Embedding model used in vector store (this should match the one in build.py)
 embed_model = HuggingFaceEmbedding(cache_folder="./models",
    model_name="BAAI/bge-large-en-v1.5")
 # LLM model to use in query transform and generation
 llm = "command-r7b"
 #
 # Custom prompt for the query engine
 #
 PROMPT = PromptTemplate(
 """You are an expert research assistant. You are given top-ranked writing \
 excerpts (CONTEXT) and a user's QUERY.
 Instructions:
 - Base your response *only* on the CONTEXT.
 - The snippets are ordered from most to least relevant—prioritize insights \
 from earlier (higher-ranked) snippets.
 - Aim to reference *as many distinct* relevant files as possible (up to 10).
 - Do not invent or generalize; refer to specific passages or facts only.
 - If a passage only loosely matches, deprioritize it.
 Format your answer in two parts:
 1. **Summary Theme**
   Summarize the dominant theme from the relevant context in a few sentences.
 2. **Matching Files**
   Make a list of 10 matching files. The format for each should be:
   <filename> - <rationale tied to content. Include date if available.>
 CONTEXT:
 {context_str}
 QUERY:
 {query_str}
 Now provide the theme and list of matching files."""
 )
 #
 # Main program routine
 #
 def main():
    # Use a local model to generate -- in this case using Ollama
    Settings.llm = Ollama(
        model=llm,
        request_timeout=360.0,
    )
    # Load embedding model (same as used for vector store)
    Settings.embed_model = embed_model
    # Load persisted vector store + metadata
    storage_context = StorageContext.from_defaults(persist_dir="./storage")
    index = load_index_from_storage(storage_context)
    # Build regular query engine with custom prompt
    query_engine = index.as_query_engine(
        similarity_top_k=15,
        text_qa_template=PROMPT,
    )
    # Query
    while True:
        q = input("\nEnter a search topic or question (or 'exit'): ").strip()
        if q.lower() in ("exit", "quit"):
            break
        print()
        # Generate the response by querying the engine
        start_time = time.time()
        response = query_engine.query(q)
        end_time = time.time()
        # Return the query response and source documents
        print(response.response)
        print("\nSource documents:")
        for node in response.source_nodes:
            meta = getattr(node, "metadata", None) or node.node.metadata
            print(f"  {meta.get('file_name')}  {getattr(node, 'score', None)}")
        print(f"\nElapsed time: {(end_time-start_time):.1f} seconds")
 if __name__ == "__main__":
    main()
--- a/03-rag/requirements.txt
+++ b/03-rag/requirements.txt
@ -0,0 +1,5 @@
 llama-index-core
 llama-index-readers-file
 llama-index-llms-ollama
 llama-index-embeddings-huggingface
 python-dateutil
--- a/04-semantic-search/README.md
+++ b/04-semantic-search/README.md
@ -0,0 +1,276 @@
 # Large Language Models Part IV: Advanced Retrieval and Semantic Search
 **CHEG 667-013 — Chemical Engineering with Computers**  
 Department of Chemical and Biomolecular Engineering, University of Delaware
 ---
 ## Key idea
 Build a more effective search system by combining multiple retrieval strategies and re-ranking results.
 ## Key goals
 - Understand why simple vector search sometimes misses relevant results
 - Combine vector similarity with keyword matching (hybrid retrieval)
 - Use a cross-encoder to re-rank candidates
 - Compare LLM-synthesized answers with raw chunk retrieval
 ---
 > This is an advanced topic that builds on Part III (RAG). Make sure you are comfortable with building a vector store and querying it before proceeding.
 In Part III, we built a RAG system that embedded documents, retrieved the most similar chunks, and passed them to an LLM. That pipeline works well for many queries — but it has blind spots.
 Consider searching for a specific person's name, a date, or a technical term. Vector embeddings capture *meaning*, not exact strings. A query for "Dr. Rodriguez" might retrieve chunks about "faculty" or "professors" instead of chunks that literally contain the name. Similarly, a query about "October 2020" might return chunks about autumn events in general.
 This tutorial introduces three improvements:
 1. **Hybrid retrieval** — combine vector similarity (good at meaning) with BM25 keyword matching (good at exact terms)
 2. **Cross-encoder re-ranking** — use a second model to score each (query, chunk) pair more carefully
 3. **Raw retrieval mode** — inspect what the pipeline retrieves *before* the LLM sees it
 The result is a more effective search system that catches both semantic matches and exact-term matches.
 ## 1. How hybrid retrieval works
 In Part III, our pipeline was:
 ```
 Query → Embed → Vector similarity (top 15) → LLM → Response
 ```
 The improved pipeline is:
 ```
 Query → Embed ──→ Vector similarity (top 20) ──┐
                                                ├─→ Merge & deduplicate → Cross-encoder re-rank (top 15) → LLM → Response
 Query → Tokenize → BM25 term matching (top 20) ┘
 ```
 ### Vector retrieval (dense)
 This is what we used in Part III. The query is embedded into a vector, and the most similar chunk vectors are returned. This catches *semantic* matches — chunks with similar meaning, even if the words are different.
 ### BM25 retrieval (sparse)
 BM25 is a classical information retrieval algorithm based on term frequency. It scores documents by how often the query's words appear, adjusted for document length. It's fast, requires no embeddings, and excels at finding exact names, dates, and technical terms that embeddings might miss.
 ### Why combine them?
 Neither retriever is perfect alone:
 | Query type | Vector | BM25 |
 |------------|--------|------|
 | "documents about campus safety" | Good — captures meaning | Decent — matches "safety" |
 | "Dr. Rodriguez" | Weak — embeds as "person" concept | Strong — matches exact name |
 | "feelings of joy and accomplishment" | Strong — semantic match | Weak — might miss synonyms like "pride" |
 | "October 2020 announcement" | Moderate | Strong — matches exact date |
 By retrieving candidates from *both* and merging them, we get a broader candidate pool that covers both semantic and lexical matches.
 ### Cross-encoder re-ranking
 The merged candidates might number 30–40 chunks. We don't want to send all of them to the LLM — that wastes context and dilutes quality. A **cross-encoder** solves this by scoring each (query, chunk) pair directly.
 Unlike the bi-encoder embedding model (which encodes query and chunk separately), a cross-encoder reads the query and chunk *together* and produces a relevance score. This is more accurate but slower — which is why we use it as a second stage on a small candidate set, not on the entire corpus.
 We use `cross-encoder/ms-marco-MiniLM-L-12-v2` to re-rank the merged candidates down to the top 15 before passing them to the LLM.
 ## 2. Setup
 ### Prerequisites
 Everything from Part III, plus a few additional packages:
 ```bash
 pip install llama-index-retrievers-bm25 nltk
 ```
 A `requirements.txt` is provided with the full set of dependencies:
 ```bash
 pip install -r requirements.txt
 ```
 The cross-encoder model (`cross-encoder/ms-marco-MiniLM-L-12-v2`) will download automatically on first use via `sentence-transformers`. It is small (~130 MB).
 Make sure `ollama` is running and `command-r7b` is available:
 ```bash
 ollama pull command-r7b
 ```
 ## 3. Building the vector store
 The `build_store.py` script works like the one in Part III, with a few differences:
 - **Smaller chunks**: 256 tokens (vs. 500 in Part III) with 25 tokens of overlap
 - **Incremental updates**: by default, it only re-indexes new or modified files
 - **Full rebuild**: use `--rebuild` to start from scratch
 ```bash
 python build_store.py --rebuild
 ```
 Or for incremental updates after adding new files:
 ```bash
 python build_store.py
 ```
 ```
 Mode: incremental update
 Loading existing index from ./store...
 Index contains 42 documents
 Data directory contains 44 files
  New:       2
  Modified:  0
  Deleted:   0
  Unchanged: 42
 Indexing 2 file(s)...
 Index updated and saved to ./store
 ```
 ### Why smaller chunks?
 In Part III we used 500-token chunks. Here we use 256. Smaller chunks are more precise — each one represents a more focused piece of text. With a re-ranker to sort them, precision matters more than capturing broad context in a single chunk. The tradeoff: you get more chunks to search through, and each one has less surrounding context.
 > **Exercise 1:** Rebuild the store with different chunk sizes (128, 256, 512, 1024). How does the number of chunks change? How does it affect retrieval quality?
 ## 4. Querying with hybrid retrieval
 The `query_hybrid.py` script implements the full hybrid pipeline:
 ```bash
 python query_hybrid.py "Find documents about campus safety"
 ```
 The output shows retrieval statistics before the LLM response:
 ```
 Query: Find documents about campus safety
 Vector: 20, BM25: 20, overlap: 8, merged: 32, re-ranked to: 15
 Response:
 ...
 ```
 This tells you:
 - 20 candidates came from vector similarity
 - 20 came from BM25
 - 8 were found by both (overlap)
 - 32 unique candidates after merging
 - Re-ranked down to 15 for the LLM
 > **Exercise 2:** Run the same query using Part III's `query.py` (pure vector retrieval) and this tutorial's `query_hybrid.py`. Compare the source documents listed. Did hybrid retrieval find anything that pure vector missed?
 ## 5. Raw retrieval without an LLM
 Sometimes you want to see *exactly* what the retrieval pipeline found, without the LLM summarizing or rephrasing. The `retrieve.py` script runs the same hybrid retrieval and re-ranking, but outputs the raw chunk text instead of passing it to an LLM:
 ```bash
 python retrieve.py "Dr. Rodriguez"
 ```
 ```
 Query: Dr. Rodriguez
 Vector: 20, BM25: 20, overlap: 3, merged: 37, re-ranked to: 15
  vector-only: 17, bm25-only: 17, both: 3
 ================================================================================
 === [1] 2024_08_26_100859.txt  (score: 0.847)  [bm25-only]
 ================================================================================
 Dr. Rodriguez spoke at the opening ceremony, emphasizing the
 university's commitment to inclusive excellence...
 ================================================================================
 === [2] 2023_10_12_155349.txt  (score: 0.712)  [vector+bm25]
 ================================================================================
 ...
 ```
 Each chunk is annotated with its source: `vector-only`, `bm25-only`, or `vector+bm25`. This lets you see which retriever nominated each result.
 This is invaluable for debugging. If your LLM response seems off, check the raw retrieval first — the problem is often in *what* was retrieved, not how the LLM synthesized it.
 > **Exercise 3:** Run `retrieve.py` with a query that includes a specific name or date. How many of the top results are `bm25-only`? What would have been missed with pure vector retrieval?
 ## 6. Keyword search
 For a complementary approach, `search_keywords.py` does pure keyword matching with no embeddings at all. It uses NLTK part-of-speech tagging to extract meaningful terms from your query, then searches the raw text files with regex:
 ```bash
 python search_keywords.py "Hurricane Sandy recovery efforts"
 ```
 ```
 Query: Hurricane Sandy recovery efforts
 Extracted terms: hurricane sandy, recovery, efforts
 Found 12 matches across 3 files
 ============================================================
 --- 2012_11_02_164248.txt  (5 matches) ---
 ============================================================
  >>> 12: Hurricane Sandy has caused significant damage to our campus...
  ...
 ```
 This is a fallback when you know exactly what you're looking for and don't need semantic matching. It's also fast — no models, no vector store needed.
 > **Exercise 4:** Compare the results of `search_keywords.py`, `retrieve.py`, and `query_hybrid.py` on the same query. When is each approach most useful?
 ## 7. Comparing the three query modes
 | Script | Method | Uses LLM? | Best for |
 |--------|--------|-----------|----------|
 | `query_hybrid.py` | Hybrid (vector + BM25) + re-rank + LLM | Yes | Synthesized answers from documents |
 | `retrieve.py` | Hybrid (vector + BM25) + re-rank | No | Inspecting raw retrieval results |
 | `search_keywords.py` | POS-tagged keyword matching | No | Finding exact names, dates, terms |
 ## 8. Exercises
 > **Exercise 5:** The hybrid retrieval uses `VECTOR_TOP_K=20` and `BM25_TOP_K=20`. Experiment with different values. What happens if you set BM25 to 0 (effectively disabling it)? What about setting vector to 0?
 > **Exercise 6:** Change the re-ranker's `RERANK_TOP_N` from 15 to 5. How does this affect response quality? What about 30?
 > **Exercise 7:** Modify the prompt in `query_hybrid.py`. Try asking the model to respond as a specific persona, or to format the output differently (e.g., as a timeline, or as bullet points).
 > **Exercise 8:** Build this system over your own document collection — class notes, research papers, or a downloaded text corpus. Which retrieval mode works best for your documents?
 ## Additional resources and references
 ### LlamaIndex
 - Documentation: https://docs.llamaindex.ai/en/stable/
 - BM25 retriever: https://docs.llamaindex.ai/en/stable/examples/retrievers/bm25_retriever/
 ### Models
 - Ollama: https://ollama.com
 - Huggingface models: https://huggingface.co/models
 #### Models used in this tutorial
 | Model | Type | Role | Source |
 |-------|------|------|--------|
 | `command-r7b` | LLM (RAG-optimized) | Response generation | `ollama pull command-r7b` |
 | `BAAI/bge-large-en-v1.5` | Embedding (1024-dim) | Text -> vector encoding | Huggingface (auto-downloaded) |
 | `cross-encoder/ms-marco-MiniLM-L-12-v2` | Cross-encoder | Re-ranking candidates | Huggingface (auto-downloaded) |
 ### Further reading
 - Robertson & Zaragoza, *The Probabilistic Relevance Framework: BM25 and Beyond* (2009) — the theory behind BM25
 - Nogueira & Cho, *Passage Re-ranking with BERT* (2019) — cross-encoder re-ranking applied to information retrieval
--- a/04-semantic-search/build_store.py
+++ b/04-semantic-search/build_store.py
@ -0,0 +1,193 @@
 # build_store.py
 #
 # Build or update the vector store from journal entries in ./data.
 #
 # Default mode (incremental): loads the existing index and adds only
 # new or modified files.  Use --rebuild for a full rebuild from scratch.
 #
 # January 2026
 # E. M. Furst
 # Used Sonnet 4.5 to suggest changes; Opus 4.6 for incremental update
 from llama_index.core import (
    SimpleDirectoryReader,
    StorageContext,
    VectorStoreIndex,
    load_index_from_storage,
    Settings,
 )
 from pathlib import Path
 from llama_index.embeddings.huggingface import HuggingFaceEmbedding
 from llama_index.core.node_parser import SentenceSplitter
 import argparse
 import datetime
 import os
 import time
 # Shared constants
 DATA_DIR = Path("./data")
 PERSIST_DIR = "./store"
 EMBED_MODEL_NAME = "BAAI/bge-large-en-v1.5"
 CHUNK_SIZE = 256
 CHUNK_OVERLAP = 25
 def get_text_splitter():
    return SentenceSplitter(
        chunk_size=CHUNK_SIZE,
        chunk_overlap=CHUNK_OVERLAP,
        paragraph_separator="\n\n",
    )
 def rebuild():
    """Full rebuild: delete and recreate the vector store from scratch."""
    if not DATA_DIR.exists():
        raise FileNotFoundError(f"Data directory not found: {DATA_DIR.absolute()}")
    print(f"Loading documents from {DATA_DIR.absolute()}...")
    documents = SimpleDirectoryReader(str(DATA_DIR)).load_data()
    if not documents:
        raise ValueError("No documents found in data directory")
    print(f"Loaded {len(documents)} document(s)")
    print("Building vector index...")
    index = VectorStoreIndex.from_documents(
        documents,
        transformations=[get_text_splitter()],
        show_progress=True,
    )
    index.storage_context.persist(persist_dir=PERSIST_DIR)
    print(f"Index built and saved to {PERSIST_DIR}")
 def update():
    """Incremental update: add new files, re-index modified files, remove deleted files."""
    if not DATA_DIR.exists():
        raise FileNotFoundError(f"Data directory not found: {DATA_DIR.absolute()}")
    # Load existing index
    print(f"Loading existing index from {PERSIST_DIR}...")
    storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
    index = load_index_from_storage(storage_context)
    # Set transformations so index.insert() chunks correctly
    Settings.transformations = [get_text_splitter()]
    # Build lookup of indexed files: file_name -> (ref_doc_id, metadata)
    all_ref_docs = index.docstore.get_all_ref_doc_info()
    indexed = {}
    for ref_id, info in all_ref_docs.items():
        fname = info.metadata.get("file_name")
        if fname:
            indexed[fname] = (ref_id, info.metadata)
    print(f"Index contains {len(indexed)} documents")
    # Scan current files on disk
    disk_files = {f.name: f for f in sorted(DATA_DIR.glob("*.txt"))}
    print(f"Data directory contains {len(disk_files)} files")
    # Classify files
    new_files = []
    modified_files = []
    deleted_files = []
    unchanged = 0
    for fname, fpath in disk_files.items():
        if fname not in indexed:
            new_files.append(fpath)
        else:
            ref_id, meta = indexed[fname]
            # Compare file size and modification date
            stat = fpath.stat()
            disk_size = stat.st_size
            # Must use UTC to match SimpleDirectoryReader's date format
            disk_mdate = datetime.datetime.fromtimestamp(
                stat.st_mtime, tz=datetime.timezone.utc
            ).strftime("%Y-%m-%d")
            stored_size = meta.get("file_size")
            stored_mdate = meta.get("last_modified_date")
            if disk_size != stored_size or disk_mdate != stored_mdate:
                modified_files.append((fpath, ref_id))
            else:
                unchanged += 1
    for fname, (ref_id, meta) in indexed.items():
        if fname not in disk_files:
            deleted_files.append((fname, ref_id))
    # Report
    print(f"\n  New:       {len(new_files)}")
    print(f"  Modified:  {len(modified_files)}")
    print(f"  Deleted:   {len(deleted_files)}")
    print(f"  Unchanged: {unchanged}")
    if not new_files and not modified_files and not deleted_files:
        print("\nNothing to do.")
        return
    # Process deletions (including modified files that need re-indexing)
    for fname, ref_id in deleted_files:
        print(f"  Removing {fname}")
        index.delete_ref_doc(ref_id, delete_from_docstore=True)
    for fpath, ref_id in modified_files:
        print(f"  Re-indexing {fpath.name} (modified)")
        index.delete_ref_doc(ref_id, delete_from_docstore=True)
    # Process additions (new files + modified files)
    files_to_add = new_files + [fpath for fpath, _ in modified_files]
    if files_to_add:
        print(f"\nIndexing {len(files_to_add)} file(s)...")
        # Use "./" prefix to match paths from full build (pathlib strips it)
        docs = SimpleDirectoryReader(
            input_files=[f"./{f}" for f in files_to_add]
        ).load_data()
        for doc in docs:
            index.insert(doc)
    # Persist
    index.storage_context.persist(persist_dir=PERSIST_DIR)
    print(f"\nIndex updated and saved to {PERSIST_DIR}")
 def main():
    parser = argparse.ArgumentParser(
        description="Build or update the vector store from journal entries."
    )
    parser.add_argument(
        "--rebuild",
        action="store_true",
        help="Full rebuild from scratch (default: incremental update)",
    )
    args = parser.parse_args()
    # Configure embedding model
    embed_model = HuggingFaceEmbedding(model_name=EMBED_MODEL_NAME)
    Settings.embed_model = embed_model
    start = time.time()
    if args.rebuild:
        print("Mode: full rebuild")
        rebuild()
    else:
        print("Mode: incremental update")
        if not Path(PERSIST_DIR).exists():
            print(f"No existing index at {PERSIST_DIR}, doing full rebuild.")
            rebuild()
        else:
            update()
    elapsed = time.time() - start
    print(f"Done in {elapsed:.1f}s")
 if __name__ == "__main__":
    main()
--- a/04-semantic-search/data/2012_11_02_164248.txt
+++ b/04-semantic-search/data/2012_11_02_164248.txt
@ -0,0 +1,80 @@
 Subject: [UDEL-ALL-2128] Hurricane Sandy
 Date: 2012_11_02_164248
 To the University of Delaware community:
 We have much to be thankful for this week at the University of Delaware
 as we were spared the full force of Hurricane Sandy. Even as we breathe
 a sigh of relief and return to our normal activities, we are mindful of
 the many, many people in this region -- some of our students among them
 -- who were not so lucky. Our thoughts and prayers go out to them as
 they rebuild their communities.
 The potential impact of Sandy was a major concern for UD, with its
 thousands of people and 430+ buildings on 2,000 acres throughout the
 state. Many members of our University community worked hard over the
 last several days to help us weather this "Storm of the Century."
 Preparation and practice paid off as our emergency response team, led
 by the Office of Campus and Public Safety, began assessing the
 situation late last week and taking steps to ensure the safety of our
 people and facilities. When the storm came, the campus suffered only
 minor damage: wind-driven water getting into buildings through roofs,
 walls and foundations; very minimal power loss, with a couple of
 residential properties without power for only a few hours, thanks to
 quick repair from the City of Newark; and only three trees knocked down
 and destroyed, along with a lot of leaves and branches to clean up. The
 Georgetown research facilities were fortunate to sustain only minor
 leaks and flooding. The hardest hit area was the Lewes campus, which
 had flooding on its grounds but minimal damage to buildings.
 Throughout this time, the University's greatest asset continued to be
 its people -- staff members from a variety of units working as a team.
 A command center brought together representatives from across UD so
 that issues could be responded to immediately. Staffed around the
 clock, the center included Housing, Public Safety, Residence Life,
 Environmental Health and Safety, Facilities and Auxiliary Services,
 Emergency Management, and Communications and Marketing.
 The dedication of UD's employees and students was evident everywhere:
 Dining Services staff, faced with reduced numbers and limited
 deliveries, kept students fed, and supported employees who worked
 during the crisis; Residence Life staff and resident assistants made
 sure students who remained on campus had up-to-date information and
 supplies; staff in Student Health Services kept Laurel Hall open to
 respond to student health needs; Human Resources staff worked over the
 weekend to ensure that payroll was processed ahead of time; UD Police
 officers were on patrol and responding to issues as they arose; the UD
 Emergency Care Unit was at the ready; staff in Environmental Health and
 Safety aided in the safe shutdown of UD laboratories and monitored fire
 safety issues; Facilities staff continue to clean up debris left in
 Sandy's wake and repair damage to buildings; faculty are working with
 students to make up lost class time.
 Our UD Alert system served as an excellent tool for keeping students,
 parents and employees informed about the storm's implications for UD,
 and the University's homepage was the repository for the most current
 information and lists of events and activities that were canceled or
 rescheduled. Through the University's accounts on Facebook and Twitter,
 staff answered questions and addressed concerns, and faculty and staff
 across the campus fielded phone calls and emails.
 In short, a stellar job all around.
 On behalf of the students, families and employees who benefited from
 these efforts, I thank everyone for their dedication and service to the
 people of UD.
 Sincerely,
 Patrick T. Harker
 President
 ::::::::::::::::::::::::::::::::::::::::::: UD P.O. Box ::
 UDEL-ALL-2128 mailing list
 Online message archive
 and management at	      https://po-box.nss.udel.edu/
 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
--- a/04-semantic-search/data/2017_05_16_123456.txt
+++ b/04-semantic-search/data/2017_05_16_123456.txt
@ -0,0 +1,85 @@
 Subject: Employee Appreciation Week
 Date: 2017_05_16_123456
 To the University of Delaware Community - President Dennis Assanis
 /* Smartphones (landscape) ----------- */
@media only screen and (max-width: 568px) {
 	img.logo {width:413px;}
 	}
 May 16, 2017
 Dear colleague,
 Our first year together has been one of amazing accomplishments and exciting opportunities. At the heart of our success has been you — the University of Delaware’s exceptional faculty and staff. To thank you and celebrate everything you do, we are launching our first Employee Appreciation Week.
 The full week of events includes:
  Monday, June 5—UDidIt Picnic 
  Tuesday, June 6—Self-Care Day 
  Wednesday, June 7—UD Spirit Day 
  Thursday, June 8—Flavors of UD 
  Friday, June 9—Employee Appreciation Night at the Blue Rocks 
 The week is a collaborative effort by Employee Health & Wellbeing and Human Resources. You can get all the details here.  
 We are dedicated to cultivating together an environment where employees are happy, healthy and continue to bring their best selves to work each day. The work you do benefits our students, our community and the world. I am truly grateful for your talents, skills, ideas and enduring commitment to the University.  
 Eleni and I hope you enjoy Employee Appreciation Week with your team and your family, and we look forward to seeing you at the many events.
 Best,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE 19716   •   USA  •   (302) 831-2792   •   www.udel.edu/president
 img { display: block !important; }
--- a/04-semantic-search/data/2018_05_21_110335.txt
+++ b/04-semantic-search/data/2018_05_21_110335.txt
@ -0,0 +1,79 @@
 Subject: Robin Morgan named UD's 11th provost
 Date: 2018_05_21_110335
 Robin Morgan Appointed Provost - University of Delaware
 						May 21, 2018
 						Dear UD Community,
 I am pleased to announce that, after a highly competitive national search, I have appointed Robin Morgan as the University of Delaware’s new provost, effective July 1. She will become the University of Delaware’s 11th provost, and the first woman to serve in this role in a permanent capacity since the position was created at UD in 1950.
 Over the last seven months, Dr. Morgan already has assembled an impressive record as interim provost, most notably in her stewardship of new cluster hires among our faculty and her leadership as we move toward the creation of the graduate college.
 Before working closely with her, I knew Dr. Morgan as a highly respected educator and scholar, but after watching her in action, I am equally impressed with her abilities to lead, inspire and effect change. Her energy, integrity, analytical mind, and innate knack for bringing people together, combined with her dedication and loyalty to UD, are great assets.
 Dr. Morgan has a distinguished record of service to this University as a faculty member since 1985. After serving as acting dean of the College of Agriculture and Natural Resources for a year, she was named dean in 2002, serving in that role for 10 years, a period of significant growth and change for the college. From 2014-16, she served as acting chair of the Department of Biological Sciences, and she had been chair of the department from 2016 until her appointment as interim provost.
 We will continue to benefit from Dr. Morgan’s deep knowledge of the University, her proven leadership across all aspects of teaching, research and administration, and her dedication to UD as she continues her career as provost.
 I am looking forward to building on our close working relationship, and I am excited by all we will accomplish to take the University of Delaware forward. Please join me in congratulating her on this next chapter in her career.
 						Sincerely,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE 19716   •   USA  •   (302) 831-2111   •   www.udel.edu/president
 img { display: block !important; }
--- a/04-semantic-search/data/2020_03_29_141635.txt
+++ b/04-semantic-search/data/2020_03_29_141635.txt
@ -0,0 +1,77 @@
 Subject: Momentum and Resilience: Our UD Spring Semester Resumes
 Date: 2020_03_29_141635
 A Message from President Dennis Assanis
 Dear UD Community,
 As the University of Delaware is ready to resume the spring semester tomorrow, March 30, I want to share with all of you a special message recorded from the office in my home. Thank you all for your support at this challenging time, particularly our faculty and staff for your Herculean efforts to convert our classes from face to face instruction to online teaching and learning.
 Best of luck with the semester ahead. As we all work remotely, please stay healthy, and stay connected!
 Sincerely,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE 19716   •   USA  •   (302) 831-2111   •   udel.edu/president
--- a/04-semantic-search/data/2023_09_19_085321.txt
+++ b/04-semantic-search/data/2023_09_19_085321.txt
@ -0,0 +1,75 @@
 Subject: National Voter Registration Day: Get Involved
 Date: 2023_09_19_085321
 National Voter Registration Day: Get Involved
 						September 19, 2023
 						Dear UD Community,
 						Do you want to make a difference in the world? Today is a good day to start.
 						This is National Voter Registration Day, an opportunity to make sure your voice will be heard in upcoming local, state and national elections. Voting is the most fundamental way that we engage in our democracy, effect change in society, work through our political differences and choose our leaders for the future. The voting rights we enjoy have been secured through the hard work and sacrifice of previous generations, and it is essential that everyone who is eligible to vote remains committed to preserving and exercising those rights.
 						At the University of Delaware, the Student Voting and Civic Engagement Committee — representing students, faculty and staff — is leading a non-partisan effort to encourage voting and help voters become better informed about the issues that matter to them. The Make It Count voter registration drive is scheduled for 2-6 p.m. today on The Green, with games, music and the opportunity to register through the TurboVote app, which also allows users to request an absentee ballot and sign up for election reminders. The committee is planning additional events this academic year to promote voting, education and civil discourse as the nation heads into the 2024 election season.
 						Being a Blue Hen means sharing a commitment to creating a better world. And being a registered, engaged and informed voter is one of the best ways for all of us to achieve that vision.
 						Sincerely,  
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu/president
--- a/04-semantic-search/data/2023_10_12_155349.txt
+++ b/04-semantic-search/data/2023_10_12_155349.txt
@ -0,0 +1,77 @@
 Subject: Affirming our position and purpose
 Date: 2023_10_12_155349
 Affirming our position and purpose | A message from UD President Dennis Assanis
 						October 12, 2023
 						Dear UD Community,
 Since my message yesterday, I have talked to many members of our community who — like me — are devastated and appalled by the terrorist attacks on Israel and the ongoing loss of life that has taken place in the Middle East.
 I want to be sure that our position is very clear: We at the University of Delaware unequivocally condemn the horrific attacks by Hamas terrorists upon Israel that have shaken the world. The atrocities of crime, abduction, hostage-taking and mass murder targeted against Jewish civilians will forever remain a stain on human history. Our community’s foundation of civility and respect has been challenged to an unimaginable extent in light of the antisemitic brutalities that have been committed against innocent victims.
 As your president, I wish words could calm the heartache and ease the fear and grief. Unfortunately, we all know that events as complicated and devastating as those taking place in the Middle East right now will continue to evolve. The longstanding humanitarian crisis needs to be acknowledged, and we should not equate the terrorist group Hamas with innocent Palestinian, Muslim and Arab people. The ensuing war-inflicted pain, suffering and death that continues to play out across the region, including Gaza, is heartbreaking for all.
 We must remember that, first and foremost, UD is a place of learning. As we engage in difficult conversations about the longstanding conflicts in the Middle East, we should always strive to do so safely, with mutual respect and without bias or judgement. I encourage our students, faculty and staff to continues organizing events to educate and unite our community. Please seize these opportunities not only as individuals, but as members of a true community defined by the freedoms that we treasure so very deeply.
 So, my message to you all is to have hope, to support each other, and to realize that the perspectives and feelings we are all experiencing right now — many of which uniquely connect to our personal backgrounds — matter. Please remember this as you walk across campus, sit in your next classroom, share experiences with other members of our community, or simply take time to reflect.
 Respectfully,
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu/president
--- a/04-semantic-search/data/2024_08_26_100859.txt
+++ b/04-semantic-search/data/2024_08_26_100859.txt
@ -0,0 +1,82 @@
 Subject: A warm welcome to our UD community!
 Date: 2024_08_26_100859
 A warm welcome to our UD community!
 						August 26, 2024
 						Dear UD Community,
 I love the beginning of every new academic year and the renewed energy and sense of anticipation that it brings to every member of our campus community. The large influx of new people and ideas that come along with each new start is truly invigorating. Whether you are a new or continuing student, faculty or staff member, on behalf of everyone in our community, I want to extend a very warm welcome to you and thank you for everything you contribute, individually and collectively, to make the University of Delaware such a unique place.
 Students, your fresh perspectives, your passion for learning, and your dreams and aspirations for the boundless possibilities that lie ahead are inspiring. Faculty, your intellectual energy, your insights and expertise, and above all, your genuine interest in transferring and sharing your knowledge with all of us are the beating heart of our institution. And to all our staff, your hard work and dedicated talents provide the essential support and services to help ensure our students are successful in all their personal, academic and career pursuits.
 Here at UD, our shared purpose is to cultivate learning, develop knowledge and foster the free exchange of ideas. The connections we make and the relationships we build help advance the mission of the University. Our focus on academic excellence in all fields of study and our opportunities for groundbreaking research rely on our endless curiosity, mutual respect and open mindedness. Together, we are stronger.
 This sense of connection and belonging at UD is fundamental to our campus culture. Your willingness to hear and consider all voices and viewpoints is critical to shaping the vibrant and inclusive culture of our entire institution. Only when we commit to constructive growth, based on a foundation of civility and respect for ourselves and each other, can we realize true progress.  Empowered by diverse perspectives, it is the opportunities to advance ideas that enrich learning and create positive impact in the world that unite all of us.
 To celebrate the new semester and welcome our undergraduate Class of 2028, all members of our community are invited to attend the Twilight Induction ceremony tonight at 7:30 p.m. on the north side of Memorial Hall or online on Facebook Live.
 As your President, I am so excited by all that we can accomplish together throughout this academic year. My wife, Eleni, and I wish you all the best at the start of this new semester and beyond. We look forward to meeting you on campus!
 						Sincerely,  
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/04-semantic-search/data/2025_02_13_160414.txt
+++ b/04-semantic-search/data/2025_02_13_160414.txt
@ -0,0 +1,80 @@
 Subject: UPDATE: Recent Executive Orders
 Date: 2025_02_13_160414
 UPDATE: Recent Executive Orders | University of Delaware
 						Feb. 13, 2025
 						Dear UD Community,
 						I know many of you continue to experience disruption and anxiety stemming from the recent federal actions and executive orders regarding a multitude of issues — from research funding to education, human rights, and immigration among other areas. As I communicated to the University of Delaware community in my Jan. 28 campus message and my Feb. 3 comments to the Faculty Senate, we will do everything we can to minimize disruption to UD students, faculty and staff while remaining in compliance with federal law.
 						To support our community, we have created this resource page that will be updated regularly with information for UD students, faculty and staff regarding ongoing federal actions, directives and developments, including guidance in response to changing conditions. Also, this page from the Research Office contains specific guidance related to research projects and grants. In parallel, we will continue to advocate on behalf of the University’s interests regarding any impact that federal or state actions could have on our students, faculty and staff.
 						One example is our response this week related to the federal action to impose a 15 limit on reimbursements for indirect administrative costs (Facilities and Administrative, or F&A costs) for all National Institutes of Health (NIH) research grants. This immediate cut in funding would have a devastating impact on all biomedical, health and life science advances and human wellness, including here at UD. In response, the Delaware Attorney General filed a lawsuit jointly with 21 other state attorneys general. The University supported the Attorney General’s lawsuit by submitting a declaration detailing the impact of the NIH rate cap on the institution. Fortunately, the attorneys general were successful, and a temporary restraining order was granted on Monday. Further, the Association of Public and Land-grant Universities, the Association of American Universities, and the American Council on Education announced a similar lawsuit. 
 						As we navigate this rapidly evolving landscape together, our values will continue to be at the heart of our community. We will continue to foster an atmosphere that promotes the free exchange of ideas and opinions; we will continue to welcome and value people of different backgrounds, perspectives and learning experiences; and we will continue to encourage respect and civility toward everyone.
 						Please know that my leadership team and I are here to help and support our community during this time. Feel free to submit any questions pertaining to these matters here, and we will do our best to add relevant information on the resource pages. I deeply appreciate your resilience and patience as we continue to work together to advance the important mission of our University.
 						Sincerely,  
 								Dennis AssanisPresident
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/04-semantic-search/data/2025_04_29_230614.txt
+++ b/04-semantic-search/data/2025_04_29_230614.txt
@ -0,0 +1,87 @@
 Subject: Extending condolences and offering support
 Date: 2025_04_29_230614
 Extending condolences and offering support
 April 29, 2025
 Dear UD Community,
 It is with a heavy heart that we share this information with you. Earlier today, a University of Delaware student died in a traffic accident on Main Street near campus, and several other people, including other UD students, suffered injuries. There is no ongoing threat to the University community.
 University of Delaware Police are continuing to work with the Newark Police Department, which is actively investigating the incident. As a result, information is limited and the Newark Police Department is not releasing the victims’ names at this time, pending family notification.
 This is a terrible tragedy for everyone in our UD community. We speak for the entire University in offering our condolences to the families, friends and classmates of the victims, and keep the other members of our community in our thoughts who may have witnessed the crash and its aftermath. The safety of our entire community remains our top priority, and we will continue to work with our partners in city and state government to address safety concerns around and on the UD campus. 
 As we all begin to cope with this traumatic incident, we encourage you to support one another and reach out for additional help from the UD resources listed below as needed.
 Sincerely,
 Dennis AssanisPresident
 José-Luis RieraVice President for Student Life
 Support and resources
 	Center for Counseling and Student Development
 			Counselors and Student Life staff are available in Warner Hall 101 on Wednesday, April 30, from 9 a.m. to 3 p.m. for counseling services. 
 	TimelyCare — A virtual health and wellbeing platform available 24/7 for UD students
 	Student Advocacy and Support — Available to assist students who need support navigating University resources or complex issues. Call 302-831-8939 or email studentsupport@udel.edu to schedule an appointment.
 	ComPsych® GuidanceResources® — Mental health support for UD benefited employees. Access services through the link or call 877-527-4742 for support.
 	Additional safety and wellness resources — Information about UD Police, Student Health Services and other services.
 	Information about the UD Alert, the LiveSafe app and safety notification communication.
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/04-semantic-search/data/2025_04_30_160615.txt
+++ b/04-semantic-search/data/2025_04_30_160615.txt
@ -0,0 +1,76 @@
 Subject: Sharing our grief, enhancing safety
 Date: 2025_04_30_160615
 Sharing our grief, enhancing safety
 April 30, 2025
 Dear UD Community,
 Since last evening’s crash on Main Street that took the life of a University of Delaware graduate student (whose identity is being withheld at this time) and injured several others, we have been struggling to cope with the pain of this senseless tragedy. Throughout the UD community, we are all feeling the deep ache of loss, and we will continue to work through our grief together.
 Today, Newark Police announced an arrest in connection with the crash, reiterating that there is no ongoing threat to the community. 
 Main Street is where we eat, shop and share our lives with our friends, families and classmates. Because it is part of the state’s roadway systems, we have been working with local and state officials this year, including our partners at Delaware Department of Transportation, to address traffic safety on and around Main Street. In the wake of this tragedy, we will reinforce and accelerate those efforts. We recognize there isn’t a simple solution, particularly when these tragedies involve actions taken by individuals that may not be stopped by changes to roadways or infrastructure. However, this incident underscores that our collective efforts must take on renewed urgency.
 University leaders joined Delaware Attorney General Kathy Jennings and Newark Mayor Travis McDermott today for a press conference, at which we expressed our shared commitment to enhanced safety along Main Street. The University has pledged to continue these discussions through meetings with the offices of AG Jennings and Mayor McDermott, in addition to DelDOT, in the near future. The University remains committed to advancing meaningful solutions, while the University’s Division of Student Life and Graduate College are connecting with students about effective advocacy, civic engagement and partnerships in order to support these efforts.
 We are also aware that members of the UD community may have witnessed the crash and its aftermath or have close relationships with the victims. We encourage everyone to become familiar with and use, as needed, the available University counseling and support resources that were shared in Tuesday evening’s message to the UD community. Counseling services are available at Warner Hall and through TimelyCare anytime, 24/7. Students with physical injuries or medical concerns relating to the incident can contact Student Health Services at 302-831-2226, Option 0, or visit Laurel Hall to meet with triage nurses available until 5 p.m. After hours, students can contact the Highmark Nurse line at 888-258-3428 or visit local urgent care centers (Newark Urgent Care at 324 E. Main Street, or ChristianaCare GoHealth at 550 S. College Avenue, Suite 115).
 During this difficult time in our community, we all need to continue supporting and standing by one another as we move forward together.
 Sincerely,
 Dennis AssanisPresident
 Laura CarlsonProvost
 José-Luis RieraVice President for Student Life
 						University of Delaware   •   Newark, DE   •   udel.edu
--- a/04-semantic-search/query_hybrid.py
+++ b/04-semantic-search/query_hybrid.py
@ -0,0 +1,176 @@
 # query_hybrid.py
 # Hybrid retrieval: BM25 (sparse) + vector similarity (dense) + cross-encoder
 #
 # Combines two retrieval strategies to catch both exact term matches and
 # semantic similarity:
 #   1. Retrieve top-20 via vector similarity (bi-encoder, catches meaning)
 #   2. Retrieve top-20 via BM25 (term frequency, catches exact names/dates)
 #   3. Merge and deduplicate candidates by node ID
 #   4. Re-rank the union with a cross-encoder -> top-15
 #   5. Pass re-ranked chunks to LLM for synthesis
 #
 # The cross-encoder doesn't care where candidates came from -- it scores
 # each (query, chunk) pair on its own merits. BM25's job is just to
 # nominate candidates that vector similarity might miss.
 #
 # E.M.F. February 2026
 # Environment vars must be set before importing huggingface/transformers
 # libraries, because huggingface_hub.constants evaluates HF_HUB_OFFLINE
 # at import time.
 import os
 os.environ["TOKENIZERS_PARALLELISM"] = "false"
 os.environ["SENTENCE_TRANSFORMERS_HOME"] = "./models"
 os.environ["HF_HUB_OFFLINE"] = "1"
 from llama_index.core import (
    StorageContext,
    load_index_from_storage,
    Settings,
    get_response_synthesizer,
 )
 from llama_index.embeddings.huggingface import HuggingFaceEmbedding
 from llama_index.llms.ollama import Ollama
 from llama_index.core.prompts import PromptTemplate
 from llama_index.core.postprocessor import SentenceTransformerRerank
 from llama_index.retrievers.bm25 import BM25Retriever
 import sys
 #
 # Globals
 #
 # Embedding model (must match build_store.py)
 EMBED_MODEL = HuggingFaceEmbedding(cache_folder="./models", model_name="BAAI/bge-large-en-v1.5", local_files_only=True)
 # LLM model for generation
 LLM_MODEL = "command-r7b"
 # Cross-encoder model for re-ranking (cached in ./models/)
 RERANK_MODEL = "cross-encoder/ms-marco-MiniLM-L-12-v2"
 RERANK_TOP_N = 15
 # Retrieval parameters
 VECTOR_TOP_K = 20   # candidates from vector similarity
 BM25_TOP_K = 20     # candidates from BM25 term matching
 #
 # Custom prompt -- same as v3
 #
 PROMPT = PromptTemplate(
 """You are a precise research assistant analyzing excerpts from a personal journal collection.
 Every excerpt below has been selected and ranked for relevance to the query.
 CONTEXT (ranked by relevance):
 {context_str}
 QUERY:
 {query_str}
 Instructions:
 - Answer ONLY using information explicitly present in the CONTEXT above
 - Examine ALL provided excerpts, not just the top few -- each one was selected for relevance
 - Be specific: quote or closely paraphrase key passages and cite their file names
 - When multiple files touch on the query, note what each one contributes
 - If the context doesn't contain enough information to answer fully, say so
 Your response should:
 1. Directly answer the query, drawing on as many relevant excerpts as possible
 2. Reference specific files and their content (e.g., "In <filename>, ...")
 3. End with a list of all files that contributed to your answer, with a brief note on each
 If the context is insufficient, explain what's missing."""
 )
 def main():
    # Configure LLM and embedding model
    # for local model using ollama
    # Note: Ollama temperature defaults to 0.8
    Settings.llm = Ollama(
        model=LLM_MODEL,
        temperature=0.3,
        request_timeout=360.0,
        context_window=8000,
    )
    # Use OpenAI API:
    # from llama_index.llms.openai import OpenAI
    # Settings.llm = OpenAI(
    #     model="gpt-4o-mini",   # or "gpt-4o" for higher quality
    #     temperature=0.3,
    # )
    Settings.embed_model = EMBED_MODEL
    # Load persisted vector store
    storage_context = StorageContext.from_defaults(persist_dir="./store")
    index = load_index_from_storage(storage_context)
    # --- Retrievers ---
    # Vector retriever (dense: cosine similarity over embeddings)
    vector_retriever = index.as_retriever(similarity_top_k=VECTOR_TOP_K)
    # BM25 retriever (sparse: term frequency scoring)
    bm25_retriever = BM25Retriever.from_defaults(
        index=index,
        similarity_top_k=BM25_TOP_K,
    )
    # Cross-encoder re-ranker
    reranker = SentenceTransformerRerank(
        model=RERANK_MODEL,
        top_n=RERANK_TOP_N,
    )
    # --- Query ---
    if len(sys.argv) < 2:
        print("Usage: python query_hybrid_bm25_v4.py QUERY_TEXT")
        sys.exit(1)
    q = " ".join(sys.argv[1:])
    # Retrieve from both sources
    vector_nodes = vector_retriever.retrieve(q)
    bm25_nodes = bm25_retriever.retrieve(q)
    # Merge and deduplicate by node ID
    seen_ids = set()
    merged = []
    for node in vector_nodes + bm25_nodes:
        node_id = node.node.node_id
        if node_id not in seen_ids:
            seen_ids.add(node_id)
            merged.append(node)
    # Re-rank the merged candidates with cross-encoder
    reranked = reranker.postprocess_nodes(merged, query_str=q)
    # Report retrieval stats
    n_vector_only = len([n for n in vector_nodes if n.node.node_id not in {b.node.node_id for b in bm25_nodes}])
    n_bm25_only = len([n for n in bm25_nodes if n.node.node_id not in {v.node.node_id for v in vector_nodes}])
    n_both = len(vector_nodes) + len(bm25_nodes) - len(merged)
    print(f"\nQuery: {q}")
    print(f"Vector: {len(vector_nodes)}, BM25: {len(bm25_nodes)}, "
          f"overlap: {n_both}, merged: {len(merged)}, re-ranked to: {len(reranked)}")
    # Synthesize response with LLM
    synthesizer = get_response_synthesizer(text_qa_template=PROMPT)
    response = synthesizer.synthesize(q, nodes=reranked)
    # Output
    print("\nResponse:\n")
    print(response.response)
    print("\nSource documents:")
    for node in response.source_nodes:
        meta = getattr(node, "metadata", None) or node.node.metadata
        score = getattr(node, "score", None)
        print(f"{meta.get('file_name')}  {meta.get('file_path')}  {score:.3f}")
 if __name__ == "__main__":
    main()
--- a/04-semantic-search/requirements.txt
+++ b/04-semantic-search/requirements.txt
@ -0,0 +1,7 @@
 llama-index-core
 llama-index-readers-file
 llama-index-llms-ollama
 llama-index-embeddings-huggingface
 llama-index-retrievers-bm25
 nltk
 sentence-transformers
--- a/04-semantic-search/retrieve.py
+++ b/04-semantic-search/retrieve.py
@ -0,0 +1,140 @@
 # retrieve.py
 # Hybrid verbatim chunk retrieval: BM25 + vector search + cross-encoder, no LLM.
 #
 # Same hybrid retrieval as query_hybrid.py but outputs raw chunk text
 # instead of LLM synthesis. Useful for inspecting what the hybrid pipeline
 # retrieves.
 #
 # Each chunk is annotated with its source (vector, BM25, or both) so you can
 # see which retriever nominated it.
 #
 # E.M.F. February 2026
 # Environment vars must be set before importing huggingface/transformers
 # libraries, because huggingface_hub.constants evaluates HF_HUB_OFFLINE
 # at import time.
 import os
 os.environ["TOKENIZERS_PARALLELISM"] = "false"
 os.environ["SENTENCE_TRANSFORMERS_HOME"] = "./models"
 os.environ["HF_HUB_OFFLINE"] = "1"
 from llama_index.core import (
    StorageContext,
    load_index_from_storage,
    Settings,
 )
 from llama_index.embeddings.huggingface import HuggingFaceEmbedding
 from llama_index.core.postprocessor import SentenceTransformerRerank
 from llama_index.retrievers.bm25 import BM25Retriever
 import sys
 import textwrap
 #
 # Globals
 #
 # Embedding model (must match build_store.py)
 EMBED_MODEL = HuggingFaceEmbedding(cache_folder="./models", model_name="BAAI/bge-large-en-v1.5", local_files_only=True)
 # Cross-encoder model for re-ranking (cached in ./models/)
 RERANK_MODEL = "cross-encoder/ms-marco-MiniLM-L-12-v2"
 RERANK_TOP_N = 15
 # Retrieval parameters
 VECTOR_TOP_K = 20
 BM25_TOP_K = 20
 # Output formatting
 WRAP_WIDTH = 80
 def main():
    # No LLM needed -- set embed model only
    Settings.embed_model = EMBED_MODEL
    # Load persisted vector store
    storage_context = StorageContext.from_defaults(persist_dir="./store")
    index = load_index_from_storage(storage_context)
    # --- Retrievers ---
    vector_retriever = index.as_retriever(similarity_top_k=VECTOR_TOP_K)
    bm25_retriever = BM25Retriever.from_defaults(
        index=index,
        similarity_top_k=BM25_TOP_K,
    )
    # Cross-encoder re-ranker
    reranker = SentenceTransformerRerank(
        model=RERANK_MODEL,
        top_n=RERANK_TOP_N,
    )
    # Query
    if len(sys.argv) < 2:
        print("Usage: python retrieve_hybrid_raw.py QUERY_TEXT")
        sys.exit(1)
    q = " ".join(sys.argv[1:])
    # Retrieve from both sources
    vector_nodes = vector_retriever.retrieve(q)
    bm25_nodes = bm25_retriever.retrieve(q)
    # Track which retriever found each node
    vector_ids = {n.node.node_id for n in vector_nodes}
    bm25_ids = {n.node.node_id for n in bm25_nodes}
    # Merge and deduplicate by node ID
    seen_ids = set()
    merged = []
    for node in vector_nodes + bm25_nodes:
        node_id = node.node.node_id
        if node_id not in seen_ids:
            seen_ids.add(node_id)
            merged.append(node)
    # Re-rank merged candidates
    reranked = reranker.postprocess_nodes(merged, query_str=q)
    # Retrieval stats
    n_both = len(vector_ids & bm25_ids)
    n_vector_only = len(vector_ids - bm25_ids)
    n_bm25_only = len(bm25_ids - vector_ids)
    print(f"\nQuery: {q}")
    print(f"Vector: {len(vector_nodes)}, BM25: {len(bm25_nodes)}, "
          f"overlap: {n_both}, merged: {len(merged)}, re-ranked to: {len(reranked)}")
    print(f"  vector-only: {n_vector_only}, bm25-only: {n_bm25_only}, both: {n_both}\n")
    # Output re-ranked chunks with source annotation
    for i, node in enumerate(reranked, 1):
        meta = getattr(node, "metadata", None) or node.node.metadata
        score = getattr(node, "score", None)
        file_name = meta.get("file_name", "unknown")
        text = node.get_content()
        node_id = node.node.node_id
        # Annotate source
        in_vector = node_id in vector_ids
        in_bm25 = node_id in bm25_ids
        if in_vector and in_bm25:
            source = "vector+bm25"
        elif in_bm25:
            source = "bm25-only"
        else:
            source = "vector-only"
        print("=" * WRAP_WIDTH)
        print(f"=== [{i}] {file_name}  (score: {score:.3f})  [{source}]")
        print("=" * WRAP_WIDTH)
        for line in text.splitlines():
            if line.strip():
                print(textwrap.fill(line, width=WRAP_WIDTH))
            else:
                print()
        print()
 if __name__ == "__main__":
    main()
--- a/04-semantic-search/run_query.sh
+++ b/04-semantic-search/run_query.sh
@ -0,0 +1,41 @@
 #!/bin/bash
 # This shell script will handle I/O for the python query engine
 # It will take a query and return the formatted results
 # E.M.F. August 2025
 # Usage: ./run_query.sh 
 QUERY_SCRIPT="query_hybrid.py"
 VENV_DIR=".venv"
 # Activate the virtual environment
 if [ -d "$VENV_DIR" ]; then
    source "$VENV_DIR/bin/activate"
    echo "Activated virtual environment: $VENV_DIR"
 else
    echo "Error: Virtual environment not found at '$VENV_DIR'" >&2
    echo "Create one with: python3 -m venv $VENV_DIR" >&2
    exit 1
 fi
 echo -e "Current query engine is $QUERY_SCRIPT\n"
 # Loop until input is "exit"
 while true; do
    read -p "Enter your query (or type 'exit' to quit): " query
    if [ "$query" == "exit" ] || [ "$query" == "quit" ] || [ "$query" == "" ] ; then
        echo "Exiting..."
        break
    fi
    time_start=$(date +%s)
    # Call the python script with the query and format the output
    python3 $QUERY_SCRIPT --query "$query" | \
        expand | sed -E 's|(.* )(.*/data)|\1./data|' | fold -s -w 131
    time_end=$(date +%s)
    elapsed=$((time_end - time_start))
    echo -e "Query processed in $elapsed seconds.\n"
    echo $query >> query.log
 done
--- a/04-semantic-search/run_retrieve.sh
+++ b/04-semantic-search/run_retrieve.sh
@ -0,0 +1,40 @@
 #!/bin/bash
 # This shell script will handle I/O for the python query engine
 # It will take a query and return the formatted results
 # E.M.F. August 2025
 # Usage: ./run_query.sh 
 QUERY_SCRIPT="retrieve.py"
 VENV_DIR=".venv"
 # Activate the virtual environment
 if [ -d "$VENV_DIR" ]; then
    source "$VENV_DIR/bin/activate"
    echo "Activated virtual environment: $VENV_DIR"
 else
    echo "Error: Virtual environment not found at '$VENV_DIR'" >&2
    echo "Create one with: python3 -m venv $VENV_DIR" >&2
    exit 1
 fi
 echo -e "$QUERY_SCRIPT -- retrieve vector store chunks based on similaity + BM25 with reranking.\n"
 # Loop until input is "exit"
 while true; do
    read -p "Enter your query (or type 'exit' to quit): " query
    if [ "$query" == "exit" ] || [ "$query" == "quit" ] || [ "$query" == "" ] ; then
        echo "Exiting..."
        break
    fi
    time_start=$(date +%s)
    # Call the python script with the query and format the output
    python3 $QUERY_SCRIPT --query "$query" | \
        expand | sed -E 's|(.* )(.*/data)|\1./data|' | fold -s -w 131
    time_end=$(date +%s)
    elapsed=$((time_end - time_start))
    echo -e "Query processed in $elapsed seconds.\n"
 done
--- a/04-semantic-search/search_keywords.py
+++ b/04-semantic-search/search_keywords.py
@ -0,0 +1,189 @@
 # search_keywords.py
 # Keyword search: extract terms from a query using POS tagging, then grep
 # across journal files for matches.
 #
 # Complements the vector search pipeline by catching exact names, places,
 # and dates that embeddings can miss. No vector store or LLM needed.
 #
 # Term extraction uses NLTK POS tagging to keep nouns (NN*), proper nouns
 # (NNP*), and adjectives (JJ*) -- skipping stopwords and function words
 # automatically. Consecutive proper nouns are joined into multi-word phrases
 # (e.g., "Robert Wright" stays as one search term, not "robert" + "wright").
 #
 # E.M.F. February 2026
 import os
 import sys
 import re
 from pathlib import Path
 import nltk
 #
 # Globals
 #
 DATA_DIR = Path("./data")
 CONTEXT_LINES = 2       # lines of context around each match
 MAX_MATCHES_PER_FILE = 3  # cap matches shown per file to avoid flooding
 # POS tags to keep: nouns, proper nouns, adjectives
 KEEP_TAGS = {"NN", "NNS", "NNP", "NNPS", "JJ", "JJS", "JJR"}
 # Proper noun tags (consecutive runs are joined as phrases)
 PROPER_NOUN_TAGS = {"NNP", "NNPS"}
 # Minimum word length to keep (filters out short noise)
 MIN_WORD_LEN = 3
 def ensure_nltk_data():
    """Download NLTK data if not already present."""
    for resource, name in [
        ("tokenizers/punkt_tab", "punkt_tab"),
        ("taggers/averaged_perceptron_tagger_eng", "averaged_perceptron_tagger_eng"),
    ]:
        try:
            nltk.data.find(resource)
        except LookupError:
            print(f"Downloading NLTK resource: {name}")
            nltk.download(name, quiet=True)
 def extract_terms(query):
    """Extract key terms from a query using POS tagging.
    Tokenizes the query, runs POS tagging, and keeps nouns, proper nouns,
    and adjectives. Consecutive proper nouns (NNP/NNPS) are joined into
    multi-word phrases (e.g., "Robert Wright" → "robert wright").
    Returns a list of terms (lowercase), phrases listed first.
    """
    tokens = nltk.word_tokenize(query)
    tagged = nltk.pos_tag(tokens)
    phrases = []     # multi-word proper noun phrases
    single_terms = []  # individual nouns/adjectives
    proper_run = []  # accumulator for consecutive proper nouns
    for word, tag in tagged:
        if tag in PROPER_NOUN_TAGS:
            proper_run.append(word)
        else:
            # Flush any accumulated proper noun run
            if proper_run:
                phrase = " ".join(proper_run).lower()
                if len(phrase) >= MIN_WORD_LEN:
                    phrases.append(phrase)
                proper_run = []
            # Keep other nouns and adjectives as single terms
            if tag in KEEP_TAGS and len(word) >= MIN_WORD_LEN:
                single_terms.append(word.lower())
    # Flush final proper noun run
    if proper_run:
        phrase = " ".join(proper_run).lower()
        if len(phrase) >= MIN_WORD_LEN:
            phrases.append(phrase)
    # Phrases first (more specific), then single terms
    all_terms = phrases + single_terms
    return list(dict.fromkeys(all_terms))  # deduplicate, preserve order
 def search_files(terms, data_dir, context_lines=CONTEXT_LINES):
    """Search all .txt files in data_dir for the given terms.
    Returns a list of (file_path, match_count, matches) where matches is a
    list of (line_number, context_block) tuples.
    """
    if not terms:
        return []
    # Build a single regex pattern that matches any term (case-insensitive)
    pattern = re.compile(
        r"\b(" + "|".join(re.escape(t) for t in terms) + r")\b",
        re.IGNORECASE
    )
    results = []
    txt_files = sorted(data_dir.glob("*.txt"))
    for fpath in txt_files:
        try:
            lines = fpath.read_text(encoding="utf-8").splitlines()
        except (OSError, UnicodeDecodeError):
            continue
        matches = []
        match_count = 0
        seen_lines = set()  # avoid overlapping context blocks
        for i, line in enumerate(lines):
            if pattern.search(line):
                match_count += 1
                if i in seen_lines:
                    continue
                # Extract context window
                start = max(0, i - context_lines)
                end = min(len(lines), i + context_lines + 1)
                block = []
                for j in range(start, end):
                    seen_lines.add(j)
                    marker = ">>>" if j == i else "   "
                    block.append(f"  {marker} {j+1:4d}: {lines[j]}")
                matches.append((i + 1, "\n".join(block)))
        if match_count > 0:
            results.append((fpath, match_count, matches))
    # Sort by match count (most matches first)
    results.sort(key=lambda x: x[1], reverse=True)
    return results
 def main():
    if len(sys.argv) < 2:
        print("Usage: python search_keywords.py QUERY_TEXT")
        sys.exit(1)
    ensure_nltk_data()
    q = " ".join(sys.argv[1:])
    # Extract terms
    terms = extract_terms(q)
    if not terms:
        print(f"Query: {q}")
        print("No searchable terms extracted. Try a more specific query.")
        sys.exit(0)
    print(f"Query: {q}")
    print(f"Extracted terms: {', '.join(terms)}\n")
    # Search
    results = search_files(terms, DATA_DIR)
    if not results:
        print("No matches found.")
        sys.exit(0)
    # Summary
    total_matches = sum(r[1] for r in results)
    print(f"Found {total_matches} matches across {len(results)} files\n")
    # Detailed output
    for fpath, match_count, matches in results:
        print("="*60)
        print(f"--- {fpath.name}  ({match_count} matches) ---")
        print("="*60)
        for line_num, block in matches[:MAX_MATCHES_PER_FILE]:
            print(block)
            print()
        if len(matches) > MAX_MATCHES_PER_FILE:
            print(f"  ... and {len(matches) - MAX_MATCHES_PER_FILE} more matches\n")
 if __name__ == "__main__":
    main()
--- a/05-neural-networks/README.md
+++ b/05-neural-networks/README.md
@ -0,0 +1,258 @@
 # Large Language Models Part V: Building a Neural Network
 **CHEG 667-013 — Chemical Engineering with Computers**  
 Department of Chemical and Biomolecular Engineering, University of Delaware
 ---
 ## Key idea
 Build a neural network from scratch to understand the core mechanics behind LLMs.
 ## Key goals
 - See concretely what "weights and biases" are and how they're organized
 - Understand the forward pass, loss function, and gradient descent
 - Implement backpropagation by hand in numpy
 - See how PyTorch automates the same process
 - Connect these concepts to what you've already seen in nanoGPT
 ---
 Everything we've done in this workshop is **machine learning** (ML) — the practice of training models to learn patterns from data rather than programming rules by hand. LLMs are one (very large) example of ML, built on neural networks. Throughout this workshop, we've used ML terms like *model weights*, *training loss*, *gradient descent*, and *overfitting* — often without defining them precisely. In Part I, we watched nanoGPT's training loss decrease over 2000 iterations. In Part II, we saw that models have millions of parameters. In Parts III and IV, we used embedding models that map text into vectors — another ML technique.
 In this section, we step back from language and build a neural network ourselves — small enough to understand every weight, but powerful enough to learn a real physical relationship. The goal is to make the ML concepts behind LLMs concrete.
 Our task: fit the heat capacity $C_p(T)$ of nitrogen gas using data from the [NIST Chemistry WebBook](https://webbook.nist.gov/). This is a function that chemical engineers know well. Textbooks like *Chemical, Biochemical, and Engineering Thermodynamics* (a UD favorite) typically fit it with a polynomial:
 $$C_p(T) = a + bT + cT^2 + dT^3$$
 Can a neural network learn this relationship directly from data?
 ## 1. Setup
 Use the virtual environment from Part I — `numpy` and `torch` are already installed. You may need to add `matplotlib`:
 ```bash
 pip install matplotlib
 ```
 ## 2. The data
 The file `data/n2_cp.csv` contains 35 data points: the isobaric heat capacity of N₂ gas at 1 bar from 300 K to 2000 K, from the NIST WebBook.
 ```bash
 head data/n2_cp.csv
 ```
 ```
 T_K,Cp_kJ_per_kgK
 300.00,1.0413
 350.00,1.0423
 400.00,1.0450
 ...
 ```
 The curve is smooth and nonlinear — $C_p$ increases with temperature as molecular vibrational modes become active. This is a good test case: simple enough for a small network, but not a straight line.
 ## 3. Architecture of a one-hidden-layer network
 Our network has three layers:
 ```
 Input (1 neuron: T)  →  Hidden (10 neurons)  →  Output (1 neuron: Cp)
 ```
 Here's what happens at each step:
 ### Forward pass
 **Step 1: Hidden layer.** Each of the 10 hidden neurons computes a weighted sum of the input plus a bias, then applies an *activation function*:
 $$z_j = w_j \cdot x + b_j \qquad a_j = \tanh(z_j)$$
 where $w_j$ and $b_j$ are the weight and bias for neuron $j$. The activation function (here, `tanh`) introduces **nonlinearity** — without it, stacking layers would just produce another linear function, no matter how many layers we use.
 **Step 2: Output layer.** The output is a weighted sum of the hidden activations:
 $$\hat{y} = \sum_j W_j \cdot a_j + b_{\text{out}}$$
 This is a linear combination — no activation on the output, since we want to predict a continuous value.
 ### Counting parameters
 With 10 hidden neurons:
 - `W1`: 10 weights (input → hidden)
 - `b1`: 10 biases (hidden)
 - `W2`: 10 weights (hidden → output)
 - `b2`: 1 bias (output)
 - **Total: 31 parameters**
 That's 31 parameters for 35 data points — almost a 1:1 ratio, which should make you nervous about overfitting. In general, a model with as many parameters as data points can memorize instead of learning. We get away with it here because (a) the $C_p(T)$ data is very smooth with no noise, and (b) the `tanh` activation constrains each neuron to a smooth S-curve, so the network can't wiggle wildly between points the way a high-degree polynomial could. We'll revisit this in the overfitting section below.
 Compare: the small nanoGPT model from Part I had ~800,000 parameters. GPT-2 has 124 million. The architecture is the same idea — layers of weights and activations — just scaled enormously.
 ## 4. Training
 Training means finding the values of all 31 parameters that make the network's predictions match the data. This requires three things:
 ### Loss function
 We need a number that says "how wrong is the network?" The **mean squared error** (MSE) is a natural choice:
 $$L = \frac{1}{N} \sum_{i=1}^{N} (\hat{y}_i - y_i)^2$$
 This is the same kind of loss we watched decrease during nanoGPT training in Part I (though nanoGPT uses cross-entropy loss, which is appropriate for classification over a vocabulary).
 ### Backpropagation
 To improve the weights, we need to know how each weight affects the loss. **Backpropagation** computes these gradients by applying the chain rule, working backward from the loss through each layer. For example, the gradient of the loss with respect to an output weight $W_j$ is:
 $$\frac{\partial L}{\partial W_j} = \frac{1}{N} \sum_i 2(\hat{y}_i - y_i) \cdot a_{ij}$$
 The numpy implementation in `nn_numpy.py` computes every gradient explicitly. This is the part that PyTorch automates.
 ### Gradient descent
 Once we have the gradients, we update each weight:
 $$w \leftarrow w - \eta \cdot \frac{\partial L}{\partial w}$$
 where $\eta$ is the **learning rate** — a small number (0.01 in our code) that controls how big each step is. Too large and training oscillates; too small and it's painfully slow.
 One full pass through these three steps (forward → loss → backward → update) is one **epoch**. We train for 5000 epochs.
 In nanoGPT, the training loop in `train.py` does exactly the same thing, but with the AdamW optimizer (a fancier version of gradient descent) and batches of data instead of the full dataset.
 ## 5. Running the numpy version
 ```bash
 python nn_numpy.py
 ```
 ```
 Epoch     0  Loss: 0.283941
 Epoch   500  Loss: 0.001253
 Epoch  1000  Loss: 0.000412
 Epoch  1500  Loss: 0.000178
 Epoch  2000  Loss: 0.000082
 Epoch  2500  Loss: 0.000040
 Epoch  3000  Loss: 0.000021
 Epoch  3500  Loss: 0.000012
 Epoch  4000  Loss: 0.000008
 Epoch  4500  Loss: 0.000005
 Epoch  4999  Loss: 0.000004
 Final loss: 0.000004
 Network: 1 input -> 10 hidden (tanh) -> 1 output
 Total parameters: 31
 ```
 The script produces a plot (`nn_fit.png`) showing the fit and the training loss curve. You should see the network's prediction closely tracking the NIST data points, and the loss dropping rapidly in the first 1000 epochs before leveling off.
 > **Exercise 1:** Read through `nn_numpy.py` carefully. Identify where each of the following happens: (a) forward pass, (b) loss calculation, (c) backpropagation, (d) gradient descent update. Annotate your copy with comments.
 > **Exercise 2:** Change the number of hidden neurons `H`. Try 2, 5, 10, 20, 50. How does the fit change? How many parameters does each network have? At what point does adding more neurons stop helping?
 ## 6. The PyTorch version
 Now look at `nn_torch.py`. It does the same thing, but in about half the code:
 ```bash
 python nn_torch.py
 ```
 Compare the two scripts side by side. The key differences:
 | | numpy version | PyTorch version |
 |---|---|---|
 | Define layers | Manual weight matrices | `nn.Linear(1, H)` |
 | Forward pass | `X @ W1 + b1`, `np.tanh(...)` | `model(X)` |
 | Backprop | Hand-coded chain rule | `loss.backward()` |
 | Weight update | `W -= lr * dW` | `optimizer.step()` |
 | Lines of code | ~80 | ~40 |
 PyTorch's `loss.backward()` computes all the gradients we wrote out by hand — automatically. This is called **automatic differentiation**. It's what makes training networks with millions of parameters feasible.
 The `nn.Sequential` definition:
 ```python
 model = nn.Sequential(
    nn.Linear(1, H),    # input -> hidden (W1, b1)
    nn.Tanh(),           # activation
    nn.Linear(H, 1),    # hidden -> output (W2, b2)
 )
 ```
 looks simple here, but it's the same API used in nanoGPT's `model.py` — just with more layers, attention mechanisms, and a much larger vocabulary.
 > **Exercise 3:** In the PyTorch version, replace `nn.Tanh()` with `nn.ReLU()` or `nn.Sigmoid()`. How does the fit change? Why might different activation functions work better for different problems?
 > **Exercise 4:** Replace the Adam optimizer with plain SGD: `torch.optim.SGD(model.parameters(), lr=0.01)`. How does training speed compare? Try increasing the learning rate. What happens?
 ## 7. Normalization
 Both scripts normalize the input ($T$) and output ($C_p$) to the range [0, 1] before training. This is important:
 - Raw $T$ values range from 300 to 2000, while $C_p$ ranges from 1.04 to 1.28
 - With unnormalized data, the gradients for the input weights would be hundreds of times larger than for the output weights
 - The network would struggle to learn — or need a much smaller learning rate
 Try it yourself:
 > **Exercise 5:** Comment out the normalization in `nn_numpy.py` (use `T_raw` and `Cp_raw` directly). What happens to the training loss? Can you fix it by changing the learning rate?
 ## 8. Overfitting
 With 31 parameters and 35 data points, our network is close to the edge. What happens with more parameters than data?
 > **Exercise 6:** Increase `H` to 100 (giving 301 parameters — nearly 10× the number of data points). Train for 20,000 epochs. Plot the fit. Does it match the training data well? Now generate predictions at $T$ = 275 K and $T$ = 2100 K (outside the training range). Are they reasonable?
 This is **overfitting** — the network memorizes the training data but fails to generalize. It's the same concept we discussed in Part I when nanoGPT's validation loss started increasing while the training loss kept decreasing.
 In practice, we combat overfitting with:
 - More data
 - Regularization (dropout — remember this parameter from nanoGPT?)
 - Early stopping (stop training when validation loss starts increasing)
 - Keeping the model appropriately sized for the data
 ## 9. Connecting back to LLMs
 Everything you've built here scales up to large language models:
 | This tutorial | nanoGPT / LLMs |
 |---|---|
 | 31 parameters | 800K – 70B+ parameters |
 | 1 hidden layer | 4 – 96+ layers |
 | tanh activation | GELU activation |
 | MSE loss | Cross-entropy loss |
 | Plain gradient descent | AdamW optimizer |
 | Numpy arrays | PyTorch tensors (on GPU) |
 | Fitting $C_p(T)$ | Predicting next tokens |
 The fundamental loop — forward pass, compute loss, backpropagate, update weights — is identical. The difference is scale: more layers, more data, more compute, and architectural innovations like self-attention.
 ## Additional resources and references
 ### NIST Chemistry WebBook
 - https://webbook.nist.gov/ — thermophysical property data used in this tutorial
 ### PyTorch
 - Tutorial: https://pytorch.org/tutorials/beginner/basics/intro.html
 - `nn.Module` documentation: https://pytorch.org/docs/stable/nn.html
 ### Reading
 - The "backpropagation" chapter in Goodfellow, Bengio & Courville, *Deep Learning* (2016), freely available at https://www.deeplearningbook.org/
 - 3Blue1Brown, *Neural Networks* video series: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi — excellent visual intuition for how neural networks learn
--- a/05-neural-networks/data/n2_cp.csv
+++ b/05-neural-networks/data/n2_cp.csv
@ -0,0 +1,36 @@
 T_K,Cp_kJ_per_kgK
 300.00,1.0413
 350.00,1.0423
 400.00,1.0450
 450.00,1.0497
 500.00,1.0564
 550.00,1.0650
 600.00,1.0751
 650.00,1.0863
 700.00,1.0981
 750.00,1.1102
 800.00,1.1223
 850.00,1.1342
 900.00,1.1457
 950.00,1.1568
 1000.0,1.1674
 1050.0,1.1774
 1100.0,1.1868
 1150.0,1.1957
 1200.0,1.2040
 1250.0,1.2118
 1300.0,1.2191
 1350.0,1.2260
 1400.0,1.2323
 1450.0,1.2383
 1500.0,1.2439
 1550.0,1.2491
 1600.0,1.2540
 1650.0,1.2586
 1700.0,1.2630
 1750.0,1.2670
 1800.0,1.2708
 1850.0,1.2744
 1900.0,1.2778
 1950.0,1.2810
 2000.0,1.2841
--- a/05-neural-networks/nn_numpy.py
+++ b/05-neural-networks/nn_numpy.py
@ -0,0 +1,156 @@
 # nn_numpy.py
 #
 # A neural network with one hidden layer, built from scratch using numpy.
 # Fits Cp(T) data for nitrogen gas at 1 bar (NIST WebBook).
 #
 # This demonstrates the core mechanics of a neural network:
 #   - Forward pass: input -> hidden layer -> activation -> output
 #   - Loss calculation (mean squared error)
 #   - Backpropagation: computing gradients of the loss w.r.t. each weight
 #   - Gradient descent: updating weights to minimize loss
 #
 # CHEG 667-013
 # E. M. Furst
 import numpy as np
 import matplotlib.pyplot as plt
 # ── Load and prepare data ──────────────────────────────────────
 data = np.loadtxt("data/n2_cp.csv", delimiter=",", skiprows=1)
 T_raw = data[:, 0]   # Temperature (K)
 Cp_raw = data[:, 1]  # Heat capacity (kJ/kg/K)
 # Normalize inputs and outputs to [0, 1] range.
 # Neural networks train better when values are small and centered.
 T_min, T_max = T_raw.min(), T_raw.max()
 Cp_min, Cp_max = Cp_raw.min(), Cp_raw.max()
 T = (T_raw - T_min) / (T_max - T_min)      # shape: (N,)
 Cp = (Cp_raw - Cp_min) / (Cp_max - Cp_min)  # shape: (N,)
 # Reshape for matrix operations: each sample is a row
 X = T.reshape(-1, 1)    # (N, 1) -- input matrix
 Y = Cp.reshape(-1, 1)   # (N, 1) -- target matrix
 N = X.shape[0]  # number of data points
 # ── Network architecture ───────────────────────────────────────
 #
 #   Input (1) --> Hidden (H neurons, tanh) --> Output (1)
 #
 # The hidden layer has H neurons. Each neuron computes:
 #   z = w * x + b       (weighted sum)
 #   a = tanh(z)          (activation -- introduces nonlinearity)
 #
 # The output layer combines the hidden activations:
 #   y_pred = W2 @ a + b2
 H = 10  # number of neurons in the hidden layer
 # Initialize weights randomly (small values)
 # W1: (1, H) -- connects input to each hidden neuron
 # b1: (1, H) -- one bias per hidden neuron
 # W2: (H, 1) -- connects hidden neurons to output
 # b2: (1, 1) -- output bias
 np.random.seed(42)
 W1 = np.random.randn(1, H) * 0.5
 b1 = np.zeros((1, H))
 W2 = np.random.randn(H, 1) * 0.5
 b2 = np.zeros((1, 1))
 # ── Training parameters ───────────────────────────────────────
 learning_rate = 0.01
 epochs = 5000
 log_interval = 500
 # ── Training loop ─────────────────────────────────────────────
 losses = []
 for epoch in range(epochs):
    # ── Forward pass ──────────────────────────────────────────
    # Step 1: hidden layer pre-activation
    Z1 = X @ W1 + b1          # (N, H)
    # Step 2: hidden layer activation (tanh)
    A1 = np.tanh(Z1)          # (N, H)
    # Step 3: output layer (linear -- no activation)
    Y_pred = A1 @ W2 + b2     # (N, 1)
    # ── Loss ──────────────────────────────────────────────────
    # Mean squared error
    error = Y_pred - Y         # (N, 1)
    loss = np.mean(error ** 2)
    losses.append(loss)
    # ── Backpropagation ───────────────────────────────────────
    # Compute gradients by applying the chain rule, working
    # backward from the loss to each weight.
    # Gradient of loss w.r.t. output
    dL_dYpred = 2 * error / N              # (N, 1)
    # Gradients for output layer weights
    dL_dW2 = A1.T @ dL_dYpred              # (H, 1)
    dL_db2 = np.sum(dL_dYpred, axis=0, keepdims=True)  # (1, 1)
    # Gradient flowing back through the hidden layer
    dL_dA1 = dL_dYpred @ W2.T             # (N, H)
    # Derivative of tanh: d/dz tanh(z) = 1 - tanh(z)^2
    dL_dZ1 = dL_dA1 * (1 - A1 ** 2)       # (N, H)
    # Gradients for hidden layer weights
    dL_dW1 = X.T @ dL_dZ1                  # (1, H)
    dL_db1 = np.sum(dL_dZ1, axis=0, keepdims=True)  # (1, H)
    # ── Gradient descent ──────────────────────────────────────
    # Update each weight in the direction that reduces the loss
    W2 -= learning_rate * dL_dW2
    b2 -= learning_rate * dL_db2
    W1 -= learning_rate * dL_dW1
    b1 -= learning_rate * dL_db1
    if epoch % log_interval == 0 or epoch == epochs - 1:
        print(f"Epoch {epoch:5d}  Loss: {loss:.6f}")
 # ── Results ────────────────────────────────────────────────────
 # Predict on a fine grid for smooth plotting
 T_fine = np.linspace(0, 1, 200).reshape(-1, 1)
 A1_fine = np.tanh(T_fine @ W1 + b1)
 Cp_pred_norm = A1_fine @ W2 + b2
 # Convert back to physical units
 T_fine_K = T_fine * (T_max - T_min) + T_min
 Cp_pred = Cp_pred_norm * (Cp_max - Cp_min) + Cp_min
 # ── Plot ───────────────────────────────────────────────────────
 fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
 # Left: fit
 ax1.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')
 ax1.plot(T_fine_K, Cp_pred, 'r-', linewidth=2, label=f'NN ({H} neurons)')
 ax1.set_xlabel('Temperature (K)')
 ax1.set_ylabel('$C_p$ (kJ/kg/K)')
 ax1.set_title('$C_p(T)$ for N$_2$ at 1 bar')
 ax1.legend()
 # Right: training loss
 ax2.semilogy(losses)
 ax2.set_xlabel('Epoch')
 ax2.set_ylabel('Mean Squared Error')
 ax2.set_title('Training Loss')
 plt.tight_layout()
 plt.savefig('nn_fit.png', dpi=150)
 plt.show()
 print(f"\nFinal loss: {losses[-1]:.6f}")
 print(f"Network: {1} input -> {H} hidden (tanh) -> {1} output")
 print(f"Total parameters: {W1.size + b1.size + W2.size + b2.size}")
--- a/05-neural-networks/nn_torch.py
+++ b/05-neural-networks/nn_torch.py
@ -0,0 +1,99 @@
 # nn_torch.py
 #
 # The same neural network as nn_numpy.py, but using PyTorch.
 # Compare this to the numpy version to see what the framework handles for you:
 #   - Automatic differentiation (no manual backprop)
 #   - Built-in optimizers (Adam instead of hand-coded gradient descent)
 #   - GPU support (if available)
 #
 # CHEG 667-013
 # E. M. Furst
 import torch
 import torch.nn as nn
 import numpy as np
 import matplotlib.pyplot as plt
 # ── Load and prepare data ──────────────────────────────────────
 data = np.loadtxt("data/n2_cp.csv", delimiter=",", skiprows=1)
 T_raw = data[:, 0]
 Cp_raw = data[:, 1]
 # Normalize to [0, 1]
 T_min, T_max = T_raw.min(), T_raw.max()
 Cp_min, Cp_max = Cp_raw.min(), Cp_raw.max()
 X = torch.tensor((T_raw - T_min) / (T_max - T_min), dtype=torch.float32).reshape(-1, 1)
 Y = torch.tensor((Cp_raw - Cp_min) / (Cp_max - Cp_min), dtype=torch.float32).reshape(-1, 1)
 # ── Define the network ─────────────────────────────────────────
 #
 # nn.Sequential stacks layers in order. Compare this to nanoGPT's
 # model.py, which uses the same PyTorch building blocks (nn.Linear,
 # activation functions) but with many more layers.
 H = 10  # hidden neurons
 model = nn.Sequential(
    nn.Linear(1, H),    # input -> hidden (W1, b1)
    nn.Tanh(),           # activation
    nn.Linear(H, 1),    # hidden -> output (W2, b2)
 )
 print(f"Model:\n{model}")
 print(f"Total parameters: {sum(p.numel() for p in model.parameters())}\n")
 # ── Training ───────────────────────────────────────────────────
 optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
 loss_fn = nn.MSELoss()
 epochs = 5000
 log_interval = 500
 losses = []
 for epoch in range(epochs):
    # Forward pass -- PyTorch tracks operations for automatic differentiation
    Y_pred = model(X)
    loss = loss_fn(Y_pred, Y)
    losses.append(loss.item())
    # Backward pass -- PyTorch computes all gradients automatically
    optimizer.zero_grad()   # reset gradients from previous step
    loss.backward()         # compute gradients via automatic differentiation
    optimizer.step()        # update weights (Adam optimizer)
    if epoch % log_interval == 0 or epoch == epochs - 1:
        print(f"Epoch {epoch:5d}  Loss: {loss.item():.6f}")
 # ── Results ────────────────────────────────────────────────────
 # Predict on a fine grid
 T_fine = torch.linspace(0, 1, 200).reshape(-1, 1)
 with torch.no_grad():  # no gradient tracking needed for inference
    Cp_pred_norm = model(T_fine)
 # Convert back to physical units
 T_fine_K = T_fine.numpy() * (T_max - T_min) + T_min
 Cp_pred = Cp_pred_norm.numpy() * (Cp_max - Cp_min) + Cp_min
 # ── Plot ───────────────────────────────────────────────────────
 fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
 ax1.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')
 ax1.plot(T_fine_K, Cp_pred, 'r-', linewidth=2, label=f'NN ({H} neurons)')
 ax1.set_xlabel('Temperature (K)')
 ax1.set_ylabel('$C_p$ (kJ/kg/K)')
 ax1.set_title('$C_p(T)$ for N$_2$ at 1 bar — PyTorch')
 ax1.legend()
 ax2.semilogy(losses)
 ax2.set_xlabel('Epoch')
 ax2.set_ylabel('Mean Squared Error')
 ax2.set_title('Training Loss')
 plt.tight_layout()
 plt.savefig('nn_fit_torch.png', dpi=150)
 plt.show()
--- a/05-neural-networks/nn_workshop.ipynb
+++ b/05-neural-networks/nn_workshop.ipynb
@ -0,0 +1,137 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "xbsmj1hcj1g",
   "source": "# Building a Neural Network: $C_p(T)$ for Nitrogen\n\n**CHEG 667-013 — LLMs for Engineers**\n\nIn this notebook we fit the heat capacity of N₂ gas using three approaches:\n1. A polynomial fit (the classical approach)\n2. A neural network built from scratch in numpy\n3. The same network in PyTorch\n\nThis makes the ML concepts behind LLMs — weights, loss, gradient descent, overfitting — concrete and tangible.",
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "id": "szrl41l3xbq",
   "source": "## 1. Load and plot the data\n\nThe data is from the [NIST Chemistry WebBook](https://webbook.nist.gov/): isobaric heat capacity of N₂ at 1 bar, 300–2000 K.",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "id": "t4lqkcoeyil",
   "source": "import numpy as np\nimport matplotlib.pyplot as plt\n\ndata = np.loadtxt(\"data/n2_cp.csv\", delimiter=\",\", skiprows=1)\nT_raw = data[:, 0]   # Temperature (K)\nCp_raw = data[:, 1]  # Cp (kJ/kg/K)\n\nplt.figure(figsize=(8, 5))\nplt.plot(T_raw, Cp_raw, 'ko', markersize=6)\nplt.xlabel('Temperature (K)')\nplt.ylabel('$C_p$ (kJ/kg/K)')\nplt.title('$C_p(T)$ for N$_2$ at 1 bar — NIST WebBook')\nplt.show()\n\nprint(f\"{len(T_raw)} data points, T range: {T_raw.min():.0f} – {T_raw.max():.0f} K\")",
   "metadata": {},
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "1jyrgsvp7op",
   "source": "## 2. Polynomial fit (baseline)\n\nTextbooks fit $C_p(T)$ with a polynomial: $C_p = a + bT + cT^2 + dT^3$. This is a **4-parameter** model. Let's fit it with `numpy.polyfit` and see how well it does.",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "id": "4smvu4z2oro",
   "source": "# Fit a cubic polynomial\ncoeffs = np.polyfit(T_raw, Cp_raw, 3)\npoly = np.poly1d(coeffs)\n\nT_fine = np.linspace(T_raw.min(), T_raw.max(), 200)\nCp_poly = poly(T_fine)\n\n# Compute residuals\nCp_poly_at_data = poly(T_raw)\nmse_poly = np.mean((Cp_poly_at_data - Cp_raw) ** 2)\n\nplt.figure(figsize=(8, 5))\nplt.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')\nplt.plot(T_fine, Cp_poly, 'b-', linewidth=2, label=f'Cubic polynomial (4 params)')\nplt.xlabel('Temperature (K)')\nplt.ylabel('$C_p$ (kJ/kg/K)')\nplt.title('Polynomial fit')\nplt.legend()\nplt.show()\n\nprint(f\"Polynomial coefficients: {coeffs}\")\nprint(f\"MSE: {mse_poly:.8f}\")\nprint(f\"Parameters: 4\")",
   "metadata": {},
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "97y7mrcekji",
   "source": "## 3. Neural network from scratch (numpy)\n\nNow let's build a one-hidden-layer neural network. The architecture:\n\n```\nInput (1: T) → Hidden (10 neurons, tanh) → Output (1: Cp)\n```\n\nWe need to:\n1. **Normalize** the data to [0, 1] so the network trains efficiently\n2. **Forward pass**: compute predictions from input through each layer\n3. **Loss**: mean squared error between predictions and data\n4. **Backpropagation**: compute gradients of the loss w.r.t. each weight using the chain rule\n5. **Gradient descent**: update weights in the direction that reduces the loss\n\nThis is exactly what nanoGPT's `train.py` does — just at a much larger scale.",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "id": "365o7bqbwkr",
   "source": "# Normalize inputs and outputs to [0, 1]\nT_min, T_max = T_raw.min(), T_raw.max()\nCp_min, Cp_max = Cp_raw.min(), Cp_raw.max()\n\nT = (T_raw - T_min) / (T_max - T_min)\nCp = (Cp_raw - Cp_min) / (Cp_max - Cp_min)\n\nX = T.reshape(-1, 1)    # (N, 1) input matrix\nY = Cp.reshape(-1, 1)   # (N, 1) target matrix\nN = X.shape[0]\n\n# Network setup\nH = 10  # hidden neurons\n\nnp.random.seed(42)\nW1 = np.random.randn(1, H) * 0.5   # input → hidden weights\nb1 = np.zeros((1, H))               # hidden biases\nW2 = np.random.randn(H, 1) * 0.5   # hidden → output weights\nb2 = np.zeros((1, 1))               # output bias\n\nprint(f\"Parameters: W1({W1.shape}) + b1({b1.shape}) + W2({W2.shape}) + b2({b2.shape})\")\nprint(f\"Total: {W1.size + b1.size + W2.size + b2.size} parameters for {N} data points\")",
   "metadata": {},
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "5w1ezs9t2w6",
   "source": "# Training loop\nlearning_rate = 0.01\nepochs = 5000\nlog_interval = 500\nlosses_np = []\n\nfor epoch in range(epochs):\n    # Forward pass\n    Z1 = X @ W1 + b1           # hidden pre-activation  (N, H)\n    A1 = np.tanh(Z1)           # hidden activation       (N, H)\n    Y_pred = A1 @ W2 + b2      # output                  (N, 1)\n\n    # Loss (mean squared error)\n    error = Y_pred - Y\n    loss = np.mean(error ** 2)\n    losses_np.append(loss)\n\n    # Backpropagation (chain rule, working backward)\n    dL_dYpred = 2 * error / N\n    dL_dW2 = A1.T @ dL_dYpred\n    dL_db2 = np.sum(dL_dYpred, axis=0, keepdims=True)\n    dL_dA1 = dL_dYpred @ W2.T\n    dL_dZ1 = dL_dA1 * (1 - A1 ** 2)   # tanh derivative\n    dL_dW1 = X.T @ dL_dZ1\n    dL_db1 = np.sum(dL_dZ1, axis=0, keepdims=True)\n\n    # Gradient descent update\n    W2 -= learning_rate * dL_dW2\n    b2 -= learning_rate * dL_db2\n    W1 -= learning_rate * dL_dW1\n    b1 -= learning_rate * dL_db1\n\n    if epoch % log_interval == 0 or epoch == epochs - 1:\n        print(f\"Epoch {epoch:5d}  Loss: {loss:.6f}\")",
   "metadata": {},
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "onel9r0kjk",
   "source": "# Predict on a fine grid and convert back to physical units\nT_fine_norm = np.linspace(0, 1, 200).reshape(-1, 1)\nA1_fine = np.tanh(T_fine_norm @ W1 + b1)\nCp_nn_norm = A1_fine @ W2 + b2\nCp_nn = Cp_nn_norm * (Cp_max - Cp_min) + Cp_min\nT_fine_K = T_fine_norm * (T_max - T_min) + T_min\n\n# MSE in original units for comparison with polynomial\nCp_nn_at_data = np.tanh(X @ W1 + b1) @ W2 + b2\nCp_nn_at_data = Cp_nn_at_data * (Cp_max - Cp_min) + Cp_min\nmse_nn = np.mean((Cp_nn_at_data.flatten() - Cp_raw) ** 2)\n\nfig, (ax1, ax2) = plt.subplots(1, 2, figsize=(13, 5))\n\nax1.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')\nax1.plot(T_fine, Cp_poly, 'b-', linewidth=2, label=f'Polynomial (4 params, MSE={mse_poly:.2e})')\nax1.plot(T_fine_K.flatten(), Cp_nn.flatten(), 'r-', linewidth=2, label=f'NN numpy (31 params, MSE={mse_nn:.2e})')\nax1.set_xlabel('Temperature (K)')\nax1.set_ylabel('$C_p$ (kJ/kg/K)')\nax1.set_title('$C_p(T)$ for N$_2$ at 1 bar')\nax1.legend()\n\nax2.semilogy(losses_np)\nax2.set_xlabel('Epoch')\nax2.set_ylabel('MSE (normalized)')\nax2.set_title('Training loss — numpy NN')\n\nplt.tight_layout()\nplt.show()",
   "metadata": {},
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "ea9z35qm9u8",
   "source": "## 4. Neural network in PyTorch\n\nThe same network, but PyTorch handles backpropagation automatically. Compare the training loop above to the one below — `loss.backward()` replaces all of our manual gradient calculations.\n\nThis is the same API used in nanoGPT's `model.py` — `nn.Linear`, activation functions, `optimizer.step()`.",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "id": "3qxnrtyxqgz",
   "source": "import torch\nimport torch.nn as nn\n\n# Prepare data as PyTorch tensors\nX_t = torch.tensor((T_raw - T_min) / (T_max - T_min), dtype=torch.float32).reshape(-1, 1)\nY_t = torch.tensor((Cp_raw - Cp_min) / (Cp_max - Cp_min), dtype=torch.float32).reshape(-1, 1)\n\n# Define the network\nmodel = nn.Sequential(\n    nn.Linear(1, H),    # input → hidden (W1, b1)\n    nn.Tanh(),           # activation\n    nn.Linear(H, 1),    # hidden → output (W2, b2)\n)\n\nprint(model)\nprint(f\"Total parameters: {sum(p.numel() for p in model.parameters())}\")",
   "metadata": {},
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "ydl3ycnypps",
   "source": "# Train\noptimizer = torch.optim.Adam(model.parameters(), lr=0.01)\nloss_fn = nn.MSELoss()\nlosses_torch = []\n\nfor epoch in range(epochs):\n    Y_pred_t = model(X_t)\n    loss = loss_fn(Y_pred_t, Y_t)\n    losses_torch.append(loss.item())\n\n    optimizer.zero_grad()   # reset gradients\n    loss.backward()         # automatic differentiation\n    optimizer.step()        # update weights\n\n    if epoch % log_interval == 0 or epoch == epochs - 1:\n        print(f\"Epoch {epoch:5d}  Loss: {loss.item():.6f}\")",
   "metadata": {},
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "bg0kvnk4ho",
   "source": "## 5. Compare all three approaches",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "id": "h2dfstoh8gd",
   "source": "# PyTorch predictions\nT_fine_t = torch.linspace(0, 1, 200).reshape(-1, 1)\nwith torch.no_grad():\n    Cp_torch_norm = model(T_fine_t)\nCp_torch = Cp_torch_norm.numpy() * (Cp_max - Cp_min) + Cp_min\n\n# MSE for PyTorch model\nwith torch.no_grad():\n    Cp_torch_at_data = model(X_t).numpy() * (Cp_max - Cp_min) + Cp_min\nmse_torch = np.mean((Cp_torch_at_data.flatten() - Cp_raw) ** 2)\n\nfig, (ax1, ax2) = plt.subplots(1, 2, figsize=(13, 5))\n\n# Left: all three fits\nax1.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')\nax1.plot(T_fine, Cp_poly, 'b-', linewidth=2, label=f'Polynomial (4 params)')\nax1.plot(T_fine_K.flatten(), Cp_nn.flatten(), 'r--', linewidth=2, label=f'NN numpy (31 params)')\nax1.plot(T_fine_K.flatten(), Cp_torch.flatten(), 'g-', linewidth=2, alpha=0.8, label=f'NN PyTorch (31 params)')\nax1.set_xlabel('Temperature (K)')\nax1.set_ylabel('$C_p$ (kJ/kg/K)')\nax1.set_title('$C_p(T)$ for N$_2$ at 1 bar')\nax1.legend()\n\n# Right: training loss comparison\nax2.semilogy(losses_np, label='numpy (gradient descent)')\nax2.semilogy(losses_torch, label='PyTorch (Adam)')\nax2.set_xlabel('Epoch')\nax2.set_ylabel('MSE (normalized)')\nax2.set_title('Training loss comparison')\nax2.legend()\n\nplt.tight_layout()\nplt.show()\n\nprint(f\"MSE — Polynomial: {mse_poly:.2e}  |  NN numpy: {mse_nn:.2e}  |  NN PyTorch: {mse_torch:.2e}\")",
   "metadata": {},
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "xyw3sr20brn",
   "source": "## 6. Extrapolation\n\nHow do the models behave *outside* the training range? This is a key test — and where the differences become stark.",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "id": "fi3iq2sjh6",
   "source": "# Extrapolate beyond the training range\nT_extrap = np.linspace(100, 2500, 300)\nT_extrap_norm = ((T_extrap - T_min) / (T_max - T_min)).reshape(-1, 1)\n\n# Polynomial extrapolation\nCp_poly_extrap = poly(T_extrap)\n\n# Numpy NN extrapolation\nA1_extrap = np.tanh(T_extrap_norm @ W1 + b1)\nCp_nn_extrap = (A1_extrap @ W2 + b2) * (Cp_max - Cp_min) + Cp_min\n\n# PyTorch NN extrapolation\nwith torch.no_grad():\n    Cp_torch_extrap = model(torch.tensor(T_extrap_norm, dtype=torch.float32)).numpy()\nCp_torch_extrap = Cp_torch_extrap * (Cp_max - Cp_min) + Cp_min\n\nplt.figure(figsize=(10, 6))\nplt.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')\nplt.plot(T_extrap, Cp_poly_extrap, 'b-', linewidth=2, label='Polynomial')\nplt.plot(T_extrap, Cp_nn_extrap.flatten(), 'r--', linewidth=2, label='NN numpy')\nplt.plot(T_extrap, Cp_torch_extrap.flatten(), 'g-', linewidth=2, alpha=0.8, label='NN PyTorch')\nplt.axvline(T_raw.min(), color='gray', linestyle=':', alpha=0.5, label='Training range')\nplt.axvline(T_raw.max(), color='gray', linestyle=':', alpha=0.5)\nplt.xlabel('Temperature (K)')\nplt.ylabel('$C_p$ (kJ/kg/K)')\nplt.title('Extrapolation beyond training data')\nplt.legend()\nplt.show()",
   "metadata": {},
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "yb2s18keiw",
   "source": "## 7. Exercises\n\nTry these in new cells below:\n\n1. **Change the number of hidden neurons** (`H`). Try 2, 5, 20, 50. How does the fit change? At what point does adding neurons stop helping?\n\n2. **Activation functions**: In the PyTorch model, replace `nn.Tanh()` with `nn.ReLU()` or `nn.Sigmoid()`. How does the fit change?\n\n3. **Optimizer comparison**: Replace `Adam` with `torch.optim.SGD(model.parameters(), lr=0.01)`. How does training speed compare?\n\n4. **Remove normalization**: Use `T_raw` and `Cp_raw` directly (no scaling to [0,1]). What happens? Can you fix it by adjusting the learning rate?\n\n5. **Overfitting**: Set `H = 100` and train for 20,000 epochs. Does it fit the training data well? Look at the extrapolation — is it reasonable?\n\n6. **Higher-order polynomial**: Try `np.polyfit(T_raw, Cp_raw, 10)`. How does it compare to the cubic? How does it extrapolate?",
   "metadata": {}
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.12.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/21
+++ b/21
@ -0,0 +1,21 @@
 MIT License
 Copyright (c) 2025-2026 Eric M. Furst, University of Delaware
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in all
 copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
--- a/README.md
+++ b/README.md
@ -0,0 +1,56 @@
 # LLMs for Engineers
 **CHEG 667-013 — Chemical Engineering with Computers**  
 Department of Chemical and Biomolecular Engineering, University of Delaware
 A hands-on workshop on Large Language Models and machine learning for engineers. Learn how to train a GPT from scratch, run local models, build retrieval-augmented generation systems, then tie it back to underlying machine learning methods by implementing a simple neural network.
 ## Sections
 | # | Topic | Description |
 |---|-------|-------------|
 | [01](01-nanogpt/) | **nanoGPT** | Train a small transformer on Shakespeare. Explore model parameters, temperature, and text generation. |
 | [02](02-ollama/) | **Local models with Ollama** | Run pre-trained LLMs locally. Summarize documents, query arXiv, generate code, build custom models. |
 | [03](03-rag/) | **Retrieval-Augmented Generation** | Build a RAG system: chunk documents, embed them, and query with an LLM grounded in your own data. |
 | [04](04-semantic-search/) | **Advanced retrieval** | Hybrid BM25 + vector search with cross-encoder re-ranking. Compares summarization versus raw retrieval. |
 | [05](05-neural-networks/) | **Building a neural network** | Implement a one-hidden-layer network from scratch in numpy, then in PyTorch. Fits $C_p(T)$ data for N₂. |
 ## Prerequisites
 - A terminal (macOS/Linux, or WSL on Windows)
 - Python 3.10+
 - Basic comfort with the command line
 - [Ollama](https://ollama.com) (sections 02–04)
 ## Getting started
 Clone this repository and work through each section in order:
 ```bash
 git clone https://lem.che.udel.edu/git/furst/llm-workshop.git
 cd llm-workshop
 ```
 Each section has its own `README.md` with a full walkthrough, exercises, and any code or data needed.
 ### Python environment
 Create a virtual environment once and reuse it across sections:
 ```bash
 python3 -m venv llm
 source llm/bin/activate
 pip install numpy torch matplotlib
 ```
 Sections 03 and 04 have additional dependencies listed in their `requirements.txt` files.
 ## License
 MIT
 ## Author
 Eric M. Furst, University of Delaware