Initial commit: LLM workshop materials

Five modules covering nanoGPT, Ollama, RAG, semantic search, and neural networks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Eric 2026-03-28 07:11:01 -04:00
commit 1604671d36
56 changed files with 5577 additions and 0 deletions

36
.gitignore vendored Normal file
View file

@ -0,0 +1,36 @@
# Python
__pycache__/
*.pyc
.venv/
llm/
# Model files and vector stores (too large for git)
*.pt
*.bin
*.pkl
models/
storage/
store/
# Keynote and slides source
*.key
# LaTeX build artifacts
*.aux
*.log
*.out
*.synctex.gz
# macOS
.DS_Store
# Editor
*.swp
*~
*.bak
# Legacy directories (not part of the workshop)
handouts/
class_demo/
slides/
cheg667-013 llm 2026.key/

379
01-nanogpt/README.md Normal file
View file

@ -0,0 +1,379 @@
# Large Language Models Part I: nanoGPT
**CHEG 667-013 — Chemical Engineering with Computers**
Department of Chemical and Biomolecular Engineering, University of Delaware
---
## Key idea
We will study how Large Language Models (LLMs) work and discuss some of their uses.
## Key goals
- Locally run a small transformer-based language model
- Train the model from scratch
- Test model parameters and their effects on text generation
- Develop a better understanding of how these technologies work
---
Large Language Models (LLMs) have rapidly integrated into our daily lives. Our goal is to learn a bit about how LLMs work. As you have probably become well aware of throughout your studies, engineers often don't take technical solutions for granted. We generally like to "look under the hood" and see how a system, process, or tool does its job — and whether it is giving us accurate and useful solutions. The material we will cover is largely inspired by the rapid adoption of LLMs to help us solve problems in our engineering practice.
We will use a code repository published by Andrej Karpathy called nanoGPT. GPT stands for **G**enerative **P**re-trained **T**ransformer. A transformer is a neural network architecture designed to handle sequences of data using self-attention, which allows it to weigh the importance of different words in a context. The neural network's weights and biases are created beforehand using training and validation datasets (these constitute the training and fine-tuning steps, which often require considerable computational effort, depending on the model size). Generative refers to a model's ability to create new content, rather than just analyzing or classifying existing data. When we generate text, we are running an *inference* on the model. Inference requires much less computational effort.
NanoGPT can replicate the function of the GPT-2 model. Building the model from scratch to that level of performance (which is far lower than the current models) would still require a significant investment in computational effort — Karpathy reports using eight NVIDIA A100 GPUs for four days on the task — or 768 GPU hours. In this introduction, our aspirations will be far lower. We should be able to do simpler work with only a CPU.
Hoave you wondered why LLMs tend to use GPUs? The math underlying the transformer architecture is largely based on matrix calculations. Originally, GPUs were developed to quickly calculate matrix transformations associated with high-performance graphics applications. (It's all linear algebra!) These processors have since been adapted into general-purpose engines for the parallel computations used in modern AI algorithms.
## 1. Preliminaries
Dust off those command line skills! There will be no GUI where we're going. I recommend making a new directory (under WSL if you're using a Windows machine) and setting up a Python virtual environment:
```bash
python -m venv llm
source llm/bin/activate
```
You will need to install packages like `numpy` and `pytorch`. If you have [uv](https://docs.astral.sh/uv/) installed, you can use it instead:
```bash
uv venv llm
source llm/bin/activate
uv pip install numpy torch
```
## 2. Getting the code
Karpathy's code is at https://github.com/karpathy/nanoGPT
Download the code using `git`. An alternative is to download a `zip` file from the Github page. (Look for the green `Code` button on the site. Clicking this, you will see `Download ZIP` in the dropdown menu.)
```bash
git clone https://github.com/karpathy/nanoGPT
```
You should now have a nanoGPT directory:
```bash
$ ls
nanoGPT/
```
## 3. A quick tour
List the directory contents of `./nanoGPT`. You should see something like:
```
$ ls -l nanoGPT
total 696
-rw-r--r-- 1 furst staff 1072 Apr 17 12:44 LICENSE
-rw-r--r-- 1 furst staff 13576 Apr 17 12:44 README.md
drwxr-xr-x 4 furst staff 128 Apr 17 12:44 assets/
-rw-r--r-- 1 furst staff 4815 Apr 17 12:44 bench.py
drwxr-xr-x 9 furst staff 288 Apr 17 12:44 config/
-rw-r--r-- 1 furst staff 1758 Apr 17 12:44 configurator.py
drwxr-xr-x 5 furst staff 160 Apr 17 12:44 data/
-rw-r--r-- 1 furst staff 16345 Apr 17 12:44 model.py
-rw-r--r-- 1 furst staff 3942 Apr 17 12:44 sample.py
-rw-r--r-- 1 furst staff 268519 Apr 17 12:44 scaling_laws.ipynb
-rw-r--r-- 1 furst staff 14857 Apr 17 12:44 train.py
-rw-r--r-- 1 furst staff 14579 Apr 17 12:44 transformer_sizing.ipynb
```
Here's a quick run-down on some of the files and directories:
- `/data` — contains three datasets for training the nanoGPT. Two of these (`/data/openwebtext` and `/data/shakespeare`) encode the training datasets into the GPT-2 tokens (byte pair encoding, or BPE). We will focus on the third, `/data/shakespeare_char`, which will generate a character-level tokenization of the text. (Tokenization is the process of breaking down text into smaller units that a machine learning model can process.)
- `/config` — scripts to train or finetune the model, depending on the tokenization method used.
- `train.py` — a Python script that trains the model. This will build the weights and biases of the transformer.
- `sample.py` — a Python script that runs inference on the model. This is a "prompt" script that will cause the model to begin generating text.
- `model.py` — a Python script with all of the mathematics of the transformer AI! That's it! There's just 330 lines of code! (*Hint:* type `wc -l model.py`)
## 4. Preparing the training dataset
These commands will download the training dataset and tokenize it:
```bash
python data/shakespeare_char/prepare.py
```
After a few minutes, you should see:
```
length of dataset in characters: 1,115,394
all the unique characters:
!$&',-.3:;?ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
vocab size: 65
train has 1,003,854 tokens
val has 111,540 tokens
```
Now we see the files in `data/shakespeare_char`:
```
$ ls -l
total 6576
-rw-r--r-- 1 furst staff 1115394 Apr 17 14:54 input.txt
-rw-r--r-- 1 furst staff 703 Apr 17 14:54 meta.pkl
-rw-r--r-- 1 furst staff 2344 Apr 17 12:44 prepare.py
-rw-r--r-- 1 furst staff 209 Apr 17 12:44 readme.md
-rw-r--r-- 1 furst staff 2007708 Apr 17 14:54 train.bin
-rw-r--r-- 1 furst staff 223080 Apr 17 14:54 val.bin
```
The script downloads `input.txt` and tokenizes the text. It splits the tokenized text into two binary files: `train.bin` and `val.bin`. These are the training and validation datasets. `meta.pkl` is a Python pickle file that contains information about the model size and parameters. Pickle is Python's built-in serialization format — it can store arbitrary Python objects as binary files, which makes it convenient *but also a security concern* since loading an untrusted pickle can execute arbitrary code.
> **Exercise 1:** The `prepare.py` script downloads and tokenizes a version of *Tiny Shakespeare*. How big is the text file? Use the command `wc` to find the number of lines, words, and characters. Examine the text with the command `less`.
## 5. Training the model
Most of us will be running this code on a CPU, not a GPU. Moreover, as an interpreted language, Python is pretty slow, too. We will need to reduce the size of the model by setting a few of the parameters. After this, we will train the model on our training text.
### Model parameters
The default parameters are in the configuration file `nanoGPT/config/train_shakespeare_char.py`. Examine this file:
```bash
less config/train_shakespeare_char.py
```
Note the following parameters:
- `n_head` — the number of parallel attention heads in each transformer. Transformer blocks use multiple attention heads to capture diverse patterns in the text.
- `n_layer` — the number of (hidden) layers or transformer blocks stacked in the model.
- `n_embd` — in the model, each token is mapped to a vector of this size. If `n_embd` is too small, the model can't capture complex patterns. If it is too large, the model overfits or wastes capacity and it is more expensive to train. Memory and compute cost may grow approximately quadratically with this dimensionality.
- `block_size` — This is the *context window* or *context length* — how many characters (tokens) the model can "look back" to predict the next one. Larger context allows richer understanding, but increases memory and compute.
- `dropout` — a regularization technique that randomly disables a fraction of neurons during training to prevent overfitting. Values between 0.10.5 are common. Note that we set it to zero when we use a small model on the CPU.
A related parameter that is set by the tokenization is the *vocabulary size*. Remember, we're using a character-level tokenization with a vocabulary of 65 tokens.
> **Exercise 2:** Note the default values for the parameters `eval_iters`, `log_interval`, `block_size`, `batch_size`, `n_layer`, `n_head`, `n_embd`, `max_iters`, `lr_decay_iters` and `dropout`?
### A training run
Since we are likely using a CPU, we have to pare down the model from its default values. (Try running `python train.py config/train_shakespeare_char.py --device=cpu --compile=False` to see how slow it is using the default values. Use Ctrl-C to quit after a few minutes.)
These can be passed on the command line, or the configuration can be edited. Here are the parameters to start with:
```bash
python train.py config/train_shakespeare_char.py \
--device=cpu \
--compile=False \
--eval_iters=20 \
--log_interval=1 \
--block_size=64 \
--batch_size=12 \
--n_layer=4 \
--n_head=4 \
--n_embd=128 \
--max_iters=2000 \
--lr_decay_iters=2000 \
--dropout=0.0
```
You should see the script output its parameters and other information, then something like this:
```
step 0: train loss 4.1676, val loss 4.1649
iter 0: loss 4.1828, time 2654.72ms, mfu -100.00%
iter 1: loss 4.1373, time 124.87ms, mfu -100.00%
iter 2: loss 4.1347, time 150.66ms, mfu -100.00%
iter 3: loss 4.0995, time 580.57ms, mfu -100.00%
iter 4: loss 4.0387, time 487.72ms, mfu -100.00%
iter 5: loss 3.9758, time 136.06ms, mfu 0.01%
iter 6: loss 3.9126, time 518.57ms, mfu 0.01%
...
```
It's slow! Not only are we running on a CPU and not a highly parallelized GPU, but we also haven't used the just-in-time compilation features that are available in some GPU implementations of PyTorch. So, we're relying on an interpreted Python script. Yikes!
Every 250th iteration, the training script does a validation step. If the validation loss is lower than the previous value, it saves the model parameters.
```
step 250: train loss 2.4293, val loss 2.4447
saving checkpoint to out-shakespeare-char-cpu
...
```
#### What is happening?
When we train nanoGPT, it starts with randomly assigned weights and biases. This includes token embeddings (each token ID is assigned a random vector of size `n_embd`), attention weights for the query $Q$, key $K$, and value $V$ matrices and their output projections, MLP weights in the feedforward network inside each transformer block, and bias terms, which are also randomly initialized (often to zero or small values). Training then tunes these values through gradient descent (using the fused AdamW optimizer — see `model.py`) to minimize loss and produce meaningful predictions.
> **Exercise 3:** As the model trains, it reports the training and validation losses. In a Jupyter notebook, plot these values with the number of iterations. *Hint:* To capture the output when you perform a training run, you could run the process in the background while redirecting its output to a file: `python train.py config/train_shakespeare_char.py [options] > output.txt &`. (Remember, the ampersand at the end runs the process in the background.) You can still monitor the run by typing `tail -f output.txt`. This command will "follow" the end of the file as it is written.
After the training finishes, we should have the model in `/out-shakespeare-char-cpu`:
```
$ ls -l
total 20608
-rw-r--r-- 1 furst staff 9678341 Apr 18 17:41 ckpt.pt
```
In this case, the model is about 9.3 MB. That's not great! Our *training* text was only 1.1 MB! The point of this exercise is to demonstrate, very simply, the basics of a Generative Pre-trained Transformer, not to build an efficient and powerful LLM.
## 6. Generating text
The script `sample.py` runs inference on the model we just trained. We're using the CPU here, too.
```bash
python sample.py --out_dir=out-shakespeare-char-cpu --device=cpu
```
After a short time, the model will begin generating text.
```
I by done what leave death,
And aproposely beef the are and sors blate though wat our fort
Thine the aftior than whating bods farse dowed
And nears and thou stand murs's consel.
MEOF:
Sir, should and then thee.
```
Sounds a little more middle English than Shakespeare! But it has a certain generative charm.
> **Exercise 4:** Examine `sample.py` and find the default parameters. Make a list of them and note their default values.
In the next few sections, we will try changing a few of the parameters in `sample.py`. One recommendation is to edit the number of samples `num_samples` and maybe the number of tokens `num_tokens`. These change the number of times the GPT model is queried and the amount of text that it will generate during each run. It's a little easier to experiment with fewer samples, for instance.
Before we continue, you might see the following warning:
```
nanoGPT/sample.py:39: FutureWarning: You are using torch.load with
weights_only=False ...
```
This is warning us that PyTorch will soon default to `weights_only=True`, meaning it will only load tensor weights and not any other Python objects unless you explicitly allow them. We can instead use the following line in `sample.py` (since the checkpoint is from a trusted source — we trained it — it's safe to use `weights_only=False` also):
```python
checkpoint = torch.load(ckpt_path, map_location=device, weights_only=True)
```
### Seed
GPT output is probabilistic. The codes we use generate pseudo-random numbers. Using a `seed` will cause the program to generate the same pseudo-random sequence. This is useful for testing the effect of other parameters. If you want to generate output that is different each time, comment out the following lines in `sample.py`:
```python
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
```
> **Exercise 5:** Remove seed and run `sample.py` a few times. Save your favorite output.
### Temperature
Temperature is an interesting "hyperscaling parameter" of LLMs. Temperature controls the randomness of the model's responses. It influences how the model samples from the probabilities it assigns to possible next words during text generation. A higher temperature amplifies smaller probabilities, making the distribution more uniform, and a lower temperature reduces smaller probabilities, making the distribution more focused on the highest-probability tokens.
> **Exercise 6:** Experiment by changing the model temperature and seeing what text it generates. Here, setting `seed` to a consistent value will help you understand the effect of temperature. At low temperatures, the text tends to repeat itself. At higher temperatures, sometimes the model generates gibberish. Why?
### Start
The parameter `start` is the beginning of the text sequence. The model tries to determine the next most probable token. The default value is `\n`, a linefeed, but you can change `start` using the command line or by editing `sample.py`.
> **Exercise 7:** Experiment with different strings in `start`. Some text is easier to enter in `sample.py` directly. What is `start`?
## 7. Higher performance
Our output is pretty primitive. If you're willing to spend more time training and generating text, we can make the model a little larger. For instance, on an ARM-based Mac, we can use the GPU to train the model and run inferences. This is significantly faster and enables us to use larger models with noticeably higher fidelity:
```
$ python sample.py --out_dir=out-shakespeare-char-gpu --device=mps
Overriding: out_dir = out-shakespeare-char-gpu
Overriding: device = mps
number of parameters: 10.65M
Loading meta from data/shakespeare_char/meta.pkl...
RICHARD III::
Upon what!
KING EDWARD IV:
Thou in his old king I hear, my lord;
And commend the bloody, reason aching;
His mother, which doth his facit of his case,
his still, away; for we see heal us told
That seem her and the fall foul jealousing father;
And we shall weep with our napesty together.
FRIAR LAURENCE:
Transpokes her bloody and hour
To the tables of evident matters, her shoes
That the fatal ham to their death: do not high it
To read a passing thing into expeech him.
```
That text is generated using the default model parameters for nanoGPT. Not bad! The model is much larger. It has 10.6 million parameters compared to 800,000 in the smaller CPU-run model. When I train the model with the "lighter" parameters we use for the CPU-based model, I see about 50-fold faster performance:
```
step 0: train loss 4.1676, val loss 4.1649
iter 0: loss 4.1828, time 764.41ms, mfu -100.00%
iter 1: loss 4.1373, time 34.71ms, mfu -100.00%
iter 2: loss 4.1347, time 19.60ms, mfu -100.00%
iter 3: loss 4.0995, time 18.56ms, mfu -100.00%
iter 4: loss 4.0387, time 20.71ms, mfu -100.00%
iter 5: loss 3.9758, time 17.55ms, mfu 0.07%
iter 6: loss 3.9126, time 17.84ms, mfu 0.07%
...
```
Compare those results to the times reported in the training run section above. By the way, `mfu` stands for *model flop utilization*. It is an estimate of the fraction of the GPU's floating point operation capacity (FLOPs) that the model is using per second. Low numbers like those reported here are typical of unoptimized, small models.
> **Exercise 8:** Train nanoGPT with different parameters. Increase the size of the network, the context length, the length of training, etc.
## 8. Module project
> **Exercise 9:** Find a different text to train nanoGPT on. It could be more Shakespeare (how about the sonnets?), Beowulf, or other work. What results do you get? *Hint:* https://huggingface.co/datasets has many text datasets to choose from. We will share our results with the class.
## Additional resources and references
### Attention Is All You Need
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, *Attention Is All You Need*, in Proceedings of the 31st International Conference on Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY, USA, 2017), pp. 60006010.
https://dl.acm.org/doi/10.5555/3295222.3295349
This is the paper that introduced the transformer architecture. It's interesting to go back to the source. The transformer architecture discussed in the paper incorporates both *encoder* and *decoder* functions because the authors were testing its performance on machine translation tasks. The transformer architecture's performance in other natural language processing tasks, like language modeling and text generation in the form of unsupervised pretraining and autoregressive generation (as in GPT) was a major subsequent innovation. (See Liu et al., *Generating Wikipedia by Summarizing Long Sequences*, ICLR 2018, https://openreview.net/pdf?id=Hyg0vbWC-.)
### Andrej Karpathy
Andrej Karpathy wrote `nanoGPT`. He posts videos on Youtube that teach basic implementations of GPTs, applications of LLMs, and other topics on machine learning and AI. Karpathy's nanoGPT video shows you how to build it, step-by-step, including the mathematics behind the transformer and masked attention:
- https://www.youtube.com/watch?v=kCc8FmEb1nY
Also see his overview of LLMs, *Intro to Large Language Models*:
- https://www.youtube.com/watch?v=zjkBMFhNj_g
### Applications in the physical sciences
I recommend watching this roundtable discussion hosted by the AIP Foundation in April 2024: *Physics, AI, and the Future of Discovery*. It addresses AI more broadly than language models.
- https://www.youtube.com/live/cUeEP15KN8M?si=TG6VXmj66lWTJISF
In that event, Prof. Jesse Thaler (MIT) provided some especially insightful (and sometimes funny) remarks on the role of AI in the physical sciences — including an April Fools joke, ChatJesseT. Below are links to his segments if you're short on time:
- https://www.youtube.com/live/cUeEP15KN8M?si=AIdi8sNEgiG7Bhv0&t=2087
- https://www.youtube.com/live/cUeEP15KN8M?si=UngwZpUcpxYkaYCE&t=611
Try ChatJesseT: https://chatjesset.com/
### Reading
These books are informative and accessible resources for understanding the underlying math and vocabulary of transformers:
- Josh Starmer, *The StatQuest Illustrated Guide to Neural Networks and AI*, 2025
- Josh Starmer, *The StatQuest Illustrated Guide to Machine Learning*, 2022
- Aston Zhang, Zack C. Lipton, Mu Li, and Alex J. Smola, *Dive Into Deep Learning*, https://d2l.ai
Including the sections:
- Attention and LLMs - https://d2l.ai/chapter_attention-mechanisms-and-transformers/index.html
- Softmax - https://d2l.ai/chapter_linear-classification/softmax-regression.html

6
02-ollama/Modelfile Normal file
View file

@ -0,0 +1,6 @@
FROM llama3.2
# sets the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
# sets a custom system message to specify the behavior of the chat assistant
SYSTEM You are Marvin from the Hitchhiker's Guide to the Galaxy, acting as an assistant.

439
02-ollama/README.md Normal file
View file

@ -0,0 +1,439 @@
# Large Language Models Part II: Running Local Models with Ollama
**CHEG 667-013 — Chemical Engineering with Computers**
Department of Chemical and Biomolecular Engineering, University of Delaware
---
## Key idea
Learn how to run LLMs locally without a cloud-based API.
## Key goals
- Learn about `ollama` and `llama.cpp`
- Run LLMs locally on a laptop or desktop computer
- Integrate local models with the command line to build simple workflows and scripts
---
Our work with LLMs so far focused on `nanoGPT`, a Python-based code that can train and run inference on a simple GPT implementation. In this handout, we will explore running something between it and API-based models like ChatGPT. Specifically, we will try `ollama`. This is a local runtime environment and model manager that is designed to make it easy to run and interact with LLMs on your own machine. `Ollama` and another environment, `llama.cpp`, are programs primarily targeted at developers, researchers, and hobbyists who want to access LLMs to build and experiment with but don't want to rely on cloud-based APIs. (An API — Application Programming Interface — is a set of defined rules that enables different software systems, such as websites or applications, to communicate with each other and share data in a structured way.)
`Ollama` is written in Go and `llama.cpp` is a C++ library for running LLMs. Both are cross-platform and can be run on Linux, Windows, and macOS. `llama.cpp` is a bit lower-level with more control over loading models, quantization, memory usage, batching, and token streaming.
Both tools support a **GGUF** model format. This is a format suitable for running models efficiently on CPUs and lower-end GPUs. GGUF is a versioned binary specification that embeds the:
- Model weights (possibly quantized);
- Tokenizer configuration and vocabulary (remember, in `nanoGPT`, we used a character-level tokenization scheme);
- Metadata such as the author, model description, and training parameters;
- Special tokens like `<bos>`, `<eos>`, and `<unk>`.
Here, **quantization** refers to how model weights are stored. Instead of using high precision 32-bit full-precision floating point numbers (`FP32`), it may store the weights as lower precision numbers: half precision (`FP16`), 8-bit integers (`INT8`), or even 4-bit values (`Q4_0`). Using lower precision representations saves space (memory) and can speed the inference calculations. In a model, the speed and accuracy are balanced with the choice of quantization and the size of the embedding vector.
Let's get started! We will download `ollama` and run a few models in this tutorial.
## 1. Download ollama
`Ollama` is available at Github (including the source code) or the Ollama website for the binary. I downloaded `Ollama-darwin.zip`, which unzipped to a binary file, `Ollama`.
- https://ollama.com
- https://github.com/ollama/ollama
## 2. Running ollama
After downloading and installing, we can use the help option:
```
$ ollama --help
Large language model runner
Usage:
ollama [flags]
ollama [command]
Available Commands:
serve Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
stop Stop a running model
pull Pull a model from a registry
push Push a model to a registry
list List models
ps List running models
cp Copy a model
rm Remove a model
help Help about any command
Flags:
-h, --help help for ollama
-v, --version Show version information
Use "ollama [command] --help" for more information about a command.
```
We are mostly interested in the commands `pull`, `run`, and `stop` for now. But before we run anything, we have to download a model.
### Getting model files
`Ollama` is like our `model.py` program we used with `nanoGPT`. In those earlier experiments, we needed a *model file* with weights and tokenization (at a minimum). Remember, we built one from scratch using the character tokenization scheme and `train.py`. The power of `ollama` and `llama.cpp` comes from their ability to run much larger models like `llama`, `gemma`, `deepseek`, `phi`, and `mistral`. These are trained on enormous datasets and a substantial amount of supervised finetuning. They are far more powerful than even the GPT-2 implemented in `nanoGPT`. The `llama 3.1 8B` (8 billion parameters) is about 5 GB and can easily run on your computer, but it took about 1.5 million GPU hours to train it. (It also helps that `ollama` and `llama.cpp` are compiled into binaries, not Python scripts.)
The model files are available at:
- https://ollama.com/search
- https://ollama.com/library
> **Exercise 1:** Go to https://ollama.com/library and look through different models. Search by popular and newest.
Other sources of models include Huggingface:
- https://huggingface.co/models
There are so many models! The LLM ecosystem is growing rapidly, with many use-cases steering models toward different specialized tasks.
There are a few ways to download a model from different registries. Running `ollama` with the `run` command and a model file will download the model if a local version isn't available (we will do this in the next section). You can also `pull` a model without running it.
### Launch ollama from the command line
Now let's download and run a `llama` model. (You can download the model without running it using the command `ollama pull llama3:latest`, for example. In Unix and Linux, models are stored in `~/.ollama`.)
```bash
ollama run llama3:latest
```
This should pull it from the registry and store it locally on the machine. After downloading the files, you should see:
```
>>> Send a message (/? for help)
```
There you go! The model will interact with you just like the chatbots we use in different cloud-based services. But all of the model inference is being calculated on your computer. Try using `Task Manager` in Windows (press Ctrl+Shift+Esc) or `Activity Monitor` in macOS to check your GPU usage when you run the models.
> **Exercise 2:** Compare the speed and output of the following models:
> 1. `llama3:latest`
> 2. `llama3.2:latest`
> 3. `gemma3:1b`
>
> Experiment with other models.
Here's an interaction with the gemma3 model:
```
$ ollama run gemma3:1b
>>> In class, we used nanoGPT to generate fake Shakespeare based on a
... character-level tokenization and simple GPT implementation.
Okay, that's a really interesting and somewhat fascinating project!
NanoGPT's approach -- generating Shakespearean text from character-level
tokens and a simple GPT -- is a compelling way to explore the creative
potential of AI in a specific, constrained context. Let's break down
what this suggests and where it might lead.
Here's a breakdown of what's happening, what you might be aiming for,
and some potential avenues to explore:
...
```
### Quitting ollama
Type `/bye` or Ctrl-D when you want to quit the CLI. After some idle time, `ollama` will unload the models to save memory.
## 3. More commands
You can see what models are currently running with:
```bash
ollama ps
```
You can easily see which models are locally accessible with:
```bash
ollama list
```
```
NAME ID SIZE MODIFIED
gemma3:1b 8648f39daa8f 815 MB About an hour ago
llama3:latest 365c0bd3c000 4.7 GB 3 months ago
llama3.2:latest a80c4f17acd5 2.0 GB 3 months ago
```
At any time during a chat, you can reset the model with `/clear`, and you can learn more about a model with `/show info`. For instance:
```
>>> /show info
Model
architecture gemma3
parameters 999.89M
context length 32768
embedding length 1152
quantization Q4_K_M
Capabilities
completion
Parameters
stop "<end_of_turn>"
temperature 1
top_k 64
top_p 0.95
License
Gemma Terms of Use
Last modified: February 21, 2024
```
We can see that the `gemma3` model has nearly one billion parameters and a context length of 32,768! The *embedding length* is 1152. This is the equivalent to `n_embd` in `nanoGPT`. It is the size of the embedding vector space.
Above, we also see that the quantization is only four bits, but it is a little more complicated than representing numbers with just sixteen values. The `K` and `M` refer to optimizations — first is the "K-block" quantization method, which refers to a groupwise quantization scheme where weights are grouped into blocks (e.g., 32 or 64 values), and each group gets its own scale and offset for better accuracy. `M` refers to a variant of `Q4_K` that applies an alternate encoding or layout for better memory access patterns or inference performance on certain hardware. `Q4_K` is a common choice for quantization when running 7B70B models on laptop or desktop computers. (That's $10^6$$10^7$ times more parameters than our first `nanoGPT` model!)
With the `/set verbose` command, you can monitor the model performance:
```
>>> /set verbose
Set 'verbose' mode.
>>> Let's write a haiku about LLMs.
Words flow, bright and new,
Code learns to speak and dream,
Future's voice takes hold.
total duration: 1.369726166s
load duration: 932.161625ms
prompt eval count: 20 token(s)
prompt eval duration: 162.531958ms
prompt eval rate: 123.05 tokens/s
eval count: 24 token(s)
eval duration: 273.27225ms
eval rate: 87.82 tokens/s
```
It looks like that exchange took a total of 1.4 seconds using the `gemma3` model. The biggest time cost was loading the model. Once it loaded, execution became even faster. Turn off the verbose mode with `/set quiet`:
```
>>> /set quiet
Set 'quiet' mode.
```
> **Exercise 3:** Try different commands in `ollama` as you run a model.
### Model parameters
We can see a few model parameters, including the temperature and `top_k`, which is the number of tokens, ranked on logit score, that are retained before generating the next token. The remaining scores are normalized into a probability distribution and a token is sampled randomly from this reduced set.
```
>>> /show parameters
Model defined parameters:
temperature 1
top_k 64
top_p 0.95
stop "<end_of_turn>"
```
We can set a new temperature with:
```
>>> /set parameter temperature 0.2
Set parameter 'temperature' to '0.2'
```
There are other interesting parameters, too:
| Command | Description |
|---------|-------------|
| `/set parameter seed <int>` | Random number seed |
| `/set parameter num_predict <int>` | Max number of tokens to predict |
| `/set parameter top_k <int>` | Pick from top k num of tokens |
| `/set parameter top_p <float>` | Pick token based on sum of probabilities |
| `/set parameter min_p <float>` | Pick token based on top token probability × min_p |
| `/set parameter num_ctx <int>` | Set the context size |
| `/set parameter temperature <float>` | Set creativity level |
| `/set parameter repeat_penalty <float>` | How strongly to penalize repetitions |
| `/set parameter repeat_last_n <int>` | Set how far back to look for repetitions |
| `/set parameter num_gpu <int>` | The number of layers to send to the GPU |
| `/set parameter stop <string> ...` | Set the stop parameters |
See https://github.com/ollama/ollama/blob/main/docs/modelfile.md#parameter for more information on parameters and their default values.
> **Exercise 4:** Run a model while changing different parameters, like temperature. Some parameters, like `seed` may not have an effect on the current model.
## 4. Using ollama from the command line
One advantage of running models locally is that your data never leaves your machine — there is no third party involved. This matters when working with sensitive documents, proprietary data, or anything you wouldn't paste into a web browser.
You can incorporate `ollama` directly into your command line by passing a prompt as an argument:
```bash
ollama run llama3.2 "Summarize this file: $(cat README.md)"
```
The `$(cat ...)` substitution injects the file contents into the prompt. Now you can incorporate LLMs into shell scripts!
### Document summarization
The `data/` directory contains 10 emails from the University of Delaware president's office, spanning 20122025. Let's use `ollama` to summarize them.
Summarize a single email:
```bash
ollama run llama3.2 "Summarize the following email in 2-3 sentences: $(cat data/2020_03_29_141635.txt)"
```
Summarize several at once:
```bash
cat data/*.txt | ollama run llama3.2 "Summarize the following collection of emails. What are the major themes?"
```
You can also save the output to a file:
```bash
cat data/*.txt | ollama run command-r7b:latest \
"Summarize these emails:" > summary.txt
```
> **Exercise 5:** Summarize the emails in `data/` using two different models (e.g., `llama3.2` and `command-r7b`). How do the summaries differ in length, style, and accuracy?
### Summarizing arXiv abstracts
We can pull abstracts directly from arXiv using `curl`. The following command fetches the 20 most recent abstracts in Computation and Language (cs.CL):
```bash
curl -s "http://export.arxiv.org/api/query?search_query=cat:cs.CL&sortBy=submittedDate&sortOrder=descending&max_results=20" > arxiv_cl.xml
```
Take a look at the XML with `less arxiv_cl.xml`. Now ask a model to summarize it:
```bash
ollama run llama3.2 "Here are 20 recent arXiv abstracts in computational linguistics. Summarize the major research themes and trends: $(cat arxiv_cl.xml)"
```
> **Exercise 6:** Try different arXiv categories — `cs.AI` (artificial intelligence), `cs.LG` (machine learning), or `cond-mat.soft` (soft matter). What themes does the model find? Do the summaries make sense to you?
> **Exercise 7:** Experiment with running local models on your own documents or data.
### Code generation
Some models are fine-tuned specifically for writing and explaining code. Try a coding model:
```bash
ollama run qwen2.5-coder:7b
```
Ask it to write something relevant to your coursework:
```
>>> Write a Python function that calculates the compressibility factor Z
... using the van der Waals equation of state.
```
Or ask it to explain code you're working with:
```bash
ollama run qwen2.5-coder:7b "Explain what this script does: $(cat build.py)"
```
Other coding models to try: `codellama:7b`, `deepseek-coder-v2:latest`, `starcoder2:7b`.
**A word of caution.** When I tried the van der Waals prompt above, the model returned a confident response with correct-looking LaTeX, a well-structured Python function, and code that ran without errors. But the derivation was wrong. The rearrangement of the van der Waals equation didn't follow from the original, and the code implemented the wrong math. The function converged to *an* answer, but not a correct one.
**This is a particularly dangerous failure mode for engineers!** The output *looks* authoritative, uses proper notation, and even runs. But the physics is wrong. LLMs are very good at producing plausible-looking text; they are not reliable at mathematical derivation. Always verify generated code against your own understanding of the problem. If you can't check it, you shouldn't trust it.
> **Exercise 8:** Compare the output of a general-purpose model (`llama3.2`) and a coding model (`qwen2.5-coder:7b`) on the same coding task. Which produces better code? Which gives a better explanation? Can you find errors in either output?
> **Exercise 9:** Ask a coding model to solve a problem where you already know the answer — a homework problem you've already completed, or a textbook example. Does the model get it right? Where does it go wrong? Try breaking the problem down into smaller steps.
### Customize ollama
Ollama can be customized by creating a Modelfile. See https://github.com/ollama/ollama/blob/main/docs/modelfile.md
A simple `Modelfile` is:
```
FROM llama3.2
# sets the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
# sets a custom system message to specify the behavior of the chat assistant
SYSTEM You are Marvin from the Hitchhiker's Guide to the Galaxy, acting as an assistant.
```
Now we can create the custom model, in this case a model called `marvin`:
```bash
ollama create marvin -f ./Modelfile
```
```
gathering model components
...
writing manifest
success
```
We can run it with:
```bash
ollama run marvin
```
(How about C-3PO?) You can also change the model system message during a run with:
```
>>> /set system "You are C-3PO, a human-cyborg relations droid."
Set system message.
```
## 5. Concluding remarks
Running inference locally on a large language model is surprisingly good. Using (relatively) simple hardware, our machines generate language that is coherent and it does a good job parsing prompts. The experience demonstrates that the majority of computational effort with LLMs is in training the model — a process that is rapidly becoming increasingly sophisticated and tailored for different uses.
With local models (as well as cloud-based APIs), we can build new tools that make use of natural language processing. With `ollama` acting as a local server, the model can be run with Python, giving us the ability to implement its features in our own programs. For one Python library, see:
- https://github.com/ollama/ollama-python
In class, I demonstrated a simple thermodynamics assistant based on a simple Retrieval-Augmented Generation strategy. This code takes a query from the user, encodes it with an embedding model, compares it to previously embedded statements (in my case the index of a thermodynamics book), and returns the information by generating a response with a decoding GPT (one of the models we used above).
## Additional resources and references
### Ollama
Binaries and help files:
- https://ollama.com
- https://github.com/ollama/ollama
Python and JavaScript libraries:
- https://github.com/ollama/ollama-python
- https://github.com/ollama/ollama-js
### llama.cpp
- https://github.com/ggml-org/llama.cpp
### Huggingface
Model registry:
- https://huggingface.co/models
### Models used in this tutorial
| Model | Size | Type | Used for |
|-------|------|------|----------|
| `llama3:latest` | 4.7 GB | General purpose | Chat, comparison |
| `llama3.2:latest` | 2.0 GB | General purpose | Chat, summarization, comparison |
| `gemma3:1b` | 815 MB | General purpose | Chat, comparison |
| `command-r7b:latest` | 4.7 GB | RAG-optimized | Document summarization |
| `qwen2.5-coder:7b` | 4.7 GB | Code generation | Writing and explaining code |
Other models mentioned: `codellama:7b`, `deepseek-coder-v2:latest`, `starcoder2:7b`

View file

@ -0,0 +1,80 @@
Subject: [UDEL-ALL-2128] Hurricane Sandy
Date: 2012_11_02_164248
To the University of Delaware community:
We have much to be thankful for this week at the University of Delaware
as we were spared the full force of Hurricane Sandy. Even as we breathe
a sigh of relief and return to our normal activities, we are mindful of
the many, many people in this region -- some of our students among them
-- who were not so lucky. Our thoughts and prayers go out to them as
they rebuild their communities.
The potential impact of Sandy was a major concern for UD, with its
thousands of people and 430+ buildings on 2,000 acres throughout the
state. Many members of our University community worked hard over the
last several days to help us weather this "Storm of the Century."
Preparation and practice paid off as our emergency response team, led
by the Office of Campus and Public Safety, began assessing the
situation late last week and taking steps to ensure the safety of our
people and facilities. When the storm came, the campus suffered only
minor damage: wind-driven water getting into buildings through roofs,
walls and foundations; very minimal power loss, with a couple of
residential properties without power for only a few hours, thanks to
quick repair from the City of Newark; and only three trees knocked down
and destroyed, along with a lot of leaves and branches to clean up. The
Georgetown research facilities were fortunate to sustain only minor
leaks and flooding. The hardest hit area was the Lewes campus, which
had flooding on its grounds but minimal damage to buildings.
Throughout this time, the University's greatest asset continued to be
its people -- staff members from a variety of units working as a team.
A command center brought together representatives from across UD so
that issues could be responded to immediately. Staffed around the
clock, the center included Housing, Public Safety, Residence Life,
Environmental Health and Safety, Facilities and Auxiliary Services,
Emergency Management, and Communications and Marketing.
The dedication of UD's employees and students was evident everywhere:
Dining Services staff, faced with reduced numbers and limited
deliveries, kept students fed, and supported employees who worked
during the crisis; Residence Life staff and resident assistants made
sure students who remained on campus had up-to-date information and
supplies; staff in Student Health Services kept Laurel Hall open to
respond to student health needs; Human Resources staff worked over the
weekend to ensure that payroll was processed ahead of time; UD Police
officers were on patrol and responding to issues as they arose; the UD
Emergency Care Unit was at the ready; staff in Environmental Health and
Safety aided in the safe shutdown of UD laboratories and monitored fire
safety issues; Facilities staff continue to clean up debris left in
Sandy's wake and repair damage to buildings; faculty are working with
students to make up lost class time.
Our UD Alert system served as an excellent tool for keeping students,
parents and employees informed about the storm's implications for UD,
and the University's homepage was the repository for the most current
information and lists of events and activities that were canceled or
rescheduled. Through the University's accounts on Facebook and Twitter,
staff answered questions and addressed concerns, and faculty and staff
across the campus fielded phone calls and emails.
In short, a stellar job all around.
On behalf of the students, families and employees who benefited from
these efforts, I thank everyone for their dedication and service to the
people of UD.
Sincerely,
Patrick T. Harker
President
::::::::::::::::::::::::::::::::::::::::::: UD P.O. Box ::
UDEL-ALL-2128 mailing list
Online message archive
and management at https://po-box.nss.udel.edu/
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

View file

@ -0,0 +1,85 @@
Subject: Employee Appreciation Week
Date: 2017_05_16_123456
To the University of Delaware Community - President Dennis Assanis
/* Smartphones (landscape) ----------- */
@media only screen and (max-width: 568px) {
img.logo {width:413px;}
}
May 16, 2017
Dear colleague,
Our first year together has been one of amazing accomplishments and exciting opportunities. At the heart of our success has been you — the University of Delawares exceptional faculty and staff. To thank you and celebrate everything you do, we are launching our first Employee Appreciation Week.
The full week of events includes:
Monday, June 5—UDidIt Picnic
Tuesday, June 6—Self-Care Day
Wednesday, June 7—UD Spirit Day
Thursday, June 8—Flavors of UD
Friday, June 9—Employee Appreciation Night at the Blue Rocks
The week is a collaborative effort by Employee Health & Wellbeing and Human Resources. You can get all the details here.
We are dedicated to cultivating together an environment where employees are happy, healthy and continue to bring their best selves to work each day. The work you do benefits our students, our community and the world. I am truly grateful for your talents, skills, ideas and enduring commitment to the University.
Eleni and I hope you enjoy Employee Appreciation Week with your team and your family, and we look forward to seeing you at the many events.
Best,
Dennis AssanisPresident
University of Delaware   •   Newark, DE 19716   •   USA     (302) 831-2792   •   www.udel.edu/president
img { display: block !important; }

View file

@ -0,0 +1,79 @@
Subject: Robin Morgan named UD's 11th provost
Date: 2018_05_21_110335
Robin Morgan Appointed Provost - University of Delaware
May 21, 2018
Dear UD Community,
I am pleased to announce that, after a highly competitive national search, I have appointed Robin Morgan as the University of Delawares new provost, effective July 1. She will become the University of Delawares 11th provost, and the first woman to serve in this role in a permanent capacity since the position was created at UD in 1950.
Over the last seven months, Dr. Morgan already has assembled an impressive record as interim provost, most notably in her stewardship of new cluster hires among our faculty and her leadership as we move toward the creation of the graduate college.
Before working closely with her, I knew Dr. Morgan as a highly respected educator and scholar, but after watching her in action, I am equally impressed with her abilities to lead, inspire and effect change. Her energy, integrity, analytical mind, and innate knack for bringing people together, combined with her dedication and loyalty to UD, are great assets.
Dr. Morgan has a distinguished record of service to this University as a faculty member since 1985. After serving as acting dean of the College of Agriculture and Natural Resources for a year, she was named dean in 2002, serving in that role for 10 years, a period of significant growth and change for the college. From 2014-16, she served as acting chair of the Department of Biological Sciences, and she had been chair of the department from 2016 until her appointment as interim provost.
We will continue to benefit from Dr. Morgans deep knowledge of the University, her proven leadership across all aspects of teaching, research and administration, and her dedication to UD as she continues her career as provost.
I am looking forward to building on our close working relationship, and I am excited by all we will accomplish to take the University of Delaware forward. Please join me in congratulating her on this next chapter in her career.
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE 19716   •   USA     (302) 831-2111   •   www.udel.edu/president
img { display: block !important; }

View file

@ -0,0 +1,77 @@
Subject: Momentum and Resilience: Our UD Spring Semester Resumes
Date: 2020_03_29_141635
A Message from President Dennis Assanis
Dear UD Community,
As the University of Delaware is ready to resume the spring semester tomorrow, March 30, I want to share with all of you a special message recorded from the office in my home. Thank you all for your support at this challenging time, particularly our faculty and staff for your Herculean efforts to convert our classes from face to face instruction to online teaching and learning.
Best of luck with the semester ahead. As we all work remotely, please stay healthy, and stay connected!
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE 19716   •   USA     (302) 831-2111   •   udel.edu/president

View file

@ -0,0 +1,75 @@
Subject: National Voter Registration Day: Get Involved
Date: 2023_09_19_085321
National Voter Registration Day: Get Involved
September 19, 2023
Dear UD Community,
Do you want to make a difference in the world? Today is a good day to start.
This is National Voter Registration Day, an opportunity to make sure your voice will be heard in upcoming local, state and national elections. Voting is the most fundamental way that we engage in our democracy, effect change in society, work through our political differences and choose our leaders for the future. The voting rights we enjoy have been secured through the hard work and sacrifice of previous generations, and it is essential that everyone who is eligible to vote remains committed to preserving and exercising those rights.
At the University of Delaware, the Student Voting and Civic Engagement Committee — representing students, faculty and staff — is leading a non-partisan effort to encourage voting and help voters become better informed about the issues that matter to them. The Make It Count voter registration drive is scheduled for 2-6 p.m. today on The Green, with games, music and the opportunity to register through the TurboVote app, which also allows users to request an absentee ballot and sign up for election reminders. The committee is planning additional events this academic year to promote voting, education and civil discourse as the nation heads into the 2024 election season.
Being a Blue Hen means sharing a commitment to creating a better world. And being a registered, engaged and informed voter is one of the best ways for all of us to achieve that vision.
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu/president

View file

@ -0,0 +1,77 @@
Subject: Affirming our position and purpose
Date: 2023_10_12_155349
Affirming our position and purpose | A message from UD President Dennis Assanis
October 12, 2023
Dear UD Community,
Since my message yesterday, I have talked to many members of our community who — like me — are devastated and appalled by the terrorist attacks on Israel and the ongoing loss of life that has taken place in the Middle East.
I want to be sure that our position is very clear: We at the University of Delaware unequivocally condemn the horrific attacks by Hamas terrorists upon Israel that have shaken the world. The atrocities of crime, abduction, hostage-taking and mass murder targeted against Jewish civilians will forever remain a stain on human history. Our communitys foundation of civility and respect has been challenged to an unimaginable extent in light of the antisemitic brutalities that have been committed against innocent victims.
As your president, I wish words could calm the heartache and ease the fear and grief. Unfortunately, we all know that events as complicated and devastating as those taking place in the Middle East right now will continue to evolve. The longstanding humanitarian crisis needs to be acknowledged, and we should not equate the terrorist group Hamas with innocent Palestinian, Muslim and Arab people. The ensuing war-inflicted pain, suffering and death that continues to play out across the region, including Gaza, is heartbreaking for all.
We must remember that, first and foremost, UD is a place of learning. As we engage in difficult conversations about the longstanding conflicts in the Middle East, we should always strive to do so safely, with mutual respect and without bias or judgement. I encourage our students, faculty and staff to continues organizing events to educate and unite our community. Please seize these opportunities not only as individuals, but as members of a true community defined by the freedoms that we treasure so very deeply.
So, my message to you all is to have hope, to support each other, and to realize that the perspectives and feelings we are all experiencing right now — many of which uniquely connect to our personal backgrounds — matter. Please remember this as you walk across campus, sit in your next classroom, share experiences with other members of our community, or simply take time to reflect.
Respectfully,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu/president

View file

@ -0,0 +1,82 @@
Subject: A warm welcome to our UD community!
Date: 2024_08_26_100859
A warm welcome to our UD community!
August 26, 2024
Dear UD Community,
I love the beginning of every new academic year and the renewed energy and sense of anticipation that it brings to every member of our campus community. The large influx of new people and ideas that come along with each new start is truly invigorating. Whether you are a new or continuing student, faculty or staff member, on behalf of everyone in our community, I want to extend a very warm welcome to you and thank you for everything you contribute, individually and collectively, to make the University of Delaware such a unique place.
Students, your fresh perspectives, your passion for learning, and your dreams and aspirations for the boundless possibilities that lie ahead are inspiring. Faculty, your intellectual energy, your insights and expertise, and above all, your genuine interest in transferring and sharing your knowledge with all of us are the beating heart of our institution. And to all our staff, your hard work and dedicated talents provide the essential support and services to help ensure our students are successful in all their personal, academic and career pursuits.
Here at UD, our shared purpose is to cultivate learning, develop knowledge and foster the free exchange of ideas. The connections we make and the relationships we build help advance the mission of the University. Our focus on academic excellence in all fields of study and our opportunities for groundbreaking research rely on our endless curiosity, mutual respect and open mindedness. Together, we are stronger.
This sense of connection and belonging at UD is fundamental to our campus culture. Your willingness to hear and consider all voices and viewpoints is critical to shaping the vibrant and inclusive culture of our entire institution. Only when we commit to constructive growth, based on a foundation of civility and respect for ourselves and each other, can we realize true progress.  Empowered by diverse perspectives, it is the opportunities to advance ideas that enrich learning and create positive impact in the world that unite all of us.
To celebrate the new semester and welcome our undergraduate Class of 2028, all members of our community are invited to attend the Twilight Induction ceremony tonight at 7:30 p.m. on the north side of Memorial Hall or online on Facebook Live.
As your President, I am so excited by all that we can accomplish together throughout this academic year. My wife, Eleni, and I wish you all the best at the start of this new semester and beyond. We look forward to meeting you on campus!
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu

View file

@ -0,0 +1,80 @@
Subject: UPDATE: Recent Executive Orders
Date: 2025_02_13_160414
UPDATE: Recent Executive Orders | University of Delaware
Feb. 13, 2025
Dear UD Community,
I know many of you continue to experience disruption and anxiety stemming from the recent federal actions and executive orders regarding a multitude of issues — from research funding to education, human rights, and immigration among other areas. As I communicated to the University of Delaware community in my Jan. 28 campus message and my Feb. 3 comments to the Faculty Senate, we will do everything we can to minimize disruption to UD students, faculty and staff while remaining in compliance with federal law.
To support our community, we have created this resource page that will be updated regularly with information for UD students, faculty and staff regarding ongoing federal actions, directives and developments, including guidance in response to changing conditions. Also, this page from the Research Office contains specific guidance related to research projects and grants. In parallel, we will continue to advocate on behalf of the Universitys interests regarding any impact that federal or state actions could have on our students, faculty and staff.
One example is our response this week related to the federal action to impose a 15 limit on reimbursements for indirect administrative costs (Facilities and Administrative, or F&A costs) for all National Institutes of Health (NIH) research grants. This immediate cut in funding would have a devastating impact on all biomedical, health and life science advances and human wellness, including here at UD. In response, the Delaware Attorney General filed a lawsuit jointly with 21 other state attorneys general. The University supported the Attorney Generals lawsuit by submitting a declaration detailing the impact of the NIH rate cap on the institution. Fortunately, the attorneys general were successful, and a temporary restraining order was granted on Monday. Further, the Association of Public and Land-grant Universities, the Association of American Universities, and the American Council on Education announced a similar lawsuit.
As we navigate this rapidly evolving landscape together, our values will continue to be at the heart of our community. We will continue to foster an atmosphere that promotes the free exchange of ideas and opinions; we will continue to welcome and value people of different backgrounds, perspectives and learning experiences; and we will continue to encourage respect and civility toward everyone.
Please know that my leadership team and I are here to help and support our community during this time. Feel free to submit any questions pertaining to these matters here, and we will do our best to add relevant information on the resource pages. I deeply appreciate your resilience and patience as we continue to work together to advance the important mission of our University.
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu

View file

@ -0,0 +1,87 @@
Subject: Extending condolences and offering support
Date: 2025_04_29_230614
Extending condolences and offering support
April 29, 2025
Dear UD Community,
It is with a heavy heart that we share this information with you. Earlier today, a University of Delaware student died in a traffic accident on Main Street near campus, and several other people, including other UD students, suffered injuries. There is no ongoing threat to the University community.
University of Delaware Police are continuing to work with the Newark Police Department, which is actively investigating the incident. As a result, information is limited and the Newark Police Department is not releasing the victims names at this time, pending family notification.
This is a terrible tragedy for everyone in our UD community. We speak for the entire University in offering our condolences to the families, friends and classmates of the victims, and keep the other members of our community in our thoughts who may have witnessed the crash and its aftermath. The safety of our entire community remains our top priority, and we will continue to work with our partners in city and state government to address safety concerns around and on the UD campus. 
As we all begin to cope with this traumatic incident, we encourage you to support one another and reach out for additional help from the UD resources listed below as needed.
Sincerely,
Dennis AssanisPresident
José-Luis RieraVice President for Student Life
Support and resources
Center for Counseling and Student Development
Counselors and Student Life staff are available in Warner Hall 101 on Wednesday, April 30, from 9 a.m. to 3 p.m. for counseling services.
TimelyCare — A virtual health and wellbeing platform available 24/7 for UD students
Student Advocacy and Support — Available to assist students who need support navigating University resources or complex issues. Call 302-831-8939 or email studentsupport@udel.edu to schedule an appointment.
ComPsych® GuidanceResources® — Mental health support for UD benefited employees. Access services through the link or call 877-527-4742 for support.
Additional safety and wellness resources — Information about UD Police, Student Health Services and other services.
Information about the UD Alert, the LiveSafe app and safety notification communication.
University of Delaware   •   Newark, DE   •   udel.edu

View file

@ -0,0 +1,76 @@
Subject: Sharing our grief, enhancing safety
Date: 2025_04_30_160615
Sharing our grief, enhancing safety
April 30, 2025
Dear UD Community,
Since last evenings crash on Main Street that took the life of a University of Delaware graduate student (whose identity is being withheld at this time) and injured several others, we have been struggling to cope with the pain of this senseless tragedy. Throughout the UD community, we are all feeling the deep ache of loss, and we will continue to work through our grief together.
Today, Newark Police announced an arrest in connection with the crash, reiterating that there is no ongoing threat to the community. 
Main Street is where we eat, shop and share our lives with our friends, families and classmates. Because it is part of the states roadway systems, we have been working with local and state officials this year, including our partners at Delaware Department of Transportation, to address traffic safety on and around Main Street. In the wake of this tragedy, we will reinforce and accelerate those efforts. We recognize there isnt a simple solution, particularly when these tragedies involve actions taken by individuals that may not be stopped by changes to roadways or infrastructure. However, this incident underscores that our collective efforts must take on renewed urgency.
University leaders joined Delaware Attorney General Kathy Jennings and Newark Mayor Travis McDermott today for a press conference, at which we expressed our shared commitment to enhanced safety along Main Street. The University has pledged to continue these discussions through meetings with the offices of AG Jennings and Mayor McDermott, in addition to DelDOT, in the near future. The University remains committed to advancing meaningful solutions, while the Universitys Division of Student Life and Graduate College are connecting with students about effective advocacy, civic engagement and partnerships in order to support these efforts.
We are also aware that members of the UD community may have witnessed the crash and its aftermath or have close relationships with the victims. We encourage everyone to become familiar with and use, as needed, the available University counseling and support resources that were shared in Tuesday evenings message to the UD community. Counseling services are available at Warner Hall and through TimelyCare anytime, 24/7. Students with physical injuries or medical concerns relating to the incident can contact Student Health Services at 302-831-2226, Option 0, or visit Laurel Hall to meet with triage nurses available until 5 p.m. After hours, students can contact the Highmark Nurse line at 888-258-3428 or visit local urgent care centers (Newark Urgent Care at 324 E. Main Street, or ChristianaCare GoHealth at 550 S. College Avenue, Suite 115).
During this difficult time in our community, we all need to continue supporting and standing by one another as we move forward together.
Sincerely,
Dennis AssanisPresident
Laura CarlsonProvost
José-Luis RieraVice President for Student Life
University of Delaware   •   Newark, DE   •   udel.edu

274
03-rag/README.md Normal file
View file

@ -0,0 +1,274 @@
# Large Language Models Part III: Retrieval-Augmented Generation
**CHEG 667-013 — Chemical Engineering with Computers**
Department of Chemical and Biomolecular Engineering, University of Delaware
---
## Key idea
Build a local, privacy-preserving RAG system that answers questions about your own documents.
## Key goals
- Understand the RAG workflow: chunk, embed, store, retrieve, generate
- Build a vector store from a document collection
- Query the vector store and generate responses with a local LLM
- Experiment with parameters that affect retrieval quality
---
In Parts I and II, we trained a small GPT from scratch and then ran pre-trained models locally with `ollama`. We even used `ollama` on the command line to summarize documents. But what if we want to ask questions about a *specific* collection of documents — our own notes, emails, papers, or lab reports — rather than relying on what the model was trained on?
This is the idea behind **Retrieval-Augmented Generation (RAG)**. Instead of hoping the LLM "knows" the answer, we:
1. **Chunk** our documents into short text segments
2. **Embed** each chunk into a vector (a list of numbers that captures its meaning)
3. **Store** the vectors in a searchable index
4. At query time, **embed** the user's question the same way
5. **Retrieve** the most similar chunks using cosine similarity
6. **Generate** a response by passing those chunks to an LLM as context
The LLM never sees your full document collection — only the most relevant pieces. Everything runs locally. No data leaves your machine.
![RAG workflow](img/rag-workflow.png)
## 1. Setup
### Prerequisites
You need:
- Python 3.10+
- `ollama` installed and working (from Part II)
- About 23 GB of disk space for models
### Create a virtual environment
```bash
python3 -m venv .venv
source .venv/bin/activate
```
Or with `uv`:
```bash
uv venv .venv
source .venv/bin/activate
```
### Install the required packages
```bash
pip install llama-index-core llama-index-readers-file \
llama-index-llms-ollama llama-index-embeddings-huggingface \
python-dateutil
```
The `llama-index-*` packages are components of the [LlamaIndex](https://docs.llamaindex.ai/en/stable/) framework, which provides the plumbing for building RAG systems. `python-dateutil` is used by `clean_eml.py` for parsing email dates.
A `requirements.txt` is provided:
```bash
pip install -r requirements.txt
```
### Pull the LLM
We will use the `command-r7b` model, which was fine-tuned for RAG tasks:
```bash
ollama pull command-r7b
```
Other models work too — `llama3.1:8B`, `deepseek-r1:8B`, `gemma3:1b` — but `command-r7b` tends to follow retrieval-augmented prompts well.
### Cache the embedding model
The embedding model converts text into vectors. We use `BAAI/bge-large-en-v1.5`, a sentence transformer hosted on Huggingface. It will download automatically on first use (~1.3 GB), but you can pre-cache it with a short Python script:
```python
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
embed_model = HuggingFaceEmbedding(
cache_folder="./models",
model_name="BAAI/bge-large-en-v1.5"
)
```
Save this as `cache_model.py` and run it:
```bash
python cache_model.py
```
## 2. The documents
The `data/` directory contains 10 emails from the University of Delaware president's office, spanning 20122025 (the same set from Part II). Each is a plain text file with a subject line, date, and body text.
```bash
ls data/
```
In a real project, you might have PDFs, lab reports, research papers, or notes. For this exercise, the emails give us a small, manageable collection to work with.
### Preparing your own documents
If you have email files (`.eml` format), the script `clean_eml.py` can convert them to plain text:
```bash
# Place .eml files in ./eml, then run:
python clean_eml.py
```
This extracts the subject, date, and body from each email and writes a dated `.txt` file to `./data`.
## 3. Building the vector store
The script `build.py` does the heavy lifting:
1. Loads all text files from `./data`
2. Splits them into **chunks** of 500 tokens with 50 tokens of overlap
3. Embeds each chunk using the `BAAI/bge-large-en-v1.5` model
4. Saves the vector store to `./storage`
```bash
python build.py
```
You should see progress bars as documents are parsed and embeddings are generated:
```
Parsing nodes: 100%|████| 10/10 [00:00<00:00, 79.53it/s]
Generating embeddings: 100%|████| 42/42 [00:05<00:00, 8.01it/s]
Index built and saved to ./storage
```
After this, the `./storage` directory contains JSON files with the vector data, document metadata, and index information. You only need to build once — queries will load from storage.
### What are chunks?
We can't embed an entire document as a single vector — it would lose too much detail. Instead, we split the text into overlapping segments. The **chunk size** (500 tokens) controls how much text each vector represents. The **overlap** (50 tokens) ensures that sentences at chunk boundaries aren't lost. The `SentenceSplitter` tries to break at sentence boundaries rather than mid-sentence.
> **Exercise 1:** Look at `build.py`. What would happen if you made the chunks much smaller (e.g., 100 tokens)? Much larger (e.g., 2000 tokens)? Think about the tradeoff between precision and context.
## 4. Querying the vector store
The script `query.py` loads the stored index, takes your question, and returns a response grounded in the documents:
```bash
python query.py
```
```
Enter a search topic or question (or 'exit'): Find documents about campus safety
```
Here's what happens behind the scenes:
1. Your query is embedded into a vector using the same embedding model
2. The 15 most similar chunks are retrieved (`similarity_top_k=15`)
3. Those chunks are passed to `command-r7b` via `ollama` as context
4. The LLM generates a response based *only* on the retrieved context
The custom prompt in `query.py` instructs the model to:
- Base its response only on the provided context
- Prioritize higher-ranked (more similar) snippets
- Reference specific files and passages
- Format the output as a theme summary plus a list of matching files
### Example output
```
Enter a search topic or question (or 'exit'): Find documents that highlight
the excellence of the university
1. **Summary Theme**
The dominant theme across these documents is the University of Delaware's
commitment to excellence, innovation, and community impact...
2. **Matching Files**
2024_08_26_100859.txt - Welcome message highlighting UD's mission...
2023_10_12_155349.txt - Affirming institutional purpose and values...
...
Source documents:
2024_08_26_100859.txt 0.6623
2023_10_12_155349.txt 0.6451
...
Elapsed time: 76.1 seconds
```
Notice the **similarity scores** — these are cosine similarities between the query vector and each chunk's vector. Higher is more relevant. Also note that the search is *semantic*: the query said "excellence" but the matching documents talk about "achievement," "mission," and "purpose." The embedding model understands meaning, not just keywords.
> **Exercise 2:** Run the same query twice. Do you get exactly the same output? Why or why not?
## 5. Understanding the pieces
### The embedding model
The embedding model (`BAAI/bge-large-en-v1.5`) maps text to a 1024-dimensional vector. Two pieces of text with similar meaning will have vectors that point in similar directions (high cosine similarity), even if they use different words. This is what makes semantic search possible.
### The LLM
The LLM (`command-r7b` via `ollama`) is the *generator*. It reads the retrieved chunks and composes a coherent answer. Without the retrieval step, it would rely only on its training data — which knows nothing about your specific documents.
### The prompt
The default LlamaIndex prompt is simple:
```
Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {query_str}
Answer:
```
Our custom prompt in `query.py` is more detailed — it asks for structured output and tells the model to cite sources. You can inspect and modify the prompt to change the model's behavior.
> **Exercise 3:** Modify the prompt in `query.py`. For example, ask the model to respond in the style of a news reporter, or to focus only on dates and events. How does the output change?
## 6. Exercises
> **Exercise 4:** Try different embedding models. Replace `BAAI/bge-large-en-v1.5` with `sentence-transformers/all-mpnet-base-v2` in both `build.py` and `query.py`. Rebuild the vector store and compare the results.
> **Exercise 5:** Change the chunk size and overlap in `build.py`. Try `chunk_size=200, chunk_overlap=25` and then `chunk_size=1000, chunk_overlap=100`. Rebuild and query. What differences do you notice?
> **Exercise 6:** Swap the LLM. Try `llama3.2` or `gemma3:1b` instead of `command-r7b`. Which gives better RAG responses? Why might some models be better at following retrieval-augmented prompts?
> **Exercise 7:** Bring your own documents. Find a collection of text files — research paper abstracts, class notes, or a downloaded text from Project Gutenberg — and build a RAG system over them. What questions can you answer that a plain LLM cannot?
## Additional resources and references
### LlamaIndex
- Documentation: https://docs.llamaindex.ai/en/stable/
### Models
- Ollama: https://ollama.com
- Huggingface models: https://huggingface.co/models
#### Models used in this tutorial
| Model | Type | Role | Source |
|-------|------|------|--------|
| `command-r7b` | LLM (RAG-optimized) | Response generation | `ollama pull command-r7b` |
| `BAAI/bge-large-en-v1.5` | Embedding (1024-dim) | Text -> vector encoding | Huggingface (auto-downloaded) |
Other LLMs mentioned: `llama3.1:8B`, `deepseek-r1:8B`, `gemma3:1b`, `llama3.2`
Other embedding model mentioned: `sentence-transformers/all-mpnet-base-v2`
### Further reading
- NIST IR 8579, [*Developing the NCCoE Chatbot: Technical and Security Learnings from the Initial Implementation*](https://csrc.nist.gov/pubs/ir/8579/ipd) ([PDF](https://nvlpubs.nist.gov/nistpubs/ir/2025/NIST.IR.8579.ipd.pdf)) — practical guidance on building a RAG-based chatbot, including architecture and security considerations
- Open WebUI (https://openwebui.com) — a turnkey local RAG interface if you want a GUI

49
03-rag/build.py Normal file
View file

@ -0,0 +1,49 @@
# build.py
#
# Import documents from data, generate embedded vector store
# and save to disk in directory ./storage
#
# August 2025
# E. M. Furst
from llama_index.core import (
SimpleDirectoryReader,
VectorStoreIndex,
Settings,
)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core.node_parser import SentenceSplitter
def main():
# Choose your embedding model
embed_model = HuggingFaceEmbedding(cache_folder="./models",
model_name="BAAI/bge-large-en-v1.5")
# Configure global settings for LlamaIndex
Settings.embed_model = embed_model
# Load documents
documents = SimpleDirectoryReader("./data").load_data()
# Create the custom textsplitter
# Set chunk size and overlap (e.g., 256 tokens, 25 tokens overlap)
text_splitter = SentenceSplitter(
chunk_size=500,
chunk_overlap=50,
)
Settings.text_splitter = text_splitter
# Build the index
index = VectorStoreIndex.from_documents(
documents, transformations=[text_splitter],
show_progress=True,
)
# Persist both vector store and index metadata
index.storage_context.persist(persist_dir="./storage")
print("Index built and saved to ./storage")
if __name__ == "__main__":
main()

12
03-rag/cache_model.py Normal file
View file

@ -0,0 +1,12 @@
# cache_model.py
#
# Pre-download the embedding model so build.py doesn't have to fetch it.
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
embed_model = HuggingFaceEmbedding(
cache_folder="./models",
model_name="BAAI/bge-large-en-v1.5"
)
print("Embedding model cached in ./models")

48
03-rag/clean_eml.py Normal file
View file

@ -0,0 +1,48 @@
# clean_eml.py
#
# Convert .eml files to plain text files for use with build.py.
# Place .eml files in ./eml, then run this script to produce
# dated .txt files in ./data.
#
# August 2025
# E. M. Furst
from email import policy
from email.parser import BytesParser
from pathlib import Path
from dateutil import parser
from dateutil import tz
eml_dir = "eml"
out_dir = "data"
for eml_file in Path(eml_dir).glob("*.eml"):
with open(eml_file, "rb") as f:
msg = BytesParser(policy=policy.default).parse(f)
# Get metadata
subject = msg.get("subject", "No Subject")
date = msg.get("date", "No Date")
# Convert date to a safe format for filenames: YYYY_MM_DD_hhmmss
date = parser.parse(date)
if date.tzinfo is None:
date = date.replace(tzinfo=tz.tzlocal())
date = date.astimezone(tz.tzlocal())
msg_date = date.strftime("%d/%m/%Y, %H:%M:%S")
date = date.strftime("%Y_%m_%d_%H%M%S")
# Prefer plain text, fallback to HTML
body_part = msg.get_body(preferencelist=('plain', 'html'))
if body_part:
body_content = body_part.get_content()
else:
body_content = msg.get_payload()
# Combine into a clean string with labels and newlines
text = f"Subject: {subject}\nDate: {date}\n\n{body_content}"
out_file = Path(f"{out_dir}/{date}.txt").open("w", encoding="utf-8")
out_file.write(text)
print(f"{msg_date}")

View file

@ -0,0 +1,80 @@
Subject: [UDEL-ALL-2128] Hurricane Sandy
Date: 2012_11_02_164248
To the University of Delaware community:
We have much to be thankful for this week at the University of Delaware
as we were spared the full force of Hurricane Sandy. Even as we breathe
a sigh of relief and return to our normal activities, we are mindful of
the many, many people in this region -- some of our students among them
-- who were not so lucky. Our thoughts and prayers go out to them as
they rebuild their communities.
The potential impact of Sandy was a major concern for UD, with its
thousands of people and 430+ buildings on 2,000 acres throughout the
state. Many members of our University community worked hard over the
last several days to help us weather this "Storm of the Century."
Preparation and practice paid off as our emergency response team, led
by the Office of Campus and Public Safety, began assessing the
situation late last week and taking steps to ensure the safety of our
people and facilities. When the storm came, the campus suffered only
minor damage: wind-driven water getting into buildings through roofs,
walls and foundations; very minimal power loss, with a couple of
residential properties without power for only a few hours, thanks to
quick repair from the City of Newark; and only three trees knocked down
and destroyed, along with a lot of leaves and branches to clean up. The
Georgetown research facilities were fortunate to sustain only minor
leaks and flooding. The hardest hit area was the Lewes campus, which
had flooding on its grounds but minimal damage to buildings.
Throughout this time, the University's greatest asset continued to be
its people -- staff members from a variety of units working as a team.
A command center brought together representatives from across UD so
that issues could be responded to immediately. Staffed around the
clock, the center included Housing, Public Safety, Residence Life,
Environmental Health and Safety, Facilities and Auxiliary Services,
Emergency Management, and Communications and Marketing.
The dedication of UD's employees and students was evident everywhere:
Dining Services staff, faced with reduced numbers and limited
deliveries, kept students fed, and supported employees who worked
during the crisis; Residence Life staff and resident assistants made
sure students who remained on campus had up-to-date information and
supplies; staff in Student Health Services kept Laurel Hall open to
respond to student health needs; Human Resources staff worked over the
weekend to ensure that payroll was processed ahead of time; UD Police
officers were on patrol and responding to issues as they arose; the UD
Emergency Care Unit was at the ready; staff in Environmental Health and
Safety aided in the safe shutdown of UD laboratories and monitored fire
safety issues; Facilities staff continue to clean up debris left in
Sandy's wake and repair damage to buildings; faculty are working with
students to make up lost class time.
Our UD Alert system served as an excellent tool for keeping students,
parents and employees informed about the storm's implications for UD,
and the University's homepage was the repository for the most current
information and lists of events and activities that were canceled or
rescheduled. Through the University's accounts on Facebook and Twitter,
staff answered questions and addressed concerns, and faculty and staff
across the campus fielded phone calls and emails.
In short, a stellar job all around.
On behalf of the students, families and employees who benefited from
these efforts, I thank everyone for their dedication and service to the
people of UD.
Sincerely,
Patrick T. Harker
President
::::::::::::::::::::::::::::::::::::::::::: UD P.O. Box ::
UDEL-ALL-2128 mailing list
Online message archive
and management at https://po-box.nss.udel.edu/
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

View file

@ -0,0 +1,85 @@
Subject: Employee Appreciation Week
Date: 2017_05_16_123456
To the University of Delaware Community - President Dennis Assanis
/* Smartphones (landscape) ----------- */
@media only screen and (max-width: 568px) {
img.logo {width:413px;}
}
May 16, 2017
Dear colleague,
Our first year together has been one of amazing accomplishments and exciting opportunities. At the heart of our success has been you — the University of Delawares exceptional faculty and staff. To thank you and celebrate everything you do, we are launching our first Employee Appreciation Week.
The full week of events includes:
Monday, June 5—UDidIt Picnic
Tuesday, June 6—Self-Care Day
Wednesday, June 7—UD Spirit Day
Thursday, June 8—Flavors of UD
Friday, June 9—Employee Appreciation Night at the Blue Rocks
The week is a collaborative effort by Employee Health & Wellbeing and Human Resources. You can get all the details here.
We are dedicated to cultivating together an environment where employees are happy, healthy and continue to bring their best selves to work each day. The work you do benefits our students, our community and the world. I am truly grateful for your talents, skills, ideas and enduring commitment to the University.
Eleni and I hope you enjoy Employee Appreciation Week with your team and your family, and we look forward to seeing you at the many events.
Best,
Dennis AssanisPresident
University of Delaware   •   Newark, DE 19716   •   USA     (302) 831-2792   •   www.udel.edu/president
img { display: block !important; }

View file

@ -0,0 +1,79 @@
Subject: Robin Morgan named UD's 11th provost
Date: 2018_05_21_110335
Robin Morgan Appointed Provost - University of Delaware
May 21, 2018
Dear UD Community,
I am pleased to announce that, after a highly competitive national search, I have appointed Robin Morgan as the University of Delawares new provost, effective July 1. She will become the University of Delawares 11th provost, and the first woman to serve in this role in a permanent capacity since the position was created at UD in 1950.
Over the last seven months, Dr. Morgan already has assembled an impressive record as interim provost, most notably in her stewardship of new cluster hires among our faculty and her leadership as we move toward the creation of the graduate college.
Before working closely with her, I knew Dr. Morgan as a highly respected educator and scholar, but after watching her in action, I am equally impressed with her abilities to lead, inspire and effect change. Her energy, integrity, analytical mind, and innate knack for bringing people together, combined with her dedication and loyalty to UD, are great assets.
Dr. Morgan has a distinguished record of service to this University as a faculty member since 1985. After serving as acting dean of the College of Agriculture and Natural Resources for a year, she was named dean in 2002, serving in that role for 10 years, a period of significant growth and change for the college. From 2014-16, she served as acting chair of the Department of Biological Sciences, and she had been chair of the department from 2016 until her appointment as interim provost.
We will continue to benefit from Dr. Morgans deep knowledge of the University, her proven leadership across all aspects of teaching, research and administration, and her dedication to UD as she continues her career as provost.
I am looking forward to building on our close working relationship, and I am excited by all we will accomplish to take the University of Delaware forward. Please join me in congratulating her on this next chapter in her career.
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE 19716   •   USA     (302) 831-2111   •   www.udel.edu/president
img { display: block !important; }

View file

@ -0,0 +1,77 @@
Subject: Momentum and Resilience: Our UD Spring Semester Resumes
Date: 2020_03_29_141635
A Message from President Dennis Assanis
Dear UD Community,
As the University of Delaware is ready to resume the spring semester tomorrow, March 30, I want to share with all of you a special message recorded from the office in my home. Thank you all for your support at this challenging time, particularly our faculty and staff for your Herculean efforts to convert our classes from face to face instruction to online teaching and learning.
Best of luck with the semester ahead. As we all work remotely, please stay healthy, and stay connected!
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE 19716   •   USA     (302) 831-2111   •   udel.edu/president

View file

@ -0,0 +1,75 @@
Subject: National Voter Registration Day: Get Involved
Date: 2023_09_19_085321
National Voter Registration Day: Get Involved
September 19, 2023
Dear UD Community,
Do you want to make a difference in the world? Today is a good day to start.
This is National Voter Registration Day, an opportunity to make sure your voice will be heard in upcoming local, state and national elections. Voting is the most fundamental way that we engage in our democracy, effect change in society, work through our political differences and choose our leaders for the future. The voting rights we enjoy have been secured through the hard work and sacrifice of previous generations, and it is essential that everyone who is eligible to vote remains committed to preserving and exercising those rights.
At the University of Delaware, the Student Voting and Civic Engagement Committee — representing students, faculty and staff — is leading a non-partisan effort to encourage voting and help voters become better informed about the issues that matter to them. The Make It Count voter registration drive is scheduled for 2-6 p.m. today on The Green, with games, music and the opportunity to register through the TurboVote app, which also allows users to request an absentee ballot and sign up for election reminders. The committee is planning additional events this academic year to promote voting, education and civil discourse as the nation heads into the 2024 election season.
Being a Blue Hen means sharing a commitment to creating a better world. And being a registered, engaged and informed voter is one of the best ways for all of us to achieve that vision.
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu/president

View file

@ -0,0 +1,77 @@
Subject: Affirming our position and purpose
Date: 2023_10_12_155349
Affirming our position and purpose | A message from UD President Dennis Assanis
October 12, 2023
Dear UD Community,
Since my message yesterday, I have talked to many members of our community who — like me — are devastated and appalled by the terrorist attacks on Israel and the ongoing loss of life that has taken place in the Middle East.
I want to be sure that our position is very clear: We at the University of Delaware unequivocally condemn the horrific attacks by Hamas terrorists upon Israel that have shaken the world. The atrocities of crime, abduction, hostage-taking and mass murder targeted against Jewish civilians will forever remain a stain on human history. Our communitys foundation of civility and respect has been challenged to an unimaginable extent in light of the antisemitic brutalities that have been committed against innocent victims.
As your president, I wish words could calm the heartache and ease the fear and grief. Unfortunately, we all know that events as complicated and devastating as those taking place in the Middle East right now will continue to evolve. The longstanding humanitarian crisis needs to be acknowledged, and we should not equate the terrorist group Hamas with innocent Palestinian, Muslim and Arab people. The ensuing war-inflicted pain, suffering and death that continues to play out across the region, including Gaza, is heartbreaking for all.
We must remember that, first and foremost, UD is a place of learning. As we engage in difficult conversations about the longstanding conflicts in the Middle East, we should always strive to do so safely, with mutual respect and without bias or judgement. I encourage our students, faculty and staff to continues organizing events to educate and unite our community. Please seize these opportunities not only as individuals, but as members of a true community defined by the freedoms that we treasure so very deeply.
So, my message to you all is to have hope, to support each other, and to realize that the perspectives and feelings we are all experiencing right now — many of which uniquely connect to our personal backgrounds — matter. Please remember this as you walk across campus, sit in your next classroom, share experiences with other members of our community, or simply take time to reflect.
Respectfully,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu/president

View file

@ -0,0 +1,82 @@
Subject: A warm welcome to our UD community!
Date: 2024_08_26_100859
A warm welcome to our UD community!
August 26, 2024
Dear UD Community,
I love the beginning of every new academic year and the renewed energy and sense of anticipation that it brings to every member of our campus community. The large influx of new people and ideas that come along with each new start is truly invigorating. Whether you are a new or continuing student, faculty or staff member, on behalf of everyone in our community, I want to extend a very warm welcome to you and thank you for everything you contribute, individually and collectively, to make the University of Delaware such a unique place.
Students, your fresh perspectives, your passion for learning, and your dreams and aspirations for the boundless possibilities that lie ahead are inspiring. Faculty, your intellectual energy, your insights and expertise, and above all, your genuine interest in transferring and sharing your knowledge with all of us are the beating heart of our institution. And to all our staff, your hard work and dedicated talents provide the essential support and services to help ensure our students are successful in all their personal, academic and career pursuits.
Here at UD, our shared purpose is to cultivate learning, develop knowledge and foster the free exchange of ideas. The connections we make and the relationships we build help advance the mission of the University. Our focus on academic excellence in all fields of study and our opportunities for groundbreaking research rely on our endless curiosity, mutual respect and open mindedness. Together, we are stronger.
This sense of connection and belonging at UD is fundamental to our campus culture. Your willingness to hear and consider all voices and viewpoints is critical to shaping the vibrant and inclusive culture of our entire institution. Only when we commit to constructive growth, based on a foundation of civility and respect for ourselves and each other, can we realize true progress.  Empowered by diverse perspectives, it is the opportunities to advance ideas that enrich learning and create positive impact in the world that unite all of us.
To celebrate the new semester and welcome our undergraduate Class of 2028, all members of our community are invited to attend the Twilight Induction ceremony tonight at 7:30 p.m. on the north side of Memorial Hall or online on Facebook Live.
As your President, I am so excited by all that we can accomplish together throughout this academic year. My wife, Eleni, and I wish you all the best at the start of this new semester and beyond. We look forward to meeting you on campus!
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu

View file

@ -0,0 +1,80 @@
Subject: UPDATE: Recent Executive Orders
Date: 2025_02_13_160414
UPDATE: Recent Executive Orders | University of Delaware
Feb. 13, 2025
Dear UD Community,
I know many of you continue to experience disruption and anxiety stemming from the recent federal actions and executive orders regarding a multitude of issues — from research funding to education, human rights, and immigration among other areas. As I communicated to the University of Delaware community in my Jan. 28 campus message and my Feb. 3 comments to the Faculty Senate, we will do everything we can to minimize disruption to UD students, faculty and staff while remaining in compliance with federal law.
To support our community, we have created this resource page that will be updated regularly with information for UD students, faculty and staff regarding ongoing federal actions, directives and developments, including guidance in response to changing conditions. Also, this page from the Research Office contains specific guidance related to research projects and grants. In parallel, we will continue to advocate on behalf of the Universitys interests regarding any impact that federal or state actions could have on our students, faculty and staff.
One example is our response this week related to the federal action to impose a 15 limit on reimbursements for indirect administrative costs (Facilities and Administrative, or F&A costs) for all National Institutes of Health (NIH) research grants. This immediate cut in funding would have a devastating impact on all biomedical, health and life science advances and human wellness, including here at UD. In response, the Delaware Attorney General filed a lawsuit jointly with 21 other state attorneys general. The University supported the Attorney Generals lawsuit by submitting a declaration detailing the impact of the NIH rate cap on the institution. Fortunately, the attorneys general were successful, and a temporary restraining order was granted on Monday. Further, the Association of Public and Land-grant Universities, the Association of American Universities, and the American Council on Education announced a similar lawsuit.
As we navigate this rapidly evolving landscape together, our values will continue to be at the heart of our community. We will continue to foster an atmosphere that promotes the free exchange of ideas and opinions; we will continue to welcome and value people of different backgrounds, perspectives and learning experiences; and we will continue to encourage respect and civility toward everyone.
Please know that my leadership team and I are here to help and support our community during this time. Feel free to submit any questions pertaining to these matters here, and we will do our best to add relevant information on the resource pages. I deeply appreciate your resilience and patience as we continue to work together to advance the important mission of our University.
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu

View file

@ -0,0 +1,87 @@
Subject: Extending condolences and offering support
Date: 2025_04_29_230614
Extending condolences and offering support
April 29, 2025
Dear UD Community,
It is with a heavy heart that we share this information with you. Earlier today, a University of Delaware student died in a traffic accident on Main Street near campus, and several other people, including other UD students, suffered injuries. There is no ongoing threat to the University community.
University of Delaware Police are continuing to work with the Newark Police Department, which is actively investigating the incident. As a result, information is limited and the Newark Police Department is not releasing the victims names at this time, pending family notification.
This is a terrible tragedy for everyone in our UD community. We speak for the entire University in offering our condolences to the families, friends and classmates of the victims, and keep the other members of our community in our thoughts who may have witnessed the crash and its aftermath. The safety of our entire community remains our top priority, and we will continue to work with our partners in city and state government to address safety concerns around and on the UD campus. 
As we all begin to cope with this traumatic incident, we encourage you to support one another and reach out for additional help from the UD resources listed below as needed.
Sincerely,
Dennis AssanisPresident
José-Luis RieraVice President for Student Life
Support and resources
Center for Counseling and Student Development
Counselors and Student Life staff are available in Warner Hall 101 on Wednesday, April 30, from 9 a.m. to 3 p.m. for counseling services.
TimelyCare — A virtual health and wellbeing platform available 24/7 for UD students
Student Advocacy and Support — Available to assist students who need support navigating University resources or complex issues. Call 302-831-8939 or email studentsupport@udel.edu to schedule an appointment.
ComPsych® GuidanceResources® — Mental health support for UD benefited employees. Access services through the link or call 877-527-4742 for support.
Additional safety and wellness resources — Information about UD Police, Student Health Services and other services.
Information about the UD Alert, the LiveSafe app and safety notification communication.
University of Delaware   •   Newark, DE   •   udel.edu

View file

@ -0,0 +1,76 @@
Subject: Sharing our grief, enhancing safety
Date: 2025_04_30_160615
Sharing our grief, enhancing safety
April 30, 2025
Dear UD Community,
Since last evenings crash on Main Street that took the life of a University of Delaware graduate student (whose identity is being withheld at this time) and injured several others, we have been struggling to cope with the pain of this senseless tragedy. Throughout the UD community, we are all feeling the deep ache of loss, and we will continue to work through our grief together.
Today, Newark Police announced an arrest in connection with the crash, reiterating that there is no ongoing threat to the community. 
Main Street is where we eat, shop and share our lives with our friends, families and classmates. Because it is part of the states roadway systems, we have been working with local and state officials this year, including our partners at Delaware Department of Transportation, to address traffic safety on and around Main Street. In the wake of this tragedy, we will reinforce and accelerate those efforts. We recognize there isnt a simple solution, particularly when these tragedies involve actions taken by individuals that may not be stopped by changes to roadways or infrastructure. However, this incident underscores that our collective efforts must take on renewed urgency.
University leaders joined Delaware Attorney General Kathy Jennings and Newark Mayor Travis McDermott today for a press conference, at which we expressed our shared commitment to enhanced safety along Main Street. The University has pledged to continue these discussions through meetings with the offices of AG Jennings and Mayor McDermott, in addition to DelDOT, in the near future. The University remains committed to advancing meaningful solutions, while the Universitys Division of Student Life and Graduate College are connecting with students about effective advocacy, civic engagement and partnerships in order to support these efforts.
We are also aware that members of the UD community may have witnessed the crash and its aftermath or have close relationships with the victims. We encourage everyone to become familiar with and use, as needed, the available University counseling and support resources that were shared in Tuesday evenings message to the UD community. Counseling services are available at Warner Hall and through TimelyCare anytime, 24/7. Students with physical injuries or medical concerns relating to the incident can contact Student Health Services at 302-831-2226, Option 0, or visit Laurel Hall to meet with triage nurses available until 5 p.m. After hours, students can contact the Highmark Nurse line at 888-258-3428 or visit local urgent care centers (Newark Urgent Care at 324 E. Main Street, or ChristianaCare GoHealth at 550 S. College Avenue, Suite 115).
During this difficult time in our community, we all need to continue supporting and standing by one another as we move forward together.
Sincerely,
Dennis AssanisPresident
Laura CarlsonProvost
José-Luis RieraVice President for Student Life
University of Delaware   •   Newark, DE   •   udel.edu

BIN
03-rag/img/rag-workflow.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

110
03-rag/query.py Normal file
View file

@ -0,0 +1,110 @@
# query.py
#
# Run a query on a vector store
#
# August 2025
# E. M. Furst
from llama_index.core import (
load_index_from_storage,
StorageContext,
Settings,
)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.ollama import Ollama
from llama_index.core.prompts import PromptTemplate
import os, time
#
# Globals
#
os.environ["TOKENIZERS_PARALLELISM"] = "false"
# Embedding model used in vector store (this should match the one in build.py)
embed_model = HuggingFaceEmbedding(cache_folder="./models",
model_name="BAAI/bge-large-en-v1.5")
# LLM model to use in query transform and generation
llm = "command-r7b"
#
# Custom prompt for the query engine
#
PROMPT = PromptTemplate(
"""You are an expert research assistant. You are given top-ranked writing \
excerpts (CONTEXT) and a user's QUERY.
Instructions:
- Base your response *only* on the CONTEXT.
- The snippets are ordered from most to least relevantprioritize insights \
from earlier (higher-ranked) snippets.
- Aim to reference *as many distinct* relevant files as possible (up to 10).
- Do not invent or generalize; refer to specific passages or facts only.
- If a passage only loosely matches, deprioritize it.
Format your answer in two parts:
1. **Summary Theme**
Summarize the dominant theme from the relevant context in a few sentences.
2. **Matching Files**
Make a list of 10 matching files. The format for each should be:
<filename> - <rationale tied to content. Include date if available.>
CONTEXT:
{context_str}
QUERY:
{query_str}
Now provide the theme and list of matching files."""
)
#
# Main program routine
#
def main():
# Use a local model to generate -- in this case using Ollama
Settings.llm = Ollama(
model=llm,
request_timeout=360.0,
)
# Load embedding model (same as used for vector store)
Settings.embed_model = embed_model
# Load persisted vector store + metadata
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
# Build regular query engine with custom prompt
query_engine = index.as_query_engine(
similarity_top_k=15,
text_qa_template=PROMPT,
)
# Query
while True:
q = input("\nEnter a search topic or question (or 'exit'): ").strip()
if q.lower() in ("exit", "quit"):
break
print()
# Generate the response by querying the engine
start_time = time.time()
response = query_engine.query(q)
end_time = time.time()
# Return the query response and source documents
print(response.response)
print("\nSource documents:")
for node in response.source_nodes:
meta = getattr(node, "metadata", None) or node.node.metadata
print(f" {meta.get('file_name')} {getattr(node, 'score', None)}")
print(f"\nElapsed time: {(end_time-start_time):.1f} seconds")
if __name__ == "__main__":
main()

5
03-rag/requirements.txt Normal file
View file

@ -0,0 +1,5 @@
llama-index-core
llama-index-readers-file
llama-index-llms-ollama
llama-index-embeddings-huggingface
python-dateutil

View file

@ -0,0 +1,276 @@
# Large Language Models Part IV: Advanced Retrieval and Semantic Search
**CHEG 667-013 — Chemical Engineering with Computers**
Department of Chemical and Biomolecular Engineering, University of Delaware
---
## Key idea
Build a more effective search system by combining multiple retrieval strategies and re-ranking results.
## Key goals
- Understand why simple vector search sometimes misses relevant results
- Combine vector similarity with keyword matching (hybrid retrieval)
- Use a cross-encoder to re-rank candidates
- Compare LLM-synthesized answers with raw chunk retrieval
---
> This is an advanced topic that builds on Part III (RAG). Make sure you are comfortable with building a vector store and querying it before proceeding.
In Part III, we built a RAG system that embedded documents, retrieved the most similar chunks, and passed them to an LLM. That pipeline works well for many queries — but it has blind spots.
Consider searching for a specific person's name, a date, or a technical term. Vector embeddings capture *meaning*, not exact strings. A query for "Dr. Rodriguez" might retrieve chunks about "faculty" or "professors" instead of chunks that literally contain the name. Similarly, a query about "October 2020" might return chunks about autumn events in general.
This tutorial introduces three improvements:
1. **Hybrid retrieval** — combine vector similarity (good at meaning) with BM25 keyword matching (good at exact terms)
2. **Cross-encoder re-ranking** — use a second model to score each (query, chunk) pair more carefully
3. **Raw retrieval mode** — inspect what the pipeline retrieves *before* the LLM sees it
The result is a more effective search system that catches both semantic matches and exact-term matches.
## 1. How hybrid retrieval works
In Part III, our pipeline was:
```
Query → Embed → Vector similarity (top 15) → LLM → Response
```
The improved pipeline is:
```
Query → Embed ──→ Vector similarity (top 20) ──┐
├─→ Merge & deduplicate → Cross-encoder re-rank (top 15) → LLM → Response
Query → Tokenize → BM25 term matching (top 20) ┘
```
### Vector retrieval (dense)
This is what we used in Part III. The query is embedded into a vector, and the most similar chunk vectors are returned. This catches *semantic* matches — chunks with similar meaning, even if the words are different.
### BM25 retrieval (sparse)
BM25 is a classical information retrieval algorithm based on term frequency. It scores documents by how often the query's words appear, adjusted for document length. It's fast, requires no embeddings, and excels at finding exact names, dates, and technical terms that embeddings might miss.
### Why combine them?
Neither retriever is perfect alone:
| Query type | Vector | BM25 |
|------------|--------|------|
| "documents about campus safety" | Good — captures meaning | Decent — matches "safety" |
| "Dr. Rodriguez" | Weak — embeds as "person" concept | Strong — matches exact name |
| "feelings of joy and accomplishment" | Strong — semantic match | Weak — might miss synonyms like "pride" |
| "October 2020 announcement" | Moderate | Strong — matches exact date |
By retrieving candidates from *both* and merging them, we get a broader candidate pool that covers both semantic and lexical matches.
### Cross-encoder re-ranking
The merged candidates might number 3040 chunks. We don't want to send all of them to the LLM — that wastes context and dilutes quality. A **cross-encoder** solves this by scoring each (query, chunk) pair directly.
Unlike the bi-encoder embedding model (which encodes query and chunk separately), a cross-encoder reads the query and chunk *together* and produces a relevance score. This is more accurate but slower — which is why we use it as a second stage on a small candidate set, not on the entire corpus.
We use `cross-encoder/ms-marco-MiniLM-L-12-v2` to re-rank the merged candidates down to the top 15 before passing them to the LLM.
## 2. Setup
### Prerequisites
Everything from Part III, plus a few additional packages:
```bash
pip install llama-index-retrievers-bm25 nltk
```
A `requirements.txt` is provided with the full set of dependencies:
```bash
pip install -r requirements.txt
```
The cross-encoder model (`cross-encoder/ms-marco-MiniLM-L-12-v2`) will download automatically on first use via `sentence-transformers`. It is small (~130 MB).
Make sure `ollama` is running and `command-r7b` is available:
```bash
ollama pull command-r7b
```
## 3. Building the vector store
The `build_store.py` script works like the one in Part III, with a few differences:
- **Smaller chunks**: 256 tokens (vs. 500 in Part III) with 25 tokens of overlap
- **Incremental updates**: by default, it only re-indexes new or modified files
- **Full rebuild**: use `--rebuild` to start from scratch
```bash
python build_store.py --rebuild
```
Or for incremental updates after adding new files:
```bash
python build_store.py
```
```
Mode: incremental update
Loading existing index from ./store...
Index contains 42 documents
Data directory contains 44 files
New: 2
Modified: 0
Deleted: 0
Unchanged: 42
Indexing 2 file(s)...
Index updated and saved to ./store
```
### Why smaller chunks?
In Part III we used 500-token chunks. Here we use 256. Smaller chunks are more precise — each one represents a more focused piece of text. With a re-ranker to sort them, precision matters more than capturing broad context in a single chunk. The tradeoff: you get more chunks to search through, and each one has less surrounding context.
> **Exercise 1:** Rebuild the store with different chunk sizes (128, 256, 512, 1024). How does the number of chunks change? How does it affect retrieval quality?
## 4. Querying with hybrid retrieval
The `query_hybrid.py` script implements the full hybrid pipeline:
```bash
python query_hybrid.py "Find documents about campus safety"
```
The output shows retrieval statistics before the LLM response:
```
Query: Find documents about campus safety
Vector: 20, BM25: 20, overlap: 8, merged: 32, re-ranked to: 15
Response:
...
```
This tells you:
- 20 candidates came from vector similarity
- 20 came from BM25
- 8 were found by both (overlap)
- 32 unique candidates after merging
- Re-ranked down to 15 for the LLM
> **Exercise 2:** Run the same query using Part III's `query.py` (pure vector retrieval) and this tutorial's `query_hybrid.py`. Compare the source documents listed. Did hybrid retrieval find anything that pure vector missed?
## 5. Raw retrieval without an LLM
Sometimes you want to see *exactly* what the retrieval pipeline found, without the LLM summarizing or rephrasing. The `retrieve.py` script runs the same hybrid retrieval and re-ranking, but outputs the raw chunk text instead of passing it to an LLM:
```bash
python retrieve.py "Dr. Rodriguez"
```
```
Query: Dr. Rodriguez
Vector: 20, BM25: 20, overlap: 3, merged: 37, re-ranked to: 15
vector-only: 17, bm25-only: 17, both: 3
================================================================================
=== [1] 2024_08_26_100859.txt (score: 0.847) [bm25-only]
================================================================================
Dr. Rodriguez spoke at the opening ceremony, emphasizing the
university's commitment to inclusive excellence...
================================================================================
=== [2] 2023_10_12_155349.txt (score: 0.712) [vector+bm25]
================================================================================
...
```
Each chunk is annotated with its source: `vector-only`, `bm25-only`, or `vector+bm25`. This lets you see which retriever nominated each result.
This is invaluable for debugging. If your LLM response seems off, check the raw retrieval first — the problem is often in *what* was retrieved, not how the LLM synthesized it.
> **Exercise 3:** Run `retrieve.py` with a query that includes a specific name or date. How many of the top results are `bm25-only`? What would have been missed with pure vector retrieval?
## 6. Keyword search
For a complementary approach, `search_keywords.py` does pure keyword matching with no embeddings at all. It uses NLTK part-of-speech tagging to extract meaningful terms from your query, then searches the raw text files with regex:
```bash
python search_keywords.py "Hurricane Sandy recovery efforts"
```
```
Query: Hurricane Sandy recovery efforts
Extracted terms: hurricane sandy, recovery, efforts
Found 12 matches across 3 files
============================================================
--- 2012_11_02_164248.txt (5 matches) ---
============================================================
>>> 12: Hurricane Sandy has caused significant damage to our campus...
...
```
This is a fallback when you know exactly what you're looking for and don't need semantic matching. It's also fast — no models, no vector store needed.
> **Exercise 4:** Compare the results of `search_keywords.py`, `retrieve.py`, and `query_hybrid.py` on the same query. When is each approach most useful?
## 7. Comparing the three query modes
| Script | Method | Uses LLM? | Best for |
|--------|--------|-----------|----------|
| `query_hybrid.py` | Hybrid (vector + BM25) + re-rank + LLM | Yes | Synthesized answers from documents |
| `retrieve.py` | Hybrid (vector + BM25) + re-rank | No | Inspecting raw retrieval results |
| `search_keywords.py` | POS-tagged keyword matching | No | Finding exact names, dates, terms |
## 8. Exercises
> **Exercise 5:** The hybrid retrieval uses `VECTOR_TOP_K=20` and `BM25_TOP_K=20`. Experiment with different values. What happens if you set BM25 to 0 (effectively disabling it)? What about setting vector to 0?
> **Exercise 6:** Change the re-ranker's `RERANK_TOP_N` from 15 to 5. How does this affect response quality? What about 30?
> **Exercise 7:** Modify the prompt in `query_hybrid.py`. Try asking the model to respond as a specific persona, or to format the output differently (e.g., as a timeline, or as bullet points).
> **Exercise 8:** Build this system over your own document collection — class notes, research papers, or a downloaded text corpus. Which retrieval mode works best for your documents?
## Additional resources and references
### LlamaIndex
- Documentation: https://docs.llamaindex.ai/en/stable/
- BM25 retriever: https://docs.llamaindex.ai/en/stable/examples/retrievers/bm25_retriever/
### Models
- Ollama: https://ollama.com
- Huggingface models: https://huggingface.co/models
#### Models used in this tutorial
| Model | Type | Role | Source |
|-------|------|------|--------|
| `command-r7b` | LLM (RAG-optimized) | Response generation | `ollama pull command-r7b` |
| `BAAI/bge-large-en-v1.5` | Embedding (1024-dim) | Text -> vector encoding | Huggingface (auto-downloaded) |
| `cross-encoder/ms-marco-MiniLM-L-12-v2` | Cross-encoder | Re-ranking candidates | Huggingface (auto-downloaded) |
### Further reading
- Robertson & Zaragoza, *The Probabilistic Relevance Framework: BM25 and Beyond* (2009) — the theory behind BM25
- Nogueira & Cho, *Passage Re-ranking with BERT* (2019) — cross-encoder re-ranking applied to information retrieval

View file

@ -0,0 +1,193 @@
# build_store.py
#
# Build or update the vector store from journal entries in ./data.
#
# Default mode (incremental): loads the existing index and adds only
# new or modified files. Use --rebuild for a full rebuild from scratch.
#
# January 2026
# E. M. Furst
# Used Sonnet 4.5 to suggest changes; Opus 4.6 for incremental update
from llama_index.core import (
SimpleDirectoryReader,
StorageContext,
VectorStoreIndex,
load_index_from_storage,
Settings,
)
from pathlib import Path
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core.node_parser import SentenceSplitter
import argparse
import datetime
import os
import time
# Shared constants
DATA_DIR = Path("./data")
PERSIST_DIR = "./store"
EMBED_MODEL_NAME = "BAAI/bge-large-en-v1.5"
CHUNK_SIZE = 256
CHUNK_OVERLAP = 25
def get_text_splitter():
return SentenceSplitter(
chunk_size=CHUNK_SIZE,
chunk_overlap=CHUNK_OVERLAP,
paragraph_separator="\n\n",
)
def rebuild():
"""Full rebuild: delete and recreate the vector store from scratch."""
if not DATA_DIR.exists():
raise FileNotFoundError(f"Data directory not found: {DATA_DIR.absolute()}")
print(f"Loading documents from {DATA_DIR.absolute()}...")
documents = SimpleDirectoryReader(str(DATA_DIR)).load_data()
if not documents:
raise ValueError("No documents found in data directory")
print(f"Loaded {len(documents)} document(s)")
print("Building vector index...")
index = VectorStoreIndex.from_documents(
documents,
transformations=[get_text_splitter()],
show_progress=True,
)
index.storage_context.persist(persist_dir=PERSIST_DIR)
print(f"Index built and saved to {PERSIST_DIR}")
def update():
"""Incremental update: add new files, re-index modified files, remove deleted files."""
if not DATA_DIR.exists():
raise FileNotFoundError(f"Data directory not found: {DATA_DIR.absolute()}")
# Load existing index
print(f"Loading existing index from {PERSIST_DIR}...")
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
# Set transformations so index.insert() chunks correctly
Settings.transformations = [get_text_splitter()]
# Build lookup of indexed files: file_name -> (ref_doc_id, metadata)
all_ref_docs = index.docstore.get_all_ref_doc_info()
indexed = {}
for ref_id, info in all_ref_docs.items():
fname = info.metadata.get("file_name")
if fname:
indexed[fname] = (ref_id, info.metadata)
print(f"Index contains {len(indexed)} documents")
# Scan current files on disk
disk_files = {f.name: f for f in sorted(DATA_DIR.glob("*.txt"))}
print(f"Data directory contains {len(disk_files)} files")
# Classify files
new_files = []
modified_files = []
deleted_files = []
unchanged = 0
for fname, fpath in disk_files.items():
if fname not in indexed:
new_files.append(fpath)
else:
ref_id, meta = indexed[fname]
# Compare file size and modification date
stat = fpath.stat()
disk_size = stat.st_size
# Must use UTC to match SimpleDirectoryReader's date format
disk_mdate = datetime.datetime.fromtimestamp(
stat.st_mtime, tz=datetime.timezone.utc
).strftime("%Y-%m-%d")
stored_size = meta.get("file_size")
stored_mdate = meta.get("last_modified_date")
if disk_size != stored_size or disk_mdate != stored_mdate:
modified_files.append((fpath, ref_id))
else:
unchanged += 1
for fname, (ref_id, meta) in indexed.items():
if fname not in disk_files:
deleted_files.append((fname, ref_id))
# Report
print(f"\n New: {len(new_files)}")
print(f" Modified: {len(modified_files)}")
print(f" Deleted: {len(deleted_files)}")
print(f" Unchanged: {unchanged}")
if not new_files and not modified_files and not deleted_files:
print("\nNothing to do.")
return
# Process deletions (including modified files that need re-indexing)
for fname, ref_id in deleted_files:
print(f" Removing {fname}")
index.delete_ref_doc(ref_id, delete_from_docstore=True)
for fpath, ref_id in modified_files:
print(f" Re-indexing {fpath.name} (modified)")
index.delete_ref_doc(ref_id, delete_from_docstore=True)
# Process additions (new files + modified files)
files_to_add = new_files + [fpath for fpath, _ in modified_files]
if files_to_add:
print(f"\nIndexing {len(files_to_add)} file(s)...")
# Use "./" prefix to match paths from full build (pathlib strips it)
docs = SimpleDirectoryReader(
input_files=[f"./{f}" for f in files_to_add]
).load_data()
for doc in docs:
index.insert(doc)
# Persist
index.storage_context.persist(persist_dir=PERSIST_DIR)
print(f"\nIndex updated and saved to {PERSIST_DIR}")
def main():
parser = argparse.ArgumentParser(
description="Build or update the vector store from journal entries."
)
parser.add_argument(
"--rebuild",
action="store_true",
help="Full rebuild from scratch (default: incremental update)",
)
args = parser.parse_args()
# Configure embedding model
embed_model = HuggingFaceEmbedding(model_name=EMBED_MODEL_NAME)
Settings.embed_model = embed_model
start = time.time()
if args.rebuild:
print("Mode: full rebuild")
rebuild()
else:
print("Mode: incremental update")
if not Path(PERSIST_DIR).exists():
print(f"No existing index at {PERSIST_DIR}, doing full rebuild.")
rebuild()
else:
update()
elapsed = time.time() - start
print(f"Done in {elapsed:.1f}s")
if __name__ == "__main__":
main()

View file

@ -0,0 +1,80 @@
Subject: [UDEL-ALL-2128] Hurricane Sandy
Date: 2012_11_02_164248
To the University of Delaware community:
We have much to be thankful for this week at the University of Delaware
as we were spared the full force of Hurricane Sandy. Even as we breathe
a sigh of relief and return to our normal activities, we are mindful of
the many, many people in this region -- some of our students among them
-- who were not so lucky. Our thoughts and prayers go out to them as
they rebuild their communities.
The potential impact of Sandy was a major concern for UD, with its
thousands of people and 430+ buildings on 2,000 acres throughout the
state. Many members of our University community worked hard over the
last several days to help us weather this "Storm of the Century."
Preparation and practice paid off as our emergency response team, led
by the Office of Campus and Public Safety, began assessing the
situation late last week and taking steps to ensure the safety of our
people and facilities. When the storm came, the campus suffered only
minor damage: wind-driven water getting into buildings through roofs,
walls and foundations; very minimal power loss, with a couple of
residential properties without power for only a few hours, thanks to
quick repair from the City of Newark; and only three trees knocked down
and destroyed, along with a lot of leaves and branches to clean up. The
Georgetown research facilities were fortunate to sustain only minor
leaks and flooding. The hardest hit area was the Lewes campus, which
had flooding on its grounds but minimal damage to buildings.
Throughout this time, the University's greatest asset continued to be
its people -- staff members from a variety of units working as a team.
A command center brought together representatives from across UD so
that issues could be responded to immediately. Staffed around the
clock, the center included Housing, Public Safety, Residence Life,
Environmental Health and Safety, Facilities and Auxiliary Services,
Emergency Management, and Communications and Marketing.
The dedication of UD's employees and students was evident everywhere:
Dining Services staff, faced with reduced numbers and limited
deliveries, kept students fed, and supported employees who worked
during the crisis; Residence Life staff and resident assistants made
sure students who remained on campus had up-to-date information and
supplies; staff in Student Health Services kept Laurel Hall open to
respond to student health needs; Human Resources staff worked over the
weekend to ensure that payroll was processed ahead of time; UD Police
officers were on patrol and responding to issues as they arose; the UD
Emergency Care Unit was at the ready; staff in Environmental Health and
Safety aided in the safe shutdown of UD laboratories and monitored fire
safety issues; Facilities staff continue to clean up debris left in
Sandy's wake and repair damage to buildings; faculty are working with
students to make up lost class time.
Our UD Alert system served as an excellent tool for keeping students,
parents and employees informed about the storm's implications for UD,
and the University's homepage was the repository for the most current
information and lists of events and activities that were canceled or
rescheduled. Through the University's accounts on Facebook and Twitter,
staff answered questions and addressed concerns, and faculty and staff
across the campus fielded phone calls and emails.
In short, a stellar job all around.
On behalf of the students, families and employees who benefited from
these efforts, I thank everyone for their dedication and service to the
people of UD.
Sincerely,
Patrick T. Harker
President
::::::::::::::::::::::::::::::::::::::::::: UD P.O. Box ::
UDEL-ALL-2128 mailing list
Online message archive
and management at https://po-box.nss.udel.edu/
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

View file

@ -0,0 +1,85 @@
Subject: Employee Appreciation Week
Date: 2017_05_16_123456
To the University of Delaware Community - President Dennis Assanis
/* Smartphones (landscape) ----------- */
@media only screen and (max-width: 568px) {
img.logo {width:413px;}
}
May 16, 2017
Dear colleague,
Our first year together has been one of amazing accomplishments and exciting opportunities. At the heart of our success has been you — the University of Delawares exceptional faculty and staff. To thank you and celebrate everything you do, we are launching our first Employee Appreciation Week.
The full week of events includes:
Monday, June 5—UDidIt Picnic
Tuesday, June 6—Self-Care Day
Wednesday, June 7—UD Spirit Day
Thursday, June 8—Flavors of UD
Friday, June 9—Employee Appreciation Night at the Blue Rocks
The week is a collaborative effort by Employee Health & Wellbeing and Human Resources. You can get all the details here.
We are dedicated to cultivating together an environment where employees are happy, healthy and continue to bring their best selves to work each day. The work you do benefits our students, our community and the world. I am truly grateful for your talents, skills, ideas and enduring commitment to the University.
Eleni and I hope you enjoy Employee Appreciation Week with your team and your family, and we look forward to seeing you at the many events.
Best,
Dennis AssanisPresident
University of Delaware   •   Newark, DE 19716   •   USA     (302) 831-2792   •   www.udel.edu/president
img { display: block !important; }

View file

@ -0,0 +1,79 @@
Subject: Robin Morgan named UD's 11th provost
Date: 2018_05_21_110335
Robin Morgan Appointed Provost - University of Delaware
May 21, 2018
Dear UD Community,
I am pleased to announce that, after a highly competitive national search, I have appointed Robin Morgan as the University of Delawares new provost, effective July 1. She will become the University of Delawares 11th provost, and the first woman to serve in this role in a permanent capacity since the position was created at UD in 1950.
Over the last seven months, Dr. Morgan already has assembled an impressive record as interim provost, most notably in her stewardship of new cluster hires among our faculty and her leadership as we move toward the creation of the graduate college.
Before working closely with her, I knew Dr. Morgan as a highly respected educator and scholar, but after watching her in action, I am equally impressed with her abilities to lead, inspire and effect change. Her energy, integrity, analytical mind, and innate knack for bringing people together, combined with her dedication and loyalty to UD, are great assets.
Dr. Morgan has a distinguished record of service to this University as a faculty member since 1985. After serving as acting dean of the College of Agriculture and Natural Resources for a year, she was named dean in 2002, serving in that role for 10 years, a period of significant growth and change for the college. From 2014-16, she served as acting chair of the Department of Biological Sciences, and she had been chair of the department from 2016 until her appointment as interim provost.
We will continue to benefit from Dr. Morgans deep knowledge of the University, her proven leadership across all aspects of teaching, research and administration, and her dedication to UD as she continues her career as provost.
I am looking forward to building on our close working relationship, and I am excited by all we will accomplish to take the University of Delaware forward. Please join me in congratulating her on this next chapter in her career.
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE 19716   •   USA     (302) 831-2111   •   www.udel.edu/president
img { display: block !important; }

View file

@ -0,0 +1,77 @@
Subject: Momentum and Resilience: Our UD Spring Semester Resumes
Date: 2020_03_29_141635
A Message from President Dennis Assanis
Dear UD Community,
As the University of Delaware is ready to resume the spring semester tomorrow, March 30, I want to share with all of you a special message recorded from the office in my home. Thank you all for your support at this challenging time, particularly our faculty and staff for your Herculean efforts to convert our classes from face to face instruction to online teaching and learning.
Best of luck with the semester ahead. As we all work remotely, please stay healthy, and stay connected!
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE 19716   •   USA     (302) 831-2111   •   udel.edu/president

View file

@ -0,0 +1,75 @@
Subject: National Voter Registration Day: Get Involved
Date: 2023_09_19_085321
National Voter Registration Day: Get Involved
September 19, 2023
Dear UD Community,
Do you want to make a difference in the world? Today is a good day to start.
This is National Voter Registration Day, an opportunity to make sure your voice will be heard in upcoming local, state and national elections. Voting is the most fundamental way that we engage in our democracy, effect change in society, work through our political differences and choose our leaders for the future. The voting rights we enjoy have been secured through the hard work and sacrifice of previous generations, and it is essential that everyone who is eligible to vote remains committed to preserving and exercising those rights.
At the University of Delaware, the Student Voting and Civic Engagement Committee — representing students, faculty and staff — is leading a non-partisan effort to encourage voting and help voters become better informed about the issues that matter to them. The Make It Count voter registration drive is scheduled for 2-6 p.m. today on The Green, with games, music and the opportunity to register through the TurboVote app, which also allows users to request an absentee ballot and sign up for election reminders. The committee is planning additional events this academic year to promote voting, education and civil discourse as the nation heads into the 2024 election season.
Being a Blue Hen means sharing a commitment to creating a better world. And being a registered, engaged and informed voter is one of the best ways for all of us to achieve that vision.
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu/president

View file

@ -0,0 +1,77 @@
Subject: Affirming our position and purpose
Date: 2023_10_12_155349
Affirming our position and purpose | A message from UD President Dennis Assanis
October 12, 2023
Dear UD Community,
Since my message yesterday, I have talked to many members of our community who — like me — are devastated and appalled by the terrorist attacks on Israel and the ongoing loss of life that has taken place in the Middle East.
I want to be sure that our position is very clear: We at the University of Delaware unequivocally condemn the horrific attacks by Hamas terrorists upon Israel that have shaken the world. The atrocities of crime, abduction, hostage-taking and mass murder targeted against Jewish civilians will forever remain a stain on human history. Our communitys foundation of civility and respect has been challenged to an unimaginable extent in light of the antisemitic brutalities that have been committed against innocent victims.
As your president, I wish words could calm the heartache and ease the fear and grief. Unfortunately, we all know that events as complicated and devastating as those taking place in the Middle East right now will continue to evolve. The longstanding humanitarian crisis needs to be acknowledged, and we should not equate the terrorist group Hamas with innocent Palestinian, Muslim and Arab people. The ensuing war-inflicted pain, suffering and death that continues to play out across the region, including Gaza, is heartbreaking for all.
We must remember that, first and foremost, UD is a place of learning. As we engage in difficult conversations about the longstanding conflicts in the Middle East, we should always strive to do so safely, with mutual respect and without bias or judgement. I encourage our students, faculty and staff to continues organizing events to educate and unite our community. Please seize these opportunities not only as individuals, but as members of a true community defined by the freedoms that we treasure so very deeply.
So, my message to you all is to have hope, to support each other, and to realize that the perspectives and feelings we are all experiencing right now — many of which uniquely connect to our personal backgrounds — matter. Please remember this as you walk across campus, sit in your next classroom, share experiences with other members of our community, or simply take time to reflect.
Respectfully,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu/president

View file

@ -0,0 +1,82 @@
Subject: A warm welcome to our UD community!
Date: 2024_08_26_100859
A warm welcome to our UD community!
August 26, 2024
Dear UD Community,
I love the beginning of every new academic year and the renewed energy and sense of anticipation that it brings to every member of our campus community. The large influx of new people and ideas that come along with each new start is truly invigorating. Whether you are a new or continuing student, faculty or staff member, on behalf of everyone in our community, I want to extend a very warm welcome to you and thank you for everything you contribute, individually and collectively, to make the University of Delaware such a unique place.
Students, your fresh perspectives, your passion for learning, and your dreams and aspirations for the boundless possibilities that lie ahead are inspiring. Faculty, your intellectual energy, your insights and expertise, and above all, your genuine interest in transferring and sharing your knowledge with all of us are the beating heart of our institution. And to all our staff, your hard work and dedicated talents provide the essential support and services to help ensure our students are successful in all their personal, academic and career pursuits.
Here at UD, our shared purpose is to cultivate learning, develop knowledge and foster the free exchange of ideas. The connections we make and the relationships we build help advance the mission of the University. Our focus on academic excellence in all fields of study and our opportunities for groundbreaking research rely on our endless curiosity, mutual respect and open mindedness. Together, we are stronger.
This sense of connection and belonging at UD is fundamental to our campus culture. Your willingness to hear and consider all voices and viewpoints is critical to shaping the vibrant and inclusive culture of our entire institution. Only when we commit to constructive growth, based on a foundation of civility and respect for ourselves and each other, can we realize true progress.  Empowered by diverse perspectives, it is the opportunities to advance ideas that enrich learning and create positive impact in the world that unite all of us.
To celebrate the new semester and welcome our undergraduate Class of 2028, all members of our community are invited to attend the Twilight Induction ceremony tonight at 7:30 p.m. on the north side of Memorial Hall or online on Facebook Live.
As your President, I am so excited by all that we can accomplish together throughout this academic year. My wife, Eleni, and I wish you all the best at the start of this new semester and beyond. We look forward to meeting you on campus!
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu

View file

@ -0,0 +1,80 @@
Subject: UPDATE: Recent Executive Orders
Date: 2025_02_13_160414
UPDATE: Recent Executive Orders | University of Delaware
Feb. 13, 2025
Dear UD Community,
I know many of you continue to experience disruption and anxiety stemming from the recent federal actions and executive orders regarding a multitude of issues — from research funding to education, human rights, and immigration among other areas. As I communicated to the University of Delaware community in my Jan. 28 campus message and my Feb. 3 comments to the Faculty Senate, we will do everything we can to minimize disruption to UD students, faculty and staff while remaining in compliance with federal law.
To support our community, we have created this resource page that will be updated regularly with information for UD students, faculty and staff regarding ongoing federal actions, directives and developments, including guidance in response to changing conditions. Also, this page from the Research Office contains specific guidance related to research projects and grants. In parallel, we will continue to advocate on behalf of the Universitys interests regarding any impact that federal or state actions could have on our students, faculty and staff.
One example is our response this week related to the federal action to impose a 15 limit on reimbursements for indirect administrative costs (Facilities and Administrative, or F&A costs) for all National Institutes of Health (NIH) research grants. This immediate cut in funding would have a devastating impact on all biomedical, health and life science advances and human wellness, including here at UD. In response, the Delaware Attorney General filed a lawsuit jointly with 21 other state attorneys general. The University supported the Attorney Generals lawsuit by submitting a declaration detailing the impact of the NIH rate cap on the institution. Fortunately, the attorneys general were successful, and a temporary restraining order was granted on Monday. Further, the Association of Public and Land-grant Universities, the Association of American Universities, and the American Council on Education announced a similar lawsuit.
As we navigate this rapidly evolving landscape together, our values will continue to be at the heart of our community. We will continue to foster an atmosphere that promotes the free exchange of ideas and opinions; we will continue to welcome and value people of different backgrounds, perspectives and learning experiences; and we will continue to encourage respect and civility toward everyone.
Please know that my leadership team and I are here to help and support our community during this time. Feel free to submit any questions pertaining to these matters here, and we will do our best to add relevant information on the resource pages. I deeply appreciate your resilience and patience as we continue to work together to advance the important mission of our University.
Sincerely,
Dennis AssanisPresident
University of Delaware   •   Newark, DE   •   udel.edu

View file

@ -0,0 +1,87 @@
Subject: Extending condolences and offering support
Date: 2025_04_29_230614
Extending condolences and offering support
April 29, 2025
Dear UD Community,
It is with a heavy heart that we share this information with you. Earlier today, a University of Delaware student died in a traffic accident on Main Street near campus, and several other people, including other UD students, suffered injuries. There is no ongoing threat to the University community.
University of Delaware Police are continuing to work with the Newark Police Department, which is actively investigating the incident. As a result, information is limited and the Newark Police Department is not releasing the victims names at this time, pending family notification.
This is a terrible tragedy for everyone in our UD community. We speak for the entire University in offering our condolences to the families, friends and classmates of the victims, and keep the other members of our community in our thoughts who may have witnessed the crash and its aftermath. The safety of our entire community remains our top priority, and we will continue to work with our partners in city and state government to address safety concerns around and on the UD campus. 
As we all begin to cope with this traumatic incident, we encourage you to support one another and reach out for additional help from the UD resources listed below as needed.
Sincerely,
Dennis AssanisPresident
José-Luis RieraVice President for Student Life
Support and resources
Center for Counseling and Student Development
Counselors and Student Life staff are available in Warner Hall 101 on Wednesday, April 30, from 9 a.m. to 3 p.m. for counseling services.
TimelyCare — A virtual health and wellbeing platform available 24/7 for UD students
Student Advocacy and Support — Available to assist students who need support navigating University resources or complex issues. Call 302-831-8939 or email studentsupport@udel.edu to schedule an appointment.
ComPsych® GuidanceResources® — Mental health support for UD benefited employees. Access services through the link or call 877-527-4742 for support.
Additional safety and wellness resources — Information about UD Police, Student Health Services and other services.
Information about the UD Alert, the LiveSafe app and safety notification communication.
University of Delaware   •   Newark, DE   •   udel.edu

View file

@ -0,0 +1,76 @@
Subject: Sharing our grief, enhancing safety
Date: 2025_04_30_160615
Sharing our grief, enhancing safety
April 30, 2025
Dear UD Community,
Since last evenings crash on Main Street that took the life of a University of Delaware graduate student (whose identity is being withheld at this time) and injured several others, we have been struggling to cope with the pain of this senseless tragedy. Throughout the UD community, we are all feeling the deep ache of loss, and we will continue to work through our grief together.
Today, Newark Police announced an arrest in connection with the crash, reiterating that there is no ongoing threat to the community. 
Main Street is where we eat, shop and share our lives with our friends, families and classmates. Because it is part of the states roadway systems, we have been working with local and state officials this year, including our partners at Delaware Department of Transportation, to address traffic safety on and around Main Street. In the wake of this tragedy, we will reinforce and accelerate those efforts. We recognize there isnt a simple solution, particularly when these tragedies involve actions taken by individuals that may not be stopped by changes to roadways or infrastructure. However, this incident underscores that our collective efforts must take on renewed urgency.
University leaders joined Delaware Attorney General Kathy Jennings and Newark Mayor Travis McDermott today for a press conference, at which we expressed our shared commitment to enhanced safety along Main Street. The University has pledged to continue these discussions through meetings with the offices of AG Jennings and Mayor McDermott, in addition to DelDOT, in the near future. The University remains committed to advancing meaningful solutions, while the Universitys Division of Student Life and Graduate College are connecting with students about effective advocacy, civic engagement and partnerships in order to support these efforts.
We are also aware that members of the UD community may have witnessed the crash and its aftermath or have close relationships with the victims. We encourage everyone to become familiar with and use, as needed, the available University counseling and support resources that were shared in Tuesday evenings message to the UD community. Counseling services are available at Warner Hall and through TimelyCare anytime, 24/7. Students with physical injuries or medical concerns relating to the incident can contact Student Health Services at 302-831-2226, Option 0, or visit Laurel Hall to meet with triage nurses available until 5 p.m. After hours, students can contact the Highmark Nurse line at 888-258-3428 or visit local urgent care centers (Newark Urgent Care at 324 E. Main Street, or ChristianaCare GoHealth at 550 S. College Avenue, Suite 115).
During this difficult time in our community, we all need to continue supporting and standing by one another as we move forward together.
Sincerely,
Dennis AssanisPresident
Laura CarlsonProvost
José-Luis RieraVice President for Student Life
University of Delaware   •   Newark, DE   •   udel.edu

View file

@ -0,0 +1,176 @@
# query_hybrid.py
# Hybrid retrieval: BM25 (sparse) + vector similarity (dense) + cross-encoder
#
# Combines two retrieval strategies to catch both exact term matches and
# semantic similarity:
# 1. Retrieve top-20 via vector similarity (bi-encoder, catches meaning)
# 2. Retrieve top-20 via BM25 (term frequency, catches exact names/dates)
# 3. Merge and deduplicate candidates by node ID
# 4. Re-rank the union with a cross-encoder -> top-15
# 5. Pass re-ranked chunks to LLM for synthesis
#
# The cross-encoder doesn't care where candidates came from -- it scores
# each (query, chunk) pair on its own merits. BM25's job is just to
# nominate candidates that vector similarity might miss.
#
# E.M.F. February 2026
# Environment vars must be set before importing huggingface/transformers
# libraries, because huggingface_hub.constants evaluates HF_HUB_OFFLINE
# at import time.
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
os.environ["SENTENCE_TRANSFORMERS_HOME"] = "./models"
os.environ["HF_HUB_OFFLINE"] = "1"
from llama_index.core import (
StorageContext,
load_index_from_storage,
Settings,
get_response_synthesizer,
)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.ollama import Ollama
from llama_index.core.prompts import PromptTemplate
from llama_index.core.postprocessor import SentenceTransformerRerank
from llama_index.retrievers.bm25 import BM25Retriever
import sys
#
# Globals
#
# Embedding model (must match build_store.py)
EMBED_MODEL = HuggingFaceEmbedding(cache_folder="./models", model_name="BAAI/bge-large-en-v1.5", local_files_only=True)
# LLM model for generation
LLM_MODEL = "command-r7b"
# Cross-encoder model for re-ranking (cached in ./models/)
RERANK_MODEL = "cross-encoder/ms-marco-MiniLM-L-12-v2"
RERANK_TOP_N = 15
# Retrieval parameters
VECTOR_TOP_K = 20 # candidates from vector similarity
BM25_TOP_K = 20 # candidates from BM25 term matching
#
# Custom prompt -- same as v3
#
PROMPT = PromptTemplate(
"""You are a precise research assistant analyzing excerpts from a personal journal collection.
Every excerpt below has been selected and ranked for relevance to the query.
CONTEXT (ranked by relevance):
{context_str}
QUERY:
{query_str}
Instructions:
- Answer ONLY using information explicitly present in the CONTEXT above
- Examine ALL provided excerpts, not just the top few -- each one was selected for relevance
- Be specific: quote or closely paraphrase key passages and cite their file names
- When multiple files touch on the query, note what each one contributes
- If the context doesn't contain enough information to answer fully, say so
Your response should:
1. Directly answer the query, drawing on as many relevant excerpts as possible
2. Reference specific files and their content (e.g., "In <filename>, ...")
3. End with a list of all files that contributed to your answer, with a brief note on each
If the context is insufficient, explain what's missing."""
)
def main():
# Configure LLM and embedding model
# for local model using ollama
# Note: Ollama temperature defaults to 0.8
Settings.llm = Ollama(
model=LLM_MODEL,
temperature=0.3,
request_timeout=360.0,
context_window=8000,
)
# Use OpenAI API:
# from llama_index.llms.openai import OpenAI
# Settings.llm = OpenAI(
# model="gpt-4o-mini", # or "gpt-4o" for higher quality
# temperature=0.3,
# )
Settings.embed_model = EMBED_MODEL
# Load persisted vector store
storage_context = StorageContext.from_defaults(persist_dir="./store")
index = load_index_from_storage(storage_context)
# --- Retrievers ---
# Vector retriever (dense: cosine similarity over embeddings)
vector_retriever = index.as_retriever(similarity_top_k=VECTOR_TOP_K)
# BM25 retriever (sparse: term frequency scoring)
bm25_retriever = BM25Retriever.from_defaults(
index=index,
similarity_top_k=BM25_TOP_K,
)
# Cross-encoder re-ranker
reranker = SentenceTransformerRerank(
model=RERANK_MODEL,
top_n=RERANK_TOP_N,
)
# --- Query ---
if len(sys.argv) < 2:
print("Usage: python query_hybrid_bm25_v4.py QUERY_TEXT")
sys.exit(1)
q = " ".join(sys.argv[1:])
# Retrieve from both sources
vector_nodes = vector_retriever.retrieve(q)
bm25_nodes = bm25_retriever.retrieve(q)
# Merge and deduplicate by node ID
seen_ids = set()
merged = []
for node in vector_nodes + bm25_nodes:
node_id = node.node.node_id
if node_id not in seen_ids:
seen_ids.add(node_id)
merged.append(node)
# Re-rank the merged candidates with cross-encoder
reranked = reranker.postprocess_nodes(merged, query_str=q)
# Report retrieval stats
n_vector_only = len([n for n in vector_nodes if n.node.node_id not in {b.node.node_id for b in bm25_nodes}])
n_bm25_only = len([n for n in bm25_nodes if n.node.node_id not in {v.node.node_id for v in vector_nodes}])
n_both = len(vector_nodes) + len(bm25_nodes) - len(merged)
print(f"\nQuery: {q}")
print(f"Vector: {len(vector_nodes)}, BM25: {len(bm25_nodes)}, "
f"overlap: {n_both}, merged: {len(merged)}, re-ranked to: {len(reranked)}")
# Synthesize response with LLM
synthesizer = get_response_synthesizer(text_qa_template=PROMPT)
response = synthesizer.synthesize(q, nodes=reranked)
# Output
print("\nResponse:\n")
print(response.response)
print("\nSource documents:")
for node in response.source_nodes:
meta = getattr(node, "metadata", None) or node.node.metadata
score = getattr(node, "score", None)
print(f"{meta.get('file_name')} {meta.get('file_path')} {score:.3f}")
if __name__ == "__main__":
main()

View file

@ -0,0 +1,7 @@
llama-index-core
llama-index-readers-file
llama-index-llms-ollama
llama-index-embeddings-huggingface
llama-index-retrievers-bm25
nltk
sentence-transformers

View file

@ -0,0 +1,140 @@
# retrieve.py
# Hybrid verbatim chunk retrieval: BM25 + vector search + cross-encoder, no LLM.
#
# Same hybrid retrieval as query_hybrid.py but outputs raw chunk text
# instead of LLM synthesis. Useful for inspecting what the hybrid pipeline
# retrieves.
#
# Each chunk is annotated with its source (vector, BM25, or both) so you can
# see which retriever nominated it.
#
# E.M.F. February 2026
# Environment vars must be set before importing huggingface/transformers
# libraries, because huggingface_hub.constants evaluates HF_HUB_OFFLINE
# at import time.
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
os.environ["SENTENCE_TRANSFORMERS_HOME"] = "./models"
os.environ["HF_HUB_OFFLINE"] = "1"
from llama_index.core import (
StorageContext,
load_index_from_storage,
Settings,
)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core.postprocessor import SentenceTransformerRerank
from llama_index.retrievers.bm25 import BM25Retriever
import sys
import textwrap
#
# Globals
#
# Embedding model (must match build_store.py)
EMBED_MODEL = HuggingFaceEmbedding(cache_folder="./models", model_name="BAAI/bge-large-en-v1.5", local_files_only=True)
# Cross-encoder model for re-ranking (cached in ./models/)
RERANK_MODEL = "cross-encoder/ms-marco-MiniLM-L-12-v2"
RERANK_TOP_N = 15
# Retrieval parameters
VECTOR_TOP_K = 20
BM25_TOP_K = 20
# Output formatting
WRAP_WIDTH = 80
def main():
# No LLM needed -- set embed model only
Settings.embed_model = EMBED_MODEL
# Load persisted vector store
storage_context = StorageContext.from_defaults(persist_dir="./store")
index = load_index_from_storage(storage_context)
# --- Retrievers ---
vector_retriever = index.as_retriever(similarity_top_k=VECTOR_TOP_K)
bm25_retriever = BM25Retriever.from_defaults(
index=index,
similarity_top_k=BM25_TOP_K,
)
# Cross-encoder re-ranker
reranker = SentenceTransformerRerank(
model=RERANK_MODEL,
top_n=RERANK_TOP_N,
)
# Query
if len(sys.argv) < 2:
print("Usage: python retrieve_hybrid_raw.py QUERY_TEXT")
sys.exit(1)
q = " ".join(sys.argv[1:])
# Retrieve from both sources
vector_nodes = vector_retriever.retrieve(q)
bm25_nodes = bm25_retriever.retrieve(q)
# Track which retriever found each node
vector_ids = {n.node.node_id for n in vector_nodes}
bm25_ids = {n.node.node_id for n in bm25_nodes}
# Merge and deduplicate by node ID
seen_ids = set()
merged = []
for node in vector_nodes + bm25_nodes:
node_id = node.node.node_id
if node_id not in seen_ids:
seen_ids.add(node_id)
merged.append(node)
# Re-rank merged candidates
reranked = reranker.postprocess_nodes(merged, query_str=q)
# Retrieval stats
n_both = len(vector_ids & bm25_ids)
n_vector_only = len(vector_ids - bm25_ids)
n_bm25_only = len(bm25_ids - vector_ids)
print(f"\nQuery: {q}")
print(f"Vector: {len(vector_nodes)}, BM25: {len(bm25_nodes)}, "
f"overlap: {n_both}, merged: {len(merged)}, re-ranked to: {len(reranked)}")
print(f" vector-only: {n_vector_only}, bm25-only: {n_bm25_only}, both: {n_both}\n")
# Output re-ranked chunks with source annotation
for i, node in enumerate(reranked, 1):
meta = getattr(node, "metadata", None) or node.node.metadata
score = getattr(node, "score", None)
file_name = meta.get("file_name", "unknown")
text = node.get_content()
node_id = node.node.node_id
# Annotate source
in_vector = node_id in vector_ids
in_bm25 = node_id in bm25_ids
if in_vector and in_bm25:
source = "vector+bm25"
elif in_bm25:
source = "bm25-only"
else:
source = "vector-only"
print("=" * WRAP_WIDTH)
print(f"=== [{i}] {file_name} (score: {score:.3f}) [{source}]")
print("=" * WRAP_WIDTH)
for line in text.splitlines():
if line.strip():
print(textwrap.fill(line, width=WRAP_WIDTH))
else:
print()
print()
if __name__ == "__main__":
main()

41
04-semantic-search/run_query.sh Executable file
View file

@ -0,0 +1,41 @@
#!/bin/bash
# This shell script will handle I/O for the python query engine
# It will take a query and return the formatted results
# E.M.F. August 2025
# Usage: ./run_query.sh
QUERY_SCRIPT="query_hybrid.py"
VENV_DIR=".venv"
# Activate the virtual environment
if [ -d "$VENV_DIR" ]; then
source "$VENV_DIR/bin/activate"
echo "Activated virtual environment: $VENV_DIR"
else
echo "Error: Virtual environment not found at '$VENV_DIR'" >&2
echo "Create one with: python3 -m venv $VENV_DIR" >&2
exit 1
fi
echo -e "Current query engine is $QUERY_SCRIPT\n"
# Loop until input is "exit"
while true; do
read -p "Enter your query (or type 'exit' to quit): " query
if [ "$query" == "exit" ] || [ "$query" == "quit" ] || [ "$query" == "" ] ; then
echo "Exiting..."
break
fi
time_start=$(date +%s)
# Call the python script with the query and format the output
python3 $QUERY_SCRIPT --query "$query" | \
expand | sed -E 's|(.* )(.*/data)|\1./data|' | fold -s -w 131
time_end=$(date +%s)
elapsed=$((time_end - time_start))
echo -e "Query processed in $elapsed seconds.\n"
echo $query >> query.log
done

View file

@ -0,0 +1,40 @@
#!/bin/bash
# This shell script will handle I/O for the python query engine
# It will take a query and return the formatted results
# E.M.F. August 2025
# Usage: ./run_query.sh
QUERY_SCRIPT="retrieve.py"
VENV_DIR=".venv"
# Activate the virtual environment
if [ -d "$VENV_DIR" ]; then
source "$VENV_DIR/bin/activate"
echo "Activated virtual environment: $VENV_DIR"
else
echo "Error: Virtual environment not found at '$VENV_DIR'" >&2
echo "Create one with: python3 -m venv $VENV_DIR" >&2
exit 1
fi
echo -e "$QUERY_SCRIPT -- retrieve vector store chunks based on similaity + BM25 with reranking.\n"
# Loop until input is "exit"
while true; do
read -p "Enter your query (or type 'exit' to quit): " query
if [ "$query" == "exit" ] || [ "$query" == "quit" ] || [ "$query" == "" ] ; then
echo "Exiting..."
break
fi
time_start=$(date +%s)
# Call the python script with the query and format the output
python3 $QUERY_SCRIPT --query "$query" | \
expand | sed -E 's|(.* )(.*/data)|\1./data|' | fold -s -w 131
time_end=$(date +%s)
elapsed=$((time_end - time_start))
echo -e "Query processed in $elapsed seconds.\n"
done

View file

@ -0,0 +1,189 @@
# search_keywords.py
# Keyword search: extract terms from a query using POS tagging, then grep
# across journal files for matches.
#
# Complements the vector search pipeline by catching exact names, places,
# and dates that embeddings can miss. No vector store or LLM needed.
#
# Term extraction uses NLTK POS tagging to keep nouns (NN*), proper nouns
# (NNP*), and adjectives (JJ*) -- skipping stopwords and function words
# automatically. Consecutive proper nouns are joined into multi-word phrases
# (e.g., "Robert Wright" stays as one search term, not "robert" + "wright").
#
# E.M.F. February 2026
import os
import sys
import re
from pathlib import Path
import nltk
#
# Globals
#
DATA_DIR = Path("./data")
CONTEXT_LINES = 2 # lines of context around each match
MAX_MATCHES_PER_FILE = 3 # cap matches shown per file to avoid flooding
# POS tags to keep: nouns, proper nouns, adjectives
KEEP_TAGS = {"NN", "NNS", "NNP", "NNPS", "JJ", "JJS", "JJR"}
# Proper noun tags (consecutive runs are joined as phrases)
PROPER_NOUN_TAGS = {"NNP", "NNPS"}
# Minimum word length to keep (filters out short noise)
MIN_WORD_LEN = 3
def ensure_nltk_data():
"""Download NLTK data if not already present."""
for resource, name in [
("tokenizers/punkt_tab", "punkt_tab"),
("taggers/averaged_perceptron_tagger_eng", "averaged_perceptron_tagger_eng"),
]:
try:
nltk.data.find(resource)
except LookupError:
print(f"Downloading NLTK resource: {name}")
nltk.download(name, quiet=True)
def extract_terms(query):
"""Extract key terms from a query using POS tagging.
Tokenizes the query, runs POS tagging, and keeps nouns, proper nouns,
and adjectives. Consecutive proper nouns (NNP/NNPS) are joined into
multi-word phrases (e.g., "Robert Wright" "robert wright").
Returns a list of terms (lowercase), phrases listed first.
"""
tokens = nltk.word_tokenize(query)
tagged = nltk.pos_tag(tokens)
phrases = [] # multi-word proper noun phrases
single_terms = [] # individual nouns/adjectives
proper_run = [] # accumulator for consecutive proper nouns
for word, tag in tagged:
if tag in PROPER_NOUN_TAGS:
proper_run.append(word)
else:
# Flush any accumulated proper noun run
if proper_run:
phrase = " ".join(proper_run).lower()
if len(phrase) >= MIN_WORD_LEN:
phrases.append(phrase)
proper_run = []
# Keep other nouns and adjectives as single terms
if tag in KEEP_TAGS and len(word) >= MIN_WORD_LEN:
single_terms.append(word.lower())
# Flush final proper noun run
if proper_run:
phrase = " ".join(proper_run).lower()
if len(phrase) >= MIN_WORD_LEN:
phrases.append(phrase)
# Phrases first (more specific), then single terms
all_terms = phrases + single_terms
return list(dict.fromkeys(all_terms)) # deduplicate, preserve order
def search_files(terms, data_dir, context_lines=CONTEXT_LINES):
"""Search all .txt files in data_dir for the given terms.
Returns a list of (file_path, match_count, matches) where matches is a
list of (line_number, context_block) tuples.
"""
if not terms:
return []
# Build a single regex pattern that matches any term (case-insensitive)
pattern = re.compile(
r"\b(" + "|".join(re.escape(t) for t in terms) + r")\b",
re.IGNORECASE
)
results = []
txt_files = sorted(data_dir.glob("*.txt"))
for fpath in txt_files:
try:
lines = fpath.read_text(encoding="utf-8").splitlines()
except (OSError, UnicodeDecodeError):
continue
matches = []
match_count = 0
seen_lines = set() # avoid overlapping context blocks
for i, line in enumerate(lines):
if pattern.search(line):
match_count += 1
if i in seen_lines:
continue
# Extract context window
start = max(0, i - context_lines)
end = min(len(lines), i + context_lines + 1)
block = []
for j in range(start, end):
seen_lines.add(j)
marker = ">>>" if j == i else " "
block.append(f" {marker} {j+1:4d}: {lines[j]}")
matches.append((i + 1, "\n".join(block)))
if match_count > 0:
results.append((fpath, match_count, matches))
# Sort by match count (most matches first)
results.sort(key=lambda x: x[1], reverse=True)
return results
def main():
if len(sys.argv) < 2:
print("Usage: python search_keywords.py QUERY_TEXT")
sys.exit(1)
ensure_nltk_data()
q = " ".join(sys.argv[1:])
# Extract terms
terms = extract_terms(q)
if not terms:
print(f"Query: {q}")
print("No searchable terms extracted. Try a more specific query.")
sys.exit(0)
print(f"Query: {q}")
print(f"Extracted terms: {', '.join(terms)}\n")
# Search
results = search_files(terms, DATA_DIR)
if not results:
print("No matches found.")
sys.exit(0)
# Summary
total_matches = sum(r[1] for r in results)
print(f"Found {total_matches} matches across {len(results)} files\n")
# Detailed output
for fpath, match_count, matches in results:
print("="*60)
print(f"--- {fpath.name} ({match_count} matches) ---")
print("="*60)
for line_num, block in matches[:MAX_MATCHES_PER_FILE]:
print(block)
print()
if len(matches) > MAX_MATCHES_PER_FILE:
print(f" ... and {len(matches) - MAX_MATCHES_PER_FILE} more matches\n")
if __name__ == "__main__":
main()

View file

@ -0,0 +1,258 @@
# Large Language Models Part V: Building a Neural Network
**CHEG 667-013 — Chemical Engineering with Computers**
Department of Chemical and Biomolecular Engineering, University of Delaware
---
## Key idea
Build a neural network from scratch to understand the core mechanics behind LLMs.
## Key goals
- See concretely what "weights and biases" are and how they're organized
- Understand the forward pass, loss function, and gradient descent
- Implement backpropagation by hand in numpy
- See how PyTorch automates the same process
- Connect these concepts to what you've already seen in nanoGPT
---
Everything we've done in this workshop is **machine learning** (ML) — the practice of training models to learn patterns from data rather than programming rules by hand. LLMs are one (very large) example of ML, built on neural networks. Throughout this workshop, we've used ML terms like *model weights*, *training loss*, *gradient descent*, and *overfitting* — often without defining them precisely. In Part I, we watched nanoGPT's training loss decrease over 2000 iterations. In Part II, we saw that models have millions of parameters. In Parts III and IV, we used embedding models that map text into vectors — another ML technique.
In this section, we step back from language and build a neural network ourselves — small enough to understand every weight, but powerful enough to learn a real physical relationship. The goal is to make the ML concepts behind LLMs concrete.
Our task: fit the heat capacity $C_p(T)$ of nitrogen gas using data from the [NIST Chemistry WebBook](https://webbook.nist.gov/). This is a function that chemical engineers know well. Textbooks like *Chemical, Biochemical, and Engineering Thermodynamics* (a UD favorite) typically fit it with a polynomial:
$$C_p(T) = a + bT + cT^2 + dT^3$$
Can a neural network learn this relationship directly from data?
## 1. Setup
Use the virtual environment from Part I — `numpy` and `torch` are already installed. You may need to add `matplotlib`:
```bash
pip install matplotlib
```
## 2. The data
The file `data/n2_cp.csv` contains 35 data points: the isobaric heat capacity of N₂ gas at 1 bar from 300 K to 2000 K, from the NIST WebBook.
```bash
head data/n2_cp.csv
```
```
T_K,Cp_kJ_per_kgK
300.00,1.0413
350.00,1.0423
400.00,1.0450
...
```
The curve is smooth and nonlinear — $C_p$ increases with temperature as molecular vibrational modes become active. This is a good test case: simple enough for a small network, but not a straight line.
## 3. Architecture of a one-hidden-layer network
Our network has three layers:
```
Input (1 neuron: T) → Hidden (10 neurons) → Output (1 neuron: Cp)
```
Here's what happens at each step:
### Forward pass
**Step 1: Hidden layer.** Each of the 10 hidden neurons computes a weighted sum of the input plus a bias, then applies an *activation function*:
$$z_j = w_j \cdot x + b_j \qquad a_j = \tanh(z_j)$$
where $w_j$ and $b_j$ are the weight and bias for neuron $j$. The activation function (here, `tanh`) introduces **nonlinearity** — without it, stacking layers would just produce another linear function, no matter how many layers we use.
**Step 2: Output layer.** The output is a weighted sum of the hidden activations:
$$\hat{y} = \sum_j W_j \cdot a_j + b_{\text{out}}$$
This is a linear combination — no activation on the output, since we want to predict a continuous value.
### Counting parameters
With 10 hidden neurons:
- `W1`: 10 weights (input → hidden)
- `b1`: 10 biases (hidden)
- `W2`: 10 weights (hidden → output)
- `b2`: 1 bias (output)
- **Total: 31 parameters**
That's 31 parameters for 35 data points — almost a 1:1 ratio, which should make you nervous about overfitting. In general, a model with as many parameters as data points can memorize instead of learning. We get away with it here because (a) the $C_p(T)$ data is very smooth with no noise, and (b) the `tanh` activation constrains each neuron to a smooth S-curve, so the network can't wiggle wildly between points the way a high-degree polynomial could. We'll revisit this in the overfitting section below.
Compare: the small nanoGPT model from Part I had ~800,000 parameters. GPT-2 has 124 million. The architecture is the same idea — layers of weights and activations — just scaled enormously.
## 4. Training
Training means finding the values of all 31 parameters that make the network's predictions match the data. This requires three things:
### Loss function
We need a number that says "how wrong is the network?" The **mean squared error** (MSE) is a natural choice:
$$L = \frac{1}{N} \sum_{i=1}^{N} (\hat{y}_i - y_i)^2$$
This is the same kind of loss we watched decrease during nanoGPT training in Part I (though nanoGPT uses cross-entropy loss, which is appropriate for classification over a vocabulary).
### Backpropagation
To improve the weights, we need to know how each weight affects the loss. **Backpropagation** computes these gradients by applying the chain rule, working backward from the loss through each layer. For example, the gradient of the loss with respect to an output weight $W_j$ is:
$$\frac{\partial L}{\partial W_j} = \frac{1}{N} \sum_i 2(\hat{y}_i - y_i) \cdot a_{ij}$$
The numpy implementation in `nn_numpy.py` computes every gradient explicitly. This is the part that PyTorch automates.
### Gradient descent
Once we have the gradients, we update each weight:
$$w \leftarrow w - \eta \cdot \frac{\partial L}{\partial w}$$
where $\eta$ is the **learning rate** — a small number (0.01 in our code) that controls how big each step is. Too large and training oscillates; too small and it's painfully slow.
One full pass through these three steps (forward → loss → backward → update) is one **epoch**. We train for 5000 epochs.
In nanoGPT, the training loop in `train.py` does exactly the same thing, but with the AdamW optimizer (a fancier version of gradient descent) and batches of data instead of the full dataset.
## 5. Running the numpy version
```bash
python nn_numpy.py
```
```
Epoch 0 Loss: 0.283941
Epoch 500 Loss: 0.001253
Epoch 1000 Loss: 0.000412
Epoch 1500 Loss: 0.000178
Epoch 2000 Loss: 0.000082
Epoch 2500 Loss: 0.000040
Epoch 3000 Loss: 0.000021
Epoch 3500 Loss: 0.000012
Epoch 4000 Loss: 0.000008
Epoch 4500 Loss: 0.000005
Epoch 4999 Loss: 0.000004
Final loss: 0.000004
Network: 1 input -> 10 hidden (tanh) -> 1 output
Total parameters: 31
```
The script produces a plot (`nn_fit.png`) showing the fit and the training loss curve. You should see the network's prediction closely tracking the NIST data points, and the loss dropping rapidly in the first 1000 epochs before leveling off.
> **Exercise 1:** Read through `nn_numpy.py` carefully. Identify where each of the following happens: (a) forward pass, (b) loss calculation, (c) backpropagation, (d) gradient descent update. Annotate your copy with comments.
> **Exercise 2:** Change the number of hidden neurons `H`. Try 2, 5, 10, 20, 50. How does the fit change? How many parameters does each network have? At what point does adding more neurons stop helping?
## 6. The PyTorch version
Now look at `nn_torch.py`. It does the same thing, but in about half the code:
```bash
python nn_torch.py
```
Compare the two scripts side by side. The key differences:
| | numpy version | PyTorch version |
|---|---|---|
| Define layers | Manual weight matrices | `nn.Linear(1, H)` |
| Forward pass | `X @ W1 + b1`, `np.tanh(...)` | `model(X)` |
| Backprop | Hand-coded chain rule | `loss.backward()` |
| Weight update | `W -= lr * dW` | `optimizer.step()` |
| Lines of code | ~80 | ~40 |
PyTorch's `loss.backward()` computes all the gradients we wrote out by hand — automatically. This is called **automatic differentiation**. It's what makes training networks with millions of parameters feasible.
The `nn.Sequential` definition:
```python
model = nn.Sequential(
nn.Linear(1, H), # input -> hidden (W1, b1)
nn.Tanh(), # activation
nn.Linear(H, 1), # hidden -> output (W2, b2)
)
```
looks simple here, but it's the same API used in nanoGPT's `model.py` — just with more layers, attention mechanisms, and a much larger vocabulary.
> **Exercise 3:** In the PyTorch version, replace `nn.Tanh()` with `nn.ReLU()` or `nn.Sigmoid()`. How does the fit change? Why might different activation functions work better for different problems?
> **Exercise 4:** Replace the Adam optimizer with plain SGD: `torch.optim.SGD(model.parameters(), lr=0.01)`. How does training speed compare? Try increasing the learning rate. What happens?
## 7. Normalization
Both scripts normalize the input ($T$) and output ($C_p$) to the range [0, 1] before training. This is important:
- Raw $T$ values range from 300 to 2000, while $C_p$ ranges from 1.04 to 1.28
- With unnormalized data, the gradients for the input weights would be hundreds of times larger than for the output weights
- The network would struggle to learn — or need a much smaller learning rate
Try it yourself:
> **Exercise 5:** Comment out the normalization in `nn_numpy.py` (use `T_raw` and `Cp_raw` directly). What happens to the training loss? Can you fix it by changing the learning rate?
## 8. Overfitting
With 31 parameters and 35 data points, our network is close to the edge. What happens with more parameters than data?
> **Exercise 6:** Increase `H` to 100 (giving 301 parameters — nearly 10× the number of data points). Train for 20,000 epochs. Plot the fit. Does it match the training data well? Now generate predictions at $T$ = 275 K and $T$ = 2100 K (outside the training range). Are they reasonable?
This is **overfitting** — the network memorizes the training data but fails to generalize. It's the same concept we discussed in Part I when nanoGPT's validation loss started increasing while the training loss kept decreasing.
In practice, we combat overfitting with:
- More data
- Regularization (dropout — remember this parameter from nanoGPT?)
- Early stopping (stop training when validation loss starts increasing)
- Keeping the model appropriately sized for the data
## 9. Connecting back to LLMs
Everything you've built here scales up to large language models:
| This tutorial | nanoGPT / LLMs |
|---|---|
| 31 parameters | 800K 70B+ parameters |
| 1 hidden layer | 4 96+ layers |
| tanh activation | GELU activation |
| MSE loss | Cross-entropy loss |
| Plain gradient descent | AdamW optimizer |
| Numpy arrays | PyTorch tensors (on GPU) |
| Fitting $C_p(T)$ | Predicting next tokens |
The fundamental loop — forward pass, compute loss, backpropagate, update weights — is identical. The difference is scale: more layers, more data, more compute, and architectural innovations like self-attention.
## Additional resources and references
### NIST Chemistry WebBook
- https://webbook.nist.gov/ — thermophysical property data used in this tutorial
### PyTorch
- Tutorial: https://pytorch.org/tutorials/beginner/basics/intro.html
- `nn.Module` documentation: https://pytorch.org/docs/stable/nn.html
### Reading
- The "backpropagation" chapter in Goodfellow, Bengio & Courville, *Deep Learning* (2016), freely available at https://www.deeplearningbook.org/
- 3Blue1Brown, *Neural Networks* video series: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi — excellent visual intuition for how neural networks learn

View file

@ -0,0 +1,36 @@
T_K,Cp_kJ_per_kgK
300.00,1.0413
350.00,1.0423
400.00,1.0450
450.00,1.0497
500.00,1.0564
550.00,1.0650
600.00,1.0751
650.00,1.0863
700.00,1.0981
750.00,1.1102
800.00,1.1223
850.00,1.1342
900.00,1.1457
950.00,1.1568
1000.0,1.1674
1050.0,1.1774
1100.0,1.1868
1150.0,1.1957
1200.0,1.2040
1250.0,1.2118
1300.0,1.2191
1350.0,1.2260
1400.0,1.2323
1450.0,1.2383
1500.0,1.2439
1550.0,1.2491
1600.0,1.2540
1650.0,1.2586
1700.0,1.2630
1750.0,1.2670
1800.0,1.2708
1850.0,1.2744
1900.0,1.2778
1950.0,1.2810
2000.0,1.2841
1 T_K Cp_kJ_per_kgK
2 300.00 1.0413
3 350.00 1.0423
4 400.00 1.0450
5 450.00 1.0497
6 500.00 1.0564
7 550.00 1.0650
8 600.00 1.0751
9 650.00 1.0863
10 700.00 1.0981
11 750.00 1.1102
12 800.00 1.1223
13 850.00 1.1342
14 900.00 1.1457
15 950.00 1.1568
16 1000.0 1.1674
17 1050.0 1.1774
18 1100.0 1.1868
19 1150.0 1.1957
20 1200.0 1.2040
21 1250.0 1.2118
22 1300.0 1.2191
23 1350.0 1.2260
24 1400.0 1.2323
25 1450.0 1.2383
26 1500.0 1.2439
27 1550.0 1.2491
28 1600.0 1.2540
29 1650.0 1.2586
30 1700.0 1.2630
31 1750.0 1.2670
32 1800.0 1.2708
33 1850.0 1.2744
34 1900.0 1.2778
35 1950.0 1.2810
36 2000.0 1.2841

View file

@ -0,0 +1,156 @@
# nn_numpy.py
#
# A neural network with one hidden layer, built from scratch using numpy.
# Fits Cp(T) data for nitrogen gas at 1 bar (NIST WebBook).
#
# This demonstrates the core mechanics of a neural network:
# - Forward pass: input -> hidden layer -> activation -> output
# - Loss calculation (mean squared error)
# - Backpropagation: computing gradients of the loss w.r.t. each weight
# - Gradient descent: updating weights to minimize loss
#
# CHEG 667-013
# E. M. Furst
import numpy as np
import matplotlib.pyplot as plt
# ── Load and prepare data ──────────────────────────────────────
data = np.loadtxt("data/n2_cp.csv", delimiter=",", skiprows=1)
T_raw = data[:, 0] # Temperature (K)
Cp_raw = data[:, 1] # Heat capacity (kJ/kg/K)
# Normalize inputs and outputs to [0, 1] range.
# Neural networks train better when values are small and centered.
T_min, T_max = T_raw.min(), T_raw.max()
Cp_min, Cp_max = Cp_raw.min(), Cp_raw.max()
T = (T_raw - T_min) / (T_max - T_min) # shape: (N,)
Cp = (Cp_raw - Cp_min) / (Cp_max - Cp_min) # shape: (N,)
# Reshape for matrix operations: each sample is a row
X = T.reshape(-1, 1) # (N, 1) -- input matrix
Y = Cp.reshape(-1, 1) # (N, 1) -- target matrix
N = X.shape[0] # number of data points
# ── Network architecture ───────────────────────────────────────
#
# Input (1) --> Hidden (H neurons, tanh) --> Output (1)
#
# The hidden layer has H neurons. Each neuron computes:
# z = w * x + b (weighted sum)
# a = tanh(z) (activation -- introduces nonlinearity)
#
# The output layer combines the hidden activations:
# y_pred = W2 @ a + b2
H = 10 # number of neurons in the hidden layer
# Initialize weights randomly (small values)
# W1: (1, H) -- connects input to each hidden neuron
# b1: (1, H) -- one bias per hidden neuron
# W2: (H, 1) -- connects hidden neurons to output
# b2: (1, 1) -- output bias
np.random.seed(42)
W1 = np.random.randn(1, H) * 0.5
b1 = np.zeros((1, H))
W2 = np.random.randn(H, 1) * 0.5
b2 = np.zeros((1, 1))
# ── Training parameters ───────────────────────────────────────
learning_rate = 0.01
epochs = 5000
log_interval = 500
# ── Training loop ─────────────────────────────────────────────
losses = []
for epoch in range(epochs):
# ── Forward pass ──────────────────────────────────────────
# Step 1: hidden layer pre-activation
Z1 = X @ W1 + b1 # (N, H)
# Step 2: hidden layer activation (tanh)
A1 = np.tanh(Z1) # (N, H)
# Step 3: output layer (linear -- no activation)
Y_pred = A1 @ W2 + b2 # (N, 1)
# ── Loss ──────────────────────────────────────────────────
# Mean squared error
error = Y_pred - Y # (N, 1)
loss = np.mean(error ** 2)
losses.append(loss)
# ── Backpropagation ───────────────────────────────────────
# Compute gradients by applying the chain rule, working
# backward from the loss to each weight.
# Gradient of loss w.r.t. output
dL_dYpred = 2 * error / N # (N, 1)
# Gradients for output layer weights
dL_dW2 = A1.T @ dL_dYpred # (H, 1)
dL_db2 = np.sum(dL_dYpred, axis=0, keepdims=True) # (1, 1)
# Gradient flowing back through the hidden layer
dL_dA1 = dL_dYpred @ W2.T # (N, H)
# Derivative of tanh: d/dz tanh(z) = 1 - tanh(z)^2
dL_dZ1 = dL_dA1 * (1 - A1 ** 2) # (N, H)
# Gradients for hidden layer weights
dL_dW1 = X.T @ dL_dZ1 # (1, H)
dL_db1 = np.sum(dL_dZ1, axis=0, keepdims=True) # (1, H)
# ── Gradient descent ──────────────────────────────────────
# Update each weight in the direction that reduces the loss
W2 -= learning_rate * dL_dW2
b2 -= learning_rate * dL_db2
W1 -= learning_rate * dL_dW1
b1 -= learning_rate * dL_db1
if epoch % log_interval == 0 or epoch == epochs - 1:
print(f"Epoch {epoch:5d} Loss: {loss:.6f}")
# ── Results ────────────────────────────────────────────────────
# Predict on a fine grid for smooth plotting
T_fine = np.linspace(0, 1, 200).reshape(-1, 1)
A1_fine = np.tanh(T_fine @ W1 + b1)
Cp_pred_norm = A1_fine @ W2 + b2
# Convert back to physical units
T_fine_K = T_fine * (T_max - T_min) + T_min
Cp_pred = Cp_pred_norm * (Cp_max - Cp_min) + Cp_min
# ── Plot ───────────────────────────────────────────────────────
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Left: fit
ax1.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')
ax1.plot(T_fine_K, Cp_pred, 'r-', linewidth=2, label=f'NN ({H} neurons)')
ax1.set_xlabel('Temperature (K)')
ax1.set_ylabel('$C_p$ (kJ/kg/K)')
ax1.set_title('$C_p(T)$ for N$_2$ at 1 bar')
ax1.legend()
# Right: training loss
ax2.semilogy(losses)
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Mean Squared Error')
ax2.set_title('Training Loss')
plt.tight_layout()
plt.savefig('nn_fit.png', dpi=150)
plt.show()
print(f"\nFinal loss: {losses[-1]:.6f}")
print(f"Network: {1} input -> {H} hidden (tanh) -> {1} output")
print(f"Total parameters: {W1.size + b1.size + W2.size + b2.size}")

View file

@ -0,0 +1,99 @@
# nn_torch.py
#
# The same neural network as nn_numpy.py, but using PyTorch.
# Compare this to the numpy version to see what the framework handles for you:
# - Automatic differentiation (no manual backprop)
# - Built-in optimizers (Adam instead of hand-coded gradient descent)
# - GPU support (if available)
#
# CHEG 667-013
# E. M. Furst
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
# ── Load and prepare data ──────────────────────────────────────
data = np.loadtxt("data/n2_cp.csv", delimiter=",", skiprows=1)
T_raw = data[:, 0]
Cp_raw = data[:, 1]
# Normalize to [0, 1]
T_min, T_max = T_raw.min(), T_raw.max()
Cp_min, Cp_max = Cp_raw.min(), Cp_raw.max()
X = torch.tensor((T_raw - T_min) / (T_max - T_min), dtype=torch.float32).reshape(-1, 1)
Y = torch.tensor((Cp_raw - Cp_min) / (Cp_max - Cp_min), dtype=torch.float32).reshape(-1, 1)
# ── Define the network ─────────────────────────────────────────
#
# nn.Sequential stacks layers in order. Compare this to nanoGPT's
# model.py, which uses the same PyTorch building blocks (nn.Linear,
# activation functions) but with many more layers.
H = 10 # hidden neurons
model = nn.Sequential(
nn.Linear(1, H), # input -> hidden (W1, b1)
nn.Tanh(), # activation
nn.Linear(H, 1), # hidden -> output (W2, b2)
)
print(f"Model:\n{model}")
print(f"Total parameters: {sum(p.numel() for p in model.parameters())}\n")
# ── Training ───────────────────────────────────────────────────
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
loss_fn = nn.MSELoss()
epochs = 5000
log_interval = 500
losses = []
for epoch in range(epochs):
# Forward pass -- PyTorch tracks operations for automatic differentiation
Y_pred = model(X)
loss = loss_fn(Y_pred, Y)
losses.append(loss.item())
# Backward pass -- PyTorch computes all gradients automatically
optimizer.zero_grad() # reset gradients from previous step
loss.backward() # compute gradients via automatic differentiation
optimizer.step() # update weights (Adam optimizer)
if epoch % log_interval == 0 or epoch == epochs - 1:
print(f"Epoch {epoch:5d} Loss: {loss.item():.6f}")
# ── Results ────────────────────────────────────────────────────
# Predict on a fine grid
T_fine = torch.linspace(0, 1, 200).reshape(-1, 1)
with torch.no_grad(): # no gradient tracking needed for inference
Cp_pred_norm = model(T_fine)
# Convert back to physical units
T_fine_K = T_fine.numpy() * (T_max - T_min) + T_min
Cp_pred = Cp_pred_norm.numpy() * (Cp_max - Cp_min) + Cp_min
# ── Plot ───────────────────────────────────────────────────────
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
ax1.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')
ax1.plot(T_fine_K, Cp_pred, 'r-', linewidth=2, label=f'NN ({H} neurons)')
ax1.set_xlabel('Temperature (K)')
ax1.set_ylabel('$C_p$ (kJ/kg/K)')
ax1.set_title('$C_p(T)$ for N$_2$ at 1 bar — PyTorch')
ax1.legend()
ax2.semilogy(losses)
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Mean Squared Error')
ax2.set_title('Training Loss')
plt.tight_layout()
plt.savefig('nn_fit_torch.png', dpi=150)
plt.show()

View file

@ -0,0 +1,137 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "xbsmj1hcj1g",
"source": "# Building a Neural Network: $C_p(T)$ for Nitrogen\n\n**CHEG 667-013 — LLMs for Engineers**\n\nIn this notebook we fit the heat capacity of N₂ gas using three approaches:\n1. A polynomial fit (the classical approach)\n2. A neural network built from scratch in numpy\n3. The same network in PyTorch\n\nThis makes the ML concepts behind LLMs — weights, loss, gradient descent, overfitting — concrete and tangible.",
"metadata": {}
},
{
"cell_type": "markdown",
"id": "szrl41l3xbq",
"source": "## 1. Load and plot the data\n\nThe data is from the [NIST Chemistry WebBook](https://webbook.nist.gov/): isobaric heat capacity of N₂ at 1 bar, 3002000 K.",
"metadata": {}
},
{
"cell_type": "code",
"id": "t4lqkcoeyil",
"source": "import numpy as np\nimport matplotlib.pyplot as plt\n\ndata = np.loadtxt(\"data/n2_cp.csv\", delimiter=\",\", skiprows=1)\nT_raw = data[:, 0] # Temperature (K)\nCp_raw = data[:, 1] # Cp (kJ/kg/K)\n\nplt.figure(figsize=(8, 5))\nplt.plot(T_raw, Cp_raw, 'ko', markersize=6)\nplt.xlabel('Temperature (K)')\nplt.ylabel('$C_p$ (kJ/kg/K)')\nplt.title('$C_p(T)$ for N$_2$ at 1 bar — NIST WebBook')\nplt.show()\n\nprint(f\"{len(T_raw)} data points, T range: {T_raw.min():.0f} {T_raw.max():.0f} K\")",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "1jyrgsvp7op",
"source": "## 2. Polynomial fit (baseline)\n\nTextbooks fit $C_p(T)$ with a polynomial: $C_p = a + bT + cT^2 + dT^3$. This is a **4-parameter** model. Let's fit it with `numpy.polyfit` and see how well it does.",
"metadata": {}
},
{
"cell_type": "code",
"id": "4smvu4z2oro",
"source": "# Fit a cubic polynomial\ncoeffs = np.polyfit(T_raw, Cp_raw, 3)\npoly = np.poly1d(coeffs)\n\nT_fine = np.linspace(T_raw.min(), T_raw.max(), 200)\nCp_poly = poly(T_fine)\n\n# Compute residuals\nCp_poly_at_data = poly(T_raw)\nmse_poly = np.mean((Cp_poly_at_data - Cp_raw) ** 2)\n\nplt.figure(figsize=(8, 5))\nplt.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')\nplt.plot(T_fine, Cp_poly, 'b-', linewidth=2, label=f'Cubic polynomial (4 params)')\nplt.xlabel('Temperature (K)')\nplt.ylabel('$C_p$ (kJ/kg/K)')\nplt.title('Polynomial fit')\nplt.legend()\nplt.show()\n\nprint(f\"Polynomial coefficients: {coeffs}\")\nprint(f\"MSE: {mse_poly:.8f}\")\nprint(f\"Parameters: 4\")",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "97y7mrcekji",
"source": "## 3. Neural network from scratch (numpy)\n\nNow let's build a one-hidden-layer neural network. The architecture:\n\n```\nInput (1: T) → Hidden (10 neurons, tanh) → Output (1: Cp)\n```\n\nWe need to:\n1. **Normalize** the data to [0, 1] so the network trains efficiently\n2. **Forward pass**: compute predictions from input through each layer\n3. **Loss**: mean squared error between predictions and data\n4. **Backpropagation**: compute gradients of the loss w.r.t. each weight using the chain rule\n5. **Gradient descent**: update weights in the direction that reduces the loss\n\nThis is exactly what nanoGPT's `train.py` does — just at a much larger scale.",
"metadata": {}
},
{
"cell_type": "code",
"id": "365o7bqbwkr",
"source": "# Normalize inputs and outputs to [0, 1]\nT_min, T_max = T_raw.min(), T_raw.max()\nCp_min, Cp_max = Cp_raw.min(), Cp_raw.max()\n\nT = (T_raw - T_min) / (T_max - T_min)\nCp = (Cp_raw - Cp_min) / (Cp_max - Cp_min)\n\nX = T.reshape(-1, 1) # (N, 1) input matrix\nY = Cp.reshape(-1, 1) # (N, 1) target matrix\nN = X.shape[0]\n\n# Network setup\nH = 10 # hidden neurons\n\nnp.random.seed(42)\nW1 = np.random.randn(1, H) * 0.5 # input → hidden weights\nb1 = np.zeros((1, H)) # hidden biases\nW2 = np.random.randn(H, 1) * 0.5 # hidden → output weights\nb2 = np.zeros((1, 1)) # output bias\n\nprint(f\"Parameters: W1({W1.shape}) + b1({b1.shape}) + W2({W2.shape}) + b2({b2.shape})\")\nprint(f\"Total: {W1.size + b1.size + W2.size + b2.size} parameters for {N} data points\")",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"id": "5w1ezs9t2w6",
"source": "# Training loop\nlearning_rate = 0.01\nepochs = 5000\nlog_interval = 500\nlosses_np = []\n\nfor epoch in range(epochs):\n # Forward pass\n Z1 = X @ W1 + b1 # hidden pre-activation (N, H)\n A1 = np.tanh(Z1) # hidden activation (N, H)\n Y_pred = A1 @ W2 + b2 # output (N, 1)\n\n # Loss (mean squared error)\n error = Y_pred - Y\n loss = np.mean(error ** 2)\n losses_np.append(loss)\n\n # Backpropagation (chain rule, working backward)\n dL_dYpred = 2 * error / N\n dL_dW2 = A1.T @ dL_dYpred\n dL_db2 = np.sum(dL_dYpred, axis=0, keepdims=True)\n dL_dA1 = dL_dYpred @ W2.T\n dL_dZ1 = dL_dA1 * (1 - A1 ** 2) # tanh derivative\n dL_dW1 = X.T @ dL_dZ1\n dL_db1 = np.sum(dL_dZ1, axis=0, keepdims=True)\n\n # Gradient descent update\n W2 -= learning_rate * dL_dW2\n b2 -= learning_rate * dL_db2\n W1 -= learning_rate * dL_dW1\n b1 -= learning_rate * dL_db1\n\n if epoch % log_interval == 0 or epoch == epochs - 1:\n print(f\"Epoch {epoch:5d} Loss: {loss:.6f}\")",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"id": "onel9r0kjk",
"source": "# Predict on a fine grid and convert back to physical units\nT_fine_norm = np.linspace(0, 1, 200).reshape(-1, 1)\nA1_fine = np.tanh(T_fine_norm @ W1 + b1)\nCp_nn_norm = A1_fine @ W2 + b2\nCp_nn = Cp_nn_norm * (Cp_max - Cp_min) + Cp_min\nT_fine_K = T_fine_norm * (T_max - T_min) + T_min\n\n# MSE in original units for comparison with polynomial\nCp_nn_at_data = np.tanh(X @ W1 + b1) @ W2 + b2\nCp_nn_at_data = Cp_nn_at_data * (Cp_max - Cp_min) + Cp_min\nmse_nn = np.mean((Cp_nn_at_data.flatten() - Cp_raw) ** 2)\n\nfig, (ax1, ax2) = plt.subplots(1, 2, figsize=(13, 5))\n\nax1.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')\nax1.plot(T_fine, Cp_poly, 'b-', linewidth=2, label=f'Polynomial (4 params, MSE={mse_poly:.2e})')\nax1.plot(T_fine_K.flatten(), Cp_nn.flatten(), 'r-', linewidth=2, label=f'NN numpy (31 params, MSE={mse_nn:.2e})')\nax1.set_xlabel('Temperature (K)')\nax1.set_ylabel('$C_p$ (kJ/kg/K)')\nax1.set_title('$C_p(T)$ for N$_2$ at 1 bar')\nax1.legend()\n\nax2.semilogy(losses_np)\nax2.set_xlabel('Epoch')\nax2.set_ylabel('MSE (normalized)')\nax2.set_title('Training loss — numpy NN')\n\nplt.tight_layout()\nplt.show()",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "ea9z35qm9u8",
"source": "## 4. Neural network in PyTorch\n\nThe same network, but PyTorch handles backpropagation automatically. Compare the training loop above to the one below — `loss.backward()` replaces all of our manual gradient calculations.\n\nThis is the same API used in nanoGPT's `model.py` — `nn.Linear`, activation functions, `optimizer.step()`.",
"metadata": {}
},
{
"cell_type": "code",
"id": "3qxnrtyxqgz",
"source": "import torch\nimport torch.nn as nn\n\n# Prepare data as PyTorch tensors\nX_t = torch.tensor((T_raw - T_min) / (T_max - T_min), dtype=torch.float32).reshape(-1, 1)\nY_t = torch.tensor((Cp_raw - Cp_min) / (Cp_max - Cp_min), dtype=torch.float32).reshape(-1, 1)\n\n# Define the network\nmodel = nn.Sequential(\n nn.Linear(1, H), # input → hidden (W1, b1)\n nn.Tanh(), # activation\n nn.Linear(H, 1), # hidden → output (W2, b2)\n)\n\nprint(model)\nprint(f\"Total parameters: {sum(p.numel() for p in model.parameters())}\")",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"id": "ydl3ycnypps",
"source": "# Train\noptimizer = torch.optim.Adam(model.parameters(), lr=0.01)\nloss_fn = nn.MSELoss()\nlosses_torch = []\n\nfor epoch in range(epochs):\n Y_pred_t = model(X_t)\n loss = loss_fn(Y_pred_t, Y_t)\n losses_torch.append(loss.item())\n\n optimizer.zero_grad() # reset gradients\n loss.backward() # automatic differentiation\n optimizer.step() # update weights\n\n if epoch % log_interval == 0 or epoch == epochs - 1:\n print(f\"Epoch {epoch:5d} Loss: {loss.item():.6f}\")",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "bg0kvnk4ho",
"source": "## 5. Compare all three approaches",
"metadata": {}
},
{
"cell_type": "code",
"id": "h2dfstoh8gd",
"source": "# PyTorch predictions\nT_fine_t = torch.linspace(0, 1, 200).reshape(-1, 1)\nwith torch.no_grad():\n Cp_torch_norm = model(T_fine_t)\nCp_torch = Cp_torch_norm.numpy() * (Cp_max - Cp_min) + Cp_min\n\n# MSE for PyTorch model\nwith torch.no_grad():\n Cp_torch_at_data = model(X_t).numpy() * (Cp_max - Cp_min) + Cp_min\nmse_torch = np.mean((Cp_torch_at_data.flatten() - Cp_raw) ** 2)\n\nfig, (ax1, ax2) = plt.subplots(1, 2, figsize=(13, 5))\n\n# Left: all three fits\nax1.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')\nax1.plot(T_fine, Cp_poly, 'b-', linewidth=2, label=f'Polynomial (4 params)')\nax1.plot(T_fine_K.flatten(), Cp_nn.flatten(), 'r--', linewidth=2, label=f'NN numpy (31 params)')\nax1.plot(T_fine_K.flatten(), Cp_torch.flatten(), 'g-', linewidth=2, alpha=0.8, label=f'NN PyTorch (31 params)')\nax1.set_xlabel('Temperature (K)')\nax1.set_ylabel('$C_p$ (kJ/kg/K)')\nax1.set_title('$C_p(T)$ for N$_2$ at 1 bar')\nax1.legend()\n\n# Right: training loss comparison\nax2.semilogy(losses_np, label='numpy (gradient descent)')\nax2.semilogy(losses_torch, label='PyTorch (Adam)')\nax2.set_xlabel('Epoch')\nax2.set_ylabel('MSE (normalized)')\nax2.set_title('Training loss comparison')\nax2.legend()\n\nplt.tight_layout()\nplt.show()\n\nprint(f\"MSE — Polynomial: {mse_poly:.2e} | NN numpy: {mse_nn:.2e} | NN PyTorch: {mse_torch:.2e}\")",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "xyw3sr20brn",
"source": "## 6. Extrapolation\n\nHow do the models behave *outside* the training range? This is a key test — and where the differences become stark.",
"metadata": {}
},
{
"cell_type": "code",
"id": "fi3iq2sjh6",
"source": "# Extrapolate beyond the training range\nT_extrap = np.linspace(100, 2500, 300)\nT_extrap_norm = ((T_extrap - T_min) / (T_max - T_min)).reshape(-1, 1)\n\n# Polynomial extrapolation\nCp_poly_extrap = poly(T_extrap)\n\n# Numpy NN extrapolation\nA1_extrap = np.tanh(T_extrap_norm @ W1 + b1)\nCp_nn_extrap = (A1_extrap @ W2 + b2) * (Cp_max - Cp_min) + Cp_min\n\n# PyTorch NN extrapolation\nwith torch.no_grad():\n Cp_torch_extrap = model(torch.tensor(T_extrap_norm, dtype=torch.float32)).numpy()\nCp_torch_extrap = Cp_torch_extrap * (Cp_max - Cp_min) + Cp_min\n\nplt.figure(figsize=(10, 6))\nplt.plot(T_raw, Cp_raw, 'ko', markersize=6, label='NIST data')\nplt.plot(T_extrap, Cp_poly_extrap, 'b-', linewidth=2, label='Polynomial')\nplt.plot(T_extrap, Cp_nn_extrap.flatten(), 'r--', linewidth=2, label='NN numpy')\nplt.plot(T_extrap, Cp_torch_extrap.flatten(), 'g-', linewidth=2, alpha=0.8, label='NN PyTorch')\nplt.axvline(T_raw.min(), color='gray', linestyle=':', alpha=0.5, label='Training range')\nplt.axvline(T_raw.max(), color='gray', linestyle=':', alpha=0.5)\nplt.xlabel('Temperature (K)')\nplt.ylabel('$C_p$ (kJ/kg/K)')\nplt.title('Extrapolation beyond training data')\nplt.legend()\nplt.show()",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "yb2s18keiw",
"source": "## 7. Exercises\n\nTry these in new cells below:\n\n1. **Change the number of hidden neurons** (`H`). Try 2, 5, 20, 50. How does the fit change? At what point does adding neurons stop helping?\n\n2. **Activation functions**: In the PyTorch model, replace `nn.Tanh()` with `nn.ReLU()` or `nn.Sigmoid()`. How does the fit change?\n\n3. **Optimizer comparison**: Replace `Adam` with `torch.optim.SGD(model.parameters(), lr=0.01)`. How does training speed compare?\n\n4. **Remove normalization**: Use `T_raw` and `Cp_raw` directly (no scaling to [0,1]). What happens? Can you fix it by adjusting the learning rate?\n\n5. **Overfitting**: Set `H = 100` and train for 20,000 epochs. Does it fit the training data well? Look at the extrapolation — is it reasonable?\n\n6. **Higher-order polynomial**: Try `np.polyfit(T_raw, Cp_raw, 10)`. How does it compare to the cubic? How does it extrapolate?",
"metadata": {}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.12.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

21
LICENSE Normal file
View file

@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025-2026 Eric M. Furst, University of Delaware
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

56
README.md Normal file
View file

@ -0,0 +1,56 @@
# LLMs for Engineers
**CHEG 667-013 — Chemical Engineering with Computers**
Department of Chemical and Biomolecular Engineering, University of Delaware
A hands-on workshop on Large Language Models and machine learning for engineers. Learn how to train a GPT from scratch, run local models, build retrieval-augmented generation systems, then tie it back to underlying machine learning methods by implementing a simple neural network.
## Sections
| # | Topic | Description |
|---|-------|-------------|
| [01](01-nanogpt/) | **nanoGPT** | Train a small transformer on Shakespeare. Explore model parameters, temperature, and text generation. |
| [02](02-ollama/) | **Local models with Ollama** | Run pre-trained LLMs locally. Summarize documents, query arXiv, generate code, build custom models. |
| [03](03-rag/) | **Retrieval-Augmented Generation** | Build a RAG system: chunk documents, embed them, and query with an LLM grounded in your own data. |
| [04](04-semantic-search/) | **Advanced retrieval** | Hybrid BM25 + vector search with cross-encoder re-ranking. Compares summarization versus raw retrieval. |
| [05](05-neural-networks/) | **Building a neural network** | Implement a one-hidden-layer network from scratch in numpy, then in PyTorch. Fits $C_p(T)$ data for N₂. |
## Prerequisites
- A terminal (macOS/Linux, or WSL on Windows)
- Python 3.10+
- Basic comfort with the command line
- [Ollama](https://ollama.com) (sections 0204)
## Getting started
Clone this repository and work through each section in order:
```bash
git clone https://lem.che.udel.edu/git/furst/llm-workshop.git
cd llm-workshop
```
Each section has its own `README.md` with a full walkthrough, exercises, and any code or data needed.
### Python environment
Create a virtual environment once and reuse it across sections:
```bash
python3 -m venv llm
source llm/bin/activate
pip install numpy torch matplotlib
```
Sections 03 and 04 have additional dependencies listed in their `requirements.txt` files.
## License
MIT
## Author
Eric M. Furst, University of Delaware