llm-workshop/README.md

# LLMs for Engineers

**CHEG 667-013 — Chemical Engineering with Computers**
Department of Chemical and Biomolecular Engineering, University of Delaware

A hands-on workshop on Large Language Models and machine learning for engineers. Learn how to train a GPT from scratch, run local models, build retrieval-augmented generation systems, then tie it back to underlying machine learning methods by implementing a simple neural network.

## Sections

| # | Topic | Description |
|---|-------|-------------|
| [01](01-nanogpt/) | **nanoGPT** | Train a small transformer on Shakespeare. Explore model parameters, temperature, and text generation. |
| [02](02-ollama/) | **Local models with Ollama** | Run pre-trained LLMs locally. Summarize documents, query arXiv, generate code, build custom models. |
| [03](03-rag/) | **Retrieval-Augmented Generation** | Build a RAG system: chunk documents, embed them, and query with an LLM grounded in your own data. |
| [04](04-semantic-search/) | **Advanced retrieval** | Hybrid BM25 + vector search with cross-encoder re-ranking. Compares summarization versus raw retrieval. |
| [05](05-neural-networks/) | **Building a neural network** | Implement a one-hidden-layer network from scratch in numpy, then in PyTorch. Fits $C_p(T)$ data for N₂. |


## Prerequisites

- A terminal (macOS/Linux, or WSL on Windows)
- Python 3.10+
- Basic comfort with the command line
- [Ollama](https://ollama.com) (sections 02–04)

## Getting started

Clone this repository and work through each section in order:

```bash
git clone https://lem.che.udel.edu/git/furst/llm-workshop.git
cd llm-workshop
```

Each section has its own `README.md` with a full walkthrough, exercises, and any code or data needed.

### Python environment

Install [uv](https://docs.astral.sh/uv/getting-started/installation/) (a fast Python package manager), then:

```bash
uv sync
```

This creates a `.venv/` virtual environment and installs all dependencies from the lock file.

**Note:** On Apple Silicon Macs, PyTorch GPU acceleration (MPS) works out of the box. On NVIDIA GPU machines, the default `uv sync` install may be CPU-only and you need to reinstall with CUDA support. See [PYTORCH.md](01-nanogpt/PYTORCH.md) for troubleshooting and device-specific instructions.

`cd` into the section directory before running scripts or notebooks, since they reference local data files:

```bash
cd 05-neural-networks
uv run python nn_torch.py
```

Or activate the environment and run directly:

```bash
source .venv/bin/activate
cd 05-neural-networks
python nn_torch.py
```

## License

MIT

## Author

Eric M. Furst, University of Delaware