Add uv for dependency management and update workshop materials

This commit is contained in:
Eric 2026-03-31 12:03:34 -04:00
commit 7e4f0fb80b
6 changed files with 4122 additions and 53 deletions

View file

@ -62,7 +62,7 @@ The curve is smooth and nonlinear — $C_p$ increases with temperature as molecu
Our network has three layers:
```
Input (1 neuron: T) → Hidden (10 neurons) → Output (1 neuron: Cp)
Input (1 neuron: T) -> Hidden (10 neurons) -> Output (1 neuron: Cp)
```
Here's what happens at each step:
@ -84,9 +84,9 @@ This is a linear combination — no activation on the output, since we want to p
### Counting parameters
With 10 hidden neurons:
- `W1`: 10 weights (input hidden)
- `W1`: 10 weights (input -> hidden)
- `b1`: 10 biases (hidden)
- `W2`: 10 weights (hidden output)
- `W2`: 10 weights (hidden -> output)
- `b2`: 1 bias (output)
- **Total: 31 parameters**
@ -123,7 +123,7 @@ $$w \leftarrow w - \eta \cdot \frac{\partial L}{\partial w}$$
where $\eta$ is the **learning rate** — a small number (0.01 in our code) that controls how big each step is. Too large and training oscillates; too small and it's painfully slow.
One full pass through these three steps (forward → loss → backward → update) is one **epoch**. We train for 5000 epochs.
One full pass through these three steps (forward -> loss -> backward -> update) is one **epoch**. We train for 5000 epochs.
In nanoGPT, the training loop in `train.py` does exactly the same thing, but with the AdamW optimizer (a fancier version of gradient descent) and batches of data instead of the full dataset.