Add uv for dependency management and update workshop materials
This commit is contained in:
parent
4c88157a8e
commit
7e4f0fb80b
6 changed files with 4122 additions and 53 deletions
|
|
@ -62,7 +62,7 @@ The curve is smooth and nonlinear — $C_p$ increases with temperature as molecu
|
|||
Our network has three layers:
|
||||
|
||||
```
|
||||
Input (1 neuron: T) → Hidden (10 neurons) → Output (1 neuron: Cp)
|
||||
Input (1 neuron: T) -> Hidden (10 neurons) -> Output (1 neuron: Cp)
|
||||
```
|
||||
|
||||
Here's what happens at each step:
|
||||
|
|
@ -84,9 +84,9 @@ This is a linear combination — no activation on the output, since we want to p
|
|||
### Counting parameters
|
||||
|
||||
With 10 hidden neurons:
|
||||
- `W1`: 10 weights (input → hidden)
|
||||
- `W1`: 10 weights (input -> hidden)
|
||||
- `b1`: 10 biases (hidden)
|
||||
- `W2`: 10 weights (hidden → output)
|
||||
- `W2`: 10 weights (hidden -> output)
|
||||
- `b2`: 1 bias (output)
|
||||
- **Total: 31 parameters**
|
||||
|
||||
|
|
@ -123,7 +123,7 @@ $$w \leftarrow w - \eta \cdot \frac{\partial L}{\partial w}$$
|
|||
|
||||
where $\eta$ is the **learning rate** — a small number (0.01 in our code) that controls how big each step is. Too large and training oscillates; too small and it's painfully slow.
|
||||
|
||||
One full pass through these three steps (forward → loss → backward → update) is one **epoch**. We train for 5000 epochs.
|
||||
One full pass through these three steps (forward -> loss -> backward -> update) is one **epoch**. We train for 5000 epochs.
|
||||
|
||||
In nanoGPT, the training loop in `train.py` does exactly the same thing, but with the AdamW optimizer (a fancier version of gradient descent) and batches of data instead of the full dataset.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue