Eric Furst 0c6e919bdd Initial commit: computing-setup

A two-module standalone guide for setting up a new machine for
scientific computing work:

- 01-know-your-machine: hardware and OS inspection. Reads the
  physical machine first via macOS/Linux terminals or Windows
  PowerShell; a separate section walks through the WSL VM and
  how its allocations differ from the host hardware.
- 02-git-basics: pull-focused git workflow. Install, configure
  identity, clone a public repo, pull updates. Authentication
  and pushing are deferred to a future collaboration module.

Includes top-level WSL.md (copied from cli-walkthrough) for
Windows users who need the Linux environment.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-28 10:09:13 -04:00

16 KiB

Raw Blame History

Know Your Machine

Key idea

Understand the basic hardware and software of the computer you are working on.

Key goals

Identify your operating system, CPU, RAM, storage, and GPU
Understand what these components do and why they matter for computing tasks
Learn commands to query your system on macOS, Linux, and Windows

Read your physical machine first. Sections 1–6 walk through inspecting the actual hardware you own. On macOS and Linux, the terminal reports directly from the hardware. On Windows, use PowerShell (or the Settings GUI) — those readings come straight from the real machine.

Then visit the WSL VM separately (Section 7). If you are on Windows and have WSL installed, your Linux environment is a virtual machine that sees only what it has been allocated. Section 7 covers how to inspect that and why it differs from the physical machine. If you do not yet have WSL installed, see ../WSL.md.

As engineers, we should know our tools. You would not run a reactor without knowing its volume, pressure rating, and materials of construction. The same principle applies to computing: before we write code, train models, or analyze data, we should understand the machine we are working on.

This module is a hands-on survey. Run the commands below on your own machine and record what you find. By the end, you should be able to answer: What is my computer, and what can it do?

1. Operating system

Your operating system (OS) manages the hardware and provides the environment where all your programs run. The three major OS families are:

macOS -- Apple's OS, based on Unix (Darwin kernel). Runs on Intel and Apple Silicon (M1/M2/M3/M4) hardware. Closely related to iOS, watchOS, and other Apple systems. (These are all, in fact, computers!)
Linux -- Open-source Unix-like OS. Many distributions exist (Ubuntu, Fedora, etc.). Common on servers, clusters, and in WSL.
Windows -- Microsoft's OS. For terminal-based work, we recommend the Windows Subsystem for Linux (WSL) to access a Unix environment.

Find your OS version

macOS: In the terminal, use the command

sw_vers

macOS (GUI): Apple menu > About This Mac. Shows the macOS version, chip (e.g., Apple M3), and memory.

Linux:

cat /etc/os-release
uname -a

The uname -a command shows the kernel version and architecture. You will see something like x86_64 (Intel/AMD) or aarch64/arm64 (ARM).

Windows (PowerShell):

Get-ComputerInfo | Select-Object OsName, OsVersion, OsArchitecture

This tells you the Windows version and architecture of the physical machine.

Windows (GUI): Settings > System > About. Shows the edition (Home, Pro), version, and processor.

Exercise 1: Run the commands above. What OS and version are you running? What architecture?

2. CPU (processor)

The CPU (Central Processing Unit) executes your code. Key properties:

Architecture: x86_64 (Intel/AMD) or arm64 (Apple Silicon, some Windows laptops). This affects which software binaries you can run.
Cores: Modern CPUs have multiple cores that can work in parallel. More cores help with parallel tasks (compiling, running simulations, some ML training).
Clock speed: Measured in GHz. Higher is faster for single-threaded tasks, but clock speed alone does not tell the whole story.

Find your CPU

macOS:

sysctl -n machdep.cpu.brand_string
sysctl -n hw.ncpu

The first command shows the CPU model. The second shows the total number of cores (including efficiency and performance cores on Apple Silicon).

macOS (GUI): Apple menu > About This Mac shows the chip (e.g., "Apple M3 Pro"). For core count, open Activity Monitor > CPU tab or run the command above.

Linux:

lscpu

This shows the CPU model, architecture, number of cores, and clock speed.

Windows (PowerShell):

Get-CimInstance Win32_Processor | Select-Object Name, NumberOfCores, NumberOfLogicalProcessors, MaxClockSpeed

Windows (GUI): Settings > System > About shows the processor name. For more detail, open Task Manager (Ctrl+Shift+Esc) > Performance > CPU. This shows cores, logical processors, and clock speed.

Why it matters

Heavy numerical work — simulations, data processing, training machine-learning models — runs faster with more cores and higher clock speed. Even so, CPUs are orders of magnitude slower than GPUs for highly parallel tasks like neural network training, which is why we also look at GPUs below.

Exercise 2: What CPU does your machine have? How many cores? What architecture?

3. RAM (memory)

RAM (Random Access Memory) is your computer's short-term working space. When you open a program, load a dataset, or run a model, the data lives in RAM. Key things to know:

RAM is volatile: it is erased when you shut down.
RAM is fast: much faster than reading from disk.
RAM is limited: if you run out, the OS will start using disk as overflow ("swap"), which is extremely slow.

Find your RAM

macOS:

sysctl -n hw.memsize | awk '{printf "%.0f GB\n", $1/1024/1024/1024}'

macOS (GUI): Apple menu > About This Mac shows memory (e.g., "18 GB"). For current usage, open Activity Monitor > Memory tab.

Linux:

free -h

This shows total, used, and available memory. The -h flag makes the output human-readable (GB instead of bytes).

Windows (PowerShell):

[math]::Round((Get-CimInstance Win32_ComputerSystem).TotalPhysicalMemory / 1GB, 1)

This returns the physical RAM installed on the machine. WSL users: this is the number you want — running free -h inside WSL would only show the VM's allocation. See Section 7.

Windows (GUI): Settings > System > About shows "Installed RAM". For current usage, open Task Manager (Ctrl+Shift+Esc) > Performance > Memory.

Why it matters

Loading a large dataset or model weights means everything in active use has to fit in RAM. A modern large language model can be 4–8 GB or more; if you load one on an 8 GB machine alongside your OS, editor, and a browser, you may run out. When that happens the system swaps to disk and everything slows down dramatically. Knowing your RAM ceiling helps you plan what is realistic to run.

Exercise 3: How much physical RAM does your machine have? Use the appropriate command for your OS above. How much is currently in use?

4. Storage (disk)

Storage is where your files, programs, and OS live permanently. Unlike RAM, it persists when you shut down. The two main types:

SSD (Solid State Drive): Fast, no moving parts. Standard on modern laptops.
HDD (Hard Disk Drive): Slower, mechanical. Sometimes used for bulk storage.

Find your storage

macOS:

df -h /

macOS (GUI): Apple menu > About This Mac > More Info > Storage. Shows total capacity, used space, and a breakdown by category.

Linux:

df -h /

Windows (PowerShell):

Get-Volume | Where-Object DriveLetter -eq 'C' | Select-Object DriveLetter, @{N='Size(GB)';E={[math]::Round($_.Size/1GB)}}, @{N='Free(GB)';E={[math]::Round($_.SizeRemaining/1GB)}}

This is the physical C: drive's total size, used, and available space. WSL users: this is your real disk; the WSL VM has its own virtual disk that we look at in Section 7.

Windows (GUI): Settings > System > Storage. Shows total capacity and usage per drive. You can also open File Explorer, right-click the C: drive, and select Properties.

Why it matters

Software adds up fast. A rough sense of common items:

Item	Approximate size
A Python environment with scientific libraries	1–3 GB
A local large language model	1–20 GB each
A course or project repository	50–500 MB
Datasets	varies widely (MB to TB)

If you are low on storage, be selective about what you install, and clean up environments and downloaded models you no longer need.

Exercise 4: How much total storage does your machine have? How much is free? Is it an SSD or HDD? (On macOS, check Apple menu > About This Mac > More Info. On Linux, lsblk shows disk devices.)

5. GPU (graphics processor)

A GPU (Graphics Processing Unit) was originally designed for rendering graphics, but its architecture (thousands of small cores optimized for parallel math) makes it excellent for machine learning. There are three common situations:

NVIDIA GPU (discrete): Found in gaming laptops and workstations. Supports CUDA, which PyTorch uses for fast training. This is the best case for ML work.
Apple Silicon GPU (integrated): The M1/M2/M3/M4 chips include a GPU that PyTorch can use via MPS (Metal Performance Shaders). Faster than CPU, slower than a dedicated NVIDIA GPU.
Intel/AMD integrated GPU: Built into the CPU. Not usable by PyTorch. Use --device=cpu.

Find your GPU

macOS (Apple Silicon):

system_profiler SPDisplaysDataType

If you see "Apple M1" (or M2, M3, M4), you have an integrated GPU that supports MPS.

macOS (GUI): Apple menu > About This Mac shows the chip. Apple Silicon chips (M1/M2/M3/M4) all include a GPU.

Linux (NVIDIA):

nvidia-smi

If this command works, you have an NVIDIA GPU and the drivers are installed. It shows the GPU model, driver version, and memory. If the command is not found, you either do not have an NVIDIA GPU or the drivers are not installed.

Windows (PowerShell):

Get-CimInstance Win32_VideoController | Select-Object Name, AdapterRAM, DriverVersion

This lists every GPU Windows sees on the physical machine — useful on laptops that have both an integrated GPU (Intel/AMD) and a discrete one (NVIDIA).

Windows (GUI): Task Manager (Ctrl+Shift+Esc) > Performance > GPU. This shows the GPU name (e.g., "NVIDIA GeForce RTX 4060" or "Intel UHD Graphics"), memory, and utilization.

No GPU or unsure:

If you have PyTorch installed, you can ask it directly:

python -c "import torch; print('CUDA:', torch.cuda.is_available()); print('MPS:', torch.backends.mps.is_available())"

This tells you what PyTorch can use on your machine.

Why it matters

Training a small neural network on CPU takes minutes; on a GPU, seconds. The difference grows dramatically with model size — this is why large language models are trained on clusters of thousands of GPUs. For most introductory computing work, a CPU is sufficient. GPU acceleration is a bonus, not a requirement.

Exercise 5: What GPU (if any) does your machine have? Can PyTorch use it? Run the Python check above (if you have PyTorch installed).

6. Putting it all together

Fill in this table for your machine:

Component	Your machine
Operating system
OS version
Architecture (x86_64 / arm64)
CPU model
CPU cores
RAM (total)
Storage (total / free)
GPU
PyTorch device (cpu / mps / cuda)

One-line system summary

macOS:

echo "$(sw_vers -productName) $(sw_vers -productVersion), $(sysctl -n machdep.cpu.brand_string), $(sysctl -n hw.ncpu) cores, $(sysctl -n hw.memsize | awk '{printf "%.0f GB", $1/1024/1024/1024}') RAM"

Linux:

echo "$(uname -o) $(uname -r), $(lscpu | grep 'Model name' | sed 's/.*: *//' ), $(nproc) cores, $(free -h | awk '/Mem:/ {print $2}') RAM"

Exercise 6: Fill in the table above. If you are working alongside others, compare with a classmate or colleague. How are your machines different? How might those differences affect the kinds of work each of you can do comfortably?

7. Inspecting your WSL environment (Windows + WSL users)

If you are on Windows and use the Windows Subsystem for Linux (WSL), your Linux environment runs inside a virtual machine managed by Windows. The Linux commands from sections 1–5 will all work inside WSL, but the answers they give are about the VM, not the physical machine you already inspected. Some readings match the host; others are very different. Understanding which is which is the goal of this section.

Reading	Inside WSL you see...	Notes
OS (`uname -a`, `cat /etc/os-release`)	The Linux distribution and kernel running in the VM	Has nothing to do with your Windows version
CPU (`lscpu`)	The host CPU model, architecture, and core count	Passed through from the physical machine — should match what PowerShell told you
RAM (`free -h`)	The RAM allocated to the VM	By default, about half your physical RAM, capped at 8 GB. Configurable in a `.wslconfig` file — see the Microsoft docs
Disk (`df -h /`)	A virtual disk (`ext4.vhdx`) stored on your Windows drive	Not the same as the C: drive. The VM grows the file on demand up to a configured maximum
GPU (`nvidia-smi`)	An NVIDIA GPU, if the Windows-side driver supports WSL	Recent NVIDIA Windows drivers include WSL support. No separate Linux driver is installed inside WSL. See NVIDIA's CUDA on WSL guide

Why this matters

When you install Python, run a model, or train something inside WSL, you are constrained by the VM's allocation, not the machine's full capacity. An 8 GB RAM cap inside WSL can mean a model loads fine on the Windows side but fails inside WSL. Knowing both numbers — physical and VM — lets you predict what will actually run where.

Exercise 7 (WSL users): Run free -h and df -h / inside WSL. Compare the results to the PowerShell readings you recorded in Section 6. How much physical RAM does your VM actually see? How much of your physical disk is the VM using right now?

8. Keeping a machine log

Engineers keep logs for lab equipment, process equipment, and instruments. Your computer deserves the same treatment. Create a document called machine_log in your personal files and start it with the spec table from section 6. It should be a simple format — a text, rich text, or markdown file.

While you are at it, give your machine a name if you have not already. (On macOS: System Settings > General > About > Name. On Linux: hostnamectl set-hostname yourname. On Windows: Settings > System > About > Rename this PC.) A named machine is easier to reference in logs, SSH configs, and conversation, especially once you have more than one. Put the name at the top of your log.

After that, add a dated entry whenever you:

Install or upgrade the OS or major software
Change system configuration (environment variables, shell settings, drivers, WSL setup)
Encounter a problem and solve it (the error, what you tried, what worked)
Upgrade hardware (new RAM, new drive, etc.)

Keep entries short. Date, what changed, and the outcome. When something breaks months later, you will be glad you wrote down what you changed and when. This is especially valuable when troubleshooting: knowing what was different before the problem started is often the fastest path to a fix.

Exercise 8: Start your machine log. Put the spec table at the top and add an entry for today.

Additional resources

Crash Course Computer Science — episodes 1-10 cover hardware fundamentals (transistors, ALU, registers, RAM, CPU, instructions) at a reasonable pace
J. Clark Scott, But How Do It Know? — a short, readable book that builds a computer from logic gates up. Good for understanding what is actually happening inside the machine.
top and htop — interactive process viewers that show CPU, memory, and process usage in real time. top is the classic Unix tool and ships built-in on macOS and Linux, so it's always available. htop is a more modern third-party rewrite: colored CPU/memory bars, a scrollable process list, click-to-sort columns, F-keys (or mouse) to kill/renice/filter processes, an F5 tree view, and the same behavior everywhere (macOS top and Linux top differ in flags and output; htop does not). Install with brew install htop (macOS) or sudo apt install htop (Linux/WSL). Worth knowing both — top for "wherever I land," htop for daily use on your own machine.

16 KiB Raw Blame History Unescape Escape

Know Your Machine

Key idea

Key goals

1. Operating system

Find your OS version

2. CPU (processor)

Find your CPU

Why it matters

3. RAM (memory)

Find your RAM

Why it matters

4. Storage (disk)

Find your storage

Why it matters

5. GPU (graphics processor)

Find your GPU

Why it matters

6. Putting it all together

One-line system summary

7. Inspecting your WSL environment (Windows + WSL users)

Why this matters

8. Keeping a machine log

Additional resources

16 KiB

Raw Blame History