A two-module standalone guide for setting up a new machine for scientific computing work: - 01-know-your-machine: hardware and OS inspection. Reads the physical machine first via macOS/Linux terminals or Windows PowerShell; a separate section walks through the WSL VM and how its allocations differ from the host hardware. - 02-git-basics: pull-focused git workflow. Install, configure identity, clone a public repo, pull updates. Authentication and pushing are deferred to a future collaboration module. Includes top-level WSL.md (copied from cli-walkthrough) for Windows users who need the Linux environment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
16 KiB
Know Your Machine
Key idea
Understand the basic hardware and software of the computer you are working on.
Key goals
- Identify your operating system, CPU, RAM, storage, and GPU
- Understand what these components do and why they matter for computing tasks
- Learn commands to query your system on macOS, Linux, and Windows
Read your physical machine first. Sections 1–6 walk through inspecting the actual hardware you own. On macOS and Linux, the terminal reports directly from the hardware. On Windows, use PowerShell (or the Settings GUI) — those readings come straight from the real machine.
Then visit the WSL VM separately (Section 7). If you are on Windows and have WSL installed, your Linux environment is a virtual machine that sees only what it has been allocated. Section 7 covers how to inspect that and why it differs from the physical machine. If you do not yet have WSL installed, see ../WSL.md.
As engineers, we should know our tools. You would not run a reactor without knowing its volume, pressure rating, and materials of construction. The same principle applies to computing: before we write code, train models, or analyze data, we should understand the machine we are working on.
This module is a hands-on survey. Run the commands below on your own machine and record what you find. By the end, you should be able to answer: What is my computer, and what can it do?
1. Operating system
Your operating system (OS) manages the hardware and provides the environment where all your programs run. The three major OS families are:
- macOS -- Apple's OS, based on Unix (Darwin kernel). Runs on Intel and Apple Silicon (M1/M2/M3/M4) hardware. Closely related to iOS, watchOS, and other Apple systems. (These are all, in fact, computers!)
- Linux -- Open-source Unix-like OS. Many distributions exist (Ubuntu, Fedora, etc.). Common on servers, clusters, and in WSL.
- Windows -- Microsoft's OS. For terminal-based work, we recommend the Windows Subsystem for Linux (WSL) to access a Unix environment.
Find your OS version
macOS: In the terminal, use the command
sw_vers
macOS (GUI): Apple menu > About This Mac. Shows the macOS version, chip (e.g., Apple M3), and memory.
Linux:
cat /etc/os-release
uname -a
The uname -a command shows the kernel version and architecture. You will see something like x86_64 (Intel/AMD) or aarch64/arm64 (ARM).
Windows (PowerShell):
Get-ComputerInfo | Select-Object OsName, OsVersion, OsArchitecture
This tells you the Windows version and architecture of the physical machine.
Windows (GUI): Settings > System > About. Shows the edition (Home, Pro), version, and processor.
Exercise 1: Run the commands above. What OS and version are you running? What architecture?
2. CPU (processor)
The CPU (Central Processing Unit) executes your code. Key properties:
- Architecture:
x86_64(Intel/AMD) orarm64(Apple Silicon, some Windows laptops). This affects which software binaries you can run. - Cores: Modern CPUs have multiple cores that can work in parallel. More cores help with parallel tasks (compiling, running simulations, some ML training).
- Clock speed: Measured in GHz. Higher is faster for single-threaded tasks, but clock speed alone does not tell the whole story.
Find your CPU
macOS:
sysctl -n machdep.cpu.brand_string
sysctl -n hw.ncpu
The first command shows the CPU model. The second shows the total number of cores (including efficiency and performance cores on Apple Silicon).
macOS (GUI): Apple menu > About This Mac shows the chip (e.g., "Apple M3 Pro"). For core count, open Activity Monitor > CPU tab or run the command above.
Linux:
lscpu
This shows the CPU model, architecture, number of cores, and clock speed.
Windows (PowerShell):
Get-CimInstance Win32_Processor | Select-Object Name, NumberOfCores, NumberOfLogicalProcessors, MaxClockSpeed
Windows (GUI): Settings > System > About shows the processor name. For more detail, open Task Manager (Ctrl+Shift+Esc) > Performance > CPU. This shows cores, logical processors, and clock speed.
Why it matters
Heavy numerical work — simulations, data processing, training machine-learning models — runs faster with more cores and higher clock speed. Even so, CPUs are orders of magnitude slower than GPUs for highly parallel tasks like neural network training, which is why we also look at GPUs below.
Exercise 2: What CPU does your machine have? How many cores? What architecture?
3. RAM (memory)
RAM (Random Access Memory) is your computer's short-term working space. When you open a program, load a dataset, or run a model, the data lives in RAM. Key things to know:
- RAM is volatile: it is erased when you shut down.
- RAM is fast: much faster than reading from disk.
- RAM is limited: if you run out, the OS will start using disk as overflow ("swap"), which is extremely slow.
Find your RAM
macOS:
sysctl -n hw.memsize | awk '{printf "%.0f GB\n", $1/1024/1024/1024}'
macOS (GUI): Apple menu > About This Mac shows memory (e.g., "18 GB"). For current usage, open Activity Monitor > Memory tab.
Linux:
free -h
This shows total, used, and available memory. The -h flag makes the output human-readable (GB instead of bytes).
Windows (PowerShell):
[math]::Round((Get-CimInstance Win32_ComputerSystem).TotalPhysicalMemory / 1GB, 1)
This returns the physical RAM installed on the machine. WSL users: this is the number you want — running free -h inside WSL would only show the VM's allocation. See Section 7.
Windows (GUI): Settings > System > About shows "Installed RAM". For current usage, open Task Manager (Ctrl+Shift+Esc) > Performance > Memory.
Why it matters
Loading a large dataset or model weights means everything in active use has to fit in RAM. A modern large language model can be 4–8 GB or more; if you load one on an 8 GB machine alongside your OS, editor, and a browser, you may run out. When that happens the system swaps to disk and everything slows down dramatically. Knowing your RAM ceiling helps you plan what is realistic to run.
Exercise 3: How much physical RAM does your machine have? Use the appropriate command for your OS above. How much is currently in use?
4. Storage (disk)
Storage is where your files, programs, and OS live permanently. Unlike RAM, it persists when you shut down. The two main types:
- SSD (Solid State Drive): Fast, no moving parts. Standard on modern laptops.
- HDD (Hard Disk Drive): Slower, mechanical. Sometimes used for bulk storage.
Find your storage
macOS:
df -h /
macOS (GUI): Apple menu > About This Mac > More Info > Storage. Shows total capacity, used space, and a breakdown by category.
Linux:
df -h /
Windows (PowerShell):
Get-Volume | Where-Object DriveLetter -eq 'C' | Select-Object DriveLetter, @{N='Size(GB)';E={[math]::Round($_.Size/1GB)}}, @{N='Free(GB)';E={[math]::Round($_.SizeRemaining/1GB)}}
This is the physical C: drive's total size, used, and available space. WSL users: this is your real disk; the WSL VM has its own virtual disk that we look at in Section 7.
Windows (GUI): Settings > System > Storage. Shows total capacity and usage per drive. You can also open File Explorer, right-click the C: drive, and select Properties.
Why it matters
Software adds up fast. A rough sense of common items:
| Item | Approximate size |
|---|---|
| A Python environment with scientific libraries | 1–3 GB |
| A local large language model | 1–20 GB each |
| A course or project repository | 50–500 MB |
| Datasets | varies widely (MB to TB) |
If you are low on storage, be selective about what you install, and clean up environments and downloaded models you no longer need.
Exercise 4: How much total storage does your machine have? How much is free? Is it an SSD or HDD? (On macOS, check Apple menu > About This Mac > More Info. On Linux,
lsblkshows disk devices.)
5. GPU (graphics processor)
A GPU (Graphics Processing Unit) was originally designed for rendering graphics, but its architecture (thousands of small cores optimized for parallel math) makes it excellent for machine learning. There are three common situations:
- NVIDIA GPU (discrete): Found in gaming laptops and workstations. Supports CUDA, which PyTorch uses for fast training. This is the best case for ML work.
- Apple Silicon GPU (integrated): The M1/M2/M3/M4 chips include a GPU that PyTorch can use via MPS (Metal Performance Shaders). Faster than CPU, slower than a dedicated NVIDIA GPU.
- Intel/AMD integrated GPU: Built into the CPU. Not usable by PyTorch. Use
--device=cpu.
Find your GPU
macOS (Apple Silicon):
system_profiler SPDisplaysDataType
If you see "Apple M1" (or M2, M3, M4), you have an integrated GPU that supports MPS.
macOS (GUI): Apple menu > About This Mac shows the chip. Apple Silicon chips (M1/M2/M3/M4) all include a GPU.
Linux (NVIDIA):
nvidia-smi
If this command works, you have an NVIDIA GPU and the drivers are installed. It shows the GPU model, driver version, and memory. If the command is not found, you either do not have an NVIDIA GPU or the drivers are not installed.
Windows (PowerShell):
Get-CimInstance Win32_VideoController | Select-Object Name, AdapterRAM, DriverVersion
This lists every GPU Windows sees on the physical machine — useful on laptops that have both an integrated GPU (Intel/AMD) and a discrete one (NVIDIA).
Windows (GUI): Task Manager (Ctrl+Shift+Esc) > Performance > GPU. This shows the GPU name (e.g., "NVIDIA GeForce RTX 4060" or "Intel UHD Graphics"), memory, and utilization.
No GPU or unsure:
If you have PyTorch installed, you can ask it directly:
python -c "import torch; print('CUDA:', torch.cuda.is_available()); print('MPS:', torch.backends.mps.is_available())"
This tells you what PyTorch can use on your machine.
Why it matters
Training a small neural network on CPU takes minutes; on a GPU, seconds. The difference grows dramatically with model size — this is why large language models are trained on clusters of thousands of GPUs. For most introductory computing work, a CPU is sufficient. GPU acceleration is a bonus, not a requirement.
Exercise 5: What GPU (if any) does your machine have? Can PyTorch use it? Run the Python check above (if you have PyTorch installed).
6. Putting it all together
Fill in this table for your machine:
| Component | Your machine |
|---|---|
| Operating system | |
| OS version | |
| Architecture (x86_64 / arm64) | |
| CPU model | |
| CPU cores | |
| RAM (total) | |
| Storage (total / free) | |
| GPU | |
| PyTorch device (cpu / mps / cuda) |
One-line system summary
macOS:
echo "$(sw_vers -productName) $(sw_vers -productVersion), $(sysctl -n machdep.cpu.brand_string), $(sysctl -n hw.ncpu) cores, $(sysctl -n hw.memsize | awk '{printf "%.0f GB", $1/1024/1024/1024}') RAM"
Linux:
echo "$(uname -o) $(uname -r), $(lscpu | grep 'Model name' | sed 's/.*: *//' ), $(nproc) cores, $(free -h | awk '/Mem:/ {print $2}') RAM"
Exercise 6: Fill in the table above. If you are working alongside others, compare with a classmate or colleague. How are your machines different? How might those differences affect the kinds of work each of you can do comfortably?
7. Inspecting your WSL environment (Windows + WSL users)
If you are on Windows and use the Windows Subsystem for Linux (WSL), your Linux environment runs inside a virtual machine managed by Windows. The Linux commands from sections 1–5 will all work inside WSL, but the answers they give are about the VM, not the physical machine you already inspected. Some readings match the host; others are very different. Understanding which is which is the goal of this section.
| Reading | Inside WSL you see... | Notes |
|---|---|---|
OS (uname -a, cat /etc/os-release) |
The Linux distribution and kernel running in the VM | Has nothing to do with your Windows version |
CPU (lscpu) |
The host CPU model, architecture, and core count | Passed through from the physical machine — should match what PowerShell told you |
RAM (free -h) |
The RAM allocated to the VM | By default, about half your physical RAM, capped at 8 GB. Configurable in a .wslconfig file — see the Microsoft docs |
Disk (df -h /) |
A virtual disk (ext4.vhdx) stored on your Windows drive |
Not the same as the C: drive. The VM grows the file on demand up to a configured maximum |
GPU (nvidia-smi) |
An NVIDIA GPU, if the Windows-side driver supports WSL | Recent NVIDIA Windows drivers include WSL support. No separate Linux driver is installed inside WSL. See NVIDIA's CUDA on WSL guide |
Why this matters
When you install Python, run a model, or train something inside WSL, you are constrained by the VM's allocation, not the machine's full capacity. An 8 GB RAM cap inside WSL can mean a model loads fine on the Windows side but fails inside WSL. Knowing both numbers — physical and VM — lets you predict what will actually run where.
Exercise 7 (WSL users): Run
free -handdf -h /inside WSL. Compare the results to the PowerShell readings you recorded in Section 6. How much physical RAM does your VM actually see? How much of your physical disk is the VM using right now?
8. Keeping a machine log
Engineers keep logs for lab equipment, process equipment, and instruments. Your computer deserves the same treatment. Create a document called machine_log in your personal files and start it with the spec table from section 6. It should be a simple format — a text, rich text, or markdown file.
While you are at it, give your machine a name if you have not already. (On macOS: System Settings > General > About > Name. On Linux: hostnamectl set-hostname yourname. On Windows: Settings > System > About > Rename this PC.) A named machine is easier to reference in logs, SSH configs, and conversation, especially once you have more than one. Put the name at the top of your log.
After that, add a dated entry whenever you:
- Install or upgrade the OS or major software
- Change system configuration (environment variables, shell settings, drivers, WSL setup)
- Encounter a problem and solve it (the error, what you tried, what worked)
- Upgrade hardware (new RAM, new drive, etc.)
Keep entries short. Date, what changed, and the outcome. When something breaks months later, you will be glad you wrote down what you changed and when. This is especially valuable when troubleshooting: knowing what was different before the problem started is often the fastest path to a fix.
Exercise 8: Start your machine log. Put the spec table at the top and add an entry for today.
Additional resources
- Crash Course Computer Science — episodes 1-10 cover hardware fundamentals (transistors, ALU, registers, RAM, CPU, instructions) at a reasonable pace
- J. Clark Scott, But How Do It Know? — a short, readable book that builds a computer from logic gates up. Good for understanding what is actually happening inside the machine.
topandhtop— interactive process viewers that show CPU, memory, and process usage in real time.topis the classic Unix tool and ships built-in on macOS and Linux, so it's always available.htopis a more modern third-party rewrite: colored CPU/memory bars, a scrollable process list, click-to-sort columns, F-keys (or mouse) to kill/renice/filter processes, an F5 tree view, and the same behavior everywhere (macOStopand Linuxtopdiffer in flags and output;htopdoes not). Install withbrew install htop(macOS) orsudo apt install htop(Linux/WSL). Worth knowing both —topfor "wherever I land,"htopfor daily use on your own machine.