cli-walkthrough/06-advanced/README.md
Eric c57d7539d8 Initial commit: CLI walkthrough for CHEG 667-013
Six-module walkthrough covering navigation, files, reading/searching,
processes/editors, scripting, and advanced tools (ssh, regex, tar, etc.).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 21:54:48 -04:00

233 lines
8.3 KiB
Markdown

# CLI Part VI: Advanced Tools
**CHEG 667-013 — Chemical Engineering with Computers**
Department of Chemical and Biomolecular Engineering, University of Delaware
---
## Key idea
Connect to remote machines, transfer files, and use additional command line tools for everyday tasks.
## Key goals
- Connect to remote machines with `ssh` and transfer files with `scp` and `sftp`
- Download files from the web with `curl` and `wget`
- Find files with `find`, compare them with `diff`, and archive them with `tar`
- Identify commands with `which`
- Understand globs and regular expressions for pattern matching
---
## 1. Remote access with ssh
`ssh` (secure shell) lets you log in to a remote machine over the network. This is how you'll connect to department servers, computing clusters, or cloud machines:
```
$ ssh username@hostname
```
For example, to connect to a UD server:
```
$ ssh ef1j@mahler.che.udel.edu
```
You'll be prompted for your password on the remote machine. Once connected, you'll see a new prompt — everything you type now runs on the remote machine.
To disconnect, type `exit` or press `Ctrl-D`.
### SSH keys
Typing your password every time gets old. You can set up *SSH keys* for password-free login:
```
$ ssh-keygen # generate a key pair (press Enter for defaults)
$ ssh-copy-id username@hostname # copy your public key to the remote machine
```
After this, `ssh` will authenticate using your key instead of a password.
## 2. Transferring files with scp and sftp
Once you're working on remote machines, you'll need to move files back and forth.
### scp — secure copy
`scp` works like `cp`, but copies files over the network:
```
$ scp localfile.txt username@hostname:~/destination/
$ scp username@hostname:~/remotefile.txt ./
```
The first command copies a local file to the remote machine. The second copies a remote file to your current directory. Add `-r` to copy entire directories.
### sftp — secure file transfer
`sftp` gives you an interactive session for browsing and transferring files:
```
$ sftp username@hostname
sftp> ls
sftp> cd results
sftp> get output.csv
sftp> put input.dat
sftp> exit
```
Use `get` to download and `put` to upload. It's useful when you want to browse what's on the remote machine before transferring.
## 3. Downloading files with curl and wget
Both `curl` and `wget` download files from the web. You've already seen `curl` in the `weather.sh` script.
### curl
`curl` prints the downloaded content to the terminal by default. Use `-o` to save to a file:
```
$ curl -s https://example.com/data.csv -o data.csv
```
The `-s` flag is for "silent" mode (no progress bar).
### wget
`wget` saves to a file by default, which makes it simpler for straightforward downloads:
```
$ wget https://example.com/data.csv
```
Use `wget -r` to download entire directories or websites recursively. On macOS, `wget` may need to be installed separately (e.g., `brew install wget`), while `curl` is available by default.
## 4. Finding files with find
The `find` command searches for files by name, type, size, modification time, and more. It searches recursively through directories:
```
$ find . -name "*.csv" # find all CSV files in current directory and below
$ find ~/cheg667 -name "hello.c" # find a specific file
$ find . -type d # find all directories
$ find . -name "*.tmp" -delete # find and delete all .tmp files (careful!)
```
The first argument is where to start searching. The remaining arguments are filters.
## 5. Comparing files with diff
`diff` shows the differences between two files, line by line:
```
$ diff file1.txt file2.txt
3c3
< This is the original line.
---
> This is the modified line.
```
Lines starting with `<` are from the first file, `>` from the second. `diff` is useful for checking what changed between two versions of a file. You'll see it again if you use version control tools like `git`.
## 6. Identifying commands with which
If you're not sure where a command lives or which version you're running, `which` tells you:
```
$ which python
/usr/bin/python
$ which ls
/usr/bin/ls
```
This is especially helpful when you have multiple versions of a program installed (e.g., different Python installations) and need to know which one the shell is using.
## 7. Archiving files with tar
`tar` bundles files and directories into a single archive. It's the standard way to package things on Unix systems.
Create an archive:
```
$ tar -czf archive.tar.gz my_directory/
```
This creates a compressed archive (`-c` create, `-z` gzip compression, `-f` filename). Extract it with:
```
$ tar -xzf archive.tar.gz
```
(`-x` for extract). To see what's inside without extracting:
```
$ tar -tzf archive.tar.gz
```
You'll encounter `.tar.gz` (or `.tgz`) files frequently when downloading source code or datasets.
## 8. Globs and regular expressions
You've already used wildcards like `*` and `?` in section 05. These patterns are called *globs* — the shell expands them into matching filenames before the command runs. Here's the full set:
| Pattern | Matches | Example |
|---------|---------|---------|
| `*` | Any string of characters | `*.txt` — all text files |
| `?` | Exactly one character | `file?.dat``file1.dat`, `fileA.dat` |
| `[abc]` | One character from the set | `file[123].dat``file1.dat`, `file2.dat`, `file3.dat` |
| `[a-z]` | One character in the range | `[A-Z]*.txt` — files starting with a capital letter |
| `[!abc]` | One character *not* in the set | `file[!0-9].dat` — files where the character isn't a digit |
Globs are expanded by the shell and work with any command (`ls`, `cp`, `rm`, etc.).
### Regular expressions
Regular expressions (or *regex*) are a more powerful pattern language used inside programs like `grep`, `sed`, and many programming languages. Unlike globs, they aren't expanded by the shell — they're interpreted by the program itself.
Here are the basics:
| Pattern | Meaning | Example |
|---------|---------|---------|
| `.` | Any single character | `h.t` matches `hat`, `hit`, `hot` |
| `*` | Zero or more of the preceding character | `ab*c` matches `ac`, `abc`, `abbc` |
| `+` | One or more of the preceding character | `ab+c` matches `abc`, `abbc`, but not `ac` |
| `^` | Start of line | `^From` matches lines starting with "From" |
| `$` | End of line | `\.csv$` matches lines ending with ".csv" |
| `[ ]` | Character class (same as globs) | `[0-9]+` matches one or more digits |
| `\` | Escape a special character | `\.` matches a literal period |
**Watch out:** `*` means different things in globs and regex! In a glob, `*.txt` means "anything ending in .txt". In a regex, `*` means "zero or more of the previous character". This is a common source of confusion.
Using regex with `grep`:
```
$ grep '^From:' emails.txt # lines starting with "From:"
$ grep '[0-9][0-9]*\.[0-9]' data.txt # lines containing a decimal number
$ grep -i 'error' logfile.txt # case-insensitive search
$ grep -c 'warning' logfile.txt # count matching lines
$ grep -v '^#' config.txt # lines that do NOT start with #
```
The `-i` flag ignores case, `-c` counts matches instead of printing them, and `-v` inverts the match (shows non-matching lines). These options combine well with pipes:
```
$ ps aux | grep python # find running Python processes
$ history | grep -c ssh # how many times have I used ssh?
```
Regular expressions are a deep subject — you don't need to master them now, but recognizing the basics will help you read documentation and write better `grep` commands.
> **Exercise 1:** If you have access to a remote machine (a department server, for example), try connecting with `ssh`. Use `scp` to copy a file back and forth.
> **Exercise 2:** Use `find` to locate all `.c` files somewhere under your home directory.
> **Exercise 3:** Create a directory with a few files in it. Use `tar -czf` to archive it. Delete the original directory, then extract the archive to restore it.
> **Exercise 4:** Use `curl` or `wget` to download a file from the web. Try downloading a plain text file, like a Project Gutenberg book (e.g., `https://www.gutenberg.org/files/1342/1342-0.txt` for Pride and Prejudice). Use `wc -l` to count the lines, `head` to see the beginning, and `grep` to search for a word.