Six-module walkthrough covering navigation, files, reading/searching, processes/editors, scripting, and advanced tools (ssh, regex, tar, etc.). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
233 lines
8.3 KiB
Markdown
233 lines
8.3 KiB
Markdown
# CLI Part VI: Advanced Tools
|
|
|
|
**CHEG 667-013 — Chemical Engineering with Computers**
|
|
Department of Chemical and Biomolecular Engineering, University of Delaware
|
|
|
|
---
|
|
|
|
## Key idea
|
|
|
|
Connect to remote machines, transfer files, and use additional command line tools for everyday tasks.
|
|
|
|
## Key goals
|
|
|
|
- Connect to remote machines with `ssh` and transfer files with `scp` and `sftp`
|
|
- Download files from the web with `curl` and `wget`
|
|
- Find files with `find`, compare them with `diff`, and archive them with `tar`
|
|
- Identify commands with `which`
|
|
- Understand globs and regular expressions for pattern matching
|
|
|
|
---
|
|
|
|
|
|
## 1. Remote access with ssh
|
|
|
|
`ssh` (secure shell) lets you log in to a remote machine over the network. This is how you'll connect to department servers, computing clusters, or cloud machines:
|
|
|
|
```
|
|
$ ssh username@hostname
|
|
```
|
|
|
|
For example, to connect to a UD server:
|
|
|
|
```
|
|
$ ssh ef1j@mahler.che.udel.edu
|
|
```
|
|
|
|
You'll be prompted for your password on the remote machine. Once connected, you'll see a new prompt — everything you type now runs on the remote machine.
|
|
|
|
To disconnect, type `exit` or press `Ctrl-D`.
|
|
|
|
### SSH keys
|
|
|
|
Typing your password every time gets old. You can set up *SSH keys* for password-free login:
|
|
|
|
```
|
|
$ ssh-keygen # generate a key pair (press Enter for defaults)
|
|
$ ssh-copy-id username@hostname # copy your public key to the remote machine
|
|
```
|
|
|
|
After this, `ssh` will authenticate using your key instead of a password.
|
|
|
|
|
|
## 2. Transferring files with scp and sftp
|
|
|
|
Once you're working on remote machines, you'll need to move files back and forth.
|
|
|
|
### scp — secure copy
|
|
|
|
`scp` works like `cp`, but copies files over the network:
|
|
|
|
```
|
|
$ scp localfile.txt username@hostname:~/destination/
|
|
$ scp username@hostname:~/remotefile.txt ./
|
|
```
|
|
|
|
The first command copies a local file to the remote machine. The second copies a remote file to your current directory. Add `-r` to copy entire directories.
|
|
|
|
### sftp — secure file transfer
|
|
|
|
`sftp` gives you an interactive session for browsing and transferring files:
|
|
|
|
```
|
|
$ sftp username@hostname
|
|
sftp> ls
|
|
sftp> cd results
|
|
sftp> get output.csv
|
|
sftp> put input.dat
|
|
sftp> exit
|
|
```
|
|
|
|
Use `get` to download and `put` to upload. It's useful when you want to browse what's on the remote machine before transferring.
|
|
|
|
|
|
## 3. Downloading files with curl and wget
|
|
|
|
Both `curl` and `wget` download files from the web. You've already seen `curl` in the `weather.sh` script.
|
|
|
|
### curl
|
|
|
|
`curl` prints the downloaded content to the terminal by default. Use `-o` to save to a file:
|
|
|
|
```
|
|
$ curl -s https://example.com/data.csv -o data.csv
|
|
```
|
|
|
|
The `-s` flag is for "silent" mode (no progress bar).
|
|
|
|
### wget
|
|
|
|
`wget` saves to a file by default, which makes it simpler for straightforward downloads:
|
|
|
|
```
|
|
$ wget https://example.com/data.csv
|
|
```
|
|
|
|
Use `wget -r` to download entire directories or websites recursively. On macOS, `wget` may need to be installed separately (e.g., `brew install wget`), while `curl` is available by default.
|
|
|
|
|
|
## 4. Finding files with find
|
|
|
|
The `find` command searches for files by name, type, size, modification time, and more. It searches recursively through directories:
|
|
|
|
```
|
|
$ find . -name "*.csv" # find all CSV files in current directory and below
|
|
$ find ~/cheg667 -name "hello.c" # find a specific file
|
|
$ find . -type d # find all directories
|
|
$ find . -name "*.tmp" -delete # find and delete all .tmp files (careful!)
|
|
```
|
|
|
|
The first argument is where to start searching. The remaining arguments are filters.
|
|
|
|
|
|
## 5. Comparing files with diff
|
|
|
|
`diff` shows the differences between two files, line by line:
|
|
|
|
```
|
|
$ diff file1.txt file2.txt
|
|
3c3
|
|
< This is the original line.
|
|
---
|
|
> This is the modified line.
|
|
```
|
|
|
|
Lines starting with `<` are from the first file, `>` from the second. `diff` is useful for checking what changed between two versions of a file. You'll see it again if you use version control tools like `git`.
|
|
|
|
|
|
## 6. Identifying commands with which
|
|
|
|
If you're not sure where a command lives or which version you're running, `which` tells you:
|
|
|
|
```
|
|
$ which python
|
|
/usr/bin/python
|
|
$ which ls
|
|
/usr/bin/ls
|
|
```
|
|
|
|
This is especially helpful when you have multiple versions of a program installed (e.g., different Python installations) and need to know which one the shell is using.
|
|
|
|
|
|
## 7. Archiving files with tar
|
|
|
|
`tar` bundles files and directories into a single archive. It's the standard way to package things on Unix systems.
|
|
|
|
Create an archive:
|
|
|
|
```
|
|
$ tar -czf archive.tar.gz my_directory/
|
|
```
|
|
|
|
This creates a compressed archive (`-c` create, `-z` gzip compression, `-f` filename). Extract it with:
|
|
|
|
```
|
|
$ tar -xzf archive.tar.gz
|
|
```
|
|
|
|
(`-x` for extract). To see what's inside without extracting:
|
|
|
|
```
|
|
$ tar -tzf archive.tar.gz
|
|
```
|
|
|
|
You'll encounter `.tar.gz` (or `.tgz`) files frequently when downloading source code or datasets.
|
|
|
|
## 8. Globs and regular expressions
|
|
|
|
You've already used wildcards like `*` and `?` in section 05. These patterns are called *globs* — the shell expands them into matching filenames before the command runs. Here's the full set:
|
|
|
|
| Pattern | Matches | Example |
|
|
|---------|---------|---------|
|
|
| `*` | Any string of characters | `*.txt` — all text files |
|
|
| `?` | Exactly one character | `file?.dat` — `file1.dat`, `fileA.dat` |
|
|
| `[abc]` | One character from the set | `file[123].dat` — `file1.dat`, `file2.dat`, `file3.dat` |
|
|
| `[a-z]` | One character in the range | `[A-Z]*.txt` — files starting with a capital letter |
|
|
| `[!abc]` | One character *not* in the set | `file[!0-9].dat` — files where the character isn't a digit |
|
|
|
|
Globs are expanded by the shell and work with any command (`ls`, `cp`, `rm`, etc.).
|
|
|
|
### Regular expressions
|
|
|
|
Regular expressions (or *regex*) are a more powerful pattern language used inside programs like `grep`, `sed`, and many programming languages. Unlike globs, they aren't expanded by the shell — they're interpreted by the program itself.
|
|
|
|
Here are the basics:
|
|
|
|
| Pattern | Meaning | Example |
|
|
|---------|---------|---------|
|
|
| `.` | Any single character | `h.t` matches `hat`, `hit`, `hot` |
|
|
| `*` | Zero or more of the preceding character | `ab*c` matches `ac`, `abc`, `abbc` |
|
|
| `+` | One or more of the preceding character | `ab+c` matches `abc`, `abbc`, but not `ac` |
|
|
| `^` | Start of line | `^From` matches lines starting with "From" |
|
|
| `$` | End of line | `\.csv$` matches lines ending with ".csv" |
|
|
| `[ ]` | Character class (same as globs) | `[0-9]+` matches one or more digits |
|
|
| `\` | Escape a special character | `\.` matches a literal period |
|
|
|
|
**Watch out:** `*` means different things in globs and regex! In a glob, `*.txt` means "anything ending in .txt". In a regex, `*` means "zero or more of the previous character". This is a common source of confusion.
|
|
|
|
Using regex with `grep`:
|
|
|
|
```
|
|
$ grep '^From:' emails.txt # lines starting with "From:"
|
|
$ grep '[0-9][0-9]*\.[0-9]' data.txt # lines containing a decimal number
|
|
$ grep -i 'error' logfile.txt # case-insensitive search
|
|
$ grep -c 'warning' logfile.txt # count matching lines
|
|
$ grep -v '^#' config.txt # lines that do NOT start with #
|
|
```
|
|
|
|
The `-i` flag ignores case, `-c` counts matches instead of printing them, and `-v` inverts the match (shows non-matching lines). These options combine well with pipes:
|
|
|
|
```
|
|
$ ps aux | grep python # find running Python processes
|
|
$ history | grep -c ssh # how many times have I used ssh?
|
|
```
|
|
|
|
Regular expressions are a deep subject — you don't need to master them now, but recognizing the basics will help you read documentation and write better `grep` commands.
|
|
|
|
> **Exercise 1:** If you have access to a remote machine (a department server, for example), try connecting with `ssh`. Use `scp` to copy a file back and forth.
|
|
|
|
> **Exercise 2:** Use `find` to locate all `.c` files somewhere under your home directory.
|
|
|
|
> **Exercise 3:** Create a directory with a few files in it. Use `tar -czf` to archive it. Delete the original directory, then extract the archive to restore it.
|
|
|
|
> **Exercise 4:** Use `curl` or `wget` to download a file from the web. Try downloading a plain text file, like a Project Gutenberg book (e.g., `https://www.gutenberg.org/files/1342/1342-0.txt` for Pride and Prejudice). Use `wc -l` to count the lines, `head` to see the beginning, and `grep` to search for a word.
|