Six-module walkthrough covering navigation, files, reading/searching, processes/editors, scripting, and advanced tools (ssh, regex, tar, etc.). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8.3 KiB
CLI Part VI: Advanced Tools
CHEG 667-013 — Chemical Engineering with Computers
Department of Chemical and Biomolecular Engineering, University of Delaware
Key idea
Connect to remote machines, transfer files, and use additional command line tools for everyday tasks.
Key goals
- Connect to remote machines with
sshand transfer files withscpandsftp - Download files from the web with
curlandwget - Find files with
find, compare them withdiff, and archive them withtar - Identify commands with
which - Understand globs and regular expressions for pattern matching
1. Remote access with ssh
ssh (secure shell) lets you log in to a remote machine over the network. This is how you'll connect to department servers, computing clusters, or cloud machines:
$ ssh username@hostname
For example, to connect to a UD server:
$ ssh ef1j@mahler.che.udel.edu
You'll be prompted for your password on the remote machine. Once connected, you'll see a new prompt — everything you type now runs on the remote machine.
To disconnect, type exit or press Ctrl-D.
SSH keys
Typing your password every time gets old. You can set up SSH keys for password-free login:
$ ssh-keygen # generate a key pair (press Enter for defaults)
$ ssh-copy-id username@hostname # copy your public key to the remote machine
After this, ssh will authenticate using your key instead of a password.
2. Transferring files with scp and sftp
Once you're working on remote machines, you'll need to move files back and forth.
scp — secure copy
scp works like cp, but copies files over the network:
$ scp localfile.txt username@hostname:~/destination/
$ scp username@hostname:~/remotefile.txt ./
The first command copies a local file to the remote machine. The second copies a remote file to your current directory. Add -r to copy entire directories.
sftp — secure file transfer
sftp gives you an interactive session for browsing and transferring files:
$ sftp username@hostname
sftp> ls
sftp> cd results
sftp> get output.csv
sftp> put input.dat
sftp> exit
Use get to download and put to upload. It's useful when you want to browse what's on the remote machine before transferring.
3. Downloading files with curl and wget
Both curl and wget download files from the web. You've already seen curl in the weather.sh script.
curl
curl prints the downloaded content to the terminal by default. Use -o to save to a file:
$ curl -s https://example.com/data.csv -o data.csv
The -s flag is for "silent" mode (no progress bar).
wget
wget saves to a file by default, which makes it simpler for straightforward downloads:
$ wget https://example.com/data.csv
Use wget -r to download entire directories or websites recursively. On macOS, wget may need to be installed separately (e.g., brew install wget), while curl is available by default.
4. Finding files with find
The find command searches for files by name, type, size, modification time, and more. It searches recursively through directories:
$ find . -name "*.csv" # find all CSV files in current directory and below
$ find ~/cheg667 -name "hello.c" # find a specific file
$ find . -type d # find all directories
$ find . -name "*.tmp" -delete # find and delete all .tmp files (careful!)
The first argument is where to start searching. The remaining arguments are filters.
5. Comparing files with diff
diff shows the differences between two files, line by line:
$ diff file1.txt file2.txt
3c3
< This is the original line.
---
> This is the modified line.
Lines starting with < are from the first file, > from the second. diff is useful for checking what changed between two versions of a file. You'll see it again if you use version control tools like git.
6. Identifying commands with which
If you're not sure where a command lives or which version you're running, which tells you:
$ which python
/usr/bin/python
$ which ls
/usr/bin/ls
This is especially helpful when you have multiple versions of a program installed (e.g., different Python installations) and need to know which one the shell is using.
7. Archiving files with tar
tar bundles files and directories into a single archive. It's the standard way to package things on Unix systems.
Create an archive:
$ tar -czf archive.tar.gz my_directory/
This creates a compressed archive (-c create, -z gzip compression, -f filename). Extract it with:
$ tar -xzf archive.tar.gz
(-x for extract). To see what's inside without extracting:
$ tar -tzf archive.tar.gz
You'll encounter .tar.gz (or .tgz) files frequently when downloading source code or datasets.
8. Globs and regular expressions
You've already used wildcards like * and ? in section 05. These patterns are called globs — the shell expands them into matching filenames before the command runs. Here's the full set:
| Pattern | Matches | Example |
|---|---|---|
* |
Any string of characters | *.txt — all text files |
? |
Exactly one character | file?.dat — file1.dat, fileA.dat |
[abc] |
One character from the set | file[123].dat — file1.dat, file2.dat, file3.dat |
[a-z] |
One character in the range | [A-Z]*.txt — files starting with a capital letter |
[!abc] |
One character not in the set | file[!0-9].dat — files where the character isn't a digit |
Globs are expanded by the shell and work with any command (ls, cp, rm, etc.).
Regular expressions
Regular expressions (or regex) are a more powerful pattern language used inside programs like grep, sed, and many programming languages. Unlike globs, they aren't expanded by the shell — they're interpreted by the program itself.
Here are the basics:
| Pattern | Meaning | Example |
|---|---|---|
. |
Any single character | h.t matches hat, hit, hot |
* |
Zero or more of the preceding character | ab*c matches ac, abc, abbc |
+ |
One or more of the preceding character | ab+c matches abc, abbc, but not ac |
^ |
Start of line | ^From matches lines starting with "From" |
$ |
End of line | \.csv$ matches lines ending with ".csv" |
[ ] |
Character class (same as globs) | [0-9]+ matches one or more digits |
\ |
Escape a special character | \. matches a literal period |
Watch out: * means different things in globs and regex! In a glob, *.txt means "anything ending in .txt". In a regex, * means "zero or more of the previous character". This is a common source of confusion.
Using regex with grep:
$ grep '^From:' emails.txt # lines starting with "From:"
$ grep '[0-9][0-9]*\.[0-9]' data.txt # lines containing a decimal number
$ grep -i 'error' logfile.txt # case-insensitive search
$ grep -c 'warning' logfile.txt # count matching lines
$ grep -v '^#' config.txt # lines that do NOT start with #
The -i flag ignores case, -c counts matches instead of printing them, and -v inverts the match (shows non-matching lines). These options combine well with pipes:
$ ps aux | grep python # find running Python processes
$ history | grep -c ssh # how many times have I used ssh?
Regular expressions are a deep subject — you don't need to master them now, but recognizing the basics will help you read documentation and write better grep commands.
Exercise 1: If you have access to a remote machine (a department server, for example), try connecting with
ssh. Usescpto copy a file back and forth.
Exercise 2: Use
findto locate all.cfiles somewhere under your home directory.
Exercise 3: Create a directory with a few files in it. Use
tar -czfto archive it. Delete the original directory, then extract the archive to restore it.
Exercise 4: Use
curlorwgetto download a file from the web. Try downloading a plain text file, like a Project Gutenberg book (e.g.,https://www.gutenberg.org/files/1342/1342-0.txtfor Pride and Prejudice). Usewc -lto count the lines,headto see the beginning, andgrepto search for a word.