cli-walkthrough/06-advanced/README.md
Eric c57d7539d8 Initial commit: CLI walkthrough for CHEG 667-013
Six-module walkthrough covering navigation, files, reading/searching,
processes/editors, scripting, and advanced tools (ssh, regex, tar, etc.).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 21:54:48 -04:00

8.3 KiB

CLI Part VI: Advanced Tools

CHEG 667-013 — Chemical Engineering with Computers
Department of Chemical and Biomolecular Engineering, University of Delaware


Key idea

Connect to remote machines, transfer files, and use additional command line tools for everyday tasks.

Key goals

  • Connect to remote machines with ssh and transfer files with scp and sftp
  • Download files from the web with curl and wget
  • Find files with find, compare them with diff, and archive them with tar
  • Identify commands with which
  • Understand globs and regular expressions for pattern matching

1. Remote access with ssh

ssh (secure shell) lets you log in to a remote machine over the network. This is how you'll connect to department servers, computing clusters, or cloud machines:

$ ssh username@hostname

For example, to connect to a UD server:

$ ssh ef1j@mahler.che.udel.edu

You'll be prompted for your password on the remote machine. Once connected, you'll see a new prompt — everything you type now runs on the remote machine.

To disconnect, type exit or press Ctrl-D.

SSH keys

Typing your password every time gets old. You can set up SSH keys for password-free login:

$ ssh-keygen                       # generate a key pair (press Enter for defaults)
$ ssh-copy-id username@hostname    # copy your public key to the remote machine

After this, ssh will authenticate using your key instead of a password.

2. Transferring files with scp and sftp

Once you're working on remote machines, you'll need to move files back and forth.

scp — secure copy

scp works like cp, but copies files over the network:

$ scp localfile.txt username@hostname:~/destination/
$ scp username@hostname:~/remotefile.txt ./

The first command copies a local file to the remote machine. The second copies a remote file to your current directory. Add -r to copy entire directories.

sftp — secure file transfer

sftp gives you an interactive session for browsing and transferring files:

$ sftp username@hostname
sftp> ls
sftp> cd results
sftp> get output.csv
sftp> put input.dat
sftp> exit

Use get to download and put to upload. It's useful when you want to browse what's on the remote machine before transferring.

3. Downloading files with curl and wget

Both curl and wget download files from the web. You've already seen curl in the weather.sh script.

curl

curl prints the downloaded content to the terminal by default. Use -o to save to a file:

$ curl -s https://example.com/data.csv -o data.csv

The -s flag is for "silent" mode (no progress bar).

wget

wget saves to a file by default, which makes it simpler for straightforward downloads:

$ wget https://example.com/data.csv

Use wget -r to download entire directories or websites recursively. On macOS, wget may need to be installed separately (e.g., brew install wget), while curl is available by default.

4. Finding files with find

The find command searches for files by name, type, size, modification time, and more. It searches recursively through directories:

$ find . -name "*.csv"               # find all CSV files in current directory and below
$ find ~/cheg667 -name "hello.c"     # find a specific file
$ find . -type d                     # find all directories
$ find . -name "*.tmp" -delete       # find and delete all .tmp files (careful!)

The first argument is where to start searching. The remaining arguments are filters.

5. Comparing files with diff

diff shows the differences between two files, line by line:

$ diff file1.txt file2.txt
3c3
< This is the original line.
---
> This is the modified line.

Lines starting with < are from the first file, > from the second. diff is useful for checking what changed between two versions of a file. You'll see it again if you use version control tools like git.

6. Identifying commands with which

If you're not sure where a command lives or which version you're running, which tells you:

$ which python
/usr/bin/python
$ which ls
/usr/bin/ls

This is especially helpful when you have multiple versions of a program installed (e.g., different Python installations) and need to know which one the shell is using.

7. Archiving files with tar

tar bundles files and directories into a single archive. It's the standard way to package things on Unix systems.

Create an archive:

$ tar -czf archive.tar.gz my_directory/

This creates a compressed archive (-c create, -z gzip compression, -f filename). Extract it with:

$ tar -xzf archive.tar.gz

(-x for extract). To see what's inside without extracting:

$ tar -tzf archive.tar.gz

You'll encounter .tar.gz (or .tgz) files frequently when downloading source code or datasets.

8. Globs and regular expressions

You've already used wildcards like * and ? in section 05. These patterns are called globs — the shell expands them into matching filenames before the command runs. Here's the full set:

Pattern Matches Example
* Any string of characters *.txt — all text files
? Exactly one character file?.datfile1.dat, fileA.dat
[abc] One character from the set file[123].datfile1.dat, file2.dat, file3.dat
[a-z] One character in the range [A-Z]*.txt — files starting with a capital letter
[!abc] One character not in the set file[!0-9].dat — files where the character isn't a digit

Globs are expanded by the shell and work with any command (ls, cp, rm, etc.).

Regular expressions

Regular expressions (or regex) are a more powerful pattern language used inside programs like grep, sed, and many programming languages. Unlike globs, they aren't expanded by the shell — they're interpreted by the program itself.

Here are the basics:

Pattern Meaning Example
. Any single character h.t matches hat, hit, hot
* Zero or more of the preceding character ab*c matches ac, abc, abbc
+ One or more of the preceding character ab+c matches abc, abbc, but not ac
^ Start of line ^From matches lines starting with "From"
$ End of line \.csv$ matches lines ending with ".csv"
[ ] Character class (same as globs) [0-9]+ matches one or more digits
\ Escape a special character \. matches a literal period

Watch out: * means different things in globs and regex! In a glob, *.txt means "anything ending in .txt". In a regex, * means "zero or more of the previous character". This is a common source of confusion.

Using regex with grep:

$ grep '^From:' emails.txt              # lines starting with "From:"
$ grep '[0-9][0-9]*\.[0-9]' data.txt    # lines containing a decimal number
$ grep -i 'error' logfile.txt           # case-insensitive search
$ grep -c 'warning' logfile.txt         # count matching lines
$ grep -v '^#' config.txt               # lines that do NOT start with #

The -i flag ignores case, -c counts matches instead of printing them, and -v inverts the match (shows non-matching lines). These options combine well with pipes:

$ ps aux | grep python                  # find running Python processes
$ history | grep -c ssh                 # how many times have I used ssh?

Regular expressions are a deep subject — you don't need to master them now, but recognizing the basics will help you read documentation and write better grep commands.

Exercise 1: If you have access to a remote machine (a department server, for example), try connecting with ssh. Use scp to copy a file back and forth.

Exercise 2: Use find to locate all .c files somewhere under your home directory.

Exercise 3: Create a directory with a few files in it. Use tar -czf to archive it. Delete the original directory, then extract the archive to restore it.

Exercise 4: Use curl or wget to download a file from the web. Try downloading a plain text file, like a Project Gutenberg book (e.g., https://www.gutenberg.org/files/1342/1342-0.txt for Pride and Prejudice). Use wc -l to count the lines, head to see the beginning, and grep to search for a word.