# CLI Part VI: Advanced Tools **CHEG 667-013 — Chemical Engineering with Computers** Department of Chemical and Biomolecular Engineering, University of Delaware --- ## Key idea Connect to remote machines, transfer files, and use additional command line tools for everyday tasks. ## Key goals - Connect to remote machines with `ssh` and transfer files with `scp` and `sftp` - Download files from the web with `curl` and `wget` - Find files with `find`, compare them with `diff`, and archive them with `tar` - Identify commands with `which` - Understand globs and regular expressions for pattern matching --- ## 1. Remote access with ssh `ssh` (secure shell) lets you log in to a remote machine over the network. This is how you'll connect to department servers, computing clusters, or cloud machines: ``` $ ssh username@hostname ``` For example, to connect to a UD server: ``` $ ssh ef1j@mahler.che.udel.edu ``` You'll be prompted for your password on the remote machine. Once connected, you'll see a new prompt — everything you type now runs on the remote machine. To disconnect, type `exit` or press `Ctrl-D`. ### SSH keys Typing your password every time gets old. You can set up *SSH keys* for password-free login: ``` $ ssh-keygen # generate a key pair (press Enter for defaults) $ ssh-copy-id username@hostname # copy your public key to the remote machine ``` After this, `ssh` will authenticate using your key instead of a password. ## 2. Transferring files with scp and sftp Once you're working on remote machines, you'll need to move files back and forth. ### scp — secure copy `scp` works like `cp`, but copies files over the network: ``` $ scp localfile.txt username@hostname:~/destination/ $ scp username@hostname:~/remotefile.txt ./ ``` The first command copies a local file to the remote machine. The second copies a remote file to your current directory. Add `-r` to copy entire directories. ### sftp — secure file transfer `sftp` gives you an interactive session for browsing and transferring files: ``` $ sftp username@hostname sftp> ls sftp> cd results sftp> get output.csv sftp> put input.dat sftp> exit ``` Use `get` to download and `put` to upload. It's useful when you want to browse what's on the remote machine before transferring. ## 3. Downloading files with curl and wget Both `curl` and `wget` download files from the web. You've already seen `curl` in the `weather.sh` script. ### curl `curl` prints the downloaded content to the terminal by default. Use `-o` to save to a file: ``` $ curl -s https://example.com/data.csv -o data.csv ``` The `-s` flag is for "silent" mode (no progress bar). ### wget `wget` saves to a file by default, which makes it simpler for straightforward downloads: ``` $ wget https://example.com/data.csv ``` Use `wget -r` to download entire directories or websites recursively. On macOS, `wget` may need to be installed separately (e.g., `brew install wget`), while `curl` is available by default. ## 4. Finding files with find The `find` command searches for files by name, type, size, modification time, and more. It searches recursively through directories: ``` $ find . -name "*.csv" # find all CSV files in current directory and below $ find ~/cheg667 -name "hello.c" # find a specific file $ find . -type d # find all directories $ find . -name "*.tmp" -delete # find and delete all .tmp files (careful!) ``` The first argument is where to start searching. The remaining arguments are filters. ## 5. Comparing files with diff `diff` shows the differences between two files, line by line: ``` $ diff file1.txt file2.txt 3c3 < This is the original line. --- > This is the modified line. ``` Lines starting with `<` are from the first file, `>` from the second. `diff` is useful for checking what changed between two versions of a file. You'll see it again if you use version control tools like `git`. ## 6. Identifying commands with which If you're not sure where a command lives or which version you're running, `which` tells you: ``` $ which python /usr/bin/python $ which ls /usr/bin/ls ``` This is especially helpful when you have multiple versions of a program installed (e.g., different Python installations) and need to know which one the shell is using. ## 7. Archiving files with tar `tar` bundles files and directories into a single archive. It's the standard way to package things on Unix systems. Create an archive: ``` $ tar -czf archive.tar.gz my_directory/ ``` This creates a compressed archive (`-c` create, `-z` gzip compression, `-f` filename). Extract it with: ``` $ tar -xzf archive.tar.gz ``` (`-x` for extract). To see what's inside without extracting: ``` $ tar -tzf archive.tar.gz ``` You'll encounter `.tar.gz` (or `.tgz`) files frequently when downloading source code or datasets. ## 8. Globs and regular expressions You've already used wildcards like `*` and `?` in section 05. These patterns are called *globs* — the shell expands them into matching filenames before the command runs. Here's the full set: | Pattern | Matches | Example | |---------|---------|---------| | `*` | Any string of characters | `*.txt` — all text files | | `?` | Exactly one character | `file?.dat` — `file1.dat`, `fileA.dat` | | `[abc]` | One character from the set | `file[123].dat` — `file1.dat`, `file2.dat`, `file3.dat` | | `[a-z]` | One character in the range | `[A-Z]*.txt` — files starting with a capital letter | | `[!abc]` | One character *not* in the set | `file[!0-9].dat` — files where the character isn't a digit | Globs are expanded by the shell and work with any command (`ls`, `cp`, `rm`, etc.). ### Regular expressions Regular expressions (or *regex*) are a more powerful pattern language used inside programs like `grep`, `sed`, and many programming languages. Unlike globs, they aren't expanded by the shell — they're interpreted by the program itself. Here are the basics: | Pattern | Meaning | Example | |---------|---------|---------| | `.` | Any single character | `h.t` matches `hat`, `hit`, `hot` | | `*` | Zero or more of the preceding character | `ab*c` matches `ac`, `abc`, `abbc` | | `+` | One or more of the preceding character | `ab+c` matches `abc`, `abbc`, but not `ac` | | `^` | Start of line | `^From` matches lines starting with "From" | | `$` | End of line | `\.csv$` matches lines ending with ".csv" | | `[ ]` | Character class (same as globs) | `[0-9]+` matches one or more digits | | `\` | Escape a special character | `\.` matches a literal period | **Watch out:** `*` means different things in globs and regex! In a glob, `*.txt` means "anything ending in .txt". In a regex, `*` means "zero or more of the previous character". This is a common source of confusion. Using regex with `grep`: ``` $ grep '^From:' emails.txt # lines starting with "From:" $ grep '[0-9][0-9]*\.[0-9]' data.txt # lines containing a decimal number $ grep -i 'error' logfile.txt # case-insensitive search $ grep -c 'warning' logfile.txt # count matching lines $ grep -v '^#' config.txt # lines that do NOT start with # ``` The `-i` flag ignores case, `-c` counts matches instead of printing them, and `-v` inverts the match (shows non-matching lines). These options combine well with pipes: ``` $ ps aux | grep python # find running Python processes $ history | grep -c ssh # how many times have I used ssh? ``` Regular expressions are a deep subject — you don't need to master them now, but recognizing the basics will help you read documentation and write better `grep` commands. > **Exercise 1:** If you have access to a remote machine (a department server, for example), try connecting with `ssh`. Use `scp` to copy a file back and forth. > **Exercise 2:** Use `find` to locate all `.c` files somewhere under your home directory. > **Exercise 3:** Create a directory with a few files in it. Use `tar -czf` to archive it. Delete the original directory, then extract the archive to restore it. > **Exercise 4:** Use `curl` or `wget` to download a file from the web. Try downloading a plain text file, like a Project Gutenberg book (e.g., `https://www.gutenberg.org/files/1342/1342-0.txt` for Pride and Prejudice). Use `wc -l` to count the lines, `head` to see the beginning, and `grep` to search for a word.