Things I Learned About Bash Scripting Last Week

15 November 2024
software,
blog,
linux,
bash

Learning a lot through unnecessary automation

This post recounts how I learned enough about Bash scripting to automate activating and deactivating Python virtual environments created by the venv module.

As I’ve become more comfortable with the command line over the years, I’ve learned a little about shell scripting here and there. I was always reluctant to really get into it because, being too young to remember a time before GUIs, it seemed archaic to me. My earliest memories of playing video games involve launching them from a command line though. Lately, while I’m still a little reluctant, I’ve been finding more and more reasons to learn about it.

When I was trying to debug my modifications to the deploy-pages Bash script for this website, I had an incentive to try Git Bash and learn a little bit more about Bash scripting. The first thing I needed was a way to save the directory the script was executed from. I just went with the first search result for “bash save current directory.”

My thanks to <user> on <well-known QA service>

I wanted to show what I found, link to where I found it, and credit the person who posted it, but I’m concerned that doing so could be outside of my “personal, non-commercial use.” I expect the risk of legal action is quite low, but I prefer to avoid it even so.

# imagine a tidy one-liner that saves the path to the current working directory

Then I ran into a persistent Codeberg bug that frequently forces you to re-run git push. If I was writing a JavaScript or Python script, I would have known exactly how to catch the error and run git push again. So I went looking for the Bash equivalent of a try/catch block.

My error handling for the Codeberg authentication bug

git push "$remote" "$remote_branch"
if [ $? -ne 0 ]; then
    echo "Codeberg authentication failed. Retrying."
    git push "$remote" "$remote_branch"
    if [ $? -ne 0 ]; then
        echo "Error: Retry failed. Bother Codeberg about it."
        exit 1
    fi
fi

I picked up the structure of a Bash script if statement and that the $var syntax was used to access the value of a variable from reading the original deploy-pages script. What was new to me in the example I found was the [ $? -ne 0 ] part. I learned that ? is a special bash variable that holds the exit status of the last program, and that any value besides zero meant an error had occurred. -ne 0 meaning "not equal to zero" made sense, but the syntax still felt strange. (I have since learned that [ ] is an alias for the test program, so -ne looking like a command line option and not a typical operator makes sense.) That script necessitated me learning more Bash script syntax, but as usual I only learned enough to get the job done. It did get me to set Git Bash as my default terminal though, and that set me on the path to finally attaining a working knowledge of Bash scripting.

Last week I was setting up everything I needed to work on my Anytype Python module. (I’ll write a post specifically about that project once I have enough done to write about.) Since my goal with that project is to make something other people can use, I’m doing my best to learn and apply best practices for Python module development. Part of that is setting up a virtual environment for the project. The venv module bundled with the current version of Python makes doing so almost seamless, but I still had to type python -m pip every time I wanted to use pip. I also didn’t like that I had to run the venv activate script every time I wanted to work in the virtual environment. I was already using Git Bash, and my recent experience modifying a Bash script gave me the confidence that I could figure out how to automate away those annoyances.

Command aliasing was quick and easy. I just needed a .bash_profile file in my home directory with the line alias pip='python -m pip'. That worked as soon as I launched a new terminal, and it taught me that .bash_profile is where I should put things for setting up the shell. (At the time of editing this, I've improved my Unix knowledge substantially, and now know that .bashrc is the better place for aliases. That's a future post.) Automatically managing Python virtual environments ended up taking the rest of the day. Though I feel that particular diversion wasn’t a waste of time. At some point in the preceding couple days, I read that it was necessary to add export PROMPT_COMMAND='history -a' to a .bashrc file in your home directory so that VS Code could save the command history of Git Bash properly. In setting up my Python project, I then had cause to learn that PROMPT_COMMAND is a shell variable where you can set commands you want to run every time after the shell finishes executing the commands you entered. Then the solution for activating a Python virtual environment seemed straightforward: add an activate_pyvenv script to PROMPT_COMMAND to check for the venv activate script and run it if it’s there.

My first attempt at activate_pyvenv.sh

if [ -f ./Scripts/activate ]; then
    source ./Scripts/activate
fi

It seemed to work at first, but I'll get into the problems I encountered later in this post. After I tested it, some quirk of VS Code made it so that the nice colour coded Git Bash prompt that updated automatically with the current directory stopped updating. So I decided it was time to learn how to customize a Bash prompt myself. I skimmed a tutorial or two, but I wanted to really understand my options, so I ended up on the Bash Reference Manual page for controlling the prompt. I patterned my custom prompt on the original Git Bash one, with some additions that appealed to me. My first pass looked something like this:

"-bash@localhost 2024-11-09 18:47 /home/heck/projects/website:_ fully described in the following figcaption" — A computer terminal prompt. White text on a black background across two lines. The text consists of, in order and separated by spaces, the shell name (bash) and host name (localhost) with an at symbol between them, the date 2024-11-09, the time 18:47, and the full directory path /home/heck/projects/website followed by a colon and an underline-style cursor. The text overflows onto a second line in the middle of the word projects.

The corresponding value of PS1

"\s@\h \D{%F} \D{%R} \w:"

I’m reconstructing these from memory in iSH as I didn’t have the foresight to document all my iterations.

That prompt had the information I wanted, but it was a little plain, and the full directory path felt like too much. Plus, the cursor being so close to the prompt bothered me. So over a few more iterations, I replaced the spaces with vertical lines, set PROMPT_DIRTRIM to 2 in .bash_profile, added a newline and the \$ special character, and went back to this tutorial I had skimmed earlier to figure out the colours. This is what my prompt looks like today:

"-bash@localhost|2024-11-09|19:14|.../projects/website/ $ _ fully described in the following figcaption" — A computer terminal prompt. Text in multiple colours on a black background across three lines. First, in order and separated by white vertical lines, the shell name (bash) and host name (localhost) with an at symbol between them in dark green, the date 2024-11-09 and the time 19:14 in purple, the partial directory path /projects/website abbreviated with an ellipsis in yellow. The yellow directory path overflows onto the second line part way through the word website. On the last line, a dollar sign and an underline-style cursor in white are separated by a space.

My current value of PS1

"\e[0;32;40m\s@\h\e[m|\e[7;30;45m\D{%F}\e[m|\e[7;30;45m\D{%R}\e[m|\e[7;30;43m\w\e[m\n\$ "

With a prompt I was happy with, I returned to figuring out how to manage Python virtual environments with a Bash script. I quickly realized that checking for the activate script itself was not the best way to detect a Python virtual environment directory. ./Scripts/activate seemed like a pretty generic sounding path that could easily exist outside the context of the venv module. Now pyvenv.cfg was nice and specific, and (I think) should exist in the root directory of every venv environment. I found a solid way of detecting when I was in the right place to run the activate script, but after some testing I realized I needed to do a whole lot more.

At that point, my activate_pyvenv script ran the corresponding activate script every time I moved to a venv root directory. So if I navigated to a subdirectory and back, it would run the script again. I could have lived with that being the only problem. But then I started installing modules and noticed they weren’t always ending up where I expected. I realized that, in addition to automatically activating the virtual environment, I needed to automatically deactivate it when I left the venv root directory. The first step I took towards solving that problem was writing a script that simply detected whether, after the shell finished executing, it had gone up, down, or stayed in the same place in the directory hierarchy. I found a common Bash pattern for detecting if a string was a prefix of another using a case statement. Then I tried comparing the shell environment variables PWD and OLDPWD, but OLDPWD wasn’t changing the way I expected it to, so that was insufficient. A new variable, VENV_PREV_DIR, that I updated to $PWD at the end of the script had the behaviour I was looking for.

My first step towards solving the problem: detect_cd.sh

case $PWD in
    "$VENV_PREV_DIR"*)
        if [[ $VENV_PREV_DIR != $PWD ]]; then
            echo down a level
        else
            echo same level
        fi
    ;;
    *)
        echo up a level
    ;;
esac

VENV_PREV_DIR=$PWD"

I was sure that the only time I would want to run the venv activate script was when I navigated down into a directory containing a pyvenv.cfg file and a virtual environment wasn’t already active. I decided not to waste time figuring out the “correct” way to tell if a virtual environment was already active, so I simply initialized a flag called PYTHON_VENV_ACTIVE to zero in .bash_profile and used that.

My first pass at activate_pyvenv.sh

#!/bin/bash

case $PWD in
    "$VENV_PREV_DIR"*)
			if [[ $VENV_PREV_DIR != $PWD ]]; then
				if [ -f ./pyvenv.cfg ] && ! (( "$PYTHON_VENV_ACTIVE" )); then
					PYTHON_VENV_ACTIVE=1
					echo activating python venv
					source ./Scripts/activate
				fi
			else
				# placeholder
            : # do nothing
			fi
    ;;
    *)
			PYTHON_VENV_ACTIVE=0
			echo deactivating python venv
			deactivate
    ;;
esac

VENV_PREV_DIR=$PWD

I was getting closer. The logic for determining when to activate worked well, but I was deactivating every time I went up in the directory hierarchy. I needed to detect when I left the actual venv directory. Saving the current directory to VENV_DIR, a new shell variable, when activating a virtual environment came to me right away. Yet I struggled to effectively determine when I had left that directory. I tried if [[ $VENV_PREV_DIR = $VENV_DIR ]] when going up, but that only worked when moving up out of exactly the venv root directory, and not when moving up out of that directory from a subdirectory. Then I came to an understanding: the core logic of the script was figuring out if a new directory was a subdirectory of the previous one, and I needed to know if a new directory wasn’t a subdirectory of the saved VENV_DIR. So I nested another case statement in the “moving up” condition and the script was done but for some minor polishing.

The current version of activate_pyvenv.sh

# manages automatically activating and deactivating Python virtual environments created by the venv module

# this case statement checks if the current directory is a subdirectory of
# the last one visited by the shell
case $PWD in
    "$VENV_PREV_DIR"*)
        if [[ $VENV_PREV_DIR != $PWD ]]; then
            # the shell just started or we've gone down in the directory hierarchy
            if [ -f ./pyvenv.cfg ] && ! (( "$PYTHON_VENV_ACTIVE" )); then
                # we're in a python virtual environment and it needs activating
                PYTHON_VENV_ACTIVE=1
                # save the venv directory so we can tell when we've left
                VENV_DIR=$PWD
                short_venv_dir=${VENV_DIR##*'/'}
                echo activating python venv $short_venv_dir
                source ./Scripts/activate
            fi
        else
            # placeholder in case I want to do something when I don't
            # change directories
            : # do nothing
        fi
    ;;
    *)
        # we've gone up in the directory hierarchy
        case $PWD in
            "$VENV_DIR"*)
                # we're still in a subdirectory of VENV_DIR
                : # so do nothing
            ;;
            *)
                # we've escaped the virtual environment
                PYTHON_VENV_ACTIVE=0
                short_venv_dir=${VENV_DIR##*'/'}
                unset VENV_DIR
                echo deactivating python venv $short_venv_dir
                deactivate
            ;;
        esac
    ;;
esac

unset working_dir
unset short_dir
unset short_venv_dir
VENV_PREV_DIR=$

I tried to save a couple lines of code by omitting the variables for the short directory names, but I quickly gave up on figuring out the correct syntax.

At the time I finished the script, I was sure there were some things I had done clumsily that could be improved by somebody with more Bash experience. I also acknowledged to myself that it wouldn’t work if I navigated straight into a subdirectory of a venv directory. But, as the saying goes: “perfect is the enemy of done.” I thought I would have to figure out how to check every directory above the current one for a pyvenv.cfg file, and that it would take some time. I added the script to my list of things to revisit if I run out of project ideas. It worked well enough, so I was happy to move on and get some “real work” done.

← Previous
How I Built This Site
Next →
How Microsoft and Apple Pushed Me Into Using Linux