Directory Comparison with Bash

01/09/21

Mood foo icon foo
Tags NeoCities

One reason I’d been doing batch uploads to NeoCities is I didn’t have a good way to do incremental site updates. Hugo doesn’t do incremental builds, it rebuilds the entire site every time, and I didn’t like uploading 300+ pages every time I updated.

I figured out I could implement crude comparison by setting aside a copy of the Hugo public folder each time I updated, then comparing that folder to the newest one generated.

A while back I cobbled together a Python script that used filecmp and believed my problem was solved, but then I realized it didn’t always upload all the new files. I figured out filecmp’s directory comparison was shallow and was not based on the actual contents of the file. So if a small update was made (say, a link added to a tag page) the page might not be considered ‘updated’ by my script. There were other problems with the script as well. Since I was running it from a Windows command line I had issues that seemed to be chalked up to “Windows weirdness.”

After fiddling around with Python a bit more I found a bash script that does what I need. It compares my current and previous folders, then generates a new one that only includes updated files. Windows 10 has a Linux subsystem but Windows 8 does not so I have to be on the Linux side of my dual boot to use this.

#!/bin/bash

# setup folders for our different stages
DIST=/home/neonaut/Documents/Hugo/public/
DIST_OLD=/home/neonaut/Documents/Hugo/public_previous/
DIST_UPGRADE=/home/neonaut/Documents/Hugo/public_updated/

cd $DIST

find . -type f | while read filename
do
   if [ ! -f "$DIST_OLD$filename" ]; then
        cp --parents "$filename" $DIST_UPGRADE
      continue
   fi
   diff "$filename" "$DIST_OLD$filename" > /dev/null
   if [[ "$?" == "1" ]]; then
        # File exists but is different so copy changed file
        cp --parents $filename $DIST_UPGRADE
   fi
done

Having done that, I upload the public_updated directory. I still use Python for this.

Finally, I cleanup by clearing out the public_updated folder and setting up the next public_previous folder for comparison.

#!/bin/bash
rm -r public_updated/*
rm -r public_previous
mv public public_previous

One button/liner is the final goal. There are a few bash upload scripts to work from: