Today I revisited a theme I had researched in the past, being how to create a sync'ed copy of a selected Folder on your Mac with an online storage provider.
The solution I researched in the past involved the Amazon S3 service. The cool thing about it is that it has cheap pricing and a fairly rich API against which many folks have developed all manner of tools. In my previous solution I used the S3CMD tools, more specifically S3Sync (which are built in Python & Ruby) and the launchd tools which are a standard part of Mac OS X. You can configure startup services using plist files (which are XML based config files) and depending on the context you want them to start, can be either system or user based. If you want to start a service only for your user, you use the LaunchAgents folder which is in the (now hidden by default) users Library folder. To unhide this folder, simply issue this command in the Terminal app:
chflags nohidden ~/Library
My previous solution relied on the functionality of the WatchPaths parameter in the launchd plist file. The effect of this parameter is that it watches the given folder for changes and if a change is detected, it lauches the command in the ProgramArguments array (this array contains a list of strings, starting with the actual command itself, followed by any arguments, each argument as a separate item). However this has a rather unfortunate problem in that it only watches the folder given and not any of the subfolders contained within it and so forth...
This weekend I discovered the solution to my predicament. To my joy I learned that a Googler had wrestled with a similar issue in Python scripting and has out of frustration with lacking existing solutions, developed his own. He has generously shared his solution with us all, it can be found here.
Since Macs come with Python already installed, it is really easy to install this package called watchdog. Note that you do need to have Xcode (Apple's developer tool) installed, which is available for free in the App store. Also note that once you have installed Xcode, you will need to go to Preferences > Downloads > Components and install the Command Line Tools. Once you've met these prerequisites, you can easily install watchdog with the following command:
sudo easy_install watchdog
That's all there is to it. This installs watchdog as well as a python script called watchmedo which I use to improve upon my previous solution. Now to implement this solution, I simply had to modify my plist file. I now no longer need the WatchPaths parameter, so I deleted that and instead I simply added a new parameter called RunAtLoad (which is set to True), which ensures that my script in the ProgramArguments array (first item) parameter is being executed when launchd loads this plist (after you've logged in this case or when you explicitly load it with launchctl load). My plist (use the Property List Editor program -use Spotlight to find this standard Mac OS X program- to edit this) looks like this:
The script is a simple BASH script, containing only one line, as follows:
/usr/local/bin/watchmedo shell-command --patterns="*.*" --recursive --command="/Users/username/s3sync/syncup-2 ~/backup mybackup:" -w ~/backup/
So this ensures that watchmedo can be found (specifying the full path to it rather than relying on the PATH environment variable's current value) and it makes it recursive so that all folders contained in the backup folder are also watched for changes. The command parameter then contains another script, called syncup-2 which is wrapper around the necessary s3cmd to trigger a full recursive sync of the backup folder to the given S3 bucket (mybackup in this case).
This is the listing for that script:
#!/bin/bash
# script to upload local directory upto s3
# sync usage:
# sync [sourcepath] [s3bucketname:]
# Gives the directory in de s3 bucket the same name as your source
#
# This script does not apply a Public ACL
curdir=`pwd`
if test -z "$1" & test -z "$2"
then
echo "Please provide a path to sync and a bucket to sync too. Like so:"
echo "sync [sourcepath] [s3bucketname:directory]"
exit 1
fi
cd ~/s3sync/
export AWS_ACCESS_KEY_ID=[your AWS KEY ID]
export AWS_SECRET_ACCESS_KEY=[your AWS Secret Access Key]
export SSL_CERT_DIR=~/certs
ruby s3sync.rb -r --no-md5 --delete $1 $2
# watch out, no error checking!!
cd $curdir
exit 0
The end result is that everything in the backup folder is fully synced to my Amazon S3 bucket, whenever I add, change or remove a file/folder. In other words, it keeps an exact replica of my backup folder on my Mac in the S3 bucket of my choice (mybucket). So this can easily be achieved by anyone with only very basic (bash, python not even required) scripting skills.