When you work with at least two computers on the same project on a daily basis you might have a problem. You need to get changed files from host A to host B and vice versa. The problem getting bigger when you work in addition on different operation systems or use more than two hosts. On UNIX/Linux the preferred tool for such a task is Rsync. Unfortunately Rsync synchronize only in one direction, it doesn’t work very well when more than two hosts are involved (and it isn’t really comfortable to set up on Windows) and can’t use a secure communication channel. Another approach is to check-in changed source files into a version control system, like CVS. On host A you check it in and on host B you check it out afterwards. But this means you always need a more or less stable variant of your code, so that other developer can, at least compile, or much better use it. That is not always the case (especially when you leave the office at 11:00 p.m.) and it also doesn’t cover files which aren’t handled by a version control system. Luckily there is a solution for all the problems mentioned which is called Unison. So here comes the second post in the ToolTips series, which covers an easy and portable way for file synchronization.
Installing Unison
Most modern Linux distributions include Unison in their package manage system. On Mac OS X you can use MacPorts. Alternatively you could download a binary version for Mac OS X or Windows here. To prevent surprises and unnecessary trouble it might be a good idea to make sure that every involved system use the same version of Unison. At least on Linux and Mac OS X it is relatively easy to compile Unison from the sources.
Setting up public/private key authentication for ssh
One advantage of Unison over Rsync is that you can use different communication channels for the file transfers. One is ssh. As I always prefer/demand encrypted communication this is a big plus of course. In the default setup you can just use ssh. But for a little bit more comfort I suggest to create a public/private key pair for the authentication.
The following creates public/private keys without a password. Although this is much more easier to use, it should be only used on hosts which are trusted. If you are in doubt, use the normal password approach or even better create a public/private key pair with a password. Create a new public/private key pair with the following command:
user@host-a ~ $ ssh-keygen -t rsa
When you are asked for a password just hit Enter twice. The command creates the private key in ~/.ssh/id_rsa
and the public key in ~/.ssh/id_rsa.pub
. Now copy the public key to host B:
user@host-a ~ $ scp ~/.ssh/id_rsa.pub host_B:.ssh/authorized_keys
If you already have some public keys on host B, make sure you append the new key and not overwrite the file by the above command. Make the file accessible by the user only with:
user@host-b ~ $ chmod 600 ~/.ssh/authorized_keys
Now you should be able to connect to host B without any interaction needed.
Configuring Unison
Like in the long UNIX tradition, Unison is configured using text files. The files are located in the ~/.unison
directory. You can configure more than one synchronization target by choosing a meaningful name. There exists one default target which is configured in the file default.prf
. Because I have more than one target I prefer to split the configuration into several files. You can include other project files with the include
statement as shown here:
# directory on host a (this is where Unison will be executed) root = /mnt/data/projects # directory on host b (this is the remote host) root = ssh://host-b//mnt/data/projects # which directories to sync? include projects_files.prf # options include options.prf ignorecase = false # unison executable on the server servercmd = /usr/local/bin/unison
We setup the root directories on both machines, including the configuration file for the project target and some generic option file. We also overwriting the default unison location, because this is a self compiled version. The file options.prf
looks like this:
# No staled nfs and mac store files ignore = Name .nfs* ignore = Name .DS_Store # options log = true rsrc = true auto = true #debug = verbose #logfile = ~/.unison/unison.log
This just set some generic options which are valid for all my targets. For the specific target projects the file projects_files.prf
contains mainly the directories and files which should be ignored:
# No ISOs ignore = Path vms/ISO # Ignore VBox branches ignore = Path vbox-* # No binary output from the other platforms ignore = Path vbox*/out/* # One exception: ignorenot = Path vbox/out/linux.amd64.additions # No wine stuff ignore = Path vbox*/wine.* # Tools ignore = Name vbox*/tools/{FetchDir,freebsd*,os2*}
So in general, you configure the directory to synchronize and later define directories or files which should be ignored. As you see, you can include or exclude paths as you like. Even simple bash wildcards are possible. As shown in this example I exclude all binary files of a VirtualBox build, because they are useless on another platform. Understanding how Unison decide which directories or files should be synchronized is sometimes difficult. So I suggest to carefully read the documentation and just use the “try and failure” approach ;). Another reason for splitting up the configuration files is you can synchronize these files as well. I have another target which synchronize several configuration files, e.g. .bashrc
, .profile
, .vim*
and the sub-project files of Unison like the projects_files.prf
. You can’t synchronize e.g. default.prf
, cause the root directories are different from host to host, but the general configuration is always the same. My home target looks like this:
# Which directories/files to sync? path = .bashrc path = .ion3 path = .gdbinit path = .cgdb path = .valgrind-vbox.supp path = .vim path = .vimrc path = .gvimrc path = .Xdefaults path = .gnupg path = .unison/options.prf path = .unison/home_files.prf path = .unison/projects_files.prf # Do not sync: ignore = Path .vim/.netrwhist ignore = Path .ion3/default-session* ignore = Path .cgdb/readline_history.txt
One of the strengths over other synchronization tools is, you can do this for others host as well. So if you synchronize between host a and host b you can also synchronize between host c and host b. However, a little bit of discipline is necessary. There should be one host which all other host synchronize again.
If you now execute unison
the project target will be used. If you execute unison home
the files of the home target will be synchronized.
Conclusion
Unison is a very powerful tool. You can synchronize between more than two hosts (OS independent), in a secure way and uni-directional. Currently there is no better tool and I use it on a daily basis.