Dirvish enhancements

Dirvish is a rsync based backup tool. It's is intended to backup on disk drives or whatever looks like one under your OS. I came across it while reading an article in c't. It was mentioned together with rsnapshot but said to be a more amplified solution for backups, etc...

I'm using dirvish now to backup computer1 on computer2 and vice versa. I basically want to protect myself from head crashes, but I also keep several copies of data as they use little disk space thanks to the hard link.

Unfortunately dirvish is a "pull" solution. It fetches the files from a backup client (or from itself) and stores them in a dedicated directory. This is okay for backup of computer2 on computer1, but raises two problems for the inverse direction (computer1 gets saved on comptuer2 or, which is the same, computer2 draws the files from computer1) concerning:

  1. time (computer1 does not run all day)
  2. access (computer2 cant automatically login on computer1, as it is more exposed and has less rights)
For a better understanding: computer1 is a ADSL-connected home computer and computer2 is a server on the Internet.

Consequently, it would be nice to have dirvish pushing backups, too. Dirvish can save files local-to-local and remote-to-local. What my patch does is to hide the local-to-remote case in the first one (local-to-local). This requires two steps:

  1. to have a local representation of the remote bank (e.g. sshfs mounted), but
  2. to also hint rsync to do real local-to-remote transfers
The first step could be used alone to let dirvish handle it like "local-to-local", but in the rsync phase of dirvish it would basically copy over the network instead of transfering only checksums. And this is what I chose dirvish for - the benefits of rsync, so I don't want to lose them.

My patch:

How to use local-to-remote copy?

Useful tips:

Download the patch here

What does it do?

Saving file ownership information as non-root

I came across this problem while setting up dirvish to make my backups. Dirvish uses rsync and works heavily with hardlinks to make backups on normal filesystems. I like this solution, but a non-privileged user can't chown() files.

As a result, dirvish (and many other backup solutions as well) need to be run as root to make perfect-looking copies. This is not my preferred solution, as I don't want to have automated network root access open on my machines.

The ideas behind any better solution is to wrap the filesystem accesses. While normal stuff gets through to work on the real filesystem which forms the back-end. Actions which need special rights get filtered and their results stored in file(s) apart. Special actions are: changing ownership information and creating device nodes.

The only two implementations I know of are: fakeroot and pretendroot. Both work as library wrapper. They use the glibc library preload (LD_PRELOAD) mechanism to catch library calls.

I thought about implementing a FUSE filesystem (user space fs) to do similar stuff before I came across fakeroot.

The difference between fakeroot and pretend root is that fakeroot tries harder to give a root-like environment, but its permament storage interface is broken. Fakeroot stores the additional information in memory. A daemon does this, which can load and save to a file on startup and termination. Unfortunately pending race condition bugs make it easy to lose information. System crashes on a lengthy backup do the same. You can trigger that bug so easy:

$ fakeroot -s new_save_file mknod node c 1 2
$ ls -la new_save_file
-rw-r--r-- 1 siemer siemer 0 2006-08-17 14:06 new_save_file
As you can see, no information got stored at all. If you load "new_save_file" in another fakeroot environment, "node" will look like a normal file.

pretendroot on the other hand stores ownership information to files in a directory on the go. No complicated loading, saving cycle. It's disadvantage is to be less spread and having only support for ownership information. You cant create device files.

Both programs associate files to the additional information by inode and filesystem device number. I would like to have a filename based solution in an extra file in every directory where needed. Like umsdos did, as far as I remember... (I used it for 10 minutes some years ago.)

Tuesday, 29-May-2007 12:58:24 CEST