| 1 | .ie t .ds o \(bu |
| 2 | .el .ds o o |
| 3 | .de hP |
| 4 | .IP |
| 5 | \h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c |
| 6 | .. |
| 7 | .TH rsync-backup 8 "7 October 2012" rsync-backup |
| 8 | .SH NAME |
| 9 | rsync-backup \- back up files using rsync |
| 10 | .SH SYNOPSIS |
| 11 | .B rsync-backup |
| 12 | .RB [ \-nv ] |
| 13 | .RB [ \-c |
| 14 | .IR config-file ] |
| 15 | .SH DESCRIPTION |
| 16 | The |
| 17 | .B rsync-backup |
| 18 | script is a backup program of the currently popular |
| 19 | .RB ` rsync (1) |
| 20 | .BR \-\-link-dest ' |
| 21 | variety. It uses |
| 22 | .BR rsync 's |
| 23 | ability to create hardlinks from (apparently) similar existing local |
| 24 | trees to make incremental dumps efficient, even from remote sources. |
| 25 | Restoring files is easy because the backups created are just directories |
| 26 | full of files, exactly as they were on the source \(en and this is |
| 27 | verified using the |
| 28 | .BR fshash (1) |
| 29 | program. |
| 30 | .PP |
| 31 | The script does more than just running |
| 32 | .BR rsync . |
| 33 | It is also responsible for creating and removing snapshots of volumes to |
| 34 | be backed up, and expiring old dumps according to a user-specified |
| 35 | retention policy. |
| 36 | .SS Installation |
| 37 | The idea is that the |
| 38 | .B rsync-backup |
| 39 | script should be installed and run on a central backup server with local |
| 40 | access to the backup volumes. |
| 41 | .PP |
| 42 | The script should be run with full (root) privileges, so that it can |
| 43 | correctly record file ownership information. The server should also be |
| 44 | able to connect via |
| 45 | .BR ssh (1) |
| 46 | to the client machines, and run processes there as root. (This is not a |
| 47 | security disaster. Remember that the backup server is, in the end, |
| 48 | responsible for the integrity of the backup data. A dishonest backup |
| 49 | server can easily compromise a client which is being restored from |
| 50 | corrupt backup data.) |
| 51 | .SS Command-line options |
| 52 | Most of the behaviour of |
| 53 | .B rsync-backup |
| 54 | is controlled by a configuration file, described starting with the |
| 55 | section named |
| 56 | .B Configuration commands |
| 57 | below. |
| 58 | But a few features are controlled by command-line options. |
| 59 | .TP |
| 60 | .B \-h |
| 61 | Show a brief help message for the program, and exit successfully. |
| 62 | .TP |
| 63 | .B \-V |
| 64 | Show |
| 65 | .BR rsync-backup 's |
| 66 | version number and some choice pieces of build-time configuration, and |
| 67 | exit successfully. |
| 68 | .TP |
| 69 | .BI "\-c " conf |
| 70 | Read |
| 71 | .I conf |
| 72 | instead of the default configuration file (shown as |
| 73 | .B conf |
| 74 | in the |
| 75 | .B \-V |
| 76 | output). |
| 77 | .TP |
| 78 | .B \-n |
| 79 | Don't actually take a backup, or write proper logs: instead, write a |
| 80 | description of what would be done to standard error. |
| 81 | .TP |
| 82 | .B \-v |
| 83 | Produce verbose progress information on standard output while the backup |
| 84 | is running. This keeps one amused while running a backup |
| 85 | interactively. In any event, |
| 86 | .B rsync-backup |
| 87 | will report failures to standard error, and otherwise run silently, so |
| 88 | it doesn't annoy unnecessarily if run by |
| 89 | .BR cron (8). |
| 90 | .SS Backup process |
| 91 | Backing up a filesystem works as follows. |
| 92 | .hP \*o |
| 93 | Make a snapshot of the filesystem on the client, and ensure that the |
| 94 | snapshot is mounted. There are some `trivial' snapshot types which use |
| 95 | the existing mounted filesystem, and either prevent processes writing to |
| 96 | it during the backup, or just hope for the best. Other snapshot types |
| 97 | require the snapshot to be mounted somewhere distinct from the main |
| 98 | filesystem, so that the latter can continue being used. |
| 99 | .hP \*o |
| 100 | Run |
| 101 | .B rsync |
| 102 | to copy the snapshot to the backup volume \(en specifically, to |
| 103 | .IB host / fs / new \fR. |
| 104 | If this directory already exists, then it's presumed to be debris from a |
| 105 | previous attempt to dump this filesystem: |
| 106 | .B rsync |
| 107 | will update it appropriately, by adding, deleting or modifying the |
| 108 | files. This means that retrying a failed dump \(en after fixing whatever |
| 109 | caused it to go wrong, obviously! \(en is usually fairly quick. |
| 110 | .hP \*o |
| 111 | Run |
| 112 | .B fshash |
| 113 | on the client to generate a `digest' describing the contents of the |
| 114 | filesystem, and send this to the server as |
| 115 | .IB host / fs / new .fshash \fR. |
| 116 | .hP \*o |
| 117 | Release the snapshot: we don't need it any more. |
| 118 | .hP \*o |
| 119 | Run |
| 120 | .B fshash |
| 121 | over the new backup; specifically, to |
| 122 | .BI tmp/fshash. host . fs . date \fR. |
| 123 | This gives us a digest for what the backup volume actually stored. |
| 124 | .hP \*o |
| 125 | Compare the two |
| 126 | .B fshash |
| 127 | digests. If they differ then dump the differences to the log file and |
| 128 | report a backup failure. (Backups aren't any good if they don't |
| 129 | actually back up the right thing. And you stand a better chance of |
| 130 | fixing them if you know that they're going wrong.) |
| 131 | .hP \*o |
| 132 | Commit the backup, by renaming the dump directory to |
| 133 | .IB host / fs / date |
| 134 | and the |
| 135 | .B fshash |
| 136 | digest file to |
| 137 | .IB host / fs / date .fshash \fR. |
| 138 | .PP |
| 139 | The backup is now complete. |
| 140 | .SS Configuration commands |
| 141 | The configuration file is simply a Bash shell fragment: configuration |
| 142 | commands are shell functions. |
| 143 | .TP |
| 144 | .BI "backup " "fs\fR[:\fIfsarg\fR] ..." |
| 145 | Back up the named filesystems. The corresponding |
| 146 | .IR fsarg s |
| 147 | may be required by the snapshot type. |
| 148 | .TP |
| 149 | .BI "host " host |
| 150 | Future |
| 151 | .B backup |
| 152 | commands will back up filesystems on the named |
| 153 | .IR host . |
| 154 | To back up filesystems on the backup server itself, use its hostname: |
| 155 | .B rsync-backup |
| 156 | will avoid inefficient and pointless messing about |
| 157 | .BR ssh (1) |
| 158 | in this case. |
| 159 | This command clears the |
| 160 | .B like |
| 161 | list, and resets the retention policy to its default (i.e., the to |
| 162 | policy defined prior to the first |
| 163 | .B host |
| 164 | command). |
| 165 | .TP |
| 166 | .BI "like " "host\fR ..." |
| 167 | Declare that subsequent filesystems are `similar' to like-named |
| 168 | filesystems on the named |
| 169 | .IR host s, |
| 170 | and that |
| 171 | .B rsync |
| 172 | should use those trees as potential sources of hardlinkable files. Be |
| 173 | careful when using this option without |
| 174 | .BR rsync 's |
| 175 | .B \-\-checksum |
| 176 | option: an erroneous hardlink will cause the backup to fail. (The |
| 177 | backup won't be left silently incorrect.) |
| 178 | .TP |
| 179 | .BI "retain " frequency " " duration |
| 180 | Define part a backup retention policy: backup trees of the |
| 181 | .I frequency |
| 182 | should be kept for the |
| 183 | .IR duration . |
| 184 | The |
| 185 | .I frequency |
| 186 | can be |
| 187 | .BR daily , |
| 188 | .BR weekly , |
| 189 | .BR monthly , |
| 190 | or |
| 191 | .B annually |
| 192 | (or |
| 193 | .BR yearly , |
| 194 | which means the same); the |
| 195 | .I duration |
| 196 | may be any of |
| 197 | .BR week , |
| 198 | .BR month , |
| 199 | .BR year , |
| 200 | or |
| 201 | .BR forever . |
| 202 | Expiry considers each existing dump against the policy lines in order: |
| 203 | the last applicable line determines the dump's fate \(en so you should |
| 204 | probably write the lines in decreasing order of duration. |
| 205 | .PP |
| 206 | Groups of |
| 207 | .B retain |
| 208 | commands between |
| 209 | .B host |
| 210 | and/or |
| 211 | .B backup |
| 212 | commands collectively define a retention policy. Once a policy is |
| 213 | defined, subsequent |
| 214 | .B backup |
| 215 | operations use the policy. The first |
| 216 | .B retain |
| 217 | command after a |
| 218 | .B host |
| 219 | or |
| 220 | .B backup |
| 221 | command clears the policy and starts defining a new one. The policy |
| 222 | defined before the first |
| 223 | .B host |
| 224 | is the |
| 225 | .I default |
| 226 | policy: at the start of each |
| 227 | .B host |
| 228 | stanza, the policy is reset to the default. |
| 229 | .TP |
| 230 | .BI "retry " count |
| 231 | The |
| 232 | .B live |
| 233 | snapshot type (see below) doesn't prevent a filesystem from being |
| 234 | modified while it's being backed up. If this happens, the |
| 235 | .B fshash |
| 236 | pass will detect the difference and fail. If the filesystem in question |
| 237 | is relatively quiescent, then maybe retrying the backup will result in a |
| 238 | successful consistent copy. Following this command, a backup which |
| 239 | results in an |
| 240 | .B fshash |
| 241 | mismatch will be retried up to |
| 242 | .I count |
| 243 | times before being declared a failure. |
| 244 | .TP |
| 245 | .BI "snap " type " " \fR[\fIargs\fR...] |
| 246 | Use the snapshot |
| 247 | .I type |
| 248 | for subsequent backups. Some snapshot types require additional |
| 249 | arguments, which may be supplied here. This command clears the |
| 250 | .B retry |
| 251 | counter. |
| 252 | .SS Configuration variables |
| 253 | The following shell variables may be overridden by the configuration |
| 254 | file. |
| 255 | .TP |
| 256 | .B MAXLOG |
| 257 | The number of log files to be kept for each filesystem. Old logfiles |
| 258 | are deleted to keep the total number below this bound. The default |
| 259 | value is 14. |
| 260 | .TP |
| 261 | .B RSYNCOPTS |
| 262 | Command-line options to pass to |
| 263 | .BR rsync (1) |
| 264 | in addition to the basic set: |
| 265 | .B \-\-archive |
| 266 | .B \-\-hard-links |
| 267 | .B \-\-numeric-ids |
| 268 | .B \-\-del |
| 269 | .B \-\-sparse |
| 270 | .B \-\-compress |
| 271 | .B \-\-one-file-system |
| 272 | .B \-\-partial |
| 273 | .BR "\-\-filter=""dir-merge .rsync-backup""" . |
| 274 | The default is |
| 275 | .BR \-\-verbose . |
| 276 | .TP |
| 277 | .B SNAPDIR |
| 278 | LVM (and |
| 279 | .BR rfreezefs ) |
| 280 | snapshots are mounted on subdirectories below the |
| 281 | .B SNAPDIR |
| 282 | .IR "on backup clients" . |
| 283 | The default is |
| 284 | .IB mntbkpdir /snap |
| 285 | where |
| 286 | .I mntbkpdir |
| 287 | is the backup mount directory configured at build time. |
| 288 | .TP |
| 289 | .B SNAPSIZE |
| 290 | The volume size option to pass to |
| 291 | .BR lvcreate (8) |
| 292 | when creating a snapshot. The default is |
| 293 | .B \-l10%ORIGIN |
| 294 | which seems to work fairly well. |
| 295 | .TP |
| 296 | .B STOREDIR |
| 297 | Where the actual backup trees should be stored. See the section on |
| 298 | .B Archive structure |
| 299 | below. |
| 300 | The default is |
| 301 | .IB mntbkpdir /store |
| 302 | where |
| 303 | .I mntbkpdir |
| 304 | is the backup mount directory configured at build time. |
| 305 | .TP |
| 306 | .B HASH |
| 307 | The hash function to use for verifying archive integrity. This is |
| 308 | passed to the |
| 309 | .B \-H |
| 310 | option of |
| 311 | .BR fshash , |
| 312 | so it must name one of the hash functions supported by your Python's |
| 313 | .B hashlib |
| 314 | module. The default is |
| 315 | .BR sha256 . |
| 316 | .SS Hook functions |
| 317 | The configuration file may define shell functions to perform custom |
| 318 | actions at various points in the backup process. |
| 319 | .TP |
| 320 | .BI "backup_precommit_hook " host " " fs " " date |
| 321 | Called after a backup has been verified complete and about to be |
| 322 | committed. The backup tree is in |
| 323 | .B new |
| 324 | in the current directory, and the |
| 325 | .B fshash |
| 326 | manifest is in |
| 327 | .BR new.fshash . |
| 328 | A typical action would be to create a digital signature on the |
| 329 | manifest. |
| 330 | .TP |
| 331 | .BI "backup_commit_hook " host " " fs " " date |
| 332 | Called during the commit procedure. The backup tree and manifest have |
| 333 | been renamed into their proper places. Typically one would use this |
| 334 | hook to rename files created by the |
| 335 | .B backup_precommit_hook |
| 336 | function. |
| 337 | .TP |
| 338 | .BR "whine " [ \-n ] " " \fItext\fR... |
| 339 | Called to report `interesting' events when the |
| 340 | .B \-v |
| 341 | option is in force. The default action is to echo the |
| 342 | .I text |
| 343 | to (what was initially) standard output, followed by a newline unless |
| 344 | .B \-n |
| 345 | is given. |
| 346 | .SS Snapshot types |
| 347 | The following snapshot types are available. |
| 348 | .TP |
| 349 | .B live |
| 350 | A trivial snapshot type: attempts to back up a live filesystem. How |
| 351 | well this works depends on how active the filesystem is. If files |
| 352 | change while the dump is in progress then the |
| 353 | .B fshash |
| 354 | verification will likely fail. Backups using this snapshot type must |
| 355 | specify the filesystem mount point as the |
| 356 | .IR fsarg . |
| 357 | .TP |
| 358 | .B ro |
| 359 | A slightly less trivial snapshot type: make the filesystem read-only |
| 360 | while the dump is in progress. Backups using this snapshot type must |
| 361 | specify the filesystem mount point as the |
| 362 | .IR fsarg . |
| 363 | .TP |
| 364 | .BI "lvm " vg |
| 365 | Create snapshots using LVM. The snapshot argument is interpreted as the |
| 366 | relevant volume group. The filesystem name is interpreted as the origin |
| 367 | volume name; the snapshot will be called |
| 368 | .IB fs .bkp |
| 369 | and mounted on |
| 370 | .IB SNAPDIR / fs \fR; |
| 371 | space will be allocated to it according to the |
| 372 | .I SNAPSIZE |
| 373 | variable. |
| 374 | .TP |
| 375 | .BI "rfreezefs " client " " vg |
| 376 | This gets complicated. Suppose that a server has an LVM volume group, |
| 377 | and exports (somehow) a logical volume to a client. Examples are a host |
| 378 | providing a virtual disk to a guest, or a server providing |
| 379 | network-attached storage to a client. The server can create a snapshot |
| 380 | of the volume using LVM, but must synchronize with the client to ensure |
| 381 | that the filesystem image captured in the snapshot is clean. The |
| 382 | .BR rfreezefs (8) |
| 383 | program should be installed on the client to perform this rather |
| 384 | delicate synchronization. Declare the server using the |
| 385 | .B host |
| 386 | command as usual; pass the client's name as the |
| 387 | .I client |
| 388 | and the |
| 389 | server's volume group name as the |
| 390 | .I vg |
| 391 | snapshot arguments. Finally, backups using this snapshot type must |
| 392 | specify the filesystem mount point (or, actually, any file in the |
| 393 | filesystem) on the client, as the |
| 394 | .IR fsarg . |
| 395 | .PP |
| 396 | Additional snapshot types can be defined in the configuration file. A |
| 397 | snapshot type requires two shell functions. |
| 398 | .TP |
| 399 | .BI snap_ type " " snapargs " " fs " " fsarg |
| 400 | Create the snapshot, and write the mountpoint (on the client host) to |
| 401 | standard output, in a form suitable as an argument to |
| 402 | .BR rsync . |
| 403 | .TP |
| 404 | .BI unsnap_ type " " snapargs " " fs " " fsarg |
| 405 | Remove the snapshot. |
| 406 | .PP |
| 407 | There are a number of utility functions which can be used by snapshot |
| 408 | type handlers: please see the script for details. Please send the |
| 409 | author interesting snapshot handlers for inclusion in the main |
| 410 | distribution. |
| 411 | .SS Archive structure |
| 412 | Backup trees are stored in a fairly straightforward directory tree. |
| 413 | .PP |
| 414 | At the top level is one directory for each client host. There are also |
| 415 | some special entries: |
| 416 | .TP |
| 417 | .B \&.rsync-backup-store |
| 418 | This file must be present in order to indicate that a backup volume is |
| 419 | present (and not just an empty mount point). |
| 420 | .TP |
| 421 | .B fshash.cache |
| 422 | The cache database used for improving performance of local file |
| 423 | hashing. There may be other |
| 424 | .B fshash.cache-* |
| 425 | files used by SQLite for its own purposes. |
| 426 | .TP |
| 427 | .B lost+found |
| 428 | Part of the filesystem used on the backup volume. You don't want to |
| 429 | mess with this. |
| 430 | .TP |
| 431 | .B tmp |
| 432 | Used to store temporary files during the backup process. (Some of them |
| 433 | want to be on the same filesystem as the rest of the backup.) When |
| 434 | things go wrong, files are left behind in the hope that they might help |
| 435 | someone debug the mess. It's always safe to delete the files in here |
| 436 | when no backup is running. |
| 437 | .PP |
| 438 | So don't use those names for your hosts. |
| 439 | .PP |
| 440 | The next layer down contains a directory for each filesystem on the given host. |
| 441 | .PP |
| 442 | The bottom layer contains a directory for each dump of that filesystem, |
| 443 | named with the date at which the dump was started (in ISO8601 |
| 444 | .IB yyyy \(en mm \(en dd |
| 445 | format), together with associated files named |
| 446 | .IB date .* \fR. |
| 447 | .SH SEE ALSO |
| 448 | .BR fshash (1), |
| 449 | .BR lvm (8), |
| 450 | .BR rfreezefs (8), |
| 451 | .BR rsync (1), |
| 452 | .BR ssh (1). |
| 453 | .SH AUTHOR |
| 454 | Mark Wooding, <mdw@distorted.org.uk> |