chiark / gitweb /
debian: Split into multiple packages.
[rsync-backup] / rsync-backup.8
CommitLineData
69305044
MW
1.ie t .ds o \(bu
2.el .ds o o
3.de hP
4.IP
5\h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
6..
f6b4ffdc 7.TH rsync-backup 8 "7 October 2012" rsync-backup
977d0da9
MW
8.SH NAME
9rsync-backup \- back up files using rsync
f6b4ffdc
MW
10.SH SYNOPSIS
11.B rsync-backup
3f496b2b 12.RB [ \-nv ]
f6b4ffdc
MW
13.RB [ \-c
14.IR config-file ]
15.SH DESCRIPTION
16The
17.B rsync-backup
18script is a backup program of the currently popular
19.RB ` rsync (1)
20.BR \-\-link-dest '
21variety. It uses
22.BR rsync 's
23ability to create hardlinks from (apparently) similar existing local
24trees to make incremental dumps efficient, even from remote sources.
25Restoring files is easy because the backups created are just directories
26full of files, exactly as they were on the source \(en and this is
27verified using the
28.BR fshash (1)
29program.
30.PP
31The script does more than just running
32.BR rsync .
33It is also responsible for creating and removing snapshots of volumes to
34be backed up, and expiring old dumps according to a user-specified
35retention policy.
36.SS Installation
37The idea is that the
38.B rsync-backup
39script should be installed and run on a central backup server with local
40access to the backup volumes.
41.PP
42The script should be run with full (root) privileges, so that it can
43correctly record file ownership information. The server should also be
44able to connect via
45.BR ssh (1)
46to the client machines, and run processes there as root. (This is not a
47security disaster. Remember that the backup server is, in the end,
48responsible for the integrity of the backup data. A dishonest backup
49server can easily compromise a client which is being restored from
50corrupt backup data.)
69305044
MW
51.SS Command-line options
52Most of the behaviour of
53.B rsync-backup
54is controlled by a configuration file, described starting with the
55section named
56.B Configuration commands
57below.
58But a few features are controlled by command-line options.
59.TP
60.B \-h
61Show a brief help message for the program, and exit successfully.
62.TP
63.B \-V
64Show
65.BR rsync-backup 's
66version number and some choice pieces of build-time configuration, and
67exit successfully.
68.TP
69.BI "\-c " conf
70Read
71.I conf
72instead of the default configuration file (shown as
73.B conf
74in the
75.B \-V
76output).
77.TP
3f496b2b
MW
78.B \-n
79Don't actually take a backup, or write proper logs: instead, write a
80description of what would be done to standard error.
81.TP
69305044
MW
82.B \-v
83Produce verbose progress information on standard output while the backup
84is running. This keeps one amused while running a backup
85interactively. In any event,
86.B rsync-backup
87will report failures to standard error, and otherwise run silently, so
88it doesn't annoy unnecessarily if run by
89.BR cron (8).
90.SS Backup process
91Backing up a filesystem works as follows.
92.hP \*o
93Make a snapshot of the filesystem on the client, and ensure that the
94snapshot is mounted. There are some `trivial' snapshot types which use
95the existing mounted filesystem, and either prevent processes writing to
96it during the backup, or just hope for the best. Other snapshot types
97require the snapshot to be mounted somewhere distinct from the main
98filesystem, so that the latter can continue being used.
99.hP \*o
100Run
101.B rsync
102to copy the snapshot to the backup volume \(en specifically, to
103.IB host / fs / new \fR.
104If this directory already exists, then it's presumed to be debris from a
105previous attempt to dump this filesystem:
106.B rsync
107will update it appropriately, by adding, deleting or modifying the
108files. This means that retrying a failed dump \(en after fixing whatever
109caused it to go wrong, obviously! \(en is usually fairly quick.
110.hP \*o
111Run
112.B fshash
113on the client to generate a `digest' describing the contents of the
114filesystem, and send this to the server as
115.IB host / fs / new .fshash \fR.
116.hP \*o
117Release the snapshot: we don't need it any more.
118.hP \*o
119Run
120.B fshash
121over the new backup; specifically, to
122.BI tmp/fshash. host . fs . date \fR.
123This gives us a digest for what the backup volume actually stored.
124.hP \*o
125Compare the two
126.B fshash
127digests. If they differ then dump the differences to the log file and
128report a backup failure. (Backups aren't any good if they don't
129actually back up the right thing. And you stand a better chance of
130fixing them if you know that they're going wrong.)
131.hP \*o
132Commit the backup, by renaming the dump directory to
133.IB host / fs / date
134and the
135.B fshash
136digest file to
137.IB host / fs / date .fshash \fR.
f6b4ffdc 138.PP
69305044 139The backup is now complete.
f6b4ffdc
MW
140.SS Configuration commands
141The configuration file is simply a Bash shell fragment: configuration
142commands are shell functions.
143.TP
144.BI "backup " "fs\fR[:\fIfsarg\fR] ..."
145Back up the named filesystems. The corresponding
146.IR fsarg s
147may be required by the snapshot type.
148.TP
149.BI "host " host
150Future
151.B backup
152commands will back up filesystems on the named
153.IR host .
69305044
MW
154To back up filesystems on the backup server itself, use its hostname:
155.B rsync-backup
156will avoid inefficient and pointless messing about
157.BR ssh (1)
158in this case.
159This command clears the
f6b4ffdc 160.B like
fdd73e22
MW
161list, the remote
162.B user
163name, and resets the retention policy to its default (i.e., the to
f8d0b27d
MW
164policy defined prior to the first
165.B host
166command).
f6b4ffdc
MW
167.TP
168.BI "like " "host\fR ..."
169Declare that subsequent filesystems are `similar' to like-named
170filesystems on the named
171.IR host s,
172and that
173.B rsync
174should use those trees as potential sources of hardlinkable files. Be
175careful when using this option without
176.BR rsync 's
177.B \-\-checksum
178option: an erroneous hardlink will cause the backup to fail. (The
179backup won't be left silently incorrect.)
180.TP
181.BI "retain " frequency " " duration
182Define part a backup retention policy: backup trees of the
183.I frequency
184should be kept for the
185.IR duration .
186The
187.I frequency
188can be
189.BR daily ,
190.BR weekly ,
191.BR monthly ,
192or
69305044 193.B annually
f6b4ffdc
MW
194(or
195.BR yearly ,
196which means the same); the
197.I duration
198may be any of
199.BR week ,
200.BR month ,
201.BR year ,
202or
203.BR forever .
204Expiry considers each existing dump against the policy lines in order:
205the last applicable line determines the dump's fate \(en so you should
206probably write the lines in decreasing order of duration.
e69b31ea 207.RS
f8d0b27d
MW
208.PP
209Groups of
210.B retain
211commands between
212.B host
213and/or
214.B backup
215commands collectively define a retention policy. Once a policy is
216defined, subsequent
217.B backup
218operations use the policy. The first
219.B retain
220command after a
221.B host
222or
223.B backup
224command clears the policy and starts defining a new one. The policy
225defined before the first
226.B host
227is the
228.I default
229policy: at the start of each
230.B host
231stanza, the policy is reset to the default.
e69b31ea 232.RE
f6b4ffdc 233.TP
5675acda
MW
234.BI "retry " count
235The
236.B live
237snapshot type (see below) doesn't prevent a filesystem from being
238modified while it's being backed up. If this happens, the
239.B fshash
240pass will detect the difference and fail. If the filesystem in question
241is relatively quiescent, then maybe retrying the backup will result in a
242successful consistent copy. Following this command, a backup which
243results in an
244.B fshash
245mismatch will be retried up to
246.I count
247times before being declared a failure.
248.TP
f6b4ffdc
MW
249.BI "snap " type " " \fR[\fIargs\fR...]
250Use the snapshot
251.I type
252for subsequent backups. Some snapshot types require additional
5675acda
MW
253arguments, which may be supplied here. This command clears the
254.B retry
255counter.
fdd73e22
MW
256.TP
257.BI "user " name
258Specify the user name on the remote host. Without this, calls to
259.BR ssh (1)
260and
261.BR rsync (1)
262won't specify any user name, so the default (probably from the
263.BR ssh_config (5)
264file) will apply.
f6b4ffdc
MW
265.SS Configuration variables
266The following shell variables may be overridden by the configuration
267file.
268.TP
8e40e6cf
MW
269.B HASH
270The hash function to use for verifying archive integrity. This is
271passed to the
272.B \-H
273option of
274.BR fshash ,
275so it must name one of the hash functions supported by your Python's
276.B hashlib
277module.
278The default is
279.BR sha256 .
280.TP
a8447303
MW
281.B INDEXDB
282The name of a SQLite database initialized by
283.BR update-bkp-index (8)
284in which an index is maintained of which dumps are on which backup
285volumes. If the file doesn't exist, then no index is maintained. The
286default is
287.IB localstatedir /lib/bkp/index.db
288where
289.I localstatedir
290is the state directory configured at build time.
291.TP
f6b4ffdc
MW
292.B MAXLOG
293The number of log files to be kept for each filesystem. Old logfiles
294are deleted to keep the total number below this bound. The default
295value is 14.
296.TP
a8447303
MW
297.B METADIR
298The metadata directory for the currently mounted backup volume.
299The default is
300.IB mntbkpdir /meta
301where
302.I mntbkpdir
303is the backup mount directory configured at build time.
304.TP
f6b4ffdc
MW
305.B RSYNCOPTS
306Command-line options to pass to
307.BR rsync (1)
308in addition to the basic set:
69305044
MW
309.B \-\-archive
310.B \-\-hard-links
311.B \-\-numeric-ids
312.B \-\-del
313.B \-\-sparse
314.B \-\-compress
315.B \-\-one-file-system
316.B \-\-partial
317.BR "\-\-filter=""dir-merge .rsync-backup""" .
f6b4ffdc
MW
318The default is
319.BR \-\-verbose .
320.TP
321.B SNAPDIR
322LVM (and
323.BR rfreezefs )
324snapshots are mounted on subdirectories below the
325.B SNAPDIR
326.IR "on backup clients" .
327The default is
328.IB mntbkpdir /snap
329where
330.I mntbkpdir
331is the backup mount directory configured at build time.
332.TP
333.B SNAPSIZE
334The volume size option to pass to
335.BR lvcreate (8)
336when creating a snapshot. The default is
337.B \-l10%ORIGIN
338which seems to work fairly well.
339.TP
340.B STOREDIR
341Where the actual backup trees should be stored. See the section on
342.B Archive structure
343below.
344The default is
345.IB mntbkpdir /store
346where
347.I mntbkpdir
348is the backup mount directory configured at build time.
349.TP
a8447303
MW
350.B VOLUME
351The name of the current volume. If this is left unset, the volume name
352is read from the file
353.IB METADIR /volume
354once at the start of the backup run.
f6b4ffdc
MW
355.SS Hook functions
356The configuration file may define shell functions to perform custom
357actions at various points in the backup process.
358.TP
359.BI "backup_precommit_hook " host " " fs " " date
360Called after a backup has been verified complete and about to be
361committed. The backup tree is in
362.B new
363in the current directory, and the
364.B fshash
365manifest is in
366.BR new.fshash .
367A typical action would be to create a digital signature on the
368manifest.
369.TP
370.BI "backup_commit_hook " host " " fs " " date
371Called during the commit procedure. The backup tree and manifest have
372been renamed into their proper places. Typically one would use this
373hook to rename files created by the
374.B backup_precommit_hook
375function.
376.TP
377.BR "whine " [ \-n ] " " \fItext\fR...
378Called to report `interesting' events when the
379.B \-v
380option is in force. The default action is to echo the
381.I text
382to (what was initially) standard output, followed by a newline unless
383.B \-n
384is given.
385.SS Snapshot types
386The following snapshot types are available.
387.TP
388.B live
389A trivial snapshot type: attempts to back up a live filesystem. How
390well this works depends on how active the filesystem is. If files
391change while the dump is in progress then the
392.B fshash
393verification will likely fail. Backups using this snapshot type must
394specify the filesystem mount point as the
395.IR fsarg .
396.TP
397.B ro
398A slightly less trivial snapshot type: make the filesystem read-only
399while the dump is in progress. Backups using this snapshot type must
400specify the filesystem mount point as the
401.IR fsarg .
402.TP
403.BI "lvm " vg
404Create snapshots using LVM. The snapshot argument is interpreted as the
405relevant volume group. The filesystem name is interpreted as the origin
406volume name; the snapshot will be called
407.IB fs .bkp
408and mounted on
409.IB SNAPDIR / fs \fR;
410space will be allocated to it according to the
411.I SNAPSIZE
412variable.
413.TP
414.BI "rfreezefs " client " " vg
415This gets complicated. Suppose that a server has an LVM volume group,
416and exports (somehow) a logical volume to a client. Examples are a host
417providing a virtual disk to a guest, or a server providing
418network-attached storage to a client. The server can create a snapshot
419of the volume using LVM, but must synchronize with the client to ensure
420that the filesystem image captured in the snapshot is clean. The
421.BR rfreezefs (8)
422program should be installed on the client to perform this rather
423delicate synchronization. Declare the server using the
424.B host
425command as usual; pass the client's name as the
426.I client
427and the
428server's volume group name as the
429.I vg
430snapshot arguments. Finally, backups using this snapshot type must
431specify the filesystem mount point (or, actually, any file in the
432filesystem) on the client, as the
433.IR fsarg .
434.PP
435Additional snapshot types can be defined in the configuration file. A
436snapshot type requires two shell functions.
437.TP
438.BI snap_ type " " snapargs " " fs " " fsarg
439Create the snapshot, and write the mountpoint (on the client host) to
440standard output, in a form suitable as an argument to
441.BR rsync .
442.TP
443.BI unsnap_ type " " snapargs " " fs " " fsarg
444Remove the snapshot.
445.PP
446There are a number of utility functions which can be used by snapshot
447type handlers: please see the script for details. Please send the
448author interesting snapshot handlers for inclusion in the main
449distribution.
450.SS Archive structure
69305044
MW
451Backup trees are stored in a fairly straightforward directory tree.
452.PP
453At the top level is one directory for each client host. There are also
454some special entries:
455.TP
6037bdb3
MW
456.B \&.rsync-backup-store
457This file must be present in order to indicate that a backup volume is
458present (and not just an empty mount point).
459.TP
69305044
MW
460.B fshash.cache
461The cache database used for improving performance of local file
462hashing. There may be other
463.B fshash.cache-*
464files used by SQLite for its own purposes.
465.TP
466.B lost+found
467Part of the filesystem used on the backup volume. You don't want to
468mess with this.
469.TP
470.B tmp
471Used to store temporary files during the backup process. (Some of them
472want to be on the same filesystem as the rest of the backup.) When
473things go wrong, files are left behind in the hope that they might help
474someone debug the mess. It's always safe to delete the files in here
475when no backup is running.
476.PP
477So don't use those names for your hosts.
478.PP
479The next layer down contains a directory for each filesystem on the given host.
480.PP
481The bottom layer contains a directory for each dump of that filesystem,
482named with the date at which the dump was started (in ISO8601
483.IB yyyy \(en mm \(en dd
484format), together with associated files named
485.IB date .* \fR.
2aea4573
MW
486There is also a symbolic link
487.B last
488referring to the most recent backup of the filesystem.
69305044
MW
489.SH SEE ALSO
490.BR fshash (1),
491.BR lvm (8),
492.BR rfreezefs (8),
493.BR rsync (1),
a8447303
MW
494.BR ssh (1),
495.BR update-bkp-index (8).
69305044
MW
496.SH AUTHOR
497Mark Wooding, <mdw@distorted.org.uk>