chiark / gitweb /
rsync-backup.in, rsync-backup.8: Retry backups which fail fshash check.
[rsync-backup] / rsync-backup.8
CommitLineData
69305044
MW
1.ie t .ds o \(bu
2.el .ds o o
3.de hP
4.IP
5\h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
6..
f6b4ffdc
MW
7.TH rsync-backup 8 "7 October 2012" rsync-backup
8.SH SYNOPSIS
9.B rsync-backup
10.RB [ \-v ]
11.RB [ \-c
12.IR config-file ]
13.SH DESCRIPTION
14The
15.B rsync-backup
16script is a backup program of the currently popular
17.RB ` rsync (1)
18.BR \-\-link-dest '
19variety. It uses
20.BR rsync 's
21ability to create hardlinks from (apparently) similar existing local
22trees to make incremental dumps efficient, even from remote sources.
23Restoring files is easy because the backups created are just directories
24full of files, exactly as they were on the source \(en and this is
25verified using the
26.BR fshash (1)
27program.
28.PP
29The script does more than just running
30.BR rsync .
31It is also responsible for creating and removing snapshots of volumes to
32be backed up, and expiring old dumps according to a user-specified
33retention policy.
34.SS Installation
35The idea is that the
36.B rsync-backup
37script should be installed and run on a central backup server with local
38access to the backup volumes.
39.PP
40The script should be run with full (root) privileges, so that it can
41correctly record file ownership information. The server should also be
42able to connect via
43.BR ssh (1)
44to the client machines, and run processes there as root. (This is not a
45security disaster. Remember that the backup server is, in the end,
46responsible for the integrity of the backup data. A dishonest backup
47server can easily compromise a client which is being restored from
48corrupt backup data.)
69305044
MW
49.SS Command-line options
50Most of the behaviour of
51.B rsync-backup
52is controlled by a configuration file, described starting with the
53section named
54.B Configuration commands
55below.
56But a few features are controlled by command-line options.
57.TP
58.B \-h
59Show a brief help message for the program, and exit successfully.
60.TP
61.B \-V
62Show
63.BR rsync-backup 's
64version number and some choice pieces of build-time configuration, and
65exit successfully.
66.TP
67.BI "\-c " conf
68Read
69.I conf
70instead of the default configuration file (shown as
71.B conf
72in the
73.B \-V
74output).
75.TP
76.B \-v
77Produce verbose progress information on standard output while the backup
78is running. This keeps one amused while running a backup
79interactively. In any event,
80.B rsync-backup
81will report failures to standard error, and otherwise run silently, so
82it doesn't annoy unnecessarily if run by
83.BR cron (8).
84.SS Backup process
85Backing up a filesystem works as follows.
86.hP \*o
87Make a snapshot of the filesystem on the client, and ensure that the
88snapshot is mounted. There are some `trivial' snapshot types which use
89the existing mounted filesystem, and either prevent processes writing to
90it during the backup, or just hope for the best. Other snapshot types
91require the snapshot to be mounted somewhere distinct from the main
92filesystem, so that the latter can continue being used.
93.hP \*o
94Run
95.B rsync
96to copy the snapshot to the backup volume \(en specifically, to
97.IB host / fs / new \fR.
98If this directory already exists, then it's presumed to be debris from a
99previous attempt to dump this filesystem:
100.B rsync
101will update it appropriately, by adding, deleting or modifying the
102files. This means that retrying a failed dump \(en after fixing whatever
103caused it to go wrong, obviously! \(en is usually fairly quick.
104.hP \*o
105Run
106.B fshash
107on the client to generate a `digest' describing the contents of the
108filesystem, and send this to the server as
109.IB host / fs / new .fshash \fR.
110.hP \*o
111Release the snapshot: we don't need it any more.
112.hP \*o
113Run
114.B fshash
115over the new backup; specifically, to
116.BI tmp/fshash. host . fs . date \fR.
117This gives us a digest for what the backup volume actually stored.
118.hP \*o
119Compare the two
120.B fshash
121digests. If they differ then dump the differences to the log file and
122report a backup failure. (Backups aren't any good if they don't
123actually back up the right thing. And you stand a better chance of
124fixing them if you know that they're going wrong.)
125.hP \*o
126Commit the backup, by renaming the dump directory to
127.IB host / fs / date
128and the
129.B fshash
130digest file to
131.IB host / fs / date .fshash \fR.
f6b4ffdc 132.PP
69305044 133The backup is now complete.
f6b4ffdc
MW
134.SS Configuration commands
135The configuration file is simply a Bash shell fragment: configuration
136commands are shell functions.
137.TP
138.BI "backup " "fs\fR[:\fIfsarg\fR] ..."
139Back up the named filesystems. The corresponding
140.IR fsarg s
141may be required by the snapshot type.
142.TP
143.BI "host " host
144Future
145.B backup
146commands will back up filesystems on the named
147.IR host .
69305044
MW
148To back up filesystems on the backup server itself, use its hostname:
149.B rsync-backup
150will avoid inefficient and pointless messing about
151.BR ssh (1)
152in this case.
153This command clears the
f6b4ffdc
MW
154.B like
155list.
156.TP
157.BI "like " "host\fR ..."
158Declare that subsequent filesystems are `similar' to like-named
159filesystems on the named
160.IR host s,
161and that
162.B rsync
163should use those trees as potential sources of hardlinkable files. Be
164careful when using this option without
165.BR rsync 's
166.B \-\-checksum
167option: an erroneous hardlink will cause the backup to fail. (The
168backup won't be left silently incorrect.)
169.TP
170.BI "retain " frequency " " duration
171Define part a backup retention policy: backup trees of the
172.I frequency
173should be kept for the
174.IR duration .
175The
176.I frequency
177can be
178.BR daily ,
179.BR weekly ,
180.BR monthly ,
181or
69305044 182.B annually
f6b4ffdc
MW
183(or
184.BR yearly ,
185which means the same); the
186.I duration
187may be any of
188.BR week ,
189.BR month ,
190.BR year ,
191or
192.BR forever .
193Expiry considers each existing dump against the policy lines in order:
194the last applicable line determines the dump's fate \(en so you should
195probably write the lines in decreasing order of duration.
196.TP
5675acda
MW
197.BI "retry " count
198The
199.B live
200snapshot type (see below) doesn't prevent a filesystem from being
201modified while it's being backed up. If this happens, the
202.B fshash
203pass will detect the difference and fail. If the filesystem in question
204is relatively quiescent, then maybe retrying the backup will result in a
205successful consistent copy. Following this command, a backup which
206results in an
207.B fshash
208mismatch will be retried up to
209.I count
210times before being declared a failure.
211.TP
f6b4ffdc
MW
212.BI "snap " type " " \fR[\fIargs\fR...]
213Use the snapshot
214.I type
215for subsequent backups. Some snapshot types require additional
5675acda
MW
216arguments, which may be supplied here. This command clears the
217.B retry
218counter.
f6b4ffdc
MW
219.SS Configuration variables
220The following shell variables may be overridden by the configuration
221file.
222.TP
223.B MAXLOG
224The number of log files to be kept for each filesystem. Old logfiles
225are deleted to keep the total number below this bound. The default
226value is 14.
227.TP
228.B RSYNCOPTS
229Command-line options to pass to
230.BR rsync (1)
231in addition to the basic set:
69305044
MW
232.B \-\-archive
233.B \-\-hard-links
234.B \-\-numeric-ids
235.B \-\-del
236.B \-\-sparse
237.B \-\-compress
238.B \-\-one-file-system
239.B \-\-partial
240.BR "\-\-filter=""dir-merge .rsync-backup""" .
f6b4ffdc
MW
241The default is
242.BR \-\-verbose .
243.TP
244.B SNAPDIR
245LVM (and
246.BR rfreezefs )
247snapshots are mounted on subdirectories below the
248.B SNAPDIR
249.IR "on backup clients" .
250The default is
251.IB mntbkpdir /snap
252where
253.I mntbkpdir
254is the backup mount directory configured at build time.
255.TP
256.B SNAPSIZE
257The volume size option to pass to
258.BR lvcreate (8)
259when creating a snapshot. The default is
260.B \-l10%ORIGIN
261which seems to work fairly well.
262.TP
263.B STOREDIR
264Where the actual backup trees should be stored. See the section on
265.B Archive structure
266below.
267The default is
268.IB mntbkpdir /store
269where
270.I mntbkpdir
271is the backup mount directory configured at build time.
272.TP
273.B HASH
274The hash function to use for verifying archive integrity. This is
275passed to the
276.B \-H
277option of
278.BR fshash ,
279so it must name one of the hash functions supported by your Python's
280.B hashlib
281module. The default is
282.BR sha256 .
283.SS Hook functions
284The configuration file may define shell functions to perform custom
285actions at various points in the backup process.
286.TP
287.BI "backup_precommit_hook " host " " fs " " date
288Called after a backup has been verified complete and about to be
289committed. The backup tree is in
290.B new
291in the current directory, and the
292.B fshash
293manifest is in
294.BR new.fshash .
295A typical action would be to create a digital signature on the
296manifest.
297.TP
298.BI "backup_commit_hook " host " " fs " " date
299Called during the commit procedure. The backup tree and manifest have
300been renamed into their proper places. Typically one would use this
301hook to rename files created by the
302.B backup_precommit_hook
303function.
304.TP
305.BR "whine " [ \-n ] " " \fItext\fR...
306Called to report `interesting' events when the
307.B \-v
308option is in force. The default action is to echo the
309.I text
310to (what was initially) standard output, followed by a newline unless
311.B \-n
312is given.
313.SS Snapshot types
314The following snapshot types are available.
315.TP
316.B live
317A trivial snapshot type: attempts to back up a live filesystem. How
318well this works depends on how active the filesystem is. If files
319change while the dump is in progress then the
320.B fshash
321verification will likely fail. Backups using this snapshot type must
322specify the filesystem mount point as the
323.IR fsarg .
324.TP
325.B ro
326A slightly less trivial snapshot type: make the filesystem read-only
327while the dump is in progress. Backups using this snapshot type must
328specify the filesystem mount point as the
329.IR fsarg .
330.TP
331.BI "lvm " vg
332Create snapshots using LVM. The snapshot argument is interpreted as the
333relevant volume group. The filesystem name is interpreted as the origin
334volume name; the snapshot will be called
335.IB fs .bkp
336and mounted on
337.IB SNAPDIR / fs \fR;
338space will be allocated to it according to the
339.I SNAPSIZE
340variable.
341.TP
342.BI "rfreezefs " client " " vg
343This gets complicated. Suppose that a server has an LVM volume group,
344and exports (somehow) a logical volume to a client. Examples are a host
345providing a virtual disk to a guest, or a server providing
346network-attached storage to a client. The server can create a snapshot
347of the volume using LVM, but must synchronize with the client to ensure
348that the filesystem image captured in the snapshot is clean. The
349.BR rfreezefs (8)
350program should be installed on the client to perform this rather
351delicate synchronization. Declare the server using the
352.B host
353command as usual; pass the client's name as the
354.I client
355and the
356server's volume group name as the
357.I vg
358snapshot arguments. Finally, backups using this snapshot type must
359specify the filesystem mount point (or, actually, any file in the
360filesystem) on the client, as the
361.IR fsarg .
362.PP
363Additional snapshot types can be defined in the configuration file. A
364snapshot type requires two shell functions.
365.TP
366.BI snap_ type " " snapargs " " fs " " fsarg
367Create the snapshot, and write the mountpoint (on the client host) to
368standard output, in a form suitable as an argument to
369.BR rsync .
370.TP
371.BI unsnap_ type " " snapargs " " fs " " fsarg
372Remove the snapshot.
373.PP
374There are a number of utility functions which can be used by snapshot
375type handlers: please see the script for details. Please send the
376author interesting snapshot handlers for inclusion in the main
377distribution.
378.SS Archive structure
69305044
MW
379Backup trees are stored in a fairly straightforward directory tree.
380.PP
381At the top level is one directory for each client host. There are also
382some special entries:
383.TP
6037bdb3
MW
384.B \&.rsync-backup-store
385This file must be present in order to indicate that a backup volume is
386present (and not just an empty mount point).
387.TP
69305044
MW
388.B fshash.cache
389The cache database used for improving performance of local file
390hashing. There may be other
391.B fshash.cache-*
392files used by SQLite for its own purposes.
393.TP
394.B lost+found
395Part of the filesystem used on the backup volume. You don't want to
396mess with this.
397.TP
398.B tmp
399Used to store temporary files during the backup process. (Some of them
400want to be on the same filesystem as the rest of the backup.) When
401things go wrong, files are left behind in the hope that they might help
402someone debug the mess. It's always safe to delete the files in here
403when no backup is running.
404.PP
405So don't use those names for your hosts.
406.PP
407The next layer down contains a directory for each filesystem on the given host.
408.PP
409The bottom layer contains a directory for each dump of that filesystem,
410named with the date at which the dump was started (in ISO8601
411.IB yyyy \(en mm \(en dd
412format), together with associated files named
413.IB date .* \fR.
414.SH SEE ALSO
415.BR fshash (1),
416.BR lvm (8),
417.BR rfreezefs (8),
418.BR rsync (1),
419.BR ssh (1).
420.SH AUTHOR
421Mark Wooding, <mdw@distorted.org.uk>