chiark / gitweb /
rfreezefs.c: Minor formatting tweak.
[rsync-backup] / fshash.1
CommitLineData
69305044
MW
1.ie t .ds o \(bu
2.el .ds o o
3.de hP
4.IP
5\h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
6..
7.TH fshash 1 "8 October 2012" rsync-backup
8.SH SYNOPSIS
9.B fshash
10.RB [ \-a ]
11.RB [ \-c
12.IR cache ]
13.RB [ \-f
14.IR format ]
15.RB [ \-H
16.IR hash ]
17.RI [ file
18\&...]
19.SH DESCRIPTION
20The
21.B fshash
4d1e50d8
MW
22program generates digests of filesystems. It's similar in concept to
23(but somewhat different from) Ian Jackson's
69305044
MW
24.BR summer (1)
25tool.
26.PP
27The idea is to capture everything interesting about a filesystem in a
28file with the following properties:
29.TP
30.I Completeness
31The digest file describes everything `interesting' about the filesystem,
32such that two filesystems which are interestingly different will have
33distinct digests.
34.TP
35.I Canonicalness
36If two filesystems aren't different in any interesting way, then their
37digests should be identical.
38.TP
39.I Readability
40Given two subtly different filesystems, it's easy for a human equipped
41with digests for them and
42.BR diff (1)
43to work out what the differences actually are.
44.SS Command-line processing
45The following command-line arguments are accepted.
46.TP
47.B \-h, \-\-help
48Show a summary of the command-line syntax, and exit successfully.
49.TP
50.B \-\-version
51Show the program's version number, and exit successfully.
52.TP
53.B \-a, \-\-all
54Clear the cache of information about all files except those processed in
55this run.
56.TP
57.B \-c, \-\-cache=\fIfile
58Keep a cache of file hashes in the
59.IR file .
60The cache is keyed by inode and modification time: if a file has an
61entry in the cache already then it won't be hashed again, which can
62provide a valuable performance improvement on large filesystems. If the
63.I file
64doesn't exist, then it will be created.
65.TP
66.B \-f, \-\-files=\fIformat
67Read a list of filenames on standard input in the given
68.I format
69and write digest lines for them. The
70.I format
71may be:
72.B find0
73for simple null-terminated names, as produced by
74.BR "find \-\-print0" ;
75or
76.B rsync
77for file data as produced by
78.BR rsync (1).
79The latter is useful, since
80.B rsync
81has powerful file inclusion and exclusion capabilities \(en and a common
82use case is generating a digest for a collection of files copied using
83.BR rsync .
84(The
85.B find0
86format doesn't work well: see
87.B BUGS
88below.)
89.TP
90.B \-H, \-\-hash=\fIhash
91Use the
92.I hash
93function, which can be any hash function supported by Python's
94.BR hashlib .
915b95f4
MW
95This option may be omitted: if it is, then the hash is read from the
96cache file; if there is no cache file either, then an error is reported.
69305044
MW
97.PP
98Positional arguments are interpreted as files and directories to be
99processed, in order. A directory name which ends in
100.RB ` / '
101is treated specially:
102.B fshash
103writes filenames relative to the given directory.
104.SS Output format
105Information about each filesystem object is written on a separate line.
106These lines can be quite long, and consist of a number of fields:
107.hP 1.
108For regular files, a cryptographic hash of the file's content, in
109hexadecimal. For other kinds of filesystem object, a description of the
110object type and any special information about it, in square brackets,
111and padded with spaces so as to take the same width as a hash; see
112below for details.
113as follows.
114.hP 2.
115A `virtual inode identifier': a string which will be the same in two
116lines if and only if they represent hard links to the same underlying
117inode. Some care is taken so that files are assigned the same
118identifier even if other parts of the filesystem are different, so as to
119avoid spurious differences.
120.hP 3.
121The object's permissions and mode bits, in octal.
122.hP 4.
123The file's owner and group, in decimal, separated by a colon.
124.hP 5.
125The file's last-modified time, in UTC, in ISO8601 format, i.e.,
126.IB yyyy \(en mm \(en dd T hh : mm : ss Z \fR.
127.hP 6.
128The file's size in bytes, in decimal.
129.hP 7.
130The file's name (relative to some appropriate parent directory).
131Characters which
132would cause ambiguity are escaped: tab, linefeed and carriage return are
133printed as
134.RB ` \et ',
135.RB ` \en ',
136and
137.RB ` \er ',
138respectively;
139.RB ` ' '
140is printed as
141.RB ` \e' ';
142.RB ` \e '
143is printed as
144.RB ` \e\e ';
145and other codes outside the range 32\(en127 are printed as hex escaped,
146in the form
147.RB ` \ex\fIxx '.
148Finally, the sequence
149.RB ` \~\->\~ '
150is printed as
151.RB ` \~\e\->\~ '
152so that symlink targets are presented unambiguously (see below).
153.PP
154For non-regular file objects, the first field is an information field
155enclosed in square brackets, and some of the other fields provide other
156information or are suppressed, follows.
157.TP
158.I Errors
159If there was an error reading the object's metadata then the information
160field shows
4d1e50d8 161.BI E nn
69305044
MW
162.IR message ,
163and the other fields, except the name, are printed as
164.B error
165rather than having any useful information.
166.TP
167.I Sockets
168The information field shows
169.BR socket .
170.TP
171.I Named pipes
172The information field shows
173.BR fifo .
174.TP
175.I Symbolic links
176The information field shows
177.BR symbolic-link .
178The name is followed by
179.RB ` \~\->\~ '
915b95f4 180and the link target (or
69305044
MW
181.BI <E nn \~ message >
182if there was an error reading the link destination).
183.TP
184.I Directories
185The information field shows
186.BR directory ,
187and the size field shows
188.B dir
189(since directory sizes are not consistent across filesystem
190implementations). The name is followed by
191.RB ` / '.
192.TP
193.I Block and character devices
194The information field shows
195.B block-device
196or
197.BR character-device ,
198as appropriate, followed by the major and minor device numbers in
199decimal, and separated by a colon.
200.PP
201.SH BUGS
202No attempt is made to sort filenames read in
203.B find0
204format, so they're not very likely to match digests produced any other
205way. Indeed, they're not very likely to match digests produced by
206.B find0
207on other machines either.
208.SH SEE ALSO
209.BR find (1),
210.BR rsync (1),
211.BR sha256sum (1)
212etc.
213.SH AUTHOR
214Mark Wooding, <mdw@distorted.org.uk>