Commit | Line | Data |
---|---|---|
f6b4ffdc MW |
1 | .TH rfreezefs 8 "October 2011" "rsync-backup" |
2 | .SH NAME | |
3 | rfreezefs \- freeze a filesystem safely | |
4 | .SH SYNOPSIS | |
5 | .B rfreezefs | |
6 | .RB [ \-n ] | |
7 | .RB [ \-a | |
8 | .IR address ] | |
9 | .RB [ \-p | |
10 | .IR loport [\fB\- hiport ]] | |
11 | .I filesystem | |
12 | \&... | |
13 | .SH DESCRIPTION | |
14 | The | |
15 | .B rfreezefs | |
16 | program freezes one or more mounted filesystems for a period of time, | |
17 | and then thaws them. For more detail on what this means, why you'd want | |
18 | to, and how you might go about using | |
19 | .B rfreezefs | |
20 | to do it, see below. | |
21 | .PP | |
22 | The following command-line options are recognized. | |
23 | .TP | |
24 | .B "\-h, \-\-help" | |
25 | Writes a help message to standard output, and exits with status 0. | |
26 | .TP | |
27 | .B "\-v, \-\-version" | |
28 | Writes the version number to standard output, and exits with status 0. | |
29 | .TP | |
30 | .B "\-u, \-\-usage" | |
31 | Writes a command-line usage synopsis to standard output, and exits with | |
32 | status 0. | |
33 | .TP | |
34 | .BI "\-a, \-\-address=" address | |
35 | Listen only for incoming connections to the given | |
36 | .IR address . | |
37 | The default is to listen for connections to any local address. | |
38 | .TP | |
39 | .B "\-n, \-\-not-really" | |
40 | Don't actually freeze or thaw any filesystems; instead, write messages | |
41 | to standard error explaining what would be done. | |
42 | .TP | |
43 | .BI "\-p, \-\-port-range=" loport\fR[ \- hiport \fR]] | |
44 | Listen for incoming connections on a port between | |
45 | .I loport | |
46 | and | |
47 | .IR hiport . | |
48 | If | |
49 | .I hiport | |
50 | is omitted, listen for connections only on | |
51 | .IR loport . | |
52 | The default is to allow the kernel a free choice of local port number. | |
53 | .PP | |
54 | The | |
55 | .I filesystem | |
56 | arguments name the filesystems to be frozen. There must be at least one | |
57 | such argument. It's conventional to name the filesystem mount points, | |
58 | though actually any file or directory in the filesystem will do. The | |
59 | files are opened read-only. | |
60 | .PP | |
61 | The | |
62 | .B rfreezefs | |
63 | program starts, parses its command line, opens the named files, and | |
64 | creates a listening TCP socket according to the command-line options. | |
65 | It then prints a sequence of lines to standard output, which may have | |
66 | one of the following forms. | |
67 | .TP | |
68 | .BI "PORT " port | |
69 | Announces the TCP | |
70 | .I port | |
71 | number on which that | |
72 | .B rfreezefs | |
73 | is listening for incoming connections. | |
74 | .TP | |
75 | .BI "TOKEN " label " " token | |
76 | Declares a `token': a randomly chosen string which is to be used in the | |
77 | network connection. The token's value is | |
78 | .IR token : | |
79 | token values are a sequence of non-whitespace printable ASCII | |
80 | characters, but their precise structure is not specified. The token | |
81 | value will have the meaning given by the | |
82 | .IR label , | |
83 | which is one of the token labels described below. | |
84 | .TP | |
85 | .B READY | |
86 | Marks the end of the lines and announces that | |
87 | .B rfreezefs | |
88 | is ready to accept connections. | |
89 | .PP | |
90 | These lines may be sent in any order, except that | |
91 | .B READY | |
92 | is always last. There may be many | |
93 | .B TOKEN | |
94 | lines. | |
95 | .PP | |
96 | Network communications use a simple plain-text line-oriented protocol. | |
97 | Each line consists of a token, optionally followed by a carriage return | |
98 | (code 13), followed by a linefeed (code 10). No other whitespace is | |
99 | permitted. The tokens allowed are precisely those announced in the | |
100 | .B TOKEN | |
101 | lines written to | |
102 | .BR rfreezefs 's | |
103 | standard output. Furthermore, only certain tokens are valid at | |
104 | particular points in the protocol. For reference, the token labels, and | |
105 | the meanings of the corresponding tokens, are as follows. | |
106 | .TP | |
107 | .B FREEZE | |
108 | Sent by a client to freeze the filesystems. This must be the first | |
109 | token transmitted by the client. On receipt, | |
110 | .B rfreezefs | |
111 | will close its listening socket and any other client connections. It | |
112 | will then freeze the filesystems. | |
113 | .TP | |
114 | .B FROZEN | |
115 | Sent by | |
116 | .B rfreezefs | |
117 | to indicate successful freezing of the filesystem. | |
118 | .TP | |
119 | .B KEEPALIVE | |
120 | Sent periodically by the client to prevent filesystems being thawed due | |
121 | to a timeout. No explicit acknowledgement is sent. | |
122 | .TP | |
123 | .B THAW | |
124 | Sent by the client to request thawing of the filesystems. | |
125 | .TP | |
126 | .B THAWED | |
127 | Sent by | |
128 | .B rfreezefs to indicate successful thawing of the filesystems in response to | |
129 | .BR THAW . | |
130 | .PP | |
131 | The high-level structure of the protocol is then as follows: the client | |
132 | sends | |
133 | .BR FREEZE ; | |
134 | the server freezes and responds with | |
135 | .BR FROZEN ; | |
136 | the client optionally sends | |
137 | .B KEEPALIVE | |
138 | at intervals; the client finally sends | |
139 | .BR THAW ; | |
140 | and the server responds with | |
141 | .B THAWED | |
142 | and drops the connection. | |
143 | .PP | |
144 | If sufficient time passes without | |
145 | .B rfreezefs | |
146 | receiving either | |
147 | .B THAW | |
148 | or | |
149 | .B KEEPALIVE | |
150 | tokens, or an invalid token is received, or it receives one of a number | |
151 | of signals, currently | |
152 | .BR SIGINT , | |
153 | .BR SIGQUIT , | |
154 | .BR SIGTERM , | |
155 | .BR SIGHUP , | |
156 | .BR SIGALRM , | |
157 | .BR SIGILL , | |
158 | .BR SIGSEGV , | |
159 | .BR SIGBUS , | |
160 | .BR SIGFPE , | |
161 | or | |
162 | .BR SIGABRT , | |
163 | .B rfreezefs | |
164 | will thaw the filesystems and report a failure. | |
165 | .PP | |
166 | Diagnostics are reported to standard error. Exit statuses have specific | |
167 | meanings: | |
168 | .TP | |
169 | .B 0 | |
170 | Successful completion. Filesystems were frozen and thawed as required. | |
171 | .TP | |
172 | .B 1 | |
173 | Problem with command-line arguments. No filesystems were frozen. | |
174 | .TP | |
175 | .B 2 | |
176 | Environmental problem, typically a system call failure: e.g., a file | |
177 | failed to open, or there was a problem with the network communications. | |
178 | Either no filesystems were frozen, or all filesystems were successfully | |
179 | thawed again. | |
180 | .TP | |
181 | .B 3 | |
182 | Timeout or invalid data. Either no connections containing the cookie | |
183 | were made in time, or no data was received for a long enough period | |
184 | after the filesystems were frozen, or an invalid token was received. In | |
185 | the first case, no filesystems were frozen; in the other two cases, the | |
186 | filesystems were successfully thawed. | |
187 | .TP | |
188 | .B 4 | |
189 | Crash. The | |
190 | .B rfreezefs | |
191 | program received a fatal signal after it had started to freeze | |
192 | filesystems. Under these circumstances, it thaws the filesystems, | |
193 | removes the signal handler, and sends itself the signal again, but if | |
194 | that doesn't work then | |
195 | .B rfreezefs | |
196 | exits with this status code. All frozen filesystems were successfully | |
197 | thawed again. | |
198 | .TP | |
199 | .B 112 | |
200 | Failure during filesystem thaw (mnemonic: European emergency number). | |
201 | Some filesystems | |
202 | .I failed | |
203 | to thaw, and are still frozen. You might have some joy with | |
204 | .BR SysRq-j , | |
205 | though in the author's experience that doesn't work and you'll probably | |
206 | have to reboot. At least your filesystems are consistent... | |
207 | .SS Background | |
208 | When frozen, a filesystem's backing block device is put in a consistent | |
209 | state (as if unmounted), and write operations to it are delayed until | |
210 | the filesystem is thawed again. In the meantime, it's possible to take | |
211 | a consistent snapshot of the block device. When a filesystem is | |
212 | directly mounted on an LVM logical volume, the kernel detects this | |
213 | situation and automatically freezes the filesystem while the snapshot is | |
214 | being prepared. If the logical volume and filesystem are on separate | |
215 | hosts, though, the filesystem must be frozen manually, which is why | |
216 | .B rfreezefs | |
217 | is useful. | |
218 | .PP | |
219 | The idea is to run | |
220 | .B rfreezefs | |
221 | using | |
222 | .BR ssh (1) | |
223 | or | |
224 | .BR userv (1), | |
225 | or some other means of acquiring the necessary privilege level. You | |
226 | read the port number and tokens, connect to the socket, and send the | |
227 | .B FREEZE | |
228 | token followed by a newline. You now wait to receive the | |
229 | .B FROZEN | |
230 | token from | |
231 | .BR rfreezefs . | |
232 | Once you have received this, the filesystems are frozen: you can safely | |
233 | take snapshots. If this will take an extended amount of time, you | |
234 | should send | |
235 | .B KEEPALIVE | |
236 | tokens to the connection at intervals in order to prevent | |
237 | .B rfreezefs | |
238 | from timing out and thawing the filesystems (but see the | |
239 | .B "Security notes" | |
240 | below). When your snapshot is prepared, sent the | |
241 | .B THAW | |
242 | token, and wait for the | |
243 | .B THAWED | |
244 | token in response. If this is received, the snapshot was completed | |
245 | successfully and the filesystems are properly thawed again. If you | |
246 | don't receive the | |
247 | .B THAWED | |
248 | token then something bad might have happened (e.g., the filesystem might | |
249 | have been prematurely thawed) and the snapshot is suspect. If the exit | |
250 | status is 112 then at least one filesystem is still frozen and some | |
251 | emergency action is needed. If you can't retrieve the exit status then | |
252 | it's possible that your transport is blocked for trying to write to the | |
253 | frozen filesystem (this especially likely if | |
254 | .B / | |
255 | or | |
256 | .B /var | |
257 | is frozen) and you should react as if the status was 112. | |
258 | .SS Security notes | |
259 | The | |
260 | .B rfreezefs | |
261 | program uses randomly chosen tokens to form a simple code which is | |
262 | revealed to the caller. It is assumed that this information is kept | |
263 | secret from adversaries, e.g., by ensuring that it is only transmitted | |
264 | over local pipes (as used by | |
265 | .BR userv (1)) | |
266 | and/or secure network transports such as SSH (see | |
267 | .BR ssh (1)). | |
268 | The author believes that the worst possible outcome is that the host | |
269 | wedges up because an important filesystem is frozen, and | |
270 | .B rfreezefs | |
271 | therefore strives to prevent that from happening. In particular, | |
272 | cryptographic transport implementations such as SSH may attempt to log | |
273 | messages to frozen filesystems or otherwise wedge themselves: | |
274 | .B rfreezefs | |
275 | deliberately uses only kernel-implemented transports for its | |
276 | communication needs once the filesystems are frozen. | |
277 | .PP | |
278 | Most of the tokens are used at most once in the protocol. In | |
279 | particular, the | |
280 | .B FROZEN | |
281 | token can't be sent by an adversary in advance of the filesystem being | |
282 | frozen, since (under the assumption that the tokens are kept secret) it | |
283 | only revealed in the clear after a successful freeze. Similarly, the | |
284 | .B THAWED | |
285 | token is only transmitted if the filesystems are thawed as a result of a | |
286 | .B THAW | |
287 | request (rather than a dropped connection, timeout, or some other | |
288 | problem). If the client only sends the | |
289 | .B THAW | |
290 | request once its snapshot is complete, then a | |
291 | .B THAWED | |
292 | response indicates that the filesystems remained frozen until the | |
293 | snapshot was indeed completed and therefore the snapshot is consistent. | |
294 | .PP | |
295 | The exception is the | |
296 | .B KEEPALIVE | |
297 | token, which may be sent repeatedly. After it is first revealed, an | |
298 | adversary can hijack the connection and replay the | |
299 | .B KEEPALIVE | |
300 | token to keep the filesystems frozen indefinitely. You can recover from | |
301 | this by severing the connection somehow, or by sending | |
302 | .B rfreezefs | |
303 | a signal. It is therefore recommended that | |
304 | .B KEEPALIVE | |
305 | tokens not be sent unless necessary. The timeout is currently set to | |
306 | 60s, which ought to be adequate for most snapshot mechanisms. | |
307 | .SH BUGS | |
308 | There ought to be a better one-time-token protocol for keepalives. I | |
309 | want to keep cryptography out of this program, though. | |
310 | .SH SEE ALSO | |
311 | .BR fsfreeze (8), | |
312 | .BR random (4), | |
313 | .BR lvm (8), | |
314 | .BR ssh (1), | |
315 | .BR userv (1). | |
316 | .SH AUTHOR | |
317 | Mark Wooding, <mdw@distorted.org.uk> |