Commit | Line | Data |
---|---|---|
f6b4ffdc MW |
1 | .TH rfreezefs 8 "October 2011" "rsync-backup" |
2 | .SH NAME | |
3 | rfreezefs \- freeze a filesystem safely | |
4 | .SH SYNOPSIS | |
5 | .B rfreezefs | |
6 | .RB [ \-n ] | |
7 | .RB [ \-a | |
8 | .IR address ] | |
9 | .RB [ \-p | |
10 | .IR loport [\fB\- hiport ]] | |
11 | .I filesystem | |
12 | \&... | |
13 | .SH DESCRIPTION | |
14 | The | |
15 | .B rfreezefs | |
16 | program freezes one or more mounted filesystems for a period of time, | |
17 | and then thaws them. For more detail on what this means, why you'd want | |
18 | to, and how you might go about using | |
19 | .B rfreezefs | |
20 | to do it, see below. | |
21 | .PP | |
22 | The following command-line options are recognized. | |
23 | .TP | |
24 | .B "\-h, \-\-help" | |
25 | Writes a help message to standard output, and exits with status 0. | |
26 | .TP | |
27 | .B "\-v, \-\-version" | |
28 | Writes the version number to standard output, and exits with status 0. | |
29 | .TP | |
30 | .B "\-u, \-\-usage" | |
31 | Writes a command-line usage synopsis to standard output, and exits with | |
32 | status 0. | |
33 | .TP | |
34 | .BI "\-a, \-\-address=" address | |
35 | Listen only for incoming connections to the given | |
36 | .IR address . | |
37 | The default is to listen for connections to any local address. | |
38 | .TP | |
39 | .B "\-n, \-\-not-really" | |
40 | Don't actually freeze or thaw any filesystems; instead, write messages | |
41 | to standard error explaining what would be done. | |
42 | .TP | |
43 | .BI "\-p, \-\-port-range=" loport\fR[ \- hiport \fR]] | |
44 | Listen for incoming connections on a port between | |
45 | .I loport | |
46 | and | |
47 | .IR hiport . | |
48 | If | |
49 | .I hiport | |
50 | is omitted, listen for connections only on | |
51 | .IR loport . | |
52 | The default is to allow the kernel a free choice of local port number. | |
53 | .PP | |
54 | The | |
55 | .I filesystem | |
56 | arguments name the filesystems to be frozen. There must be at least one | |
57 | such argument. It's conventional to name the filesystem mount points, | |
58 | though actually any file or directory in the filesystem will do. The | |
59 | files are opened read-only. | |
60 | .PP | |
61 | The | |
62 | .B rfreezefs | |
63 | program starts, parses its command line, opens the named files, and | |
64 | creates a listening TCP socket according to the command-line options. | |
65 | It then prints a sequence of lines to standard output, which may have | |
66 | one of the following forms. | |
67 | .TP | |
68 | .BI "PORT " port | |
69 | Announces the TCP | |
70 | .I port | |
71 | number on which that | |
72 | .B rfreezefs | |
73 | is listening for incoming connections. | |
74 | .TP | |
75 | .BI "TOKEN " label " " token | |
76 | Declares a `token': a randomly chosen string which is to be used in the | |
77 | network connection. The token's value is | |
78 | .IR token : | |
79 | token values are a sequence of non-whitespace printable ASCII | |
80 | characters, but their precise structure is not specified. The token | |
81 | value will have the meaning given by the | |
82 | .IR label , | |
83 | which is one of the token labels described below. | |
84 | .TP | |
85 | .B READY | |
86 | Marks the end of the lines and announces that | |
87 | .B rfreezefs | |
88 | is ready to accept connections. | |
89 | .PP | |
90 | These lines may be sent in any order, except that | |
91 | .B READY | |
92 | is always last. There may be many | |
93 | .B TOKEN | |
94 | lines. | |
95 | .PP | |
96 | Network communications use a simple plain-text line-oriented protocol. | |
97 | Each line consists of a token, optionally followed by a carriage return | |
98 | (code 13), followed by a linefeed (code 10). No other whitespace is | |
99 | permitted. The tokens allowed are precisely those announced in the | |
100 | .B TOKEN | |
101 | lines written to | |
102 | .BR rfreezefs 's | |
103 | standard output. Furthermore, only certain tokens are valid at | |
104 | particular points in the protocol. For reference, the token labels, and | |
105 | the meanings of the corresponding tokens, are as follows. | |
106 | .TP | |
107 | .B FREEZE | |
108 | Sent by a client to freeze the filesystems. This must be the first | |
109 | token transmitted by the client. On receipt, | |
110 | .B rfreezefs | |
111 | will close its listening socket and any other client connections. It | |
112 | will then freeze the filesystems. | |
113 | .TP | |
114 | .B FROZEN | |
115 | Sent by | |
116 | .B rfreezefs | |
117 | to indicate successful freezing of the filesystem. | |
118 | .TP | |
119 | .B KEEPALIVE | |
120 | Sent periodically by the client to prevent filesystems being thawed due | |
121 | to a timeout. No explicit acknowledgement is sent. | |
122 | .TP | |
123 | .B THAW | |
124 | Sent by the client to request thawing of the filesystems. | |
125 | .TP | |
126 | .B THAWED | |
127 | Sent by | |
939429c0 MW |
128 | .B rfreezefs |
129 | to indicate successful thawing of the filesystems in response to | |
f6b4ffdc MW |
130 | .BR THAW . |
131 | .PP | |
132 | The high-level structure of the protocol is then as follows: the client | |
133 | sends | |
134 | .BR FREEZE ; | |
135 | the server freezes and responds with | |
136 | .BR FROZEN ; | |
137 | the client optionally sends | |
138 | .B KEEPALIVE | |
139 | at intervals; the client finally sends | |
140 | .BR THAW ; | |
141 | and the server responds with | |
142 | .B THAWED | |
143 | and drops the connection. | |
144 | .PP | |
145 | If sufficient time passes without | |
146 | .B rfreezefs | |
147 | receiving either | |
148 | .B THAW | |
149 | or | |
150 | .B KEEPALIVE | |
151 | tokens, or an invalid token is received, or it receives one of a number | |
939429c0 | 152 | of signals \(en currently |
f6b4ffdc MW |
153 | .BR SIGINT , |
154 | .BR SIGQUIT , | |
155 | .BR SIGTERM , | |
156 | .BR SIGHUP , | |
157 | .BR SIGALRM , | |
158 | .BR SIGILL , | |
159 | .BR SIGSEGV , | |
160 | .BR SIGBUS , | |
161 | .BR SIGFPE , | |
162 | or | |
939429c0 MW |
163 | .B SIGABRT |
164 | \(en then | |
f6b4ffdc MW |
165 | .B rfreezefs |
166 | will thaw the filesystems and report a failure. | |
167 | .PP | |
168 | Diagnostics are reported to standard error. Exit statuses have specific | |
169 | meanings: | |
170 | .TP | |
171 | .B 0 | |
172 | Successful completion. Filesystems were frozen and thawed as required. | |
173 | .TP | |
174 | .B 1 | |
175 | Problem with command-line arguments. No filesystems were frozen. | |
176 | .TP | |
177 | .B 2 | |
178 | Environmental problem, typically a system call failure: e.g., a file | |
179 | failed to open, or there was a problem with the network communications. | |
180 | Either no filesystems were frozen, or all filesystems were successfully | |
181 | thawed again. | |
182 | .TP | |
183 | .B 3 | |
dfc3e9b1 MW |
184 | Timeout or invalid data. Either no connections containing the |
185 | .B FREEZE | |
186 | token were made in time, or no data was received for a long enough | |
187 | period after the filesystems were frozen, or an invalid token was | |
188 | received. In the first case, no filesystems were frozen; in the other | |
189 | two cases, the filesystems were successfully thawed. | |
f6b4ffdc MW |
190 | .TP |
191 | .B 4 | |
192 | Crash. The | |
193 | .B rfreezefs | |
194 | program received a fatal signal after it had started to freeze | |
195 | filesystems. Under these circumstances, it thaws the filesystems, | |
196 | removes the signal handler, and sends itself the signal again, but if | |
197 | that doesn't work then | |
198 | .B rfreezefs | |
199 | exits with this status code. All frozen filesystems were successfully | |
200 | thawed again. | |
201 | .TP | |
202 | .B 112 | |
203 | Failure during filesystem thaw (mnemonic: European emergency number). | |
204 | Some filesystems | |
205 | .I failed | |
206 | to thaw, and are still frozen. You might have some joy with | |
207 | .BR SysRq-j , | |
208 | though in the author's experience that doesn't work and you'll probably | |
209 | have to reboot. At least your filesystems are consistent... | |
210 | .SS Background | |
211 | When frozen, a filesystem's backing block device is put in a consistent | |
212 | state (as if unmounted), and write operations to it are delayed until | |
213 | the filesystem is thawed again. In the meantime, it's possible to take | |
214 | a consistent snapshot of the block device. When a filesystem is | |
215 | directly mounted on an LVM logical volume, the kernel detects this | |
216 | situation and automatically freezes the filesystem while the snapshot is | |
217 | being prepared. If the logical volume and filesystem are on separate | |
218 | hosts, though, the filesystem must be frozen manually, which is why | |
219 | .B rfreezefs | |
220 | is useful. | |
221 | .PP | |
222 | The idea is to run | |
223 | .B rfreezefs | |
224 | using | |
225 | .BR ssh (1) | |
226 | or | |
227 | .BR userv (1), | |
228 | or some other means of acquiring the necessary privilege level. You | |
229 | read the port number and tokens, connect to the socket, and send the | |
230 | .B FREEZE | |
231 | token followed by a newline. You now wait to receive the | |
232 | .B FROZEN | |
233 | token from | |
234 | .BR rfreezefs . | |
235 | Once you have received this, the filesystems are frozen: you can safely | |
236 | take snapshots. If this will take an extended amount of time, you | |
237 | should send | |
238 | .B KEEPALIVE | |
239 | tokens to the connection at intervals in order to prevent | |
240 | .B rfreezefs | |
241 | from timing out and thawing the filesystems (but see the | |
242 | .B "Security notes" | |
243 | below). When your snapshot is prepared, sent the | |
244 | .B THAW | |
245 | token, and wait for the | |
246 | .B THAWED | |
247 | token in response. If this is received, the snapshot was completed | |
248 | successfully and the filesystems are properly thawed again. If you | |
249 | don't receive the | |
250 | .B THAWED | |
251 | token then something bad might have happened (e.g., the filesystem might | |
252 | have been prematurely thawed) and the snapshot is suspect. If the exit | |
253 | status is 112 then at least one filesystem is still frozen and some | |
254 | emergency action is needed. If you can't retrieve the exit status then | |
255 | it's possible that your transport is blocked for trying to write to the | |
256 | frozen filesystem (this especially likely if | |
257 | .B / | |
258 | or | |
259 | .B /var | |
260 | is frozen) and you should react as if the status was 112. | |
261 | .SS Security notes | |
262 | The | |
263 | .B rfreezefs | |
264 | program uses randomly chosen tokens to form a simple code which is | |
265 | revealed to the caller. It is assumed that this information is kept | |
266 | secret from adversaries, e.g., by ensuring that it is only transmitted | |
267 | over local pipes (as used by | |
268 | .BR userv (1)) | |
269 | and/or secure network transports such as SSH (see | |
270 | .BR ssh (1)). | |
271 | The author believes that the worst possible outcome is that the host | |
272 | wedges up because an important filesystem is frozen, and | |
273 | .B rfreezefs | |
274 | therefore strives to prevent that from happening. In particular, | |
275 | cryptographic transport implementations such as SSH may attempt to log | |
276 | messages to frozen filesystems or otherwise wedge themselves: | |
277 | .B rfreezefs | |
278 | deliberately uses only kernel-implemented transports for its | |
279 | communication needs once the filesystems are frozen. | |
280 | .PP | |
281 | Most of the tokens are used at most once in the protocol. In | |
282 | particular, the | |
283 | .B FROZEN | |
284 | token can't be sent by an adversary in advance of the filesystem being | |
285 | frozen, since (under the assumption that the tokens are kept secret) it | |
286 | only revealed in the clear after a successful freeze. Similarly, the | |
287 | .B THAWED | |
288 | token is only transmitted if the filesystems are thawed as a result of a | |
289 | .B THAW | |
290 | request (rather than a dropped connection, timeout, or some other | |
291 | problem). If the client only sends the | |
292 | .B THAW | |
293 | request once its snapshot is complete, then a | |
294 | .B THAWED | |
295 | response indicates that the filesystems remained frozen until the | |
296 | snapshot was indeed completed and therefore the snapshot is consistent. | |
297 | .PP | |
298 | The exception is the | |
299 | .B KEEPALIVE | |
300 | token, which may be sent repeatedly. After it is first revealed, an | |
301 | adversary can hijack the connection and replay the | |
302 | .B KEEPALIVE | |
303 | token to keep the filesystems frozen indefinitely. You can recover from | |
304 | this by severing the connection somehow, or by sending | |
305 | .B rfreezefs | |
306 | a signal. It is therefore recommended that | |
307 | .B KEEPALIVE | |
308 | tokens not be sent unless necessary. The timeout is currently set to | |
309 | 60s, which ought to be adequate for most snapshot mechanisms. | |
310 | .SH BUGS | |
311 | There ought to be a better one-time-token protocol for keepalives. I | |
312 | want to keep cryptography out of this program, though. | |
313 | .SH SEE ALSO | |
314 | .BR fsfreeze (8), | |
315 | .BR random (4), | |
316 | .BR lvm (8), | |
317 | .BR ssh (1), | |
318 | .BR userv (1). | |
319 | .SH AUTHOR | |
320 | Mark Wooding, <mdw@distorted.org.uk> |