chiark - git - mdw - preload-hacks/blob - README

   1 PRELOAD-HACKS
   2 ~~~~~~~~~~~~~
   3
   4 What is it?
   5
   6         The preload-hacks distribution contains a couple of LD_PRELOAD-
   7         able libraries which I find very useful.  Well, one useful one,
   8         and one which is really handy in theory.
   9
  10         uopen   Traps when a process is trying to open(2) a Unix-domain
  11                 socket, and does the appropriate socket(2)/connect(2)
  12                 dance instead.  I've no idea why it doesn't work like
  13                 this in the first place.
  14
  15         noip    Traps when a process is trying to make an Internet
  16                 socket, and makes a Unix-domain socket instead.
  17
  18         The first one is the one which is useful in theory but I've not
  19         really made much use of in practice.
  20
  21
  22 uopen
  23
  24         The main use-case is variable signatures.  Many mail and news
  25         clients nowadays have built-in sigmonsters, which choose a
  26         .signature at random from a collection.  Some don't, of course,
  27         which is a shame.  It would be nice if the sigmonster was
  28         detachable, so you could just write a sigmonster and attach it
  29         to your favourite newsreader.  It would extra nice if
  30         newsreaders (and mail clients) don't have to use some kind of
  31         weirdo sigmonster interface just to do this stupid thing with
  32         .signatures.
  33
  34         All mail and news clients know how to read a .signature file.
  35         It's why it's got that name.  So the right answer seems to be to
  36         make this file magically have different contents each time it's
  37         read.  Noticing when someone tries to read a regular file is
  38         just awful so let's not think about that idea any more.  We
  39         could make .signature be a named pipe; but named pipe servers
  40         are very difficult to get right when there are multiple
  41         simultaneous clients.  Sockets are, of course, the right answer
  42         when client/server architectures come up.  And we've got a
  43         convenient way of stashing sockets in the filesystem: PF_UNIX
  44         sockets.
  45
  46         So, we write our sigmonster:
  47
  48         $ fwd -d from unix:$HOME/.signature to exec.fortune
  49
  50         And now we check to see whether it works.
  51
  52         $ cat ~/.signature
  53         cat: /home/mdw/.signature: No such device or address
  54
  55         Hmm.  That blows.  Surely it's obvious how to read from a
  56         socket.  But, no, the kernel won't do the socket/connect thing
  57         for us.
  58
  59         Enter uopen.
  60
  61         $ uopen cat ~/.signature
  62         Noise proves nothing.  Often a hen who has merely laid an egg
  63         cackles as if she laid an asteroid.
  64                 -- Mark Twain
  65
  66         Joy!
  67
  68         This isn't perfect.  The file is weird and not a proper file.
  69         Emacs will refuse to visit it as a result.  But it /will/
  70         happily insert the contents of the file into existing buffers.
  71         Hopefully other editors are similar.  `less' wants the -f option
  72         before it will bother.  But actually it works pretty well.
  73
  74         The right place for the functionality of uopen is in the kernel.
  75         It shouldn't be difficult.  I even submitted a patch to the
  76         Linux kernel list to do precisely that, once, back in the days
  77         of 2.0.x.  It was ignored, and I gave up; the patch bitrotted
  78         hopelessly.  My LD_PRELOAD hack still works.  There's no
  79         configuration.  It just works.
  80
  81         My .signature has been `-- [mdw]' for years now, and that's
  82         unlikely to change.  So I don't actually use uopen very much.
  83         But it's cool to know that it exists.
  84
  85
  86 noip
  87
  88   The basic idea
  89
  90         This I use every day.  All the time.  Here's the use case.
  91         We'll see some more examples later.
  92
  93         Some random program has a client/server split between the main
  94         guts of the thing and its user interface, and the two
  95         communicate over TCP sockets.  There are lots of examples: SLIME
  96         (the Superior Lisp Interaction Mode for Emacs) runs a Common
  97         Lisp system as a separate process.  The SAGE notebook runs a web
  98         server and you're meant to use a Javascript-supporting web
  99         browser to drive it.  All sorts of stuff.  Usually the
 100         programmer knew just enough to remember to bind the server's
 101         listening socket to 127.0.0.1, to stop everyone on the Internet
 102         from connecting, but often the security consciousness stops
 103         there.  If you're very lucky, there's some sort of password
 104         mechanism.
 105
 106         The problem, of course, occurs on a multi-user system.  Binding
 107         to localhost doesn't stop any other user of the same machine
 108         from connecting.  In the cases of SLIME and SAGE, this is a big
 109         problem: both provide a full programming environment
 110         (respectively Common Lisp and Python) which would let an
 111         attacker do anything he likes in your name.
 112
 113         Passwords are wretched as a security mechanism.  Besides, I
 114         shouldn't need a damned password to talk to one of my own
 115         processes from another one of my own processes!  The operating
 116         system should be able to ensure that processes owned by the same
 117         user can communicate securely.  There's a whole filesystem with
 118         access control and everything.
 119
 120         The right answer is to use Unix-domain sockets, which live in
 121         the filesystem and have proper access control applied to them.
 122         But programmers are lazy, and Unix-domain sockets don't exist on
 123         Windows (well, unless you install Cygwin, but I can see why
 124         that's an unpopular idea).
 125
 126         The noip LD_PRELOAD hack intercepts the socket(2) system call.
 127         If the process is asking for a PF_INET socket, then it hands out
 128         a PF_UNIX socket instead.  If the process tries to bind(2) its
 129         socket to 127.0.0.1:12345, say, then noip binds it to
 130         /tmp/noip-USER/127.0.0.1:12345 instead (having previously
 131         created the directory /tmp/noip-USER and made sure that nobody
 132         else can get to it).  If the process tries to connect(2)
 133         somewhere, noip fixes up the address.  The noip hack intercepts
 134         14 different system calls in order to prevent its systematic
 135         dishonesty from being discovered.
 136
 137   Configuration
 138
 139         Running a program under noip effectively only allows it to talk
 140         to other programs running under noip.  This is sort of the idea,
 141         but it's rather restrictive in practice.  I can happily run
 142
 143         $ noip emacs
 144
 145         and start up SLIME, and Emacs and SLIME will communicate
 146         securely over a Unix-domain socket without either of them
 147         noticing.  But now Emacs can't talk to anything other than
 148         SLIME, which makes w3m-el less useful than it used to be, and,
 149         worse, my Common Lisp programs can't talk to anything external
 150         either, which may make writing network-aware Lisp programs
 151         annoying.
 152
 153         It gets worse with SAGE.  I can run
 154
 155         $ noip sage -notebook
 156
 157         and in another window
 158
 159         $ noip iceweasel http://localhost:8000/
 160
 161         (or Firefox, on Ubuntu), and the two will communicate happily.
 162         But now my Iceweasel is crippled and can't actually talk to the
 163         rest of the Internet.  The point of the exercise was to make my
 164         SAGE process secure, not to make me run two copies of Iceweasel
 165         and have to cope with the inevitable profile fork.
 166
 167         So noip can be configured.  It still defaults to safety:
 168         whenever the process asks for a new Internet socket, noip hands
 169         it a fake plastic Unix-domain socket instead.  But when the
 170         process tries to bind or connect its socket, noip will look the
 171         address up in a list decide what to do.  If the result comes
 172         back `allow', then noip will do a three-card Monte, rustling up
 173         a real PF_INET socket and replacing the plastic imitation; if
 174         the result comes back `deny' then noip will continue with its
 175         elaborate deception.
 176
 177         The configuration file lives in $HOME/.noip.  Mine says
 178         something like this.
 179
 180         ## standard configuration
 181
 182         ## debug
 183         realconnect +172.29.199.2:25
 184         realconnect +172.29.199.2:53
 185         realconnect +172.29.199.2:80
 186         realconnect +172.29.199.2:3128
 187         realconnect +127.0.0.1:6010-6020
 188         realconnect -127.0.0.0/8
 189         realconnect -local
 190
 191         (172.29.199.2 is the IP address of the machine I took this
 192         from.)  What this says is as follows.
 193
 194           * Don't produce debugging output, but let me turn it on easily
 195             if I feel the urge.
 196
 197           * Allow direct connection to my SMTP server, on port 25.  (The
 198             `+' means `allow'.)
 199
 200           * Allow conversations with my local DNS server.  (The noip
 201             hack is not particularly discriminating.  It replaces UDP
 202             sockets with Unix-domain datagram sockets, just as it
 203             replaces TCP sockets with Unix-domain stream sockets.)
 204
 205           * Allow conversations with my local web server.
 206
 207           * Allow conversations with my local squid proxy.
 208
 209           * Allow conversations with SSH-forwarded X displays.
 210
 211           * Don't allow any other communication with anything else on
 212             the loopback network 127.0.0.0/8.  (I've still no idea why
 213             each machine needs 16 million IP addresses for talking to
 214             itself.  The `-' means `deny'.)
 215
 216           * Don't allow any other communication with any of my other
 217             local IP addresses.  (noip will work out which IP addresses
 218             are local from your network interface configuration.)
 219
 220           * And finally, implicitly, allow anything else.
 221
 222         The rules follow the squid convention: the default is to do
 223         whatever the last rule didn't do, so if the last rule says
 224         `deny' then the default is `allow', and vice versa.
 225
 226         Armed with this configuration, I now routinely run both Emacs
 227         and Iceweasel exclusively under the control of noip.  And I've
 228         done this for several years.
 229
 230   SSH tricks
 231
 232         SSH is made of win.  Its X forwarding is lovely.  Its port
 233         forwarding divine.  Almost.
 234
 235         Here's a common scenario.  I'm running on a multi-user server,
 236         shared with several other people whom I don't necessarily trust.
 237         I want to check some files out from my office's version-control
 238         system.  Traditional answer:
 239
 240         $ ssh -L 12345:vcs.work.com:345 mdw@gateway.work.com
 241
 242         Now I can run
 243
 244         $ vcs -d localhost:12345 checkout ...
 245
 246         and all works well.  Of course, anyone else on the server can do
 247         the same thing, so I've just leaked my company's secret sauce.
 248         (I don't believe in secret sauce, but I ought to show willing.)
 249
 250         How do I fix this?  Easy!
 251
 252         $ noip ssh -L 12345:vcs.work.com:345 mdw@gateway.work.com
 253
 254         $ noip vcs -d localhost:12345 checkout ...
 255
 256         And it all works.  In this case, in particular, it's essential
 257         that the /same/ SSH process binds a safe, plastic local end to
 258         its forwarded VCS port, and is able to make a real, potentially
 259         dangerous Internet connection to gateway.work.com.  Of course,
 260         since I run Emacs under noip anyway, all the version control
 261         stuff that Emacs does magically find the SSH tunnel and work
 262         without me having to care.
 263
 264   Testing
 265
 266         noip provides a handy way for testing network servers and so on
 267         safely.  For a start, you can run your test server apparently on
 268         the same port as the real one.  Because noip consults the
 269         environment variable NOIP_SOCKETDIR to find out where to put its
 270         sockets, you can run two at a time and they don't interfere.
 271         And noip doesn't care what port numbers your program tries to
 272         bind, so you don't need to jump through stupid hoops in order to
 273         test programs which use `privileged' ports.
 274
 275   Other applications
 276
 277         There are certainly loads of handy things you can do with noip.
 278         If you think of one, let me know!
 279
 280                                                                 Mark Wooding
 281                                                         mdw@distorted.org.uk
 282
 283 \f
 284 Local variables:
 285 mode: text
 286 fill-column: 72
 287 End: