PRELOAD-HACKS
~~~~~~~~~~~~~

What is it?

	The preload-hacks distribution contains a couple of LD_PRELOAD-
	able libraries which I find very useful.  Well, one useful one,
	and one which is really handy in theory.

	uopen	Traps when a process is trying to open(2) a Unix-domain
		socket, and does the appropriate socket(2)/connect(2)
		dance instead.  I've no idea why it doesn't work like
		this in the first place.

	noip	Traps when a process is trying to make an Internet
		socket, and makes a Unix-domain socket instead.

	The first one is the one which is useful in theory but I've not
	really made much use of in practice.


uopen

	The main use-case is variable signatures.  Many mail and news
	clients nowadays have built-in sigmonsters, which choose a
	.signature at random from a collection.  Some don't, of course,
	which is a shame.  It would be nice if the sigmonster was
	detachable, so you could just write a sigmonster and attach it
	to your favourite newsreader.  It would extra nice if
	newsreaders (and mail clients) don't have to use some kind of
	weirdo sigmonster interface just to do this stupid thing with
	.signatures.

	All mail and news clients know how to read a .signature file.
	It's why it's got that name.  So the right answer seems to be to
	make this file magically have different contents each time it's
	read.  Noticing when someone tries to read a regular file is
	just awful so let's not think about that idea any more.  We
	could make .signature be a named pipe; but named pipe servers
	are very difficult to get right when there are multiple
	simultaneous clients.  Sockets are, of course, the right answer
	when client/server architectures come up.  And we've got a
	convenient way of stashing sockets in the filesystem: PF_UNIX
	sockets.

	So, we write our sigmonster:

	$ fwd -d from unix:$HOME/.signature to exec.fortune

	And now we check to see whether it works.

	$ cat ~/.signature
	cat: /home/mdw/.signature: No such device or address

	Hmm.  That blows.  Surely it's obvious how to read from a
	socket.  But, no, the kernel won't do the socket/connect thing
	for us.

	Enter uopen.

	$ uopen cat ~/.signature
	Noise proves nothing.  Often a hen who has merely laid an egg 
	cackles as if she laid an asteroid.
                -- Mark Twain

	Joy!

	This isn't perfect.  The file is weird and not a proper file.
	Emacs will refuse to visit it as a result.  But it /will/
	happily insert the contents of the file into existing buffers.
	Hopefully other editors are similar.  `less' wants the -f option
	before it will bother.  But actually it works pretty well.

	The right place for the functionality of uopen is in the kernel.
	It shouldn't be difficult.  I even submitted a patch to the
	Linux kernel list to do precisely that, once, back in the days
	of 2.0.x.  It was ignored, and I gave up; the patch bitrotted
	hopelessly.  My LD_PRELOAD hack still works.  There's no
	configuration.  It just works.

	My .signature has been `-- [mdw]' for years now, and that's
	unlikely to change.  So I don't actually use uopen very much.
	But it's cool to know that it exists.


noip

  The basic idea

	This I use every day.  All the time.  Here's the use case.
	We'll see some more examples later.

	Some random program has a client/server split between the main
	guts of the thing and its user interface, and the two
	communicate over TCP sockets.  There are lots of examples: SLIME
	(the Superior Lisp Interaction Mode for Emacs) runs a Common
	Lisp system as a separate process.  The SAGE notebook runs a web
	server and you're meant to use a Javascript-supporting web
	browser to drive it.  All sorts of stuff.  Usually the
	programmer knew just enough to remember to bind the server's
	listening socket to 127.0.0.1, to stop everyone on the Internet
	from connecting, but often the security consciousness stops
	there.  If you're very lucky, there's some sort of password
	mechanism.

	The problem, of course, occurs on a multi-user system.  Binding
	to localhost doesn't stop any other user of the same machine
	from connecting.  In the cases of SLIME and SAGE, this is a big
	problem: both provide a full programming environment
	(respectively Common Lisp and Python) which would let an
	attacker do anything he likes in your name.

	Passwords are wretched as a security mechanism.  Besides, I
	shouldn't need a damned password to talk to one of my own
	processes from another one of my own processes!  The operating
	system should be able to ensure that processes owned by the same
	user can communicate securely.  There's a whole filesystem with
	access control and everything.

	The right answer is to use Unix-domain sockets, which live in
	the filesystem and have proper access control applied to them.
	But programmers are lazy, and Unix-domain sockets don't exist on
	Windows (well, unless you install Cygwin, but I can see why
	that's an unpopular idea).

	The noip LD_PRELOAD hack intercepts the socket(2) system call.
	If the process is asking for a PF_INET socket, then it hands out
	a PF_UNIX socket instead.  If the process tries to bind(2) its
	socket to 127.0.0.1:12345, say, then noip binds it to
	/tmp/noip-USER/127.0.0.1:12345 instead (having previously
	created the directory /tmp/noip-USER and made sure that nobody
	else can get to it).  If the process tries to connect(2)
	somewhere, noip fixes up the address.  The noip hack intercepts
	14 different system calls in order to prevent its systematic
	dishonesty from being discovered.

  Configuration

	Running a program under noip effectively only allows it to talk
	to other programs running under noip.  This is sort of the idea,
	but it's rather restrictive in practice.  I can happily run

	$ noip emacs

	and start up SLIME, and Emacs and SLIME will communicate
	securely over a Unix-domain socket without either of them
	noticing.  But now Emacs can't talk to anything other than
	SLIME, which makes w3m-el less useful than it used to be, and,
	worse, my Common Lisp programs can't talk to anything external
	either, which may make writing network-aware Lisp programs
	annoying.

	It gets worse with SAGE.  I can run

	$ noip sage -notebook

	and in another window

	$ noip iceweasel http://localhost:8000/

	(or Firefox, on Ubuntu), and the two will communicate happily.
	But now my Iceweasel is crippled and can't actually talk to the
	rest of the Internet.  The point of the exercise was to make my
	SAGE process secure, not to make me run two copies of Iceweasel
	and have to cope with the inevitable profile fork.

	So noip can be configured.  It still defaults to safety:
	whenever the process asks for a new Internet socket, noip hands
	it a fake plastic Unix-domain socket instead.  But when the
	process tries to bind or connect its socket, noip will look the
	address up in a list decide what to do.  If the result comes
	back `allow', then noip will do a three-card Monte, rustling up
	a real PF_INET socket and replacing the plastic imitation; if
	the result comes back `deny' then noip will continue with its
	elaborate deception.

	The configuration file lives in $HOME/.noip.  Mine says
	something like this.

	## standard configuration

	## debug
	realconnect +127.0.0.1:6010-6020, +[::1]:6010-6020
	realconnect +127.0.0.1:53, +[::1]:53
	realconnect +local:22
	realconnect -127.0.0.0/8, -[::1]
	realconnect -local

	What this says is as follows.

	  * Don't produce debugging output, but let me turn it on easily
	    if I feel the urge.

	  * Allow conversations with SSH-forwarded X displays, which
	    listen on the loopback interface.  Notice that the IPv6
	    address must be enclosed in square brackets because colons
	    are having to do double-duty here.

	  * Allow conversations with my local DNS server.  (I run
	    `unbound' on all of my servers, to do DNSsec validation.
	    The noip hack is not particularly discriminating.  It
	    replaces UDP sockets with Unix-domain datagram sockets, just
	    as it replaces TCP sockets with Unix-domain stream sockets.)

	  * Allow conversations with my local SSH server.

	  * Don't allow any other communication with anything else on
	    the loopback network 127.0.0.0/8.  (I've still no idea why
	    each machine needs 16 million IPv4 addresses for talking to
	    itself.  The `-' means `deny'.)

	  * Don't allow any other communication with any of my other
	    local IP addresses.  (noip will work out which IP addresses
	    are local from your network interface configuration.)

	  * And finally, implicitly, allow anything else.

	The rules follow the squid convention: the default is to do
	whatever the last rule didn't do, so if the last rule says
	`deny' then the default is `allow', and vice versa.

	Armed with this configuration, I now routinely run both Emacs
	and Iceweasel exclusively under the control of noip.  And I've
	done this for several years.

  SSH tricks

	SSH is made of win.  Its X forwarding is lovely.  Its port
	forwarding divine.  Almost.

	Here's a common scenario.  I'm running on a multi-user server,
	shared with several other people whom I don't necessarily trust.
	I want to check some files out from my office's version-control
	system.  Traditional answer:

	$ ssh -L 12345:vcs.work.com:345 mdw@gateway.work.com

	Now I can run

	$ vcs -d localhost:12345 checkout ...

	and all works well.  Of course, anyone else on the server can do
	the same thing, so I've just leaked my company's secret sauce.
	(I don't believe in secret sauce, but I ought to show willing.)

	How do I fix this?  Easy!

	$ noip ssh -L 12345:vcs.work.com:345 mdw@gateway.work.com

	$ noip vcs -d localhost:12345 checkout ...

	And it all works.  In this case, in particular, it's essential
	that the /same/ SSH process binds a safe, plastic local end to
	its forwarded VCS port, and is able to make a real, potentially
	dangerous Internet connection to gateway.work.com.  Of course,
	since I run Emacs under noip anyway, all the version control
	stuff that Emacs does magically find the SSH tunnel and work
	without me having to care.

  Testing

	noip provides a handy way for testing network servers and so on
	safely.  For a start, you can run your test server apparently on
	the same port as the real one.  Because noip consults the
	environment variable NOIP_SOCKETDIR to find out where to put its
	sockets, you can run two at a time and they don't interfere.
	And noip doesn't care what port numbers your program tries to
	bind, so you don't need to jump through stupid hoops in order to
	test programs which use `privileged' ports.

  Other applications

	There are certainly loads of handy things you can do with noip.
	If you think of one, let me know!

								Mark Wooding
							mdw@distorted.org.uk


Local variables:
mode: text
fill-column: 72
End: