ick-proxy
ick-proxy
- custom web proxy for rewriting URLs
ick-proxy [ options ] [ subcommand | --multiuser ]
ick-proxy [ options ] -t test-url
ick-proxy
is a specialist web proxy whose job is to rewrite URLs and return 302 (Moved Temporarily) redirections for them.
You might use ick-proxy
if there was a class of URL which you frequently needed to click on links to, but which you preferred to have modified before you visited them.
For example, some web sites provide their content in multiple formats, distinguished by some aspect of the URL. (E.g. BBC News provides low-graphics and high-graphics versions of all its news articles.) You might have a preferred style in which to read such pages, and wish to arrange that any page you read is shown in your style, even when following a link to that page from someone who had cited the other kind of URL. ick-proxy
can solve this for you by automatically rewriting all such URLs into the form you wanted, no matter whether the URL was entered manually into the address bar or followed from some other unrelated web page.
(Of course, some web sites of this type provide their own internal cookie mechanism for accommodating your viewing preferences, in which case it's almost certainly simpler to use that. But some don't, and in that case ick-proxy
can help.)
To configure ick-proxy
, you provide a script written in the Ick language (described below), which implements a function called rewrite
taking one string argument and returning a string value. This function should transform any URL you want rewriting into the URL it should be rewritten as. URLs that you do not want rewriting should be returned unchanged.
(Your script must be idempotent: rewriting a URL twice should give the same result as doing so once. In other words, the output from your rewrite
function should always be unchanged if fed back to the function as input.)
ick-proxy
will take this script, and optionally a .pac
file describing your conventional web proxy requirements, and will output a replacement .pac
file which you should configure your web browser to use. This .pac
file will contain a Javascript translation of your rewrite script, so that your web browser can identify URLs which require rewriting and pass them to ick-proxy
, which will return 302 Moved Temporarily responses containing the rewritten URLs. Those URLs in turn, and any URLs which did not need rewriting in the first place, will be retrieved in the manner specified by your input .pac
file, or by direct access if you did not supply an input .pac
.
(ick-proxy
has no capability to actually fetch web pages; the only thing it knows how to do is to return 302s. So if you ask it to proxy a URL which does not require rewriting, it will have no option but to return a 501 internal error code. Hence, the configuration it supplies to your browser must be careful to send no URL to ick-proxy
which does not need redirection.)
ick-proxy
can be run in various different modes.
In its default mode, if invoked without arguments and without --multiuser
, it will run as an X client. It reads the calling user's configuration files, writes out its custom .pac
file, attaches to your X server, and forks off into the background. It will last as long as your X session does (unless it crashes or is killed), and when your X session terminates it will detect this and terminate as well.
$ ick-proxy
Alternatively, you can run it as a wrapper around a subcommand, by providing that command as arguments on the command line. In this mode ick-proxy
will continue running until the subcommand terminates, and will then shut down. For example, you might run it as a wrapper around your web browser itself:
$ ick-proxy firefox
In both of the above modes ick-proxy
will write its output .pac
file to the file system, by default as ~/.ick-proxy/output.pac
(though this location is configurable; see the next section). In order to actually enable URL rewriting in your browser, you would then configure the browser to read its proxy configuration from a URL along the lines of file:///home/
username/.ick-proxy/output.pac
. If you need ick-proxy
to re-read its configuration files during its run, you can send it the SIGHUP
signal.
ick-proxy
also supports a third rather different operating mode: it can run as a system-wide daemon providing its service to all users of a system. You invoke this mode using the --multiuser
option, typically as root:
$ ick-proxy --multiuser
In this mode, ick-proxy
will no longer write its output .pac
files into the file system. Instead it will allocate a central port to listen on (880 by default). On that port it will perform no proxying functions; all it will do is to serve its generated .pac
files over HTTP. So a user wanting to use ick-proxy
would then configure their web browser to retrieve its proxy configuration from a URL of the form http://localhost:880/pac/
username.
When a multi-user ick-proxy
is asked for a .pac
file for a particular user, it will allocate a secondary port on which to perform proxying for that user; it will then read that user's configuration files out of their home directory, and return an output .pac
which cites the secondary port it has allocated. Subsequent .pac
requests for the same user will cause ick-proxy
to re-read the user's configuration, but to re-use the same port number. .pac
requests for other users will result in separate port numbers being allocated.
In practice, the author has found that the most convenient mode of use seems to be the default one: start ick-proxy
without arguments from within your .xsession
script, and then it will write a static .pac
file into your home directory and run for the lifetime of your X session. However, it is rumoured that browsers are required to be able to retrieve .pac
files over HTTP but not required to be able to read them from static files. This suggests that the multi-user mode may technically be the most standards-compliant and hence in principle the most likely to work on all browsers. Unfortunately, the author has found that in practice at least one browser has problems retrieving .pac
s over HTTP but copes fine with static files, so draw your own conclusions...
To select different running modes:
--multiuser
-t
url
If none of the above options is given, the default mode is to attach to your X session and run in single-user mode.
To configure single-user mode (whether X-attached or wrapping a subprogram):
-s
script-file
rewrite
function. Default is ‘~/.ick-proxy/rewrite.ick
’.
-i
input-pac
.pac
file specifying the user's conventional web proxy preferences. Default is ‘~/.ick-proxy/input.pac
’. It is not an error for this file not to exist: if ick-proxy
cannot read it, it will assume you did not wish to use a conventional web proxy at all.
-o
output-pac
ick-proxy
will write the output .pac
file which configures the browser to use ick-proxy
for URLs requiring rewriting. Default is ‘~/.ick-proxy/output.pac
’.
To configure the X-client mode:
-display
display
$DISPLAY
.
To configure the multi-user mode:
-p
port
ick-proxy
will listen in multi-user mode. Default is 880.
-u
username
ick-proxy
in multi-user mode to drop root privileges by setting its user ID to that of username. This will be done after it binds to its primary port (since that port number can be less than 1024).
This section describes the Ick language, in which rewrite scripts are written.
In brief: the Ick language is roughly C-like, but simplified, and in particular it has a very simple type system which supports no compound types at all but does support arbitrarily sized strings as a basic type.
The Ick language has basically C-like syntax.
At the top level, a source file consists of function definitions, variable declarations, and nothing else. A function definition is of the form
return-type function-name ( [ type param [ , type param ... ] ] )
{
variable-declarations
statements
}
and a variable declaration is of the form
type varname [ = expression ] [ , varname [ = expression ] ... ];
The only valid types are string
, int
and bool
. The pseudo-type void
may also be used as the return type of a function (indicating that the function returns no value at all), but not for any variable or function parameter.
(To declare a function with no arguments, the word void
may be used between the parentheses in place of the parameter list, as an alternative syntax to simply leaving the parentheses empty.)
ick-proxy
requires that scripts written in this language provide a function called ‘rewrite
’, taking one string argument and returning a string. So the simplest possible rewrite script, which does nothing at all, might look like this:
string rewrite(string url)
{
return url;
}
Comments in ick-proxy
are like C and C++: either contained between ‘/*
’ and ‘*/
’ (without nesting), or between ‘//
’ and the next newline.
Valid statements are listed below.
The statement
expression;
has the effect of evaluating the expression, including any side effects, and ignoring its result (if any). This type of statement can be used to perform assignments, increments and decrements, function calls, or a combination of those.
The statement
return [ expression ];
immediately terminates the current instance of the function in which it is invoked. If an expression is supplied, then its value is the return value of the function (and the type of the expression must be the same as the function's return type, which must not be void). If no expression is supplied, then no value is returned (and the function's return type must be void).
The statements
break;
continue;
must be contained within at least one loop construction (if
, while
, for
or do
). Both of them immediately terminate the current iteration of the innermost loop containing them; break
also terminates the entire loop, whereas continue
merely causes the next iteration to begin.
The statement
if (expression) then-statement [ else else-statement ]
evaluates expression (which must have boolean type). If the result is true, it runs then-statement; otherwise it runs else-statement if provided.
The statement
while (expression) statement
evaluates expression (which must have boolean type). If the result is false, it does nothing further. If the result is true, it runs statement, and then starts all over again (evaluating expression again, and potentially continuing to loop).
The statement
do statement while (expression);
first runs statement. Then it evaluates expression (which must have boolean type). If the result is false, it does nothing further; otherwise, it starts all over again (running statement again, and potentially continuing to loop).
The statement
for ( [ expr1 ] ; [ expr2 ] ; [ expr3 ] ) statement
starts by evaluating expr1 and ignoring any result.
Next it evaluates expr2, which must have boolean type. If the result is false, it does nothing further. If the result is true, it runs statement, then evaluates expr3 and ignores any result, and then goes back to the evaluation of expr2, potentially continuing to loop.
A continue
statement within statement does not skip the evaluation of expr3.
Anywhere a single statement is syntactically valid, a braced block may appear instead:
{
variable-declarations
statements
}
Variables declared within this block are only valid within the block. If they include initialisers, they are initialised every time execution enters the block.
Expressions use ordinary infix syntax, with a restricted subset of the usual C operators. The accepted operators are listed below. Each subheading indicates a group of operators with the same precedence, and the operators are listed from lowest to highest precedence.
The expression
leftexpr , rightexpr
has the value of rightexpr, but before it evaluates rightexpr it first evaluates leftexpr and ignores the result.
leftexpr and rightexpr need not have the same type, and either or both may even be void. The type of the entire comma expression is the same as the type of rightexpr.
The expression
variable = expression
has the value of expression, and the side effect of copying that value into variable. variable and expression must have the same type, and of course the type of the expression as a whole is the same type again.
The compound assignment expressions
variable += expression
variable -= expression
variable *= expression
variable /= expression
variable &&= expression
variable ||= expression
are equivalent, respectively, to
variable = variable + expression
variable = variable - expression
variable = variable * expression
variable = variable / expression
variable = variable && expression
variable = variable || expression
The expression
condexpr ? trueexpr : falseexpr
has the value of trueexpr if condexpr evaluates to true, or of falseexpr if condexpr evaluates to false.
condexpr must have boolean type. Either or both of trueexpr and falseexpr may have void type, in which case the expression as a whole has void type as well; otherwise trueexpr and falseexpr must have the same type, which is also the type of the whole expression.
The expressions
leftexpr && rightexpr
leftexpr || rightexpr
have, respectively, the value of the logical AND and logical OR of their operands. Both operands must have boolean type, and the expressions as a whole have boolean type too.
These operators are guaranteed to short-circuit: that is, if evaluating leftexpr leaves the value of the entire expression in no doubt (i.e. leftexpr is false in an &&
expression, or true in an ||
expression) then rightexpr is not evaluated at all (so its side effects, if any, will not occur).
The &&
and ||
operators have the same precedence, and associate with themselves, but may not associate with one another. That is, you can legally write either of
expr1 && expr2 && expr3
expr1 || expr2 || expr3
but it is an error to write either of
expr1 && expr2 || expr3
expr1 || expr2 && expr3
and you must instead use parentheses to disambiguate the relative priority of the operators.
The expressions
leftexpr < rightexpr
leftexpr <= rightexpr
leftexpr > rightexpr
leftexpr >= rightexpr
leftexpr == rightexpr
leftexpr != rightexpr
return true if and only if leftexpr compares, respectively, less than, less than or equal to, greater than, greater than or equal to, equal to, or unequal to rightexpr.
leftexpr and rightexpr must both have the same type, which must be either string or integer. The expressions as a whole have boolean type.
The expressions
leftexpr + rightexpr
leftexpr - rightexpr
return, respectively, the sum and difference of leftexpr and rightexpr.
leftexpr and rightexpr must have the same type, and the expressions as a whole have the same type. That type must be integer for the -
operator; for the +
operator it may be either integer or string. In the latter case, the operation performed is string concatenation.
The expressions
leftexpr * rightexpr
leftexpr / rightexpr
return, respectively, the product and quotient of leftexpr and rightexpr.
leftexpr and rightexpr must both have integer type, and the expressions as a whole have the same type.
The expressions
+ expression
- expression
have, respectively, the same value as expression and the arithmetic negative of the value of expression. expression must have integer type, and the expression as a whole has integer type too.
The expression
! expression
has the value of the boolean negation of the value of expression. expression must have boolean type, and the expression as a whole has boolean type too.
The expressions
++ variable
-- variable
have, respectively, the effect of adding 1 to variable and subtracting 1 from it. Their value is the value of variable after it is modified. variable must have integer type, and the expression as a whole has integer type too.
The expressions
variable ++
variable --
have, respectively, the effect of adding 1 to variable and subtracting 1 from it. Their value is the value of variable before it is modified. variable must have integer type, and the expression as a whole has integer type too.
The expression
( expression )
has the same type and value as expression.
The expression
function-name ( [ argument [ , argument ... ] ] )
has the effect of calling the named function, with its parameters set to the values of the argument expressions in order. The types of the argument expressions must match the types of the parameters of the function; the type of the expression as a whole is the return type of the function, and its value (if any) is equal to the value returned by any return
statement within the function body.
Functions are overloaded by their number and type of parameters. That is, you can independently define two functions with the same name, as long as their lists of parameter types are distinct.
The expression
variable-name
has the type and value of the contents of the named variable.
The expressions
true
false
have boolean type, and their values are respectively boolean truth and boolean falsehood.
Finally, expressions can also be integer literals and string literals.
An integer literal consists of either a sequence of decimal digits starting with a non-zero one, or a sequence of octal digits starting with a zero, or a sequence of hexadecimal digits preceded by ‘0x
’.
A string literal consists of a sequence of characters enclosed in double quotes. Within those quotes, the backslash character is special, and must introduce one of the following sequences:
\a
\b
\f
\n
\r
\t
\v
\\
\"
\
followed by a new line in the source
\x
followed by up to two hex digits
\
followed by up to three octal digits
Multiple string literals may also be specified in immediate succession, and will be automatically concatenated.
The Ick execution environment pre-defines a number of standard functions you can use for string processing. Those functions are listed below.
int len(string str)
Returns the length of str.
string substr(string str, int start, int end)
Returns the substring of str starting at character position start (counting the first character in the string as zero), and continuing until position end. The character at position start is included, but the one at position end is not.
string substr(string str, int start)
Returns the substring of str starting at character position start (counting the first character in the string as zero), and continuing until the end of the string.
int atoi(string str)
Interprets str as a sequence of decimal digits (with optional minus sign) encoding an integer, and returns that integer.
string itoa(int i)
Encodes i as a string containing a sequence of decimal digits (with optional minus sign) encoding an integer, and returns that string.
int ord(string str)
Returns the character code of the first character in str, or zero if str is empty.
string chr(int c)
Returns a string containing a single character with code c, or the empty string if c is zero.
int index(string haystack, string needle)
Searches for the string needle occurring anywhere in the string haystack. Returns the first position at which it occurs, or -1 if it does not occur at all.
int index(string haystack, string needle, int start)
Same as above, but only counts matches at or after the position start.
int rindex(string haystack, string needle)
int rindex(string haystack, string needle, int start)
Same as index
, but returns the last position at which needle occurs rather than the first. (Still returns -1 if it does not occur at all.)
int min(int a, int b)
Returns the smaller of the two integers a and b.
int max(int a, int b)
Returns the larger of the two integers a and b.
Here is a simple example of an ick-proxy
configuration which rewrites BBC News article URLs to reference the low-graphics version always.
bool strprefix(string str, string pfx)
{
return (len(str) >= len(pfx) &&
substr(str, 0, len(pfx)) == pfx);
}
string rewrite(string url)
{
if (strprefix(url, "http://news.bbc.co.uk/1/hi/")) {
url = substr(url, 0, 24) + "low" + substr(url, 26);
}
return url;
}
None currently known, other than the fact that the entire concept is utterly disgusting (hence the program's name).
ick-proxy
is free software, distributed under the MIT licence. Type ick-proxy --licence
to see the full licence text.