Manual for GUI versions of ick-proxy

This manual describes how to use the GUI versions of the ick-proxy cross-browser URL rewriting utility. It covers the Windows and Mac OS X versions.

Chapter 1: Introduction

Suppose you regularly visit a website which can present its content in multiple formats, depending on the URL. For example, BBC News supplies all of its articles in a simple version without too much fancy HTML, and in a highly formatted version with additional borders and links round the sides. The simple versions of the pages have ‘/low/’ in the URL, and the complicated ones have ‘/hi/’.

You might have a strong preference for one of these formats over another. For example, if you have an old web browser, or a slow computer, or a small screen, or a slow network connection, you might want to always see the simple version of the page. So if you saw a hyperlink elsewhere on the web (e.g. blogs) which linked to the complicated version of a page, you would really like your web browser to go to the simple version instead when you click on that link.

Some websites of this type recognise this desire, and will offer you the ability to set cookies in your browser which cause them to return pages the way you want to see them, instead of the way the person who linked to them wanted to see them. If the website in question is one of these, you're fine: simply go through its configuration process, set the cookie, and then you'll always see news articles (or whatever) in the format you wanted.

However, some websites do not provide this feature. If you have to deal with such a website on a regular basis, it would be convenient if you could configure your web browser to do the same job itself: whenever you click on a link pointing at (let us say) a BBC News article with ‘/hi/’ somewhere in the URL, to modify the URL so that ‘/hi/’ is replaced with ‘/low/’.

ick-proxy can do this for you, in a manner largely independent of your web browser. It offers a general mechanism for you to specify a set of URLs to be rewritten.

There are other reasons you might want the same capability. Another one involves VPNs: in some VPN setups, you might be able to access the same web server by two different host names, such that one host name causes you to connect across the open Internet and another causes you to connect across the secured VPN. If you have to exchange any sensitive data with that web server, you might set up ick-proxy to rewrite URLs using the insecure host name so that they use the secure one instead.

Whatever your reason for wanting it, ick-proxy supplies general URL rewriting across multiple web browsers and operating systems.

The biggest downside to ick-proxy is that you need to be able to program in order to configure it: the rules about what URLs should be rewritten and how are expressed as a program in a small custom programming language. If you aren't competent to write small programs, you will need to get somebody else to configure ick-proxy for you, or not use it at all.

Chapter 2: Mechanism

The mechanism by which ick-proxy works is somewhat disgusting (hence the first half of its name).

The ick-proxy program itself acts as an HTTP web proxy (hence the second half of the name!). When the browser asks it to fetch a URL which needs rewriting, ick-proxy lies to the browser: instead of fetching the page it was asked to fetch, it instead returns a ‘302 Moved Temporarily’ response code which causes the browser to believe that the target web site redirected it to a different URL. The different URL in question, of course, is the one you configured ick-proxy to rewrite the original one as; and of course the target web site did not actually return that redirection, because ick-proxy never actually passed on the request to it in the first place.

However, it isn't as simple as just telling your web browser to use ick-proxy as a web proxy for all connections. If you did that, then your browser would also ask ick-proxy to retrieve URLs that don't need rewriting. That would mean ick-proxy had to actually go and connect to the web site in question and fetch real web pages, which would require it to do real work and be much more complex and probably less reliable.

So instead of that, you instead configure your web browser in such a way that the browser itself examines every URL to decide whether it needs rewriting. If it does, the URL will be fetched via ick-proxy, which will return a mendacious 302 as described above. The browser will then attempt to fetch the rewritten URL, and this time its proxy configuration will not tell it to go via ick-proxy, and it will instead retrieve the page in the usual manner (either by direct connection, or via your normal conventional web proxy if you have one set up).

Many different web browsers can be configured to do this, using a mechanism called Proxy Auto-Configuration. This lets you load a piece of Javascript into your web browser which examines a URL and decides what proxy the browser should use, if any, to fetch it. So as well as actually acting as a web proxy, ick-proxy also does the work of constructing this PAC file out of your URL rewriting configuration. If you want to use a real web proxy in addition to ick-proxy, you must supply a PAC file as an additional input to this process; ick-proxy will combine your input PAC (if any) with your rewrite script, and produce a (probably) much more complicated output PAC which you should then configure your browser to point at.

(If you want to use a real web proxy in addition to ick-proxy, but you had previously set up that web proxy by means of a browser GUI rather than having a PAC file to describe it, then you will need to write a PAC which implements the same rules. This shouldn't be too hard if you have enough programming experience to be configuring ick-proxy in the first place.)

Chapter 3: Setting up ick-proxy

This section describes how to set up ick-proxy.

First, you need to write a program which performs all the URL rewriting you want ick-proxy to do. This is written in the custom ‘Ick’ language; see chapter 5 for a specification.

On Windows, you should install this script file in your user profile, e.g. ‘C:\Documents and Settings\Application Data’ (or wherever user profiles are kept on your local Windows system). Under that directory you should create a subdirectory called ‘ick-proxy’, and you should save the script file under the name ‘rewrite.ick’ in that subdirectory. So, for example, the full pathname might be ‘C:\Documents and Settings\Application Data\ick-proxy\rewrite.ick’.

On Mac OS X, this file should go in your home directory, in a directory called ‘.ick-proxy’. So the full pathname might be ‘~/.ick-proxy/rewrite.ick’.

If you want to use a conventional web proxy as well as ick-proxy, you should find or write a .pac file (Javascript proxy configuration) which specifies that proxy or combination of proxies, and install that under the name ‘input.pac’ in the same subdirectory of your profile. So, for example, the full pathname to this file might be ‘C:\Documents and Settings\Application Data\ick-proxy\input.pac’ on Windows, or ‘~/.ick-proxy/input.pac’ on a Mac.

Now run the ick-proxy application. This will put an icon in the System Tray (Windows) or the Dock (Mac OS X), and quietly sit there apparently doing nothing.

You should find, however, that on startup ick-proxy has read the configuration file(s) you set up above, and has written a new .pac file into its configuration directory called ‘output.pac’. So, for example, the full pathname to this file might be ‘C:\Documents and Settings\Application Data\ick-proxy\output.pac’ on Windows, or ‘~/.ick-proxy/output.pac’ on a Mac.

Your next step is to configure your web browser to use this file as its proxy configuration. If you're lucky, your browser preferences should include a Browse button which should enable you to browse around the file system and point it at this file.

Once you've set that up, you should be ready to go. Try fetching a URL that you want to be rewritten, and see what happens!

If that works, you will probably want to perform the final step: configure ick-proxy to start up again the next time you log in. On Windows, you can do this by putting a shortcut to it in the ‘Startup’ folder of your Start Menu (having first created that folder if it doesn't already exist). On Mac OS X, this is configured by going to System Preferences > Accounts, selecting your account, and then editing the Startup Items list to include the ick-proxy application.

(If you are using Internet Explorer, you may find that there are problems because IE caches the results computed by the PAC file on a per-host basis. Microsoft KnowledgeBase #271361 gives a workaround for this, although the author has not successfully managed to get IE to work even with that.

3.1 Command-line options (Windows only)

On Windows, you can configure ick-proxy to store its configuration files somewhere other than the default location, by providing options on its command line. (If you put a shortcut to ick-proxy in your Startup folder, you should be able to edit the command line stored in that shortcut by right-clicking on it and selecting ‘Properties’.)

-s script-file
Specify an alternative location for ick-proxy to find your URL-rewriting script, instead of looking for ick-proxy\rewrite.ick in your user profile.
-i input-pac-file
Specify an alternative location for ick-proxy to find your input PAC file, instead of looking for ick-proxy\input.pac in your user profile.
-o output-pac-file
Specify an alternative location for ick-proxy to generate your output PAC file, instead of storing it as ick-proxy\output.pac in your user profile.

On Mac OS X, these options are not available, since OS X applications do not conveniently support command-line arguments. Instead, you can create symbolic links in your ~/.ick-proxy directory pointing at the real locations of the files.

Chapter 4: Using ick-proxy

While ick-proxy is running, right-clicking on its System Tray or Dock icon will give you a small menu. This menu will allow you to shut the program down, of course.

It also contains an option which causes ick-proxy to re-read its configuration files and regenerate the output PAC. You might use this if you were modifying your rewriting configuration in mid-session. Don't forget, however, that after using this option you should restart your web browser or otherwise make sure it reloads the output PAC; if the web browser's configuration does not match that of ick-proxy then your browser may fetch URLs directly which you wanted rewritten, or (perhaps worse) direct URLs to ick-proxy which it is no longer prepared to rewrite, resulting in HTTP error messages.

If you find that no URL rewriting is happening at all, this may be because your rewriting script had an error in it. Load your output.pac file into a text editor and have a look at it. You may find that it contains a comment at the top explaining that ick-proxy was unable to parse your rewriting script, and giving an error message and a location in the source file. In that case, edit your rewriting script and fix the error, then ask ick-proxy to reload its configuration again and see if matters have improved.

Chapter 5: Specification of the Ick language

This section describes the Ick language, in which rewrite scripts are written.

In brief: the Ick language is roughly C-like, but simplified, and in particular it has a very simple type system which supports no compound types at all but does support arbitrarily sized strings as a basic type.

5.1 Syntax

The Ick language has basically C-like syntax.

At the top level, a source file consists of function definitions, variable declarations, and nothing else. A function definition is of the form

return-type function-name ( [ type param [ , type param ... ] ] )
{
    variable-declarations
    statements
}

and a variable declaration is of the form

type varname [ = expression ] [ , varname [ = expression ] ... ];

The only valid types are string, int and bool. The pseudo-type void may also be used as the return type of a function (indicating that the function returns no value at all), but not for any variable or function parameter.

(To declare a function with no arguments, the word void may be used between the parentheses in place of the parameter list, as an alternative syntax to simply leaving the parentheses empty.)

ick-proxy requires that scripts written in this language provide a function called ‘rewrite’, taking one string argument and returning a string. So the simplest possible rewrite script, which does nothing at all, might look like this:

string rewrite(string url)
{
    return url;
}

Comments in ick-proxy are like C and C++: either contained between ‘/*’ and ‘*/’ (without nesting), or between ‘//’ and the next newline.

5.2 Statements

Valid statements are listed below.

5.2.1 Expression statements

The statement

    expression;

has the effect of evaluating the expression, including any side effects, and ignoring its result (if any). This type of statement can be used to perform assignments, increments and decrements, function calls, or a combination of those.

5.2.2 Return statements

The statement

    return [ expression ];

immediately terminates the current instance of the function in which it is invoked. If an expression is supplied, then its value is the return value of the function (and the type of the expression must be the same as the function's return type, which must not be void). If no expression is supplied, then no value is returned (and the function's return type must be void).

5.2.3 Break and continue statements

The statements

    break;
    continue;

must be contained within at least one loop construction (if, while, for or do). Both of them immediately terminate the current iteration of the innermost loop containing them; break also terminates the entire loop, whereas continue merely causes the next iteration to begin.

5.2.4 If statements

The statement

    if (expression) then-statement [ else else-statement ]

evaluates expression (which must have boolean type). If the result is true, it runs then-statement; otherwise it runs else-statement if provided.

5.2.5 While statements

The statement

    while (expression) statement

evaluates expression (which must have boolean type). If the result is false, it does nothing further. If the result is true, it runs statement, and then starts all over again (evaluating expression again, and potentially continuing to loop).

5.2.6 Do statements

The statement

    do statement while (expression);

first runs statement. Then it evaluates expression (which must have boolean type). If the result is false, it does nothing further; otherwise, it starts all over again (running statement again, and potentially continuing to loop).

5.2.7 For statements

The statement

    for ( [ expr1 ] ; [ expr2 ] ; [ expr3 ] ) statement

starts by evaluating expr1 and ignoring any result.

Next it evaluates expr2, which must have boolean type. If the result is false, it does nothing further. If the result is true, it runs statement, then evaluates expr3 and ignores any result, and then goes back to the evaluation of expr2, potentially continuing to loop.

A continue statement within statement does not skip the evaluation of expr3.

5.2.8 Statement blocks

Anywhere a single statement is syntactically valid, a braced block may appear instead:

    {
        variable-declarations
        statements
    }

Variables declared within this block are only valid within the block. If they include initialisers, they are initialised every time execution enters the block.

5.3 Expressions

Expressions use ordinary infix syntax, with a restricted subset of the usual C operators. The accepted operators are listed below. Each subheading indicates a group of operators with the same precedence, and the operators are listed from lowest to highest precedence.

5.3.1 The comma operator

The expression

    leftexpr , rightexpr

has the value of rightexpr, but before it evaluates rightexpr it first evaluates leftexpr and ignores the result.

leftexpr and rightexpr need not have the same type, and either or both may even be void. The type of the entire comma expression is the same as the type of rightexpr.

5.3.2 Assignment operators

The expression

    variable = expression

has the value of expression, and the side effect of copying that value into variable. variable and expression must have the same type, and of course the type of the expression as a whole is the same type again.

The compound assignment expressions

    variable += expression
    variable -= expression
    variable *= expression
    variable /= expression
    variable &&= expression
    variable ||= expression

are equivalent, respectively, to

    variable = variable + expression
    variable = variable - expression
    variable = variable * expression
    variable = variable / expression
    variable = variable && expression
    variable = variable || expression

5.3.3 The conditional operator

The expression

    condexpr ? trueexpr : falseexpr

has the value of trueexpr if condexpr evaluates to true, or of falseexpr if condexpr evaluates to false.

condexpr must have boolean type. Either or both of trueexpr and falseexpr may have void type, in which case the expression as a whole has void type as well; otherwise trueexpr and falseexpr must have the same type, which is also the type of the whole expression.

5.3.4 Logical operators

The expressions

    leftexpr && rightexpr
    leftexpr || rightexpr

have, respectively, the value of the logical AND and logical OR of their operands. Both operands must have boolean type, and the expressions as a whole have boolean type too.

These operators are guaranteed to short-circuit: that is, if evaluating leftexpr leaves the value of the entire expression in no doubt (i.e. leftexpr is false in an && expression, or true in an || expression) then rightexpr is not evaluated at all (so its side effects, if any, will not occur).

The && and || operators have the same precedence, and associate with themselves, but may not associate with one another. That is, you can legally write either of

    expr1 && expr2 && expr3
    expr1 || expr2 || expr3

but it is an error to write either of

    expr1 && expr2 || expr3
    expr1 || expr2 && expr3

and you must instead use parentheses to disambiguate the relative priority of the operators.

5.3.5 Comparison operators

The expressions

    leftexpr < rightexpr
    leftexpr <= rightexpr
    leftexpr > rightexpr
    leftexpr >= rightexpr
    leftexpr == rightexpr
    leftexpr != rightexpr

return true if and only if leftexpr compares, respectively, less than, less than or equal to, greater than, greater than or equal to, equal to, or unequal to rightexpr.

leftexpr and rightexpr must both have the same type, which must be either string or integer. The expressions as a whole have boolean type.

5.3.6 Additive operators

The expressions

    leftexpr + rightexpr
    leftexpr - rightexpr

return, respectively, the sum and difference of leftexpr and rightexpr.

leftexpr and rightexpr must have the same type, and the expressions as a whole have the same type. That type must be integer for the - operator; for the + operator it may be either integer or string. In the latter case, the operation performed is string concatenation.

5.3.7 Multiplicative operators

The expressions

    leftexpr * rightexpr
    leftexpr / rightexpr

return, respectively, the product and quotient of leftexpr and rightexpr.

leftexpr and rightexpr must both have integer type, and the expressions as a whole have the same type.

5.3.8 Unary operators

The expressions

    + expression
    - expression

have, respectively, the same value as expression and the arithmetic negative of the value of expression. expression must have integer type, and the expression as a whole has integer type too.

The expression

    ! expression

has the value of the boolean negation of the value of expression. expression must have boolean type, and the expression as a whole has boolean type too.

The expressions

    ++ variable
    -- variable

have, respectively, the effect of adding 1 to variable and subtracting 1 from it. Their value is the value of variable after it is modified. variable must have integer type, and the expression as a whole has integer type too.

The expressions

    variable ++
    variable --

have, respectively, the effect of adding 1 to variable and subtracting 1 from it. Their value is the value of variable before it is modified. variable must have integer type, and the expression as a whole has integer type too.

5.3.9 Core expression components

The expression

    ( expression )

has the same type and value as expression.

The expression

    function-name ( [ argument [ , argument ... ] ] )

has the effect of calling the named function, with its parameters set to the values of the argument expressions in order. The types of the argument expressions must match the types of the parameters of the function; the type of the expression as a whole is the return type of the function, and its value (if any) is equal to the value returned by any return statement within the function body.

Functions are overloaded by their number and type of parameters. That is, you can independently define two functions with the same name, as long as their lists of parameter types are distinct.

The expression

    variable-name

has the type and value of the contents of the named variable.

The expressions

    true
    false

have boolean type, and their values are respectively boolean truth and boolean falsehood.

Finally, expressions can also be integer literals and string literals.

An integer literal consists of either a sequence of decimal digits starting with a non-zero one, or a sequence of octal digits starting with a zero, or a sequence of hexadecimal digits preceded by ‘0x’.

A string literal consists of a sequence of characters enclosed in double quotes. Within those quotes, the backslash character is special, and must introduce one of the following sequences:

\a
The alert or bell character (ASCII value 7).
\b
The backspace character (ASCII value 8).
\f
The form feed character (ASCII value 12).
\n
The new line or line feed character (ASCII value 10).
\r
The carriage return character (ASCII value 13).
\t
The horizontal tab character (ASCII value 9).
\v
The vertical tab character (ASCII value 11).
\\
A literal backslash.
\"
A literal double quote.
\ followed by a new line in the source
Causes the new line in the source to be ignored, so you can break a single string literal across multiple source lines.
\x followed by up to two hex digits
Encodes the character with the code given by the hex digits.
\ followed by up to three octal digits
Encodes the character with the code given by the octal digits.

Multiple string literals may also be specified in immediate succession, and will be automatically concatenated.

5.4 Standard library

The Ick execution environment pre-defines a number of standard functions you can use for string processing. Those functions are listed below.

int len(string str)

Returns the length of str.

string substr(string str, int start, int end)

Returns the substring of str starting at character position start (counting the first character in the string as zero), and continuing until position end. The character at position start is included, but the one at position end is not.

string substr(string str, int start)

Returns the substring of str starting at character position start (counting the first character in the string as zero), and continuing until the end of the string.

int atoi(string str)

Interprets str as a sequence of decimal digits (with optional minus sign) encoding an integer, and returns that integer.

string itoa(int i)

Encodes i as a string containing a sequence of decimal digits (with optional minus sign) encoding an integer, and returns that string.

int ord(string str)

Returns the character code of the first character in str, or zero if str is empty.

string chr(int c)

Returns a string containing a single character with code c, or the empty string if c is zero.

int index(string haystack, string needle)

Searches for the string needle occurring anywhere in the string haystack. Returns the first position at which it occurs, or -1 if it does not occur at all.

int index(string haystack, string needle, int start)

Same as above, but only counts matches at or after the position start.

int rindex(string haystack, string needle)
int rindex(string haystack, string needle, int start)

Same as index, but returns the last position at which needle occurs rather than the first. (Still returns -1 if it does not occur at all.)

int min(int a, int b)

Returns the smaller of the two integers a and b.

int max(int a, int b)

Returns the larger of the two integers a and b.

5.5 Example

Here is a simple example of an ick-proxy configuration which rewrites BBC News article URLs to reference the low-graphics version always.

bool strprefix(string str, string pfx)
{
    return (len(str) >= len(pfx) &&
	    substr(str, 0, len(pfx)) == pfx);
}
string rewrite(string url)
{
    if (strprefix(url, "http://news.bbc.co.uk/1/hi/")) {
        url = substr(url, 0, 24) + "low" + substr(url, 26);
    }
    return url;
}

Appendix A: Licence

ick-proxy is copyright 2004-8 Simon Tatham.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


[$Id: gui0.but 7922 2008-03-12 21:24:08Z simon $]
[$Id: icklang.but 7922 2008-03-12 21:24:08Z simon $]
[$Id: gui1.but 7922 2008-03-12 21:24:08Z simon $]