Rules of standard semaphore --------------------------- Each semaphore character is formed by holding two identical flags in different positions, chosen from the positions on this diagram: 3 4 5 \ | / 2 ___\|/___ 6 /|\ / | \ 1 0 7 It makes no difference which flag points in which direction, so the list of the 28 possible positions is as follows: (0,1), (0,2), ..., (0,7), (1,2), (1,3), ..., (1,7), (2,3), ..., (6,7). In letter mode, these correspond, respectively, to the characters a-i,k-u,y,\b,#,j,v,w,x,z; in number mode, the first 10 characters instead correspond to 1-9,0. \b is "backspace" (meaning ignore the last character). # and j switch to and from number mode, respectively. In this way, any lower-case letter or number can be encoded in semaphore. Rules of our extended semaphore ------------------------------- To encode characters which are not in [a-z0-9], we extended semaphore as follows. The #...j notation no longer encodes numbers. Instead, if X is a number between 0 and 255, then #Xj represents a byte whose value is X. E.g. the string "V3ry $1lly" would be encoded as "#86j#51jry #36j#49jlly". In this way, we can encode arbitrary ASCII characters - and indeed, through UTF-8, arbitrary Unicode characters (such as Chinese characters: 太極æ.³), in such a way that a message consisting just of lowercase Roman letters encodes identically to in standard semaphore. We treat the "both arms down" position as the space character (ASCII 32). This is consistent with standard semaphore, where you put your arms down to indicate pauses like word-breaks. Newlines are preserved; wherever there was a newline in the original text there is a newline in the encoded text. There's no need for us to encode the backspace (\b) character as "annul", because our filter is perfect and never makes mistakes. However, it will correctly decode a semaphore message containing the "annul" character. Representing semaphore in text art ---------------------------------- To ``draw'' semaphore, we draw a full-stop head, with arms pointing in the correct direction. For example, the character "c" is position (0,3), and so is drawn like this: \. | How the script encodes/decodes semaphore ---------------------------------------- Most of the script is fairly simple, and has been explained in SOLUTION. However, there are these two cryptic lines in the encode and decode routines respectively: # replace each character with "X.Y", X and Y representing arm positions s!.!for($a=0,$b=-96+ord$&;$b>7-$a;$b-=7-$a++){} $b+=$a;$a>5||$b+$a<5?"$b.$a":"$a.$b"!eg; # derive character from "X.Y" representing arm positions s!(.)\.(.)!chr(96+(0,7,13,18,22,25,27)[$1>$2?$2:$1]+abs$2-$1)!ge; The way they work is this. Given arm positions (x,y) as in the diagram above, with x < y, you can calculate which character they represent as follows. The character is the k'th in the list a-i,k-u,y,\b,#,j,v,w,x,z, where k = [y-x + (the sum of the first x numbers in the list 7,6,5,4,3,2)]. So as in the example above, (0,3) gives k = 3-0 + 0 = 3, i.e. character "c", which is correct. If you think about it for a moment, or check some examples, you'll believe our general formula. The two cryptic lines just implement this formula in a small amount of space. In fact, we use a shorter version of the second line in the real program: s!(.)\.(.)!chr 125-(7.5-($1>$2?$2:$1))**2/2+abs$2-$1!ge This is just a way of encoding the list "7,6,5,4,3,2,1" in a quadratic formula, and "completing the square", to take up less space.