3 ### String formatting, with bells, whistles, and gongs
5 ### (c) 2013 Mark Wooding
8 ###----- Licensing notice ---------------------------------------------------
10 ### This file is part of Chopwood: a password-changing service.
12 ### Chopwood is free software; you can redistribute it and/or modify
13 ### it under the terms of the GNU Affero General Public License as
14 ### published by the Free Software Foundation; either version 3 of the
15 ### License, or (at your option) any later version.
17 ### Chopwood is distributed in the hope that it will be useful,
18 ### but WITHOUT ANY WARRANTY; without even the implied warranty of
19 ### MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
20 ### GNU Affero General Public License for more details.
22 ### You should have received a copy of the GNU Affero General Public
23 ### License along with Chopwood; if not, see
24 ### <http://www.gnu.org/licenses/>.
26 from __future__ import with_statement
28 import contextlib as CTX
30 from cStringIO import StringIO
35 ###--------------------------------------------------------------------------
36 ### A quick guide to the formatting machinery.
38 ### This is basically a re-implementation of Common Lisp's FORMAT function in
39 ### Python. It differs in a few respects.
41 ### * Most essentially, Python's object and argument-passing models aren't
42 ### the same as Lisp's. In fact, for our purposes, they're a bit better:
43 ### Python's sharp distinction between positional and keyword arguments
44 ### is often extremely annoying, but here they become a clear benefit.
45 ### Inspired by Python's own enhanced string-formatting machinery (the
46 ### new `str.format' method, and `string.Formatting' class, we provide
47 ### additional syntax to access keyword arguments by name, positional
48 ### arguments by position (without moving the cursor as manipulated by
49 ### `~*'), and for selecting individual elements of arguments by indexing
50 ### or attribute lookup.
52 ### * Unfortunately, Python's I/O subsystem is much less rich than Lisp's.
53 ### We lack streams which remember their cursor position, and so can't
54 ### implmenent the `?&' (fresh line) or `~T' (horizontal tab) operators
55 ### usefully. Moreover, the Python pretty-printer is rather less well
56 ### developed than the XP-based Lisp pretty-printer, so the pretty-
57 ### printing operations are unlikely to be implemented any time soon.
59 ### * This implementation is missing a number of formatting directives just
60 ### because they're somewhat tedious to write, such as the detailed
61 ### floating-point printing provided by `~E', `~F' and `~G'. These might
64 ### Formatting takes place in two separable stages. First, a format string
65 ### is compiled into a formatting operation. Then, the formatting operation
66 ### can be applied to sets of arguments. State for these two stages is
67 ### maintained in fluid variable sets `COMPILE' and `FORMAT'.
69 ### There are a number of protocols involved in making all of this work.
70 ### They're described in detail as we come across them, but here's an
73 ### * Output is determined by formatting-operation objects, typically (but
74 ### not necessarily) subclasses of `BaseFormatOperation'. A format
75 ### string is compiled into a single compound formatting operation.
77 ### * Formatting operations determine what to output from their own
78 ### internal state and from formatting arguments. The latter are
79 ### collected from argument-collection objects which are subclasses of
82 ### * Formatting operations can be modified using parameters, which are
83 ### supplied either through the format string or from arguments. To
84 ### abstract over this distinction, parameters are collected from
85 ### parameter-collection objects which are subclasses of `BaseParameter'.
88 ## State for format-time processing. The base state is established by the
89 ## `format' function, though various formatting operations will rebind
90 ## portions of the state while they perform recursive processing. The
91 ## variables are as follows.
93 ## argmap The map (typically a dictionary) of keyword arguments to be
94 ## formatted. These can be accessed only though `=KEY' or
97 ## argpos The index of the next positional argument to be collected.
98 ## The `~*' directive works by setting this variable.
100 ## argseq The sequence (typically a list) of positional arguments to be
101 ## formatted. These are collected in order (as modified by the
102 ## `~*' directive), or may be accessed through `=INDEX' or
105 ## escape An escape procedure (i.e., usually created by `Escape()') to
106 ## be called by `~^'.
108 ## last_multi_p A boolean, indicating that there are no more lists of
109 ## arguments (e.g., from `~:{...~}'), so `~:^' should escape if
110 ## it is encountered.
112 ## multi_escape An escape procedure (i.e., usually created by `Escape()') to
113 ## be called by `~:^'.
115 ## pushback Some formatting operations, notably `~@[...~]', read
116 ## arguments without consuming them, so a subsequent operation
117 ## should collect the same argument. This works by pushing the
118 ## arguments onto the `pushback' list.
120 ## write A function which writes its single string argument to the
124 ## State for compile-time processing. The base state is established by the
125 ## `compile' function, though some formatting operations will rebind portions
126 ## of the state while they perform recursive processing. The variables are
129 ## control The control string being parsed.
131 ## delim An iterable (usually a string) of delimiter directives. See
132 ## the `FormatDelimeter' class and the `collect_subformat'
133 ## function for details of this.
135 ## end The end of the portion of the control string being parsed.
136 ## There might be more of the string, but we should pretend that
139 ## opmaps A list of operation maps, i.e., dictionaries mapping
140 ## formatting directive characters to the corresponding
141 ## formatting operation classes. The list is searched in order,
142 ## and the first match is used. This can be used to provide
143 ## local extensions to the formatting language.
145 ## start The current position in the control string. This is advanced
146 ## as pieces of the string are successfully parsed.
148 ###--------------------------------------------------------------------------
149 ### A few random utilities.
153 Return the number of positional arguments remaining.
155 This will /include/ pushed-back arguments, so this needn't be monotonic
156 even in the absence of `~*' repositioning.
158 return len(FORMAT.pushback) + len(FORMAT.argseq) - FORMAT.argpos
161 def bind_args(args, **kw):
163 Context manager: temporarily establish a different collection of arguments.
165 If the ARGS have a `keys' attribute, then they're assumed to be a mapping
166 object and are set as the keyword arguments, preserving the positional
167 arguments; otherwise, the positional arguments are set and the keyword
168 arguments are preserved.
170 Other keyword arguments to this function are treated as additional `FORMAT'
171 variables to be bound.
173 if hasattr(args, 'keys'):
174 with FORMAT.bind(argmap = args, **kw): yield
176 with FORMAT.bind(argseq = args, argpos = 0, pushback = [], **kw): yield
178 ## Some regular expressions for parsing things.
179 R_INT = RX.compile(r'[-+]?[0-9]+')
180 R_WORD = RX.compile(r'[_a-zA-Z][_a-zA-Z0-9]*')
182 ###--------------------------------------------------------------------------
183 ### Format string errors.
185 class FormatStringError (Exception):
187 An exception type for reporting errors in format control strings.
189 Its most useful feature is that it points out where the error is in a
190 vaguely useful way. Attributes are as follows.
192 control The offending format control string.
194 msg The error message, as a human-readable string.
196 pos The position at which the error was discovered. This might
197 be a little way from the actual problem, but it's usually
201 def __init__(me, msg, control, pos):
203 Construct the exception, given a message MSG, a format CONTROL string,
204 and the position POS at which the error was found.
212 Present a string explaining the problem, including a dump of the
213 offending portion of the string.
215 s = me.control.rfind('\n', 0, me.pos) + 1
216 e = me.control.find('\n', me.pos)
217 if e < 0: e = len(me.control)
218 return '%s\n %s\n %*s^\n' % \
219 (me.msg, me.control[s:e], me.pos - s, '')
221 def format_string_error(msg):
222 """Report an error in the current format string."""
223 raise FormatStringError(msg, COMPILE.control, COMPILE.start)
225 ###--------------------------------------------------------------------------
226 ### Argument collection protocol.
228 ## Argument collectors abstract away the details of collecting formatting
229 ## arguments. They're used both for collecting arguments to be output, and
230 ## for parameters designated using the `v' or `!ARG' syntaxes.
232 ## There are a small number of primitive collectors, and some `compound
233 ## collectors' which read an argument using some other collector, and then
234 ## process it in some way.
236 ## An argument collector should implement the following methods.
238 ## get() Return the argument variable.
240 ## pair() Return a pair of arguments.
243 ## Return a string representation of the collector. If FORCEP,
244 ## always return a string; otherwise, a `NextArg' collector
245 ## returns `None' to indicate that no syntax is required to
248 class BaseArg (object):
250 Base class for argument collectors.
252 This implements the `pair' method by calling `get' and hoping that the
253 corresponding argument is indeed a sequence of two items.
257 """Trivial constructor."""
262 Return a pair of arguments, by returning an argument which is a pair.
267 """Print a useful string representation of the collector."""
268 return '#<%s "=%s">' % (type(me).__name__, me.tostr(True))
270 class NextArg (BaseArg):
271 """The default argument collector."""
275 Return the next argument.
277 If there are pushed-back arguments, then return the one most recently
278 pushed back. Otherwise, return the next argument from `argseq',
281 if FORMAT.pushback: return FORMAT.pushback.pop()
284 FORMAT.argpos = i + 1
288 """Return a pair of arguments, by fetching two separate arguments."""
293 def tostr(me, forcep):
294 """Convert the default collector to a string."""
295 if forcep: return '+'
299 ## Because a `NextArg' collectors are used so commonly, and they're all the
300 ## same, we make a distinguished one and try to use that instead. Nothing
301 ## goes badly wrong if you don't use this, but you'll use more memory than
302 ## strictly necessary.
304 class ThisArg (BaseArg):
305 """Return the current positional argument without consuming it."""
307 """Return the positional argument I on from the current position."""
308 n = len(FORMAT.pushback)
309 if n > i: return FORMAT.pushback[n - i - 1]
310 else: return FORMAT.argseq[FORMAT.argpos + i - n]
312 """Return the next argument."""
315 """Return the next two arguments without consuming either."""
316 return me._get(0), me._get(1)
317 def tostr(me, forcep):
318 """Convert the colector to a string."""
323 class SeqArg (BaseArg):
325 A primitive collector which picks out the positional argument at a specific
328 def __init__(me, index): me.index = index
329 def get(me): return FORMAT.argseq[me.index]
330 def tostr(me, forcep): return '%d' % me.index
332 class MapArg (BaseArg):
334 A primitive collector which picks out the keyword argument with a specific
337 def __init__(me, key): me.key = key
338 def get(me): return FORMAT.argmap[me.key]
339 def tostr(me, forcep): return '%s' % me.key
341 class IndexArg (BaseArg):
343 A compound collector which indexes an argument.
345 def __init__(me, base, index):
349 return me.base.get()[me.index]
350 def tostr(me, forcep):
351 return '%s[%s]' % (me.base.tostr(True), me.index)
353 class AttrArg (BaseArg):
355 A compound collector which returns an attribute of an argument.
357 def __init__(me, base, attr):
361 return getattr(me.base.get(), me.attr)
362 def tostr(me, forcep):
363 return '%s.%s' % (me.base.tostr(True), me.attr)
365 ## Regular expression matching compound-argument suffixes.
366 R_REF = RX.compile(r'''
367 \[ ( [-+]? [0-9]+ ) \]
369 | \. ( [_a-zA-Z] [_a-zA-Z0-9]* )
374 Parse an argument collector from the current format control string.
376 The syntax of an argument is as follows.
378 ARG ::= COMPOUND-ARG | `{' COMPOUND-ARG `}'
380 COMPOUND-ARG ::= SIMPLE-ARG
381 | COMPOUND-ARG `[' INDEX `]'
382 | COMPOUND-ARG `.' WORD
384 SIMPLE-ARG ::= INT | WORD | `+' | `@'
386 Surrounding braces mean nothing, but may serve to separate the argument
387 from a following alphabetic formatting directive.
389 A `+' means `the next pushed-back or positional argument'. It's useful to
390 be able to say this explicitly so that indexing and attribute references
391 can be attached to it: for example, in `~={thing}@[~={+.attr}A~]'.
393 An integer argument selects the positional argument with that index; a
394 negative index counts backwards from the end, as is usual in Python.
396 A word argument selects the keyword argument with that key.
400 s, e = COMPILE.start, COMPILE.end
402 ## If it's delimited then pick through the delimiter.
404 if s < e and c[s] == '{':
408 ## Make sure there's something to look at.
409 if s >= e: raise FormatStringError('missing argument specifier', c, s)
411 ## Find the start of the breadcrumbs.
419 m = R_INT.match(c, s, e)
420 getarg = SeqArg(int(m.group()))
423 m = R_WORD.match(c, s, e)
424 if not m: raise FormatStringError('unknown argument specifier', c, s)
425 getarg = MapArg(m.group())
428 ## Now parse indices and attribute references.
430 m = R_REF.match(c, s, e)
432 if m.group(1): getarg = IndexArg(getarg, int(m.group(1)))
433 elif m.group(2): getarg = IndexArg(getarg, m.group(2))
434 elif m.group(3): getarg = AttrArg(getarg, m.group(3))
435 else: raise FormatStringError('internal error (weird ref)', c, s)
438 ## Finally, check that we have the close delimiter we want.
440 if s >= e or c[s] != brace:
441 raise FormatStringError('missing close brace', c, s)
448 ###--------------------------------------------------------------------------
449 ### Parameter collectors.
451 ## These are pretty similar in shape to argument collectors. The required
452 ## methods are as follows.
454 ## get() Return the parameter value.
456 ## tostr() Return a string representation of the collector. (We don't
457 ## need a FORCEP argument here, because there are no default
460 class BaseParameter (object):
462 Base class for parameter collector objects.
464 This isn't currently very useful, because all it provides is `__repr__',
465 but the protocol might get more complicated later.
467 def __init__(me): pass
468 def __repr__(me): return '#<%s "%s">' % (type(me).__name__, me.tostr())
470 class LiteralParameter (BaseParameter):
472 A literal parameter, parsed from the control string.
474 def __init__(me, lit): me.lit = lit
475 def get(me): return me.lit
477 if me.lit is None: return ''
478 elif isinstance(me.lit, (int, long)): return str(me.lit)
479 else: return "'%c" % me.lit
481 ## Many parameters are omitted, so let's just reuse a distinguished collector
483 LITNONE = LiteralParameter(None)
485 class RemainingParameter (BaseParameter):
487 A parameter which collects the number of remaining positional arguments.
489 def get(me): return remaining()
490 def tostr(me): return '#'
492 ## These are all the same, so let's just have one of them.
493 REMAIN = RemainingParameter()
495 class VariableParameter (BaseParameter):
497 A variable parameter, fetched from an argument.
499 def __init__(me, arg): me.arg = arg
500 def get(me): return me.arg.get()
502 s = me.arg.tostr(False)
505 VARNEXT = VariableParameter(NEXTARG)
507 ###--------------------------------------------------------------------------
508 ### Formatting protocol.
510 ## The formatting operation protocol is pretty straightforward. An operation
511 ## must implement a method `format' which takes no arguments, and should
512 ## produce its output (if any) by calling `FORMAT.write'. In the course of
513 ## its execution, it may collect parameters and arguments.
515 ## The `opmaps' table maps formatting directives (which are individual
516 ## characters, in upper-case for letters) to functions returning formatting
517 ## operation objects. All of the directives are implemented in this way.
518 ## The functions for the base directives are actually the (callable) class
519 ## objects for subclasses of `BaseFormatOperation', though this isn't
522 ## The constructor functions are called as follows:
524 ## FUNC(ATP, COLONP, GETARG, PARAMS, CHAR)
525 ## The ATP and COLONP arguments are booleans indicating respectively
526 ## whether the `@' and `:' modifiers were set in the control string.
527 ## GETARG is the collector for the operation's argument(s). The PARAMS
528 ## are a list of parameter collectors. Finally, CHAR is the directive
529 ## character (so directives with siilar behaviour can use the same
532 class FormatLiteral (object):
534 A special formatting operation for printing literal text.
536 def __init__(me, s): me.s = s
537 def __repr__(me): return '#<%s %r>' % (type(me).__name__, me.s)
538 def format(me): FORMAT.write(me.s)
540 class FormatSequence (object):
542 A special formatting operation for applying collection of other operations
545 def __init__(me, seq):
548 return '#<%s [%s]>' % (type(me).__name__,
549 ', '.join(repr(p) for p in me.seq))
551 for p in me.seq: p.format()
553 class BaseFormatOperation (object):
555 The base class for built-in formatting operations (and, probably, most
558 Subclasses should implement a `_format' method.
560 _format(ATP, COLONP, [PARAM = DEFAULT, ...])
561 Called to produce output. The ATP and COLONP flags are from
562 the constructor. The remaining function arguments are the
563 computed parameter values. Arguments may be collected using
564 the `getarg' attribute.
566 Subclasses can set class attributes to influence the constructor.
568 MINPARAM The minimal number of parameters acceptable. If fewer
569 parameters are supplied then an error is reported at compile
570 time. The default is zero.
572 MAXPARAM The maximal number of parameters acceptable. If more
573 parameters are supplied then an error is reported at compile
574 time. The default is zero; `None' means that there is no
575 maximum (but this is unusual).
577 Instances have a number of useful attributes.
579 atp True if an `@' modifier appeared in the directive.
581 char The directive character from the control string.
583 colonp True if a `:' modifier appeared in the directive.
585 getarg Argument collector; may be called by `_format'.
587 params A list of parameter collector objects.
590 ## Default bounds on parameters.
591 MINPARAM = MAXPARAM = 0
593 def __init__(me, atp, colonp, getarg, params, char):
595 Constructor: store information about the directive, and check the bounds
598 A subclass should call this before doing anything fancy such as parsing
599 the control string further.
602 ## Store information.
609 ## Check the parameters.
611 if len(params) < me.MINPARAM: bad = True
612 elif me.MAXPARAM is not None and len(params) > me.MAXPARAM: bad = True
614 format_string_error('bad parameters')
617 """Produce output: call the subclass's formatting function."""
618 me._format(me.atp, me.colonp, *[p.get() for p in me.params])
621 """Convert the operation to a directive string."""
622 return '~%s%s%s%s%s' % (
623 ','.join(a.tostr() for a in me.params),
624 me.colonp and ':' or '',
625 me.atp and '@' or '',
626 (lambda s: s and '={%s}' % s or '')(me.getarg.tostr(False)),
630 """Produce a readable (ahem) version of the directive."""
631 return '#<%s "%s">' % (type(me).__name__, me.tostr())
633 class FormatDelimiter (BaseFormatOperation):
635 A fake formatting operation which exists to impose additional syntactic
636 structure on control strings.
638 No `_format' method is actually defined, so `FormatDelimiter' objects
639 should never find their way into the output pipeline. Instead, they are
640 typically useful in conjunction with the `collect_subformat' function. To
641 this end, the constructor will fail if its directive character is not in
642 listed as an expected delimiter in `CONTROL.delim'.
645 def __init__(me, *args):
647 Constructor: make sure this delimiter is expected in the current context.
649 super(FormatDelimiter, me).__init__(*args)
650 if me.char not in COMPILE.delim:
651 format_string_error("unexpected close delimiter `~%s'" % me.char)
653 ###--------------------------------------------------------------------------
654 ### Parsing format strings.
656 def parse_operator():
658 Parse the next portion of the current control string and return a single
659 formatting operator for it.
661 If we have reached the end of the control string (as stored in
662 `CONTROL.end') then return `None'.
666 s, e = COMPILE.start, COMPILE.end
668 ## If we're at the end then stop.
669 if s >= e: return None
671 ## If there's some literal text then collect it.
673 i = c.find('~', s, e)
676 return FormatLiteral(c[s:i])
678 ## Otherwise there's a formatting directive to collect.
681 ## First, collect arguments.
691 if s >= e: raise FormatStringError('missing argument character', c, s)
692 aa.append(LiteralParameter(c[s]))
694 elif c[s].upper() == 'V':
698 COMPILE.start = s + 1
701 aa.append(VariableParameter(getarg))
706 m = R_INT.match(c, s, e)
708 aa.append(LiteralParameter(int(m.group())))
710 if s >= e or c[s] != ',': break
713 ## Maybe there's an explicit argument.
714 if s < e and c[s] == '=':
715 COMPILE.start = s + 1
721 ## Next, collect the flags.
727 if atp: raise FormatStringError('duplicate at flag', c, s)
730 if colonp: raise FormatStringError('duplicate colon flag', c, s)
736 ## We should now have a directive character.
737 if s >= e: raise FormatStringError('missing directive', c, s)
740 for map in COMPILE.opmaps:
742 except KeyError: pass
745 raise FormatStringError('unknown directive', c, s)
750 return op(atp, colonp, getarg, aa, ch)
752 def collect_subformat(delim):
754 Parse formatting operations from the control string until we find one whose
755 directive character is listed in DELIM.
757 Where an operation accepts multiple sequences of formatting directives, the
758 first element of DELIM should be the proper closing delimiter. The
759 traditional separator is `~;'.
762 with COMPILE.bind(delim = delim):
766 format_string_error("missing close delimiter `~%s'" % delim[0])
767 if isinstance(p, FormatDelimiter) and p.char in delim: break
769 return FormatSequence(pp), p
771 def compile(control):
773 Parse the whole CONTROL string, returning the corresponding formatting
777 with COMPILE.bind(control = control, start = 0, end = len(control),
783 return FormatSequence(pp)
785 ###--------------------------------------------------------------------------
788 def format(out, control, *args, **kw):
790 Format the positional args and keywords according to the CONTROL, and write
793 The output is written to OUT, which may be one of the following.
795 `True' Write to standard output.
797 `False' Write to standard error.
799 `None' Return the output as a string.
801 Any object with a `write' attribute
802 Call `write' repeatedly with strings to be output.
805 Call the object repeatedly with strings to be output.
807 The CONTROL argument may be one of the following.
809 A string or unicode object
810 Compile the string into a formatting operation and use that.
812 A formatting operation
813 Apply the operation to the arguments.
816 ## Turn the output argument into a function which we can use easily. If
817 ## we're writing to a string, we'll have to extract the result at the end,
818 ## so keep track of anything we have to do later.
819 final = U.constantly(None)
821 write = SYS.stdout.write
823 write = SYS.stderr.write
827 final = strio.getvalue
828 elif hasattr(out, 'write'):
835 ## Turn the control argument into a formatting operation.
836 if isinstance(control, basestring):
837 op = compile(control)
841 ## Invoke the formatting operation in the correct environment.
842 with FORMAT.bind(write = write, pushback = [],
843 argseq = args, argpos = 0,
850 ###--------------------------------------------------------------------------
851 ### Standard formatting directives.
853 ## A dictionary, in which we'll build the basic set of formatting operators.
854 ## Callers wishing to implement extensions should include this in their
857 COMPILE.opmaps = [BASEOPS]
859 ## Some standard delimiter directives.
860 for i in [']', ')', '}', '>', ';']: BASEOPS[i] = FormatDelimiter
862 class SimpleFormatOperation (BaseFormatOperation):
864 Common base class for the `~A' (`str') and `~S' (`repr') directives.
866 These take similar parameters, so it's useful to deal with them at the same
867 time. Subclasses should implement a method `_convert' of one argument,
868 which returns a string to be formatted.
870 The parameters are as follows.
872 MINCOL The minimum number of characters to output. Padding is added
873 if the output string is shorter than this.
875 COLINC Lengths of padding groups. The number of padding characters
876 will be MINPAD more than a multiple of COLINC.
878 MINPAD The smallest number of padding characters to write.
880 PADCHAR The padding character.
882 If the `@' modifier is given, then padding is applied on the left;
883 otherwise it is applied on the right.
888 def _format(me, atp, colonp,
889 mincol = 0, colinc = 1, minpad = 0, padchar = ' '):
890 what = me._convert(me.getarg.get())
892 p = mincol - n - minpad + colinc - 1
897 elif atp: what = (p * padchar) + what
898 else: what = what + (p * padchar)
901 class FormatString (SimpleFormatOperation):
902 """~A: convert argument to a string."""
903 def _convert(me, arg): return str(arg)
904 BASEOPS['A'] = FormatString
906 class FormatRepr (SimpleFormatOperation):
907 """~S: convert argument to readable form."""
908 def _convert(me, arg): return repr(arg)
909 BASEOPS['S'] = FormatRepr
911 class IntegerFormat (BaseFormatOperation):
913 Common base class for the integer formatting directives `~D', `~B', `~O~,
916 These take similar parameters, so it's useful to deal with them at the same
917 time. There is a `_convert' method which does the main work. By default,
918 `_format' calls this with the argument and the value of the class attribute
919 `RADIX'; complicated subclasses might want to override this behaviour.
921 The parameters are as follows.
923 MINCOL Minimum column width. If the output is smaller than this
924 then it will be padded on the left. The default is 0.
926 PADCHAR Character to use to pad the output, should this be necessary.
927 The default is space.
929 COMMACHAR If the `:' modifier is present, then use this character to
930 separate groups of digits. The default is `,'.
932 COMMAINTERVAL If the `:' modifier is present, then separate groups of this
933 many digits. The default is 3.
935 If `@' is present, then a sign is always written; otherwise only `-' signs
941 def _convert(me, n, radix, atp, colonp,
942 mincol = 0, padchar = ' ',
943 commachar = ',', commainterval = 3):
945 Convert the integer N into the given RADIX, under the control of the
946 formatting parameters supplied.
949 ## Sort out the sign. We'll deal with it at the end: for now it's just a
951 if n < 0: sign = '-'; n = -n
955 ## Build in `dd' a list of the digits, in reverse order. This will make
956 ## the commafication easier later. The general radix conversion is
957 ## inefficient but we can make that better later.
962 if radix == 10: dd = revdigits(str(n))
963 elif radix == 8: dd = revdigits(oct(n))
964 elif radix == 16: dd = revdigits(hex(n).upper())
968 q, r = divmod(n, radix)
969 if r < 10: ch = asc(ord('0') + r)
970 elif r < 36: ch = asc(ord('A') - 10 + r)
971 else: ch = asc(ord('a') - 36 + r)
973 if not dd: dd.append('0')
975 ## If we must commafy then do that.
980 if i >= commainterval: ndd.append(commachar); i = 0
985 if sign: dd.append(sign)
987 ## Maybe we must pad the result.
988 s = ''.join(reversed(dd))
989 npad = mincol - len(s)
990 if npad > 0: s = npad*padchar + s
995 def _format(me, atp, colonp, mincol = 0, padchar = ' ',
996 commachar = ',', commainterval = 3):
997 me._convert(me.getarg.get(), me.RADIX, atp, colonp, mincol, padchar,
998 commachar, commainterval)
1000 class FormatDecimal (IntegerFormat):
1001 """~D: Decimal formatting."""
1003 BASEOPS['D'] = FormatDecimal
1005 class FormatBinary (IntegerFormat):
1006 """~B: Binary formatting."""
1008 BASEOPS['B'] = FormatBinary
1010 class FormatOctal (IntegerFormat):
1011 """~O: Octal formatting."""
1013 BASEOPS['O'] = FormatOctal
1015 class FormatHex (IntegerFormat):
1016 """~X: Hexadecimal formatting."""
1018 BASEOPS['X'] = FormatHex
1020 class FormatRadix (IntegerFormat):
1021 """~R: General integer formatting."""
1023 def _format(me, atp, colonp, radix = None, mincol = 0, padchar = ' ',
1024 commachar = ',', commainterval = 3):
1026 raise ValueError, 'Not implemented'
1027 me._convert(me.getarg.get(), radix, atp, colonp, mincol, padchar,
1028 commachar, commainterval)
1029 BASEOPS['R'] = FormatRadix
1031 class FormatSuppressNewline (BaseFormatOperation):
1033 ~newline: suppressed newline and/or spaces.
1035 Unless the `@' modifier is present, don't print the newline. Unless the
1036 `:' modifier is present, don't print the following string of whitespace
1039 R_SPACE = RX.compile(r'\s*')
1040 def __init__(me, *args):
1041 super(FormatSuppressNewline, me).__init__(*args)
1042 m = me.R_SPACE.match(COMPILE.control, COMPILE.start, COMPILE.end)
1043 me.trail = m.group()
1044 COMPILE.start = m.end()
1045 def _format(me, atp, colonp):
1046 if atp: FORMAT.write('\n')
1047 if colonp: FORMAT.write(me.trail)
1048 BASEOPS['\n'] = FormatSuppressNewline
1050 class LiteralFormat (BaseFormatOperation):
1052 A base class for formatting operations which write fixed strings.
1054 Subclasses should have an attribute `CHAR' containing the string (usually a
1055 single character) to be written.
1057 These operations accept a single parameter:
1059 COUNT The number of copies of the string to be written.
1062 def _format(me, atp, colonp, count = 1):
1063 FORMAT.write(count * me.CHAR)
1065 class FormatNewline (LiteralFormat):
1066 """~%: Start a new line."""
1068 BASEOPS['%'] = FormatNewline
1070 class FormatTilde (LiteralFormat):
1071 """~~: Print a literal `@'."""
1073 BASEOPS['~'] = FormatTilde
1075 class FormatCaseConvert (BaseFormatOperation):
1077 ~(...~): Case-convert the contained output.
1079 The material output by the contained directives is subject to case
1080 conversion as follows.
1082 no modifiers Convert to lower-case.
1083 @ Make initial letter upper-case and remainder lower.
1084 : Make initial letters of words upper-case.
1085 @: Convert to upper-case.
1087 def __init__(me, *args):
1088 super(FormatCaseConvert, me).__init__(*args)
1089 me.sub, _ = collect_subformat(')')
1090 def _format(me, atp, colonp):
1093 with FORMAT.bind(write = strio.write):
1096 inner = strio.getvalue()
1098 if colonp: out = inner.upper()
1099 else: out = inner.capitalize()
1101 if colonp: out = inner.title()
1102 else: out = inner.lower()
1104 BASEOPS['('] = FormatCaseConvert
1106 class FormatGoto (BaseFormatOperation):
1108 ~*: Seek in positional arguments.
1110 There may be a parameter N; the default value depends on which modifiers
1111 are present. Without `@', skip forwards or backwards by N (default
1112 1) places; with `@', move to argument N (default 0). With `:', negate N,
1113 so move backwards instead of forwards, or count from the end rather than
1114 the beginning. (Exception: `~@:0*' leaves no arguments remaining, whereas
1115 `~@-0*' is the same as `~@0*', and starts again from the beginning.
1117 BUG: The list of pushed-back arguments is cleared.
1120 def _format(me, atp, colonp, n = None):
1125 else: n = len(FORMAT.argseq)
1126 if n < 0: n += len(FORMAT.argseq)
1132 FORMAT.pushback = []
1133 BASEOPS['*'] = FormatGoto
1135 class FormatConditional (BaseFormatOperation):
1137 ~[...[~;...]...[~:;...]~]: Conditional formatting.
1139 There are three variants, which are best dealt with separately.
1141 With no modifiers, apply the Nth enclosed piece, where N is either the
1142 parameter, or the argument if no parameter is provided. If there is no
1143 such piece (i.e., N is negative or too large) and the final piece is
1144 introduced by `~:;' then use that piece; otherwise produce no output.
1146 With `:', there must be exactly two pieces: apply the first if the argument
1147 is false, otherwise the second.
1149 With `@', there must be exactly one piece: if the argument is not `None'
1150 then push it back and apply the enclosed piece.
1155 def __init__(me, *args):
1157 ## Store the arguments.
1158 super(FormatConditional, me).__init__(*args)
1160 ## Collect the pieces, and keep track of whether there's a default piece.
1165 piece, delim = collect_subformat('];')
1166 if nextdef: default = piece
1167 else: pieces.append(piece)
1168 if delim.char == ']': break
1170 if default: format_string_error('multiple defaults')
1173 ## Make sure the syntax matches the modifiers we've been given.
1174 if (me.colonp or me.atp) and default:
1175 format_string_error('default not allowed here')
1176 if (me.colonp and len(pieces) != 2) or \
1177 (me.atp and len(pieces) != 1):
1178 format_string_error('wrong number of pieces')
1182 me.default = default
1184 def _format(me, atp, colonp, n = None):
1186 arg = me.getarg.get()
1187 if arg: me.pieces[1].format()
1188 else: me.pieces[0].format()
1190 arg = me.getarg.get()
1192 FORMAT.pushback.append(arg)
1193 me.pieces[0].format()
1195 if n is None: n = me.getarg.get()
1196 if 0 <= n < len(me.pieces): piece = me.pieces[n]
1197 else: piece = me.default
1198 if piece: piece.format()
1199 BASEOPS['['] = FormatConditional
1201 class FormatIteration (BaseFormatOperation):
1203 ~{...~}: Repeated formatting.
1205 Repeatedly apply the enclosed formatting directives to a sequence of
1206 different arguments. The directives may contain `~^' to escape early.
1208 Without `@', an argument is fetched and is expected to be a sequence; with
1209 `@', the remaining positional arguments are processed.
1211 Without `:', the enclosed directives are simply applied until the sequence
1212 of arguments is exhausted: each iteration may consume any number of
1213 arguments (even zero, though this is likely a bad plan) and any left over
1214 are available to the next iteration. With `:', each element of the
1215 sequence of arguments is itself treated as a collection of arguments --
1216 either positional or keyword depending on whether it looks like a map --
1217 and exactly one such element is consumed in each iteration.
1219 If a parameter is supplied then perform at most this many iterations. If
1220 the closing delimeter bears a `:' modifier, and the parameter is not zero,
1221 then the enclosed directives are applied once even if the argument sequence
1224 If the formatting directives are empty then a formatting string is fetched
1225 using the argument collector associated with the closing delimiter.
1230 def __init__(me, *args):
1231 super(FormatIteration, me).__init__(*args)
1232 me.body, me.end = collect_subformat('}')
1234 def _multi(me, body):
1236 Treat the positional arguments as a sequence of argument sets to be
1239 args = NEXTARG.get()
1240 with U.Escape() as esc:
1241 with bind_args(args, multi_escape = FORMAT.escape, escape = esc,
1242 last_multi_p = not remaining()):
1245 def _single(me, body):
1247 Format arguments from a single argument sequence.
1251 def _loop(me, each, max):
1253 Apply the function EACH repeatedly. Stop if no positional arguments
1254 remain; if MAX is not `None', then stop after that number of iterations.
1255 The EACH function is passed a formatting operation representing the body
1258 if me.body.seq: body = me.body
1259 else: body = compile(me.end.getarg.get())
1260 oncep = me.end.colonp
1263 if max is not None and i >= max: break
1264 if (i > 0 or not oncep) and not remaining(): break
1268 def _format(me, atp, colonp, max = None):
1269 if colonp: each = me._multi
1270 else: each = me._single
1271 with U.Escape() as esc:
1272 with FORMAT.bind(escape = esc):
1276 with bind_args(me.getarg.get()):
1278 BASEOPS['{'] = FormatIteration
1280 class FormatEscape (BaseFormatOperation):
1282 ~^: Escape from iteration.
1284 Conditionally leave an iteration early.
1286 There may be up to three parameters: call then X, Y and Z. If all three
1287 are present then exit unless Y is between X and Z (inclusive); if two are
1288 present then exit if X = Y; if only one is present, then exit if X is
1289 zero. Obviously these are more useful if at least one of X, Y and Z is
1292 With no parameters, exit if there are no positional arguments remaining.
1293 With `:', check the number of argument sets (as read by `~:{...~}') rather
1294 than the number of arguments in the current set, and escape from the entire
1295 iteration rather than from the processing the current set.
1298 def _format(me, atp, colonp, x = None, y = None, z = None):
1299 if z is not None: cond = x <= y <= z
1300 elif y is not None: cond = x != y
1301 elif x is not None: cond = x != 0
1302 elif colonp: cond = not FORMAT.last_multi_p
1303 else: cond = remaining()
1305 if colonp: FORMAT.multi_escape()
1306 else: FORMAT.escape()
1307 BASEOPS['^'] = FormatEscape
1309 class FormatRecursive (BaseFormatOperation):
1311 ~?: Recursive formatting.
1313 Without `@', read a pair of arguments: use the first as a format string,
1314 and apply it to the arguments extracted from the second (which may be a
1317 With `@', read a single argument: use it as a format string and apply it to
1318 the remaining arguments.
1320 def _format(me, atp, colonp):
1321 with U.Escape() as esc:
1323 control = me.getarg.get()
1324 op = compile(control)
1325 with FORMAT.bind(escape = esc): op.format()
1327 control, args = me.getarg.pair()
1328 op = compile(control)
1329 with bind_args(args, escape = esc): op.format()
1330 BASEOPS['?'] = FormatRecursive
1332 ###----- That's all, folks --------------------------------------------------