chiark - git - mdw - sod/blob - STYLE

   1 Notes on Lisp style
   2
   3 * Language subset and extensions
   4
   5 None of ANSI Common Lisp is off-limits.
   6
   7 I think my Lisp style is rather more imperative in flavour than most
   8 modern Lisp programmers.  It's probably closer to historical Lisp
   9 practice in that regard, even though I wasn't writing Lisp back then.  A
  10 lot of this is because I don't assume that the Lisp implementation
  11 handles tail calls properly: Common Lisp is not Scheme.
  12
  13 I make extensive use of CLOS, and macros.  On a couple of occasions I've
  14 made macros which use CLOS generic function dispatch to compute their
  15 expansions.  The parser language is probably the best example of this in
  16 the codebase.
  17
  18 I like hairy ~format~ strings.  I've intentionally opted to leave them
  19 as challenges to the reader rather than explain them.
  20
  21 I've avoided hairy ~loop~ for the most part, not because I dislike it
  22 strongly but because others do and I don't find that it wins big enough
  23 for the fight to be worthwhile.
  24
  25 I only use ~&aux~ lambda-list parameters in ~defstruct~ BOA
  26 constructors, for special effects.
  27
  28 I use ~car~, not ~first~, and ~cdr~, not ~rest~.  Similarly, I use
  29 ~cadr~, not ~second~, and I'm not afraid to use ~cddr~ or ~cadar~.
  30
  31 Similarly, I've not used ~elt~, preferring to know what kind of sequence
  32 I'm dealing with, or using the built-in sequence functions.
  33
  34 I'm happy to use ~1+~, and I like the brevity of ~1-~ enough to use it
  35 despite its terrible name.
  36
  37 There are no reader syntax extensions in the code.  This is because I
  38 couldn't think of any way they'd be especially helpful, and not because
  39 I'm in any way opposed to them.
  40
  41 The main translator, in the ~SOD~ package, tries to assume very little
  42 beyond ANSI Common Lisp and what's included in just about every serious
  43 implementation: specifically, MOP introspection, and Gray streams.
  44 There's intentionally no MOP intercession.
  45
  46 The frontend additionally makes use of ~cl-launch~, but the dependency
  47 is actually quite weak, and it could be replaced with a different, maybe
  48 implementation-specific, mechanism fairly easily.  I'm keen to take
  49 patches which improve frontend portability.
  50
  51 I'm more tolerant of extensions and external dependencies in the test
  52 suite, which makes additional use of ~xlunit~.  Running the test suite
  53 isn't essential to getting the translator built, so this isn't as much
  54 of a problem.
  55
  56
  57 * Layout
  58
  59 I pretty much let Emacs indent my code for me, based on information
  60 collected by SLIME.  Some exceptions:
  61
  62   + DSLs (e.g., the parser language) have their own space of macros
  63     which Emacs doesn't understand and for the most part I haven't
  64     bothered to teach it.
  65
  66   + Emacs sometimes does a bad job with hairy ~loop~ and requires manual
  67     fixing.  Since I don't use hairy ~loop~ much, this isn't a major
  68     problem.
  69
  70   + Emacs indents lambda lists really badly.  I often prefer to put the
  71     entire lambda list on its own line than to split it.  If I have to
  72     split a simple lambda list, without lambda-list keywords, I just
  73     align the start of each subsequent line with the start of the first
  74     argument.  I break hairy lambda lists before lambda-list keywords,
  75     and the start of a subsequent line aligns with the first argument
  76     name following the lambda-list keyword which begins the group, so
  77     that the lambda-list keyword stands out.
  78
  79     : (defun many-arguments (first second third
  80     :                        fourth fifth)
  81     :   ...)
  82
  83     : (defun hairy-arguments (first second third
  84     :                         &optional fourth fifth
  85     :                                   sixth
  86     :                         &rest others)
  87     :   ...)
  88
  89     I don't know what I'd do if I had a hairy lambda list with so many
  90     mandatory positional arguments that I had to split them.  So far,
  91     this situation hasn't come up.
  92
  93 Lines are 77 characters at most, except for strange special effects.
  94 Don't ask.  This is not negotiable, though.  Don't try to tell me that
  95 your monitor is very wide so you can read longer lines.  My monitor is
  96 likely at least as wide.  On the other hand, most lines are easily short
  97 enough to fit in my narrow columns, so the right hand side of a wide
  98 window would be mostly blank.  This seems wasteful to me, when I could
  99 fill that space with more code.
 100
 101 Lisp code does have a tendency to march across to the right quite
 102 rapidly given a chance.  I have a number of strategies for dealing with
 103 this.
 104
 105   + Break a long nested calculation into pieces, giving names to the
 106     intermediate results, in a ~let*~ form.
 107
 108   + Hoist deeply nested complex computations out into ~flet~ or
 109     ~labels~, and then invoke them from inside whatever complicated
 110     conditional mess was needed to decide what to do.
 111
 112   + Shrug my shoulders and let code dribble down the right hand side for
 113     a bit.
 114
 115
 116 * Packages and exporting
 117
 118 A package collects symbols which are given meanings in one or more
 119 source files.  If a package's code is all in one file, then the package
 120 definition can be put in that file too; otherwise I put it in its own
 121 file.
 122
 123 I don't put ~:export~ in package definitions.  Instead, I scatter calls
 124 to the ~export~ function throughout the code, right next to where the
 125 relevant symbol is defined.  This has three important advantages.
 126
 127   + You can tell, when you're reading the code which defines ~foo~,
 128     whether ~foo~ is exported and therefore a defined part of the
 129     package interface.
 130
 131   + When you know that you're writing a thing which will form part of
 132     the package interface, you don't have to go off and edit some other
 133     file to export it.
 134
 135   + A master list of exported symbols becomes a merge hazard: if two
 136     different branches add symbols to nearby pieces of the master list
 137     then you get a merge conflict for no especially good reason.
 138
 139 There's an apparent disadvantage: there's no immediately visible master
 140 list of exported symbols.  But that's not a big problem:
 141
 142 : (loop for s being the external-symbols of pkg collect s)
 143
 144 See ~doc/list-symbols.lisp~ for more sophisticated reporting.  (In
 145 particular, this identifies what kind of thing(s) each external symbol
 146 names.)
 147
 148
 149 * Comments and file structuring
 150
 151 A file starts with a big ~;;;~ comment bearing the Emacs ~-*-lisp-*-~
 152 marker, a quick description, and copyright and licensing boilerplate.  I
 153 don't use four-semicolon comments, and I only use ~#|~ ... ~|#~ for
 154 special effects.
 155
 156 Then there's package stuff.  There may be a ~cl:defpackage~ form (with
 157 explicit package qualifier) if the relevant package doesn't have its own
 158 package definition file.  I use gensyms to name packages: strings don't
 159 seem right, and symbols would leak into some unrelated package.
 160
 161 Then there's ~cl:in-package~.  Like ~defpackage~, I use a gensym to name
 162 the package.  I can't think offhand of a good reason to have a file with
 163 sections `in' more than one package.  So, the ~in-package~ form goes at
 164 the top of the file, before the first section header.  If sections are
 165 going to end up in separate packages, I think I'd put a ~cl:in-package~
 166 at the top of each section in case I wanted to reorder them.
 167
 168 The rest of the file consists of Lisp code.  I don't use page boundaries
 169 ~^L~ to split files up.  Instead, I use big banner comments for this:
 170
 171 : ;;;--------------------------------------------------------------------------
 172 : ;;; Section title.
 173
 174 Sections don't usually have internal comments, but if they did they'd
 175 also be ~;;;~ comments.
 176
 177 Almost all definitions get documentation strings.  I've tried to be
 178 consistent about formatting.
 179
 180   + Docstring lines are 77 characters or less.
 181
 182   + The first line gives a summary of what the thing does.  The summary,
 183     together with the SLIME-generated synopsis, is likely enough to
 184     remind you what the thing does.
 185
 186   + The rest of the lines are indented by three spaces, and explain
 187     carefully what the thing does and what all the parameters mean.
 188
 189 Smallish functions and macros don't usually need any further
 190 commentary.  Big functions often need to be split into bitesize pieces
 191 with their own internal ~;;~ comments.  The idea is that these comments
 192 should explain the code's overall strategy to the reader, and help them
 193 figure out how a piece fits into that strategy.
 194
 195 Winged, single ~;~ comments are very rare.
 196
 197 Files end, as a result of long tradition, with a comment
 198
 199 : ;;;----- That's all, folks --------------------------------------------------
 200
 201
 202 * Macro style
 203
 204 I don't mind complicated macros if they're doing something worthwhile.
 205 They need to have good documentation strings, though.
 206
 207 That said, where possible I've tried to factor macros into an actual
 208 macro providing the syntactic sugar, and a function which receives the
 209 parameters and $\eta$-expanded forms, and does the actual work.
 210
 211 It's extremely bad taste for a macro to evaluate its evaluable
 212 parameters in any order other than strictly left to right, or to
 213 evaluate them more than once.
 214
 215
 216 * Data structures
 217
 218 I've tended to be happy with plain lists for homogeneous-ish
 219 collections.  Strongly heterogeneous collections (other than input
 220 syntax, destructured using ~defmacro~ or ~destructuring-bind~) I've
 221 tended to make a proper data type for.
 222
 223 My first instinct when defining a new structure is to use ~defclass~.
 224 While it's annoyingly verbose, it has the immense benefit over
 225 ~defstruct~ that it's safe to redefine CLOS classes in a running image
 226 without the world breaking, and I usually find it necessary to add or
 227 change slots while I'm working on new code.  Once a piece of code has
 228 settled down and I have a good feel for what my structure is actually
 229 doing, I might switch the ~defclass~ for a ~defstruct~.  Several
 230 questions influence my decision.
 231
 232   + Do slot accesses need to be really fast?  My usual Lisp
 233     implementations aggressively optimize ~defstruct~ accessor
 234     functions.
 235
 236   + Have I subclassed my class?  While I can move over a
 237     single-inheritance tree using ~:include~, it seems wrong to do this
 238     most of the time.  Also, I'd be precluding subclasses from multiple
 239     inheritance, and I'd either have to prohibit subclassing by
 240     extensions or have to commit to ~defstruct~ in the documentation.
 241     In general, I'm much happier committing to ~defclass~.
 242
 243   + Are there methods specialized on my class?  Again, structure classes
 244     make fine method specializers, but it doesn't seem right.
 245
 246 Apart from being hard to redefine, ~defstruct~ does a pretty good job of
 247 making a new structure type.  I tend to tidy up a few rough edges.
 248
 249   + The default predicate always has ~-p~ appended.  If the class name
 250     is a single word, then I'll explicitly name the predicate with a
 251     simple ~p~ suffix.  For example, ~ship~ would have the predicate
 252     ~shipp~, rather than ~ship-p~.
 253
 254   + If there are slots I can't default then I'll usually provide a BOA
 255     constructor which sets them from required parameters; other slots
 256     I'll set from optional or keyword parameters according to my taste
 257     and judgement.
 258
 259   + Slots mustn't be given names which are external in any package.
 260     Unfortunately, slot names are used in constructing accessor names,
 261     and sometimes the right accessor name involves a prohibited symbol.
 262     I've mostly addressed this by naming the slot ~%foo~, and then
 263     providing inline reader and writer functions.  (CLOS class
 264     definitions don't have this problem because you get to set the
 265     accessor function names independently of the slot names.)
 266
 267   + BOA constructors are strange.  You can set the initial slots based
 268     on an arbitrary computation on the provided parameters, but you have
 269     to roll up your sleeves and mess with ~&aux~ parameters to pull it
 270     off.
 271
 272
 273 * Naming
 274
 275 I'm a traditionalist in some ways, and one of the reasons I like Lisp is
 276 the richness of its history and tradition.
 277
 278 In other languages, I tend to use single- or two-letter names for
 279 variables and structure slots; not so much in Lisp.  Other languages
 280 express more using punctuation, so the names stand out easily; I find
 281 that short names can be lost more easily in Lisp.
 282
 283 I've also tended to go for fairly prosaic names, taking my inspiration
 284 from the CLOS MOP.  While I mourn the loss of whimsical names like
 285 ~haulong~ and ~haipart~, I've tried to avoid inventing more of them.
 286
 287 There's a convention, which I think comes from ML, of using ~_~ where a
 288 binding occurrence of a variable name is expected, to signify that that
 289 the corresponding value is to be discarded.  Common Lisp, alas, doesn't
 290 have such a convention.  Instead, there's a sequence of silly names used
 291 with the same intention, and the bindings are then explicitly ignored
 292 with a declaration.  The names begin ~hunoz~, ~hukairz~, and (I think)
 293 ~huaskt~.
 294
 295
 296 * Declarations
 297
 298 The code is light on declarations, other than ~ignore~ and similar used
 299 to muffle warnings.  The macros try to do sensible things with
 300 declarations, and I think they succeed fairly well, but there might be
 301 bugs and rough edges.  I know that some are just broken because, for
 302 actual correctness, declarations provided by the caller need to be split
 303 up into a number of different parts of the expansion, which in turn
 304 requires figuring out what the declarations mean and which bindings
 305 they're referring to.  That's not completely impossible, assuming that
 306 there aren't implementation-specific declarations with crazy syntax
 307 mixed in there, but it's more work than seems worthwhile.
 308
 309
 310 * COMMENT Emacs cruft
 311
 312 #+LATEX_CLASS: strayman
 313
 314 ## LocalWords:  CLOS ish destructure destructured accessor specializers
 315 ## LocalWords:  accessors DSLs gensym gensyms
 316
 317 ## Local variables:
 318 ## mode: org
 319 ## End: