From 1528431bae65721c81c5a3a1fd99b4231a936fe0 Mon Sep 17 00:00:00 2001 Message-Id: <1528431bae65721c81c5a3a1fd99b4231a936fe0.1716728691.git.mdw@distorted.org.uk> From: Mark Wooding Date: Thu, 1 Oct 2015 11:49:41 +0100 Subject: [PATCH] STYLE: Document Lisp programming style. Organization: Straylight/Edgeware From: Mark Wooding --- STYLE | 287 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 287 insertions(+) create mode 100644 STYLE diff --git a/STYLE b/STYLE new file mode 100644 index 0000000..12e848a --- /dev/null +++ b/STYLE @@ -0,0 +1,287 @@ +Notes on Lisp style + +* Language subset and extensions + +None of ANSI Common Lisp is off-limits. + +I make extensive use of CLOS, and macros. On a couple of occasions I've +made macros which use CLOS generic function dispatch to compute their +expansions. The parser language is probably the best example of this in +the codebase. I like hairy ~format~ strings. + +I've avoided hairy ~loop~ for the most part, not because I dislike it +strongly but because others do and I don't find that it wins big enough +for the fight to be worthwhile. + +I only use ~&aux~ lambda-list parameters in ~defstruct~ BOA +constructors, for special effects. + +I use ~car~, not ~first~, and ~cdr~, not ~rest~. Similarly, I use +~cadr~, not ~second~, and I'm not afraid to use ~cddr~ or ~cadar~. + +Similarly, I've not used ~elt~, preferring to know what kind of sequence +I'm dealing with, or using the built-in sequence functions. + +I'm happy to use ~1+~, and I like the brevity of ~1-~ enough to use it +despite its terrible name. + +There are no reader syntax extensions in the code. This is because I +couldn't think of any way they'd be especially helpful, and not because +I'm in any way opposed to them. + +The main translator, in the ~SOD~ package, tries to assume very little +beyond ANSI Common Lisp and what's included in just about every serious +implementation: specifically, MOP introspection, and Gray streams. +There's intentionally no MOP intercession. + +The frontend additionally ~cl-launch~, but the dependency is actually +quite weak, and it could be replaced with a different, maybe +implementation-specific, mechanism fairly easily. I'm keen to take +patches which improve frontend portability. + +I'm more tolerant of extensions and external dependencies in the test +suite, which makes additional use of ~xlunit~. Running the test suite +isn't essential to getting the translator built, so this isn't as much +of a problem. + + +* Layout + +I pretty much let Emacs indent my code for me, based on information +collected by SLIME. Some exceptions: + + + DSLs (e.g., the parser language) have their own space of macros + which Emacs doesn't understand and for the most part I haven't + bothered to teach it. + + + Emacs sometimes does a bad job with hairy ~loop~ and requires manual + fixing. Since I don't use hairy ~loop~ much, this isn't a major + problem. + +Lines are 77 characters at most, except for strange special effects. +Don't ask. This is not negotiable, though. Don't try to tell me that +your monitor is very wide so you can read longer lines. My monitor is +likely at least as wide. On the other hand, most lines are easily short +enough to fit in my narrow columns, so the right hand side of a wide +window would be mostly blank. This seems wasteful to me, when I could +fill that space with more code. + +Lisp code does have a tendency to march across to the right quite +rapidly given a chance. I have a number of strategies for dealing with +this. + + + Break a long nested calculation into pieces, giving names to the + intermediate results, in a ~let*~ form. + + + Hoist deeply nested complex computations out into an ~flet~ or + ~labels~, and then invoke it from inside whatever complicated + conditional mess was needed to decide what to do. + + + Shrug my shoulders and let code dribble down the right hand side for + a bit. + + +* Packages and exporting + +A package collects symbols which are given meanings in one or more +source files. If a package's code is all in one file, then the package +definition can be put in that file too; otherwise I put it in its own +file. + +I don't put ~:export~ in package definitions. Instead, I scatter calls +to the ~export~ function throughout the code, right next to where the +relevant symbol is defined. This has three important advantages. + + + You can tell, when you're reading the code which defines ~foo~, + whether ~foo~ is exported and therefore a defined part of the + package interface. + + + When you know that you're writing a thing which will form part of + the package interface, you don't have to go off and edit some other + file to export it. + + + A master list of exported symbols becomes a merge hazard: if two + different branches add symbols to nearby pieces of the master list + then you get a merge conflict for no especially good reason. + +There's an apparent disadvantage: there's no immediately visible master +list of exported symbols. But that's not a big problem: + +: (loop for s being the external-symbols of pkg collect s) + +See ~doc/list-symbols.lisp~ for more sophisticated reporting. (In +particular, this identifies what kind of thing(s) each external symbol +names.) + + +* Comments and file structuring + +A file starts with a big ~;;;~ comment bearing the Emacs ~-*-lisp-*-~ +marker, a quick description, and copyright and licensing boilerplate. I +don't use four-semicolon comments, and I only use ~#|~ ... ~|#~ for +special effects. + +Then there's package stuff. There may be a ~cl:defpackage~ form (with +explicit package qualifier) if the relevant package doesn't have its own +package definition file. + +Then there's ~cl:in-package~. Like ~defpackage~, I use a gensym to name +the package. I can't think offhand of a good reason to have a file with +sections `in' more than one package. So, in the ~in-package~ form goes +at the top of the file, before the first section header. If sections +are going to end up in separate packages, I think I'd put a +~cl:in-package~ at the top of each section in case I wanted to reorder +them. + +The rest of the file consists of Lisp code. I don't use page boundaries +~^L~ to split files up. Instead, I use big banner comments for this: + +: ;;;-------------------------------------------------------------------------- +: ;;; Section title. + +Sections don't usually have internal comments, but if they did they'd +also be ~;;;~ comments. + +Almost all definitions get documentation strings. I've tried to be +consistent about formatting. + + + Docstring lines are 77 characters or less. + + + The first line gives a summary of what the thing does. The summary, + together with the SLIME-generated synopsis, is likely enough to + remind you what the thing does. + + + The rest of the lines are indented by three spaces, and explain + carefully what the thing does and what all the parameters mean. + +Smallish functions and macros don't usually need any further +commentary. Big functions often need to be split into bitesize pieces +with their own internal ~;;~ comments. The idea is that these comments +should explain the code's overall strategy to the reader, and help them +figure out how a piece fits into that strategy. + +Winged, single ~;~ comments are very rare. + +Files end, as a result of long tradition, with a comment + +: ;;;----- That's all, folks -------------------------------------------------- + + +* Macro style + +I don't mind complicated macros if they're doing something worthwhile. +They need to have good documentation strings, though. + +That said, where possible I've tried to factor macros into an actual +macro providing the syntactic sugar, and a function which receives the +parameters and $\eta$-expanded forms, and does the actual work. + +It's extremely bad taste for a macro to evaluate its evaluable +parameters in any order other than strictly left to right, or to +evaluate them more than once. + + +* Data structures + +I've tended to be happy with plain lists for homogeneous-ish +collections. Strongly heterogeneous collections (other than input +syntax, destructured using ~defmacro~ or ~destructuring-bind~) I've +tended to make a proper data type for. + +My first instinct when defining a new structure is to use ~defclass~. +While it's annoyingly verbose, it has the immense benefit over +~defstruct~ that it's safe to redefine CLOS classes in a running image +without the world breaking, and I usually find it necessary to add or +change slots while I'm working on new code. Once a piece of code has +settled down and I have a good feel for what my structure is actually +doing, I might switch the ~defclass~ for a ~defstruct~. Several +questions influence my decision. + + + Do slot accesses need to be really fast? My usual Lisp + implementations aggressively optimize ~defstruct~ accessor + functions. + + + Have I subclasses my class? While I can move over a + single-inheritance tree using ~:include~, it seems wrong to do this + most of the time. Also, I'd be precluding subclasses from multiple + inheritance, and I'd either have to prohibit subclassing by + extensions or have to commit to ~defstruct~ in the documentation. + In general, I'm much happier committing to ~defclass~. + + + Are there methods specialized on my class? Again, structure classes + make fine method specializers, but it doesn't seem right. + +Apart from being hard to redefine, ~defstruct~ does a pretty good job of +making a new structure type. I tend to tidy up a few rough edges. + + + The default predicate always has ~-p~ appended. If the class name + is a single word, then I'll explicitly name the predicate with a + simple ~p~ suffix. For example, ~ship~ would have the predicate + ~shipp~, rather than ~ship-p~. + + + If there are slots I can't default then I'll usually provide a BOA + constructor which sets them from required parameters; other slots + I'll set from optional or keyword parameters according to my taste + and judgement. + + + Slots mustn't be given names which are external in any package. + Unfortunately, slot names are used in constructing accessor names, + and sometimes the right accessor name involves a prohibited symbol. + I've mostly addressed this by naming the slot ~%foo~, and then + providing inline reader and writer functions. (CLOS class + definitions don't have this problem because you get to set the + accessor function names independently of the slot names.) + + + BOA constructors are strange. You can set the initial slots based + on an arbitrary computation on the provided parameters, but you have + to roll up your sleeves and mess with ~&aux~ parameters to pull it + off. + + +* Naming + +I'm a traditionalist in some ways, and one of the reasons I like Lisp is +the richness of its history and tradition. + +In other languages, I tend to use single- or two-letter names for +variables and structure slots; not so much in Lisp. Other languages +express more using punctuation, so the names stand out easily; I find +that short names can be lost more easily in Lisp. + +I've also tended to go for fairly prosaic names, taking my inspiration +from the CLOS MOP. While I mourn the loss of whimsical names like +~haulong~ and ~haipart~, I've tried to avoid inventing more of them. + +There's a convention, which I think comes from ML, of using ~_~ in a +where a binding occurrence of a variable name is expected, to signify +that that the corresponding value is to be discarded. Common Lisp, +alas, doesn't have such a convention. Instead, there's a sequence of +silly names used with the same intention, and the bindings are then +explicitly ignored with a declaration. The names begin ~hunoz~, +~hukairz~, and (I think) ~huaskt~. + + +* Declarations + +The code is light on declarations, other than ~ignore~ and similar used +to muffle warnings. The macros try to do sensible things with +declarations, and I think they succeed fairly well, but there might be +bugs and rough edges. I know that some are just broken because, for +actual correctness, declarations provided by the caller need to be split +up into a number of different parts of the expansion, which in turn +requires figuring out what the declarations mean and which bindings +they're referring to. That's not completely impossible, assuming that +there aren't implementation-specific declarations which crazy syntax +mixed in there, but it's more work than seems worthwhile. + + +* COMMENT Emacs cruft + +#+LATEX_CLASS: strayman + +## LocalWords: CLOS ish destructure destructured accessor specializers +## LocalWords: accessors DSLs gensym + +## Local variables: +## mode: org +## End: -- [mdw]