From 831d9dd77565efaf4f50550c37eea2710c761d96 Mon Sep 17 00:00:00 2001 From: Ian Jackson Date: Thu, 23 Feb 2017 19:37:53 +0000 Subject: [PATCH] wip. infectionsness is wrong - should depend on object contents, not choice of name in HEAD --- plan.txt | 133 +++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 110 insertions(+), 23 deletions(-) diff --git a/plan.txt b/plan.txt index 803aadd..dcda9c2 100644 --- a/plan.txt +++ b/plan.txt @@ -34,8 +34,8 @@ Everywhere in the git object formats and git protocols, a new object name (with hash function indicator) is permitted where an old object name is permitted. A single object refers to all the objects it references by the same hash function; in general this might be a -different hash function to the hash function by this particular object -was itself referenced or obtained. +different hash function to the hash function by which this particular +object was itself referenced or obtained. As an exception, it is forbidden to refer to a tree object by a name other than the hash function it uses to name its subtrees. If this @@ -72,12 +72,17 @@ the user: existing objects and notes that this hash function is now `ENABLED PRESENT'. + If a hash collision is detected, we crash immediately. + * OBSOLESCENT: Every object in the object store has its hash calculated using H. However, H is known to possibly have collisions which we try to tolerate. When a collision occurs, the object text which is currently in the object store is preferred and the "new" - object is thrown away. Local creation of new objects with - references using H is forbidden. + object is thrown away. + + Local creation of new objects with references using H is + discouraged. Specifically, if another hash function is ENABLED, we + will use that instead. This is used as part of a gradual desupport strategy. When the hash function is in this stage, existing history in all existing object @@ -119,11 +124,11 @@ the user: This allows us to finally retire a hash function entirely. We effectively throw away all the history which uses H. -During transfer protocols, the receiver will say which hashes are -obsolete or forgotten, and the sender will not follow such references -when computing the set of objects to send. So receivers will not -receive the objects which were named only by obsolete or forgotten -names. +During transfer protocols, the receiver will say which hashes it +thinks are obsolete or forgotten, and the sender will not follow such +references when computing the set of objects to send. So receivers +will not receive the objects which were named only by obsolete or +forgotten names. Naming in newly-generated objects, queries, etc. @@ -151,7 +156,7 @@ overrideable by configuration.) This (together with the `forbidden' state, above) ensures that switching a project to use a new hash function is a deliberate decision: the default hash function needs to be changed to make the -first first commit with the new hash function. After that, provided +first commit with the new hash function. After that, provided the server accepts it, it's infectious. @@ -160,16 +165,24 @@ Naming of refs other than HEAD A ref refers to an object by one of its names. However, operations like git-show-ref convert that name to the default format (see above). -git-gc rewrites ref names to the default format. +git-gc rewrites ref names to the default format iff that is newer. Remote protocol -During the negotation, a client needs to specify what names it -understands, and which it prefers (its default). +During the negotation, a receiver needs to specify what hashes it +understands. + +When the sender is listing its refs, the names are converted to a +hash understood by the client if necessary. If this is not necessary, +they are left unchanged. -When the server is listing its refs, the names are converted to the -client's preferred format. +When a receiver is updating refs, it should by follow the sender's +idea of a hash change iff it's an upgrade (and the new function is +ENABLED). That is, if the sender sends name H2 for some ref, and the +receiver has H1, but these refer to the same object, then the receiver +should update its own ref name from H1 to H2 iff H2 uses a newer hash +function. Equality testing @@ -180,10 +193,14 @@ for both objects. This is going to be quite annoying. +We should provide a convenient utility which tests whether two object +names refer to the same object. + Note that semantically identical trees may (now) have different tree objects because those tree objects might contain different object -names. So tree comparison cannot any longer be done by comparing -names; rather an invocation of git diff is needed. +names. So (in some contexts at least) tree comparison cannot any +longer be done by comparing names; rather an invocation of git diff is +needed, or explicit generation of a tree object with the right name. Transition plan @@ -191,18 +208,88 @@ Transition plan Y0: Implement all of the above. Test it. Default configuration: - SHA-1 is ENABLED and is default HEAD hash - + SHA-1 is ENABLED SHA-512 is FORBIDDEN in bare repos SHA-512 is ENABLED in trees with working trees + default HEAD hash is SHA-1 + + Effects: + + Existing projects will not switch to SHA-512 willy-nilly. + New projects will still use SHA-1. + + Incompatible new-style commits cannot be pushed without server + admin effort (or until future upgrade). + + So all old git clients still work. + +Y4: SHA-512 by default for new projects. + Conversion enabled for existing projects. + Old git software is now pretty firmly deprecated. + + Default configuration change: + + When creating a new bare tree, a configuration dropping is left + (in `config') which specifies that SHA-1 is OBSOLESCENT + + Default status for SHA-512 is FORBIDDEN if SHA-1 is ENABLED, + or ENABLED if SHA-1 is OBSOLESCENT. + + default HEAD hash is newest ENABLED hash. + + Effects: + + When creating a new working tree, it starts using SHA-512. + A new server tree will accept SHA-512. + + Existing server trees do not yet accept SHA-512. They publish + their SHA-1 hashes, so clients make commits with SHA-1. + + To convert a project, an administrator would set SHA-1 to + OBSOLESCENT on the server. All clones after that will have HEAD + with a SHA-512 name. Fetches and pulls will update to SHA-512 + names. + +will , and push one SHA-512 commit to + mainline. + + + + Default configuration change: + + Effects: + + When creating a new tree with working tree with git init (ie, no + HEAD), the default HEAD hash is set to SHA-512 (because SHA-1 is + OBSOLESCENT in a new tree and therefore SHA-512 is the only + ENABLED hash and is the default). + + Newly minted server trees accept SHA-512. + + + start using SHA-512 by default. + +Y6: Existing projects start being converted infectiously. + It is hard to stop this happening. + Old git software is firmly stuffed. + + Default configuration change: + SHA-1 is OBSOLESCENT + (default for SHA-512, and HEAD hash, computed as in Y4) + + Result is that by default all software + + (Projects which do not want to convert need to set SHA-1 to + ENABLED, explicitly, on their -Y5: New projects should start using SHA-512. +Y6: Existing projects start using SHA-512. Default configuration change: + SHA-512 is ENABLED + SHA-1 is OBSOLESCENT + (default default HEAD hash is already SHA-512) - SHA-512 becomes ENABLED in *new* bare repos but remains - FORBIDDEN in existing ones - + In existing repositories where no special action -- -- 2.30.2