-From ijackson Mon Sep 26 15:37:19 +0100 2016
-X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil]
- [nil "Monday" "26" "September" "2016" "15:37:19" "+0100" "Ian Jackson" "ijackson@chiark.greenend.org.uk" nil nil "Intent to commit craziness - source package unpacking" "^From:" nil nil "9" nil nil nil nil nil nil nil nil nil nil]
- nil)
-X-Mozilla-Status: 0001
-X-Mozilla-Status2: 00000000
-MIME-Version: 1.0
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Message-ID: <22505.12959.668142.478444@chiark.greenend.org.uk>
-X-Mailer: VM 8.2.0b under 24.4.1 (i586-pc-linux-gnu)
-From: Ian Jackson <ijackson@chiark.greenend.org.uk>
-To: debian-dpkg@lists.debian.org,
- Guido Guenther <agx@debian.org>,
- Bernhard R. Link <brlink@debian.org>,
- vcs-pkg-discuss@lists.alioth.debian.org
-Subject: Intent to commit craziness - source package unpacking
-Date: Mon, 26 Sep 2016 15:37:19 +0100
-
-tl;dr:
-
- * dpkg developers, please tell me whether I am making assumptions
- that are likely to become false. Particularly, on the behaviour of
- successive runs of dpkg-source --before-build with successively
- longer series files.
-
- * git-buildpackage and git-dpm developers, please point me to
- information about what metadata to put into the commit message for
- a git commit which represents a dpkg-source quilt patch. I would
- like these commits to be as convenient for gbp and git-dpm users as
- possible.
-
-
-Hi.
-
-Currently when dgit needs to import a .dsc into git, it just uses
-dpkg-source -x, and git-add. The result is a single commit where the
-package springs into existence fully formed. This is not as good as
-it could be. I would like to represent (in the git pseudohistory) the
-way that the resulting tree is constructed from the input objects.
-
-In particular, I would like to: represent the input tarballs as a
-commit each (which all get merged together as if by git merge -s
-subtree), and for quilt packages, each patch as a commit. But I want
-to avoid (as much as possible) reimplementing the package extraction
-algorithm in dpkg-source.
+We would like to: represent the input tarballs as a commit each (which
+all get merged together as if by git merge -s subtree), and for quilt
+packages, each patch as a commit. But w want to avoid (as much as
+possible) reimplementing the package extraction algorithm in
+dpkg-source.
dpkg-source does not currently provide interfaces that look like they
-are intended for what I want to do. And dgit wants to work with old
-versions of dpkg, so I don't want to block on getting such interfaces
-added (even supposing that a sane interface could be designed, which
-is doubtful).
-
-So I intend to do as follows. (Please hold your nose.)
+are intended for what dgit wants to do. And dgit wants to work with
+old versions of dpkg, so I have implemented the following algorithm
+rather than wait for such interfaces added (even supposing that a sane
+interface could be designed, which is doubtful):
-* dgit will untar each input tarball (other than the Debian tarball).
+* dgit will untar each input tarball.
This will be done by scanning the .dsc for things whose names look
like (compressed) tarballs, and using the interfaces provided by
Dpkg::Compression to get at the tarball.
Each input tarball unpack will be done separately, and will be
- followed by git-add and git-write tree, to obtain a git tree object
+ followed by git add and git write-tree, to obtain a git tree object
corresponding to the tarball contents.
That tree object will be made into a commit object with no parents.
with the right upstream version component, and the information found
there used for the commit object's metadata.)
-* dgit will then run dpkg-source -x --skip-patches.
+* For `3.0 (quilt), dgit will run
+ dpkg-source -x --skip-patches
+
+ git plumbing will be used to make the result into a tree and a
+ commit. The commit will have as parents all the tarballs previously
+ mentioned. The main orig tarball will be the leftmost parent and
+ the debian tarball the rightmost parent. The metadata will come
+ from the .dsc and/or the final changelog entry.
- Again, git plumbing will be used to make this into a tree and a
- commit. The commit will have as parents all the tarballs previous
- mentioned. The metadata will come from the .dsc and/or the
- final changelog entry.
+ dgit will then dpkg-source --before-build and record the resulting
+ tree, too.
-* dgit will look to see if the package is `3.0 (quilt)' and if so
- whether it has a series file. (dgit already rejects packages with
- distro-specific series files, so we need worry only about a single
- debian/patches/series file.)
+ Then, dgit will switch back to the patches-unapplied version and use
+ `gbp pq import' (in the private working area) to turn the
+ patches-unapplied tree into a patches-applied one.
- If there is a series file, dgit will read it into memory. It will
- then iterate over the series file, and each time:
- - write into its playground a series file containing one
- more non-comment non-empty line to previously
- - run dpkg-source --before-build (which will apply that
- additional patch)
- - make git tree and commit objects, using the metadata from
- the relevant patch file to make the commit (if available)
- - each commit object has as a parent the previous commit
- (either the previous commit, or the commit resulting from
- dpkg-source -x)
+ Finally dgit will check that the gbp pq generated patches-applied
+ version has the same git tree object as the one generated by
+ dpkg-source --before-build.
- After this the series file has been completely rewritten.
+* For source formats other than `3.0 (quilt)', dgit will do simply
+ dpkg-source -x.
-* dgit will then run one final invocation of dpkg-source
- --before-build. This ought not to produce any changes, but if
- it does, they will be represented as another commit.
+ Again, it will make that into a tree and a commit.
+
+* For source formats with only single file entry in the .dsc, the
+ (one) tarball is not imported separately (since its tree object
+ would be the same as the extracted object), and the commit of the
+ dpkg-source -x output has no parents.
* As currently, there will be a final no-change-to-the-tree
pseudomerge commit which stitches the package into the relevant dgit
- suite branch; ie something that looks as if it was made with git
- merge -s ours.
+ suite branch. (By `pseudomerge' we mean something that looks as if
+ it was made with git merge -s ours.)
* As currently, dgit will take steps so that none of the git trees
discussed above contain a .pc directory.
upstream version.
* For `3.0 (quilt)' each patch's changes to the upstream files appears
- as a single git commit (as is the effect of the debian tarball).
- For `1.0' non-native, the effect of the diff is represented as a
+ as a single git commit (as is the effect of the debian tarball);
+ also, there is a commit object whose tree is just the debian/
+ directory, which might well be the same as certain debian-only git
+ workflow trees.
+
+* For `1.0' non-native, the effect of the diff is represented as a
commit. So eg `git blame' will show synthetic commits corresponding
to the correct parts of the input source package.
-* It is possible to `git-cherry-pick' etc. commits representing `3.0
+* It is possible to `git cherry-pick' etc. commits representing `3.0
(quilt)' patches. It is even possible fish out the patch stack as
git branch and rebase it elsewhere etc., since the patch stack is
represented as a contiguous series of commits which make only the
* No back doors into the innards of dpkg-source (nor changes to
dpkg-dev) are required.
-* dgit does grow a dependency on Dpkg::Compression.
+* dgit does grow a dependency on git-buildpackage.
* Knowledge of the source format embedded in dgit is is restricted to
- iterating over tarballs and manipulating debian/patches/series,
- which dgit already does.
-
-* dgit now depends on dpkg-source --before-build idempotently applying
- patches as they successively appear on debian/patches/series.
-
-* Perhaps the git commits generated by dgit to represent patches can
- be made to round-trip nicely into tools like git-dpm and
- git-buildpackage.
-
- I have found the information about tags in gbp-dch(1), but that
- doesn't seem like it's applicable.
-
- I have also found the information about tags in gbp-pq(1). From
- that it looks like I ought to generate "Gbp-Pq: Name" and "Gbp-Pq:
- Topic".
-
-* The scheme I describe avoids introducing a dependency from dgit to
- git-buildpackage. I might be able to replace the
- successive-patch-application part with an appropriate invocation of
- gbp-pq. Would that be better ?
-
- Bear in mind that because the output of gbp-pq import doesn't
- contain debian/patches, I would need to rewrite its output (perhaps
- with git-filter-branch).
-
-
-Comments welcome. Please be quick - this is very close to the top of
-my dgit todo list.
-
-
-Thanks,
-Ian.
-
-
---
-Ian Jackson <ijackson@chiark.greenend.org.uk> These opinions are my own.
-
-If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
-a private address which bypasses my fierce spamfilter.
-
-From ijackson Wed Sep 28 10:50:49 +0100 2016
-X-VM-v5-Data: ([nil nil nil nil t nil nil nil nil nil nil nil nil nil nil nil]
- [nil "Wednesday" "28" "September" "2016" "10:50:49" "+0100" "Ian Jackson" "ijackson@chiark.greenend.org.uk" "<22507.37497.633622.843659@chiark.greenend.org.uk>" nil "Re: Intent to commit craziness - source package unpacking" "^From:" nil nil "9" nil nil nil nil nil nil nil nil nil nil]
- nil)
-X-Mozilla-Status: 0003
-X-Mozilla-Status2: 00000000
-MIME-Version: 1.0
-Content-Type: text/plain; charset=iso-8859-1
-Content-Transfer-Encoding: quoted-printable
-Message-ID: <22507.37497.633622.843659@chiark.greenend.org.uk>
-In-Reply-To: <20160928010117.nqe2prbsbaqkbjza@gaara.hadrons.org>
-References: <22505.12959.668142.478444@chiark.greenend.org.uk>
- <20160928010117.nqe2prbsbaqkbjza@gaara.hadrons.org>
-X-Mailer: VM 8.2.0b under 24.4.1 (i586-pc-linux-gnu)
-From: Ian Jackson <ijackson@chiark.greenend.org.uk>
-To: Guillem Jover <guillem@debian.org>
-Cc: debian-dpkg@lists.debian.org,
- Guido Guenther <agx@debian.org>,
- "Bernhard R. Link" <brlink@debian.org>,
- vcs-pkg-discuss@lists.alioth.debian.org
-Subject: Re: Intent to commit craziness - source package unpacking
-Date: Wed, 28 Sep 2016 10:50:49 +0100
-
-Guillem Jover writes ("Re: Intent to commit craziness - source package =
-unpacking"):
-> On Mon, 2016-09-26 at 15:37:19 +0100, Ian Jackson wrote:
-> > tl;dr:
-> >=20
-> > * dpkg developers, please tell me whether I am making assumptions
-> > that are likely to become false. Particularly, on the behaviour=
- of
-> > successive runs of dpkg-source --before-build with successively
-> > longer series files.
->=20
-> For format =AB3.0 (quilt)=BB, that seems fine, to the point I'm fine =
-even
-> documenting this, which I can probably do for 1.18.11.
-
-Great.
-
-> For other formats, such as =AB2.0=BB, I don't think that's true, but =
-I
-> assume you don't care about that one anyway. But just mentioning
-> because this behavior is probably format-specific. For =AB2.0=BB I
-> think it could be fixed, and should not be too hard (not sure if it's=
-
-> worth it though).
-
-I think the right approach is perhaps to use --skip-patches and
---before-build only with 3.0 (quilt). The that would leave 2.0 (or
-other strange or future formats) producing a correct (although
-possibly sub-optimal) import.
-
-> > dpkg-source does not currently provide interfaces that look like th=
-ey
-> > are intended for what I want to do. And dgit wants to work with ol=
-d
-> > versions of dpkg, so I don't want to block on getting such interfac=
-es
-> > added (even supposing that a sane interface could be designed, whic=
-h
-> > is doubtful).
->=20
-> Even then I'm still interested in a decription of what you'd need
-> ideally, to take into account when having a pass at cleaning up that
-> part of the interface. I think you could be interested in a cleaner
-> Dpkg::Source::* hierarchy, for the mid/long-term?
-
-For `3.0 (quilt)' explicit interfaces for applying and unapplying
-individual patches would help. But really IMO such an interface ought
-to be exposed on the command line rather than (or as well as) via a
-Perl module.
-
-Beyond that I find it hard to see what could make dgit's life easier.
-Since dgit wants to construct a commit graph representing the source
-package's innards, unless dpkg-source explicitly provides an interface
-along those lines ("please output a graph of unpacked source tree
-states and corresponding commit messages") dgit is still going to have
-to know specially about most of the source package formats.
-
-> > * dgit will untar each input tarball (other than the Debian tarball=
-).
-> >=20
-> > This will be done by scanning the .dsc for things whose names loo=
-k
-> > like (compressed) tarballs, and using the interfaces provided by
-> > Dpkg::Compression to get at the tarball.
->=20
-> Hmm, Dpkg::Source::Archive is currently private, but I might have a
-> look at making it public if that would be helpful here.
-
-I think the amount of logic I would have to replicate is minimal.
-
-> > * As currently, dgit will take steps so that none of the git trees
-> > discussed above contain a .pc directory.
->=20
-> As long as the directory does not disappear from the working tree,
-> that should work.
-
-Right, indeed it won't.
-
-Thanks for your comments. I feel unblocked :-).
-
-Ian.
-
---=20
-Ian Jackson <ijackson@chiark.greenend.org.uk> These opinions are my o=
-wn.
-
-If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
-a private address which bypasses my fierce spamfilter.
+ some relatively straightforward processing of filenames found in
+ .dsc files.
+* dgit now depends on dpkg-source -x --skip-patches followed by
+ dpkg-source --before-build being the same as dpkg-source -x
+ (for `3.0 (quilt)').