[PATCH] Add support for git option: `pack.packSizeLimit`
Cathy J. Fitzpatrick
cathy at cathyjf.com
Fri Dec 15 12:27:19 GMT 2023
The standard `git-config(1)` setting of `pack.packSizeLimit` is used to
specify the maximum size of a pack file when repacking a repository.
This setting is important when working with a remote git hosting
provider that imposes a maximum file size on files stored on the remote
server. For example, GitHub currently imposes a maximum size of 100 MiB
per file stored on its servers.
Until now, gpg-remote-gcrypt has ignored the `pack.packSizeLimit`
setting when repacking the repository. This setting has been ignored
because gpg-remote-gcrypt has supplied the `--stdout` flag to
`gpg-repack(1)`, and that flag implicitly causes `gpg-repack(1)` to
ignore the value of `pack.packSizeLimit`.
This patch modifies gpg-remote-gcrypt so that it will respect the value
of `pack.packSizeLimit`. This is achieved by modifying the invocation of
`gpg-repack(1)` so that the `--stdout` argument is not supplied.
Instead, the pack files are written to the same temporary directory that
gpg-remote-gcrypt already uses for other purposes.
The code that invokes `gpg-repack(1)` is also modified to handle the
possibility that more than one pack file might be produced (if the size
of the pack would exceed the value of `pack.packSizeLimit`). Previously,
gpg-remote-gcrypt was able to assume that gpg-repack(1) would always
produce exactly one pack file, but with this patch, that is no longer
the case if the user has specified `pack.packSizeLimit`. To address
this, it was necessary to introduce a loop in two places, to iterate
over each of the generated pack files, instead of assuming that there
would always be exactly one pack file.
This patch does not change the git-remote-gcrypt protocol in any way.
Repositories created with the new version of git-remote-gcrypt can still
be read with older versions of git-remote-gcrypt. And, of course,
repositories created with older versions of git-remote-gcrypt can be
read with the new version. The change is fully backward- and
forward-compatible. Indeed, this is true of the `pack.packSizeLimit`
setting in general. As the manual for `gpg-config(1)` observes,
"the git:// protocol is unaffected" by the value of
`pack.packSizeLimit`.
Although storing repositories encrypted by git-remote-gcrypt on the
servers of Git hosting services such as GitHub has a variety of
drawbacks, it is a supported use case, and it can make sense for
certain kinds of repositories. This patch makes it easier to work with
these backends by handling any maximum file size restrictions
imposed by the services, and, for maximum simplicity, the interface
for this patch relies solely on a standard `git-config(1)` setting,
namely, the `pack.packSizeLimit` setting.
This patch also modifies the README to document the behavior of the
`pack.packSizeLimit` setting as it affects git-remote-gcrypt. Finally,
this patch amends the section of the README relating to the
*GCRYPT_FULL_REPACK* environment variable to clarify that, in order
to force a repack of the repository, the variable must be set to a value
_other than the empty string_.
Signed-off-by: Cathy J. Fitzpatrick <cathy at cathyjf.com>
---
README.rst | 17 ++++++++++++++++-
git-remote-gcrypt | 46 +++++++++++++++++++++++++++++++---------------
2 files changed, 47 insertions(+), 16 deletions(-)
diff --git a/README.rst b/README.rst
index a7c41a2..6bc91d7 100644
--- a/README.rst
+++ b/README.rst
@@ -105,11 +105,26 @@ The following ``git-config(1)`` variables are supported:
If this flag is set to ``true``, git-remote-gcrypt will refuse to push,
unless ``--force`` is passed, or refspecs are prefixed with ``+``.
+``pack.packSizeLimit``
+ This is a standard git configuration variable.
+
+ In the context of git-remote-crypt, this variable, if set, specifies the
+ maximum size of the packfiles to be uploaded to the backend. As in
+ standard git, this value should be an integer, optionally suffixed with
+ "k", "m", or "g". If a packfile exceeds the maximum size, it will be
+ split into several files before being uploaded. This splitting is
+ transparent to the user and does not affect use of the repository.
+
+ This variable is useful when working with a backend that imposes a maximum
+ file size, such as GitHub, which currently imposes a maximum file size of
+ 100m.
+
Environment variables
=====================
*GCRYPT_FULL_REPACK*
- When set (to anything), this environment variable forces a full repack when pushing.
+ When set (to anything other than the empty string), this environment
+ variable forces a full repack when pushing.
Examples
========
diff --git a/git-remote-gcrypt b/git-remote-gcrypt
index 7e7240f..97684aa 100755
--- a/git-remote-gcrypt
+++ b/git-remote-gcrypt
@@ -739,7 +739,8 @@ do_push()
# The manifest is encrypted.
local r_revlist= pack_id= key_= obj_= src_= dst_= \
r_pack_delete= tmp_encrypted= tmp_objlist= tmp_manifest= \
- force_passed=
+ force_passed= tmp_pack_prefix= r_new_pack_list= \
+ new_pack_object_ids= object_id=
ensure_connected
@@ -787,6 +788,7 @@ EOF
fi
fi
+ tmp_pack_prefix="$Tempdir/pack_raw"
tmp_encrypted="$Tempdir/packP"
tmp_objlist="$Tempdir/objlP"
@@ -798,17 +800,28 @@ EOF
# Only send pack if we have any objects to send
if [ -s "$tmp_objlist" ]
then
- key_=$(genkey "$Packkey_bytes")
- pack_id=$(export GIT_ALTERNATE_OBJECT_DIRECTORIES=$Tempdir;
- pipefail git pack-objects --stdout < "$tmp_objlist" |
- pipefail ENCRYPT "$key_" |
- tee "$tmp_encrypted" | gpg_hash "$Hashtype")
-
- append_to @Packlist "pack :${Hashtype}:$pack_id $key_"
- if isnonnull "$r_pack_delete"
- then
- append_to @Keeplist "keep :${Hashtype}:$pack_id 1"
- fi
+ # This will return more than one object_id if the user's git
+ # configuration includes `pack.packSizeLimit` and the size of the
+ # packfile is greater than the specified size limit. Hence, we need
+ # to iterate through the returned objects.
+ new_pack_object_ids=$(GIT_ALTERNATE_OBJECT_DIRECTORIES=$Tempdir \
+ git pack-objects "$tmp_pack_prefix" < "$tmp_objlist")
+ while IFS= read -r object_id
+ do
+ key_=$(genkey "$Packkey_bytes")
+ pack_id=$(pipefail ENCRYPT "$key_" < "$tmp_pack_prefix-$object_id.pack" | \
+ tee "$tmp_encrypted-$object_id" | gpg_hash "$Hashtype")
+ rm -f -- "$tmp_pack_prefix-$object_id.pack"
+
+ append_to @r_new_pack_list "$pack_id:$object_id"
+ append_to @Packlist "pack :${Hashtype}:$pack_id $key_"
+ if isnonnull "$r_pack_delete"
+ then
+ append_to @Keeplist "keep :${Hashtype}:$pack_id 1"
+ fi
+ done <<EOF
+$new_pack_object_ids
+EOF
fi
# Generate manifest
@@ -824,16 +837,19 @@ repo $Repoid
$Extnlist
EOF
- # Upload pack
+ # Upload pack (or packs, if applicable)
if [ -s "$tmp_objlist" ]
then
- PUT "$URL" "$pack_id" "$tmp_encrypted"
+ xecho "$r_new_pack_list" | while IFS=':' read -r pack_id object_id
+ do
+ PUT "$URL" "$pack_id" "$tmp_encrypted-$object_id"
+ rm -f -- "$tmp_encrypted-$object_id"
+ done
fi
# Upload manifest
PUT "$URL" "$Manifestfile" "$tmp_manifest"
- rm -f "$tmp_encrypted"
rm -f "$tmp_objlist"
rm -f "$tmp_manifest"
--
2.43.0
More information about the sgo-software-discuss
mailing list