[wishlist] git-remote-gcrypt - "shared" mode?
lists-sgo-software-discuss at onerussian.com
lists-sgo-software-discuss at onerussian.com
Wed Sep 23 19:47:05 BST 2020
NB this is a re-send from initially a private email. Follow up from Joey Hess
(was CCed) is quoted at the end. FWIW: it is ok as to me to have the
"passphrase" to be the sole source of entropy, but it must not be present in
any repository -- would be provided "programmatically"
Hi Sean,
Thanks for taking care about maintaining git-remote-gcrypt!
I think we never corresponded. A fellow DataLad/Debian/NeuroDebian/...
developer here who pesters Joey quite a bit about git-annex ;)
CCing Joey since he might have ideas as well and it might need
some work on git-annex end (but I didn't want yet to pollute
git-annex'es branchable).
Big goal:
In the course of the DataLad project I want to get git/git-annex repos
with protected data deposited e.g. on github.com publicly +
LFS/S3/whatever, fully encrypted, so only people with access to the
"origin" of the data (could be nih data archive, etc) be able to
get data git and git-annex "data" de-crypted.
I do not want to "manually" register any keys, or anyhow otherwise
manage those encrypted clones besides an occasional "push" of
updates. Ability to get data should be solely due to users'
ability to access/authenticate into original data source.
I was thinking to use git-annex with encryption on remote for
annexed content, and then publish git objects encrypted to github via
gcrypt::.
I would like to get setup similar to annex'es "shared" mode
(https://git-annex.branchable.com/encryption/#index2h2) BUT with a key
having a passphrase, so key could be shared publicly, but getting
access to the key alone would not be enough.
That passphrase could then be sha512 or any other long hash from some
content (e.g. a table with filenames fetched from nih) users
would need first to fetch and provide to a helper tool to get that hash.
In the long'er run we could provide convenience at datalad level (e.g.
soon NIH archive will have an API, DataLad can authenticate to it, we
can fetch desired "manifest", compute passphrase and be able to
decrypt).
Do you think something like "shared" mode could be worked out for gcrypt
where the key is also shipped along but would require passphrase to be
made available it somehow (e.g. via envvar)?
Joey, Could git-annex's special remote shared mode of encryption
also acquire a passphrase?
Or may be you see a better way to accomplish the mission?
While at it, a minor related wishlist request for gcrypt::
to make it possible to specify paths within repo to be shared as-is
(without encryption) and also be "pushed" from e.g. a configurable
path within repository without applying any encryption to it?
Then I would specify README.md, which users would see, and where we
could then provide dataset description and summary on how to get it
installed.
Reply from Joey:
yoh at onerussian.com wrote:
> Do you think something like "shared" mode could be worked out for gcrypt
> where the key is also shipped along but would require passphrase to be
> made available it somehow (e.g. via envvar)?
The passphrase would have to be the entire source of entropy, there
can't be another key that's usefully mixed with the passphrase unless
it's shipped separately from the repo.
> Joey, Could git-annex's special remote shared mode of encryption
> also acquire a passphrase?
git-annex's shared mode uses gnupg with a symmetric key, which gnupg
does call a passphrase, but of course that's hidden from the git-annex
user. There's no particular reason another mode couldn't also use a
passphrase, either as the full entropy or combined with a shared encryption
key. Combining might be useful to allow weaker passwords to be used.
Cheers,
--
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
WWW: http://www.linkedin.com/in/yarik
More information about the sgo-software-discuss
mailing list