Re: rfc: why are we still distributing the portage tree via rsync?
From: | Rich Freeman <rich0-AT-gentoo.org> | |
To: | gentoo-dev <gentoo-dev-AT-lists.gentoo.org> | |
Subject: | Re: rfc: why are we still distributing the portage tree via rsync? | |
Date: | Tue, 3 Jul 2018 11:38:03 -0400 | |
Message-ID: | <CAGfcS_kzzo4gkGKrORnuRw6hxvPJzwy6-2OL5A-7Rdgki0Un0w@mail.gmail.com> | |
Archive-link: | Article |
On Tue, Jul 3, 2018 at 11:22 AM William Hubbs <williamh@gentoo.org> wrote: > > Mostly because of the recent "trustless infrastructure" thread, I am > wondering why we are still distributing the portage tree primarily > via rsync instead of git? > > Can someone educate me on that, and is it worth considering moving away > from rsync distribution? > Here are the pros/cons that I've seen come up in the past: 1. emerge-webrsync is probably more secure at the moment, because emerge --sync with git leaves the tree corrupt if it doesn't verify. That seems like something that could be fixed, and which should be fixed regardless (presumably somebody just has to do the work - I can't imagine the portage team would turn away patches). 2. git seems to be more efficient for frequent syncing, while rsync seems to be more efficient for infrequest syncing. I'd guess the crossover is somewhere around a week or few, but I don't have data to support that. 3. we have more rsync mirrors, though with the possibility of using mirrors like github I don't see why this matters (as long as we actually secure distribution). 4. by default git tends to accumulate history, which can eat up disk space. I imagine this could be automatically trimmed if users wanted, though during syncing it would at least need to store all the commits between the last fetched and next-fetched, and that means fetching things that might have been subsequently removed/changed Personally I stick with git. I want the history anyway, and since I sync frequently it involves WAY less disk IO and seems to be very network-efficient as well. -- Rich