At his blog, Chris Ball announces "GitTorrent," his new project designed to let developers host Git repositories on BitTorrent. The system takes advantage of Git's ability to run over arbitrary network protocols. "We ask for the commit we want and connect to a node with BitTorrent, but once connected we conduct this Smart Protocol negotiation in an overlay connection on top of the BitTorrent wire protocol, in what’s called a BitTorrent Extension. Then the remote node makes us a packfile and tells us the hash of that packfile, and then we start downloading that packfile from it and any other nodes who are seeding it using Standard BitTorrent. We can authenticate the packfile we receive, because after we uncompress it we know which Git commit our graph is supposed to end up at; if we don’t end up there, the other node lied to us, and we should try talking to someone else instead." The project is, obviously, a new one that still has important ground to cover—such as dealing with comments or pull requests—but there are interesting ideas to consider already.
Announcing GitTorrent: A Decentralized GitHub
Posted May 29, 2015 21:12 UTC (Fri) by rillian (subscriber, #11344) [Link]
Announcing GitTorrent: A Decentralized GitHub
Posted May 30, 2015 1:29 UTC (Sat) by pabs (subscriber, #43278) [Link]
Also, when do we get git-remote-ipfs?
Announcing GitTorrent: A Decentralized GitHub
Posted May 29, 2015 22:35 UTC (Fri) by flewellyn (subscriber, #5047) [Link]
Announcing GitTorrent: A Decentralized GitHub
Posted May 29, 2015 22:53 UTC (Fri) by dlang (✭ supporter ✭, #313) [Link]
Which isn't to say that it's not worthwhile, just that the headline is bad.
Announcing GitTorrent: A Decentralized GitHub
Posted May 30, 2015 21:40 UTC (Sat) by hirnbrot (subscriber, #89469) [Link]
Announcing GitTorrent: A Decentralized GitHub
Posted May 31, 2015 0:04 UTC (Sun) by dlang (✭ supporter ✭, #313) [Link]
no, "github" is the set of services provided by a specific company. It's nowhere close to "everything remotely elated to git"
Announcing GitTorrent: A Decentralized GitHub
Posted Jun 1, 2015 15:35 UTC (Mon) by flussence (subscriber, #85566) [Link]
And on closer inspection, all this does is distribute the bandwidth-heavy part of cloning a git repository - everything else is just as hub-like as before.
Announcing GitTorrent: A Decentralized GitHub
Posted Jun 1, 2015 22:01 UTC (Mon) by paulj (subscriber, #341) [Link]
Announcing GitTorrent: A Decentralized GitHub
Posted Jun 2, 2015 8:54 UTC (Tue) by fb (subscriber, #53265) [Link]
[...]
Lack of ways to serve code over "git://" is not what makes GitHub popular.
The power of GitHub lies in the way its platform gives you a formal & standard way to do pull-requests. It has a well defined way to send a PR to a given project owner, and a well defined way to comment back and forth on the pull-request. Same for bugs.
The fact that it is the very same identical interface for all projects I interact with (and that such interface is "good enough") is its killer feature.
How these folks expect a PR to take place? Over a mailing list?
Announcing GitTorrent: A Decentralized GitHub
Posted Jun 2, 2015 7:42 UTC (Tue) by fb (subscriber, #53265) [Link]
Announcing GitTorrent: A Decentralized GitHub
Posted May 30, 2015 7:34 UTC (Sat) by graemes (subscriber, #3788) [Link]
No IPv6 support, no UDP holepunching
Posted May 30, 2015 15:42 UTC (Sat) by jch (guest, #51929) [Link]
Announcing GitTorrent: A Decentralized GitHub
Posted May 31, 2015 15:13 UTC (Sun) by jond (subscriber, #37669) [Link]
Announcing GitTorrent: A Decentralized GitHub
Posted May 31, 2015 22:47 UTC (Sun) by njwhite (subscriber, #51848) [Link]
As I read it this just uses bittorrent's DHT functionality to find hosts with the needed repo and then downloads a packfile from one of them, rather than parts of a packfile from multiple hosts at once. In which case the speed that comes from downloading from many peers in parallel with 'normal' bittorrent usage isn't present. But perhaps I just missed that part?
Announcing GitTorrent: A Decentralized GitHub
Posted Jun 1, 2015 4:32 UTC (Mon) by Otus (subscriber, #67685) [Link]
So it should download from multiple nodes using normal BitTorrent. However, will other nodes have *that* particular packfile? Will they know to also create it somehow or is parallelism dependent on someone else having requested that exact set of changes before?
I haven't looked at the code yet, so I don't know.
Announcing GitTorrent: A Decentralized GitHub
Posted Jun 1, 2015 5:43 UTC (Mon) by zenaan (subscriber, #3778) [Link]
Example git pack file parameter/ configuration variations:
- compression on X number of CPUs
- maximum packfile size
- more trees/ branches in this repo than on that repo
A way is needed to capture these packfile config variations and distribute them to other git servers (perhaps on a standardized branch name or ??).
In this way many "git servers" (or git torrent clients) can participate in a parallel/ multi server "git torrent". E.g. if "nrcpus" is set to say 8, a dual cpu git mirror needs to be able to reproduce the same packfile as though it too had 8 cpus. And if it's your "fork" of the "master repo", then any branches you add to your fork need to be separated into separate pack files, so that your primary pack files match the master repo's pack files (in order to participate in the git torrent swarm). This type of setup could potentially be very useful for deduplication on a site like github - though one might expect they already have some solution for this (git clone -s ?).
Unless a particular set of "pack file" parameters is standardized, participation in any particular git torrent would just require designation of the "master" repo - so the pack file params are obtained from it. And come to think of it, the "master"'s branches need to be designated anyway.
"Non master" repos could of course use their own pack file parameters, but would not be able to participate in the swarm.
It doesn't sound too hard to conceptualize, so one would hope it's possible to implement this.
Announcing GitTorrent: A Decentralized GitHub
Posted Jun 1, 2015 12:22 UTC (Mon) by pclouds (subscriber, #76590) [Link]
Conceptually it may not be hard, but implementation is hard. By forcing certain object layout rules, you may have lower compression ratio, or slower pack access, and may consume more power. Git tries to reuse deltas from existing packs to produce a new pack. This makes it quick to assemble a pack, but also underterministic. There's also threads stealing jobs from one another in the above link. Resumable clone is a frequent request, and we still don't have it now.
Announcing GitTorrent: A Decentralized GitHub
Posted Jun 2, 2015 4:59 UTC (Tue) by zenaan (subscriber, #3778) [Link]
Also for scenarios which benefit from pack file torrents, the marginal reduction in compression (increase in pack file size) due to the need for determinism may very well be valuable (marginal increase in local storage in order to distribute downloads) - local policy strikes again.
As long as my local mirror wishes to maintain repo torrent participation, then when the authoritative server tells me it is choosing a new parameter set, then I have all the commits and the new compression parameter set, so I can re-pack. [Although it may make sense to have a new "torrent ID" (dunno what that's called sorry) - either way, participating servers can locally regenerate the torrent pack files when this is deterministic.]
It's up to the "authoritative git server" admin to make the policy decision of how long to keep with a current deterministic torrentable pack file parameter set, and when to update to a new/tuned set. This is always a local policy matter! "We can't do that because it's not the best policy for maximum compression" is not the right answer here...
As "deterministic pack file parameter set" is tuned, this is simply a new version of the deterministic pack file format. A git torrent server provides its current set of parameters to others who have configured this server to be authoritative for this repo.
The parameters e.g. pack file size, compression version, branch set included by this server etc, are all server local (or "authoritative server"-local to be precise). So any torrent scenario implies an "authoritative server" for a particular repo. If I am a torrent repo mirror, the "authoritative torrent upstream" is merely a local config.
This not only sounds easy, it is easy - even in the face of compression technology changes and "tuning" over time - that's merely a "version" increase or new parameter set provided by the "authoritative" repo server, and is local policy to that server.
Copyright © 2015, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds