|From:||Junio C Hamano <gitster-AT-pobox.com>|
|To:||Linus Torvalds <torvalds-AT-linux-foundation.org>|
|Subject:||Re: [git patches] libata updates, GPG signed (but see admin notes)|
|Date:||Wed, 09 Nov 2011 09:26:42 -0800|
|Cc:||"Ted Ts'o" <tytso-AT-mit.edu>, Shawn Pearce <spearce-AT-spearce.org>, git-AT-vger.kernel.org, James Bottomley <James.Bottomley-AT-hansenpartnership.com>, Jeff Garzik <jeff-AT-garzik.org>, Andrew Morton <akpm-AT-linux-foundation.org>, linux-ide-AT-vger.kernel.org, LKML <linux-kernel-AT-vger.kernel.org>|
Linus Torvalds <firstname.lastname@example.org> writes: > No, no, don't consider my "put in the merge message" a winner at all. > > I personally dislike it, and don't really think it's a wonderful thing > at all. I really does have real downsides: > > - internal signatures really *are* a disaster for maintenance. You > can never fix them if they need fixing (and "need fixing" may well be > "you want to re-sign things after a repository format change") > > - they are ugly as heck, and you really don't want to see them in > 99.999% of all cases. > > So putting those things iin the merge commit message may have some > upsides, but it has tons of downsides too. > > I think your refs/audit/ idea should be given real thought, because > maybe that's the right idea. With the latest round of touch-ups, modulo a few bugs I will be fixing before the 1.7.8 final, I think what we have is more or less OK in the shorter term and should be ready for general consumption. The ugliness is gone, but the issue around internal signatures may remain to be solved in the longer term. At least, by storing the full contents of the tag today in an extended header, when we figure out how a detached signature should really work, we could convert by extracting them from the history. In a separate message earlier in the thread, you raised another issue. > I hate how anonymous our branches are. Sure, we can use good names for > them, but it was a mistake to think we should describe the repository > (for gitweb), rather than the branch. > > Ok, "hate" is a strong word. I don't "hate" it. I don't even think > it's a major design issue. But I do think that it would have been > nicer if we had had some branch description model. At the first glance, our branch model is indeed peculiar in that a branch does not have a global identity. The scope of its name is local to the repository, and it is just a pointer into the history. A "note" [*1*] that can annotate a commit long after the commit is made is not a good way to describe what a branch is about, because the tip of the branch can advance beyond the commit that is annotated by such a note. A commit on a branch does not serve as a good anchoring point to describe the branch. However, a commit that merges the history of a branch, whether the merged branch is from a local repository or from a remote one, does serve as a good anchoring point. The work on a branch is finished as complete as possible at the time of the merge, and the committer who merges the branch agrees with both the objective and the implementation of the work done on the branch, and that is why the merge is made [*2*]. Describing what the history of the side branch was about in the resulting merge is a perfectly sensible way to explain the branch. So in that sense, I am very happy with the way the merge message template uses the pull request tag to let the lieutenant explain and defend the history behind the tag used for the pull request. Such an explanation does not have to be keyed with anybody's local branch name (e.g. "for-linus" would mean different things for different pull requests even from the same person), but keying it with the resulting merge commit is a sensible way to leave the record in the history. After justifying with the above two paragraphs that it is perfectly sensible to record the annotations on commits and not on "branch names", I do agree that we would eventually want to be able to have such annotations on commits after the fact. Neither "tags" nor "notes" is necessarily a very good mechanism, however, for the purpose of "signed pull requests" and "signed commits" [*3*]. Here are some pros and cons: - tags must be named, but the only thing we need is to be able to look the contents (with signature if signed) up given a commit object. Unlike the usual "I want to check out v3.0 release" look-up that goes from tag names to the commits, annotation look-ups go the other way, do not have to have a tagname, and having tagname does not help our look-up in any way. If we want to use tag to annotate various commits by various people and keep them around, we would need global namespace that would not cause them to crash (we can work this around by using the object name of the tag, e.g. renaming 'for-linus' tag to $(git rev-parse tags/for-linus), but that is merely a workaround of having to name things that do not have to be named in the first place). As a local storage machinery for annotations, tags hanging below refs/tags/ (or refs/audit for that matter) hierarchy with their own names is an inappropriate model. + tags can auto-follow the commits when object transfer happens (at least in the fetch direction), and for the purpose of "signed pull requests" and "signed commits", this is a desirable property. When a repository gains a commit, the annotations attached to the commit that are missing from the receiving repository are automatically transferred from the place the commit comes from. Annotations given to other commits that are not transferred into the repository do not come to the repository. - "git notes" is represented as a commit that records a tree that holds the entire mapping from commit to its annotations, and the only way to transferr it is to send it together with its history as a whole. It does not have the nice auto-following property that transfers only the relevant annotations. + "git notes" maps the commits to its annotations in the right direction; the object name of an annotated object to its annotation. In the longer term, I think we would need to extend the system in the following way: - Introduce a mapping machanism that can be locally used to map names of the objects being annotated to names of other objects (most likely blobs but there is nothing that fundamentally prevents you from annotating a commit with a tree). The current "git notes" might be a perfectly suitable representation of this, or it may turn out to be lacking (I haven't thought things through), but the important point is that this "mapping store" is _local_. fsck, repack and prune need to be told that objects that store the annotation are reachable from the annotated objects. - Introduce a protocol extension to transfer this mapping information for objects being transferred in an efficient way. When "rev-list --objects have..want" tells us that the receiving end (in either fetch/push direction) would have an object at the end of the primary transfer (note that I did not say "an object will be sent in this transfer transaction"; "have" does not come into the picture), we make sure that missing annotations attached to the object is also transferred, and new mapping is registered at the receiving end. The detailed design for the latter needs more thought. The auto-following of tags works even if nothing is being fetched in the primary transfer (i.e. "git fetch" && "git fetch" back to back to update our origin/master with the master at the origin) when a new tag is added to ancient part of the history that leads to the master at the origin, but this is exactly because the sending end advertises all the available tags and the objects they point at so that we can tell what new tags added to an old object is missing from the receiving end. This obviously would not scale well when we have tens of thousands of objects to annotate. Perhaps an entry in the "mapping store" would record: - The object name of the object being annotated; - The object name of the annotation; - The "timestamp", i.e. when the association between the above two was made--this can be local to the repository and a simple counter would do. and also maintain the last "timestamp" this repository sent annotations to the remote (one timestamp per remote repository). When we push, we would send annotations pertaining to the object reachable from what we are pushing (not limited by what they already have, as the whole point of this exercise is to allow us to transfer annotations added to an object long after the object was created and sent to the remote) that is newer than that "timestamp". Similarly, when fetching, we would send the "timestamp" this repository last fetched annotations from the other end (which means we would need one such "timestamp" per remote repository) and let the remote side decide the set of new annotations they added since we last synched that are on objects reachable from what we "want". Or something like that. [Footnote] *1* By this word, I do not necessarily mean what the "git notes" command manipulates. A tag that points at a commit is also equally a good vehicle to annotate a commit after the fact. *2* For this reason, it may make sense to "commit -S" such a merge commit. The "mergetag" asserts the authenticity of the pull request from the lieutenant whose history is being integrated, and the "gpgsig" asserts the authenticity of the merge itself--the fact that it was made by the integrator. *3* I do not mean what "git commit -S" parked in 'pu' produces, which is to store the signature in the commit. Adding "Signed-off-by:" after the fact to an existing commit by many people is a more appropriate example. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to email@example.com More majordomo info at http://vger.kernel.org/majordomo-info.html
Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds