|
|
Log in / Subscribe / Register

Evans: A data model for Git (and other docs updates)

On her blog, Julia Evans writes about improving Git documentation, including a new data model man page she wrote with Marie LeBlanc Flanagan, and updates to the pages for several other Git sub-commands (add, checkout, push, and pull). As part of the process, she asked Git users to describe problems they had run into in the documentation, which helped guide the changes that she made.
I'm excited about this because understanding how Git organizes its commit and branch data has really helped me reason about how Git works over the years, and I think it's important to have a short (1600 words!) version of the data model that's accurate.

The "accurate" part turned out to not be that easy: I knew the basics of how Git's data model worked, but during the review process I learned some new details and had to make quite a few changes (for example how merge conflicts are stored in the staging area).



to post comments

Contributing via GitHub

Posted Jan 10, 2026 20:27 UTC (Sat) by epa (subscriber, #39769) [Link] (1 responses)

I used GitGitGadget to make all of my contributions. This meant that I could make a GitHub pull request (a workflow I’m comfortable with) and GitGitGadget would convert my PRs into the system the Git developers use (emails with patches attached). GitGitGadget worked great and I was very grateful to not have to learn how to send patches by email with Git.
Interesting. I wonder whether that would work for the kernel?

Contributing via GitHub

Posted Jan 10, 2026 23:54 UTC (Sat) by neggles (subscriber, #153254) [Link]

Could it? absolutely, but it requires maintainer buy-in and currently many kernel maintainers consider the difficulty of contributing via the mailing-list-patch-email workflow, to be a desirable feature, not a bug.

I don't entirely disagree; making it easier to contribute will tend to increase the volume of crappy contributions faster than high-quality ones, especially these days. But the difficulty hurdle doesn't just keep people from sending patches, it also keeps people from reviewing / testing them; the number of eyes on the mailing lists is high, but the number of eyes on a web based forge is much higher.

I don't know if it works out to a net benefit in practice, but some subsystems/maintainers seem pretty happy with their forge-based or forge-capable (via gitgitgadget style bridges) workflows, so as usual it's not a technical problem, but a social one.

Changes were road-tested on Mastodon social network

Posted Jan 11, 2026 14:18 UTC (Sun) by gdt (subscriber, #6284) [Link] (1 responses)

For me the most interesting part of the documentation was watching it happen. Julia Evans road-tested changes in Mastodon, using a straightforward set of rules to keep conversations productive. "Ship early, ship often" but for documentation, in a medium which invites feedback.

Julia Evans also writes zines on programming topics. They sometimes unsuspectingly walk into messy aspects of what should be simple tasks, such as how do you robustly add a directory to PATH? Upon which Julia hacks through the thickets on the reader's behalf.

Changes were road-tested on Mastodon social network

Posted Jan 14, 2026 7:56 UTC (Wed) by stephenjudd (guest, #3227) [Link]

Yes, I think Evans makes a huge contribution to the community through her zines and social media, making many mysteries clear for noob and elder alike.

Refusing to use PRs

Posted Jan 11, 2026 17:51 UTC (Sun) by lyda (subscriber, #7429) [Link] (13 responses)

Refusing to use PRs/MRs or whatever on Github, Gitlab or Forgejo is just annoying. It freezes out a bunch of developers.

A project I use insists on emailed changes. They lose them on a regular basis. It's hard to keep track of what changes are out there.

Emailed patches were great when there weren't other options. There are now. Folks should use them. Forgejo is a single binary - it's easier to install than a mail server. Sheesh.

Supposedly there are some folks setting up a uucp2 network. If I ever end up with a change for an "email your patch" project I'm emailing it from a uucp email address and with a uucp command to copy the patch.

Huge hat tip to Julia Evans for giving folks info on a nice tool to be able to contribute to git development, but it's a massive shame it's needed.

Refusing to use PRs

Posted Jan 11, 2026 23:52 UTC (Sun) by alx.manpages (subscriber, #145117) [Link] (12 responses)

YMMV.

I refuse to review PRs instead of email patches. I find email much easier to deal with.

And mutt(1) is an example where moving from email patches to gitlab resulted in loosing contributions.

Web-based contributions aren't a magic stick that will bump contributions in every project.

Refusing to use PRs

Posted Jan 12, 2026 0:43 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (11 responses)

What are your thoughts on this proposed tool? <https://blog.buenzli.dev/announcing-development-on-flirt> If you could locally review a PR and push it back (even a partial review) with the ability to ignore things like Github's awful behavior in the face of rebases, would that at least make it so that you could *accept* someone using a PR as a way to get patches in your inbox?

FWIW, I also dislike reviewing code on Github (I find GitLab to be far better and Forgejo has largely copied Github here :/ ). But even GitLab has its annoying behaviors I'd like to avoid if possible.

My main issue with emailed patches is that I have no idea what "state" it is in and, AFAIK, pings go into the same /dev/null bucket. So I have no idea if I need a new revision, need to ping someone else, or what. It's quite frustrating as an occasional contributor to a project.

Refusing to use PRs

Posted Jan 12, 2026 1:20 UTC (Mon) by alx.manpages (subscriber, #145117) [Link] (10 responses)

> What are your thoughts on this proposed tool? <https://blog.buenzli.dev/announcing-development-on-flirt>

You can do the same with git-range-diff(1). git-range-diff(1) is amazing. And you can incorporate it in the cover letter of patch series with git-format-patch(1)'s --range-diff.

I documented how to do this in the Linux man-pages project contributing guidelines:
<https://git.kernel.org/pub/scm/docs/man-pages/man-pages.g...>

I use it even to document my github PRs, in projects I'm forced to use github. See the 'Revisions' in this comment:
<https://github.com/shadow-maint/shadow/pull/1471#issue-37...> ('git rd' is an alias I use: 'rd = range-diff master..@{u} master@{u}..HEAD').

>If you could locally review a PR

You can review github PRs locally. Github usually has a link at the bottom of a PR with instructions to merge it manually with the command line. You can follow those to pull the PR into your local git repo. I do that when reviewing a non-trivial PR.

> with the ability to ignore things like Github's awful behavior in the face of rebases,

If the contributor is aware and willing to be nice, it can help. What I do is separate rebases from code changes. You can ignore the rebases (and even better if you provide a range-diff that shows it's a no-op), and then do a separate push for the actual changes. Bonus points again if it shows the range-diff.

> would that at least make it so that you could *accept* someone using a PR as a way to get patches in your inbox?

No; email is just so much simpler. I mean orders of magnitude.

As an exception, I can accept anything. If someone comes with a contribution and says it's their first time sending email patches, I help as much as possible, and even write the patches myself if they give me some instructions.

But as a norm, I wouldn't.

> My main issue with emailed patches is that I have no idea what "state" it is in

Agree. That's a problem with the reviewers. I personally always explicitly confirm by email when I've applied a patch, so that contributors are aware that it's done. But some people just apply the patch, and remain silent, assuming you'll eventually realize.

If I haven't said anything, I probably either forgot for some reason, or have it in the queue of "this patch is boring or difficult, I'll review it when I'm fresh" (a.k.a., unread mail). In either of those cases, I appreciate pings, and I always respond after a ping, even if just to confirm that I forgot or have it in my queue.

> and, AFAIK, pings go into the same /dev/null bucket.

I think silence is disrespectful, indeed.

> So I have no idea if I need a new revision, need to ping someone else, or what. It's quite frustrating as an occasional contributor to a project.

I'd say that when in doubt, ask explicitly. People tend to reply to direct explicit questions. Sometimes they don't, but it often works. I assume some people are just not aware of how frustrating silence can be on contributors.

Refusing to use PRs

Posted Jan 12, 2026 3:16 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (4 responses)

This is beyond even `git-range-diff` and local merging. I'm aware of those things. Things I want from this tool that it should enable:

- PR changes things in "areas" A and B. I ask Alice to review A and Bob to review B. Using these, I can see that everything has been reviewed.
- Part A is reviewed by Alice. In my review, I "import" that review to mark it as "reviewed" so that I can focus on the "rest".
- PR is split into multiples after one round of review: bring what I reviewed in the full thing into one of the "sharded" PRs
- review on machine X, push review state, resume on machine Y

The other part is about managing open questions (this is where I find Github to fall over hardest because of its discussion resolution behaviors). If I notice something and say "hey, Charlie should look at this", it should be open until Charlie reviews that part. Same with changes made in response to a comment: the original commenter should be the one to say "yes, this is acceptable", not "autoresolve because it was changed". An intermediate state of "updated, please reverify" is also useful to track updates in response to parts of reviews.

Refusing to use PRs

Posted Jan 12, 2026 10:59 UTC (Mon) by alx.manpages (subscriber, #145117) [Link] (3 responses)

> - PR changes things in "areas" A and B. I ask Alice to review A and Bob to review B. Using these, I can see that everything has been reviewed.

Usual practice in email is to reply to each patch with

Reviewed-by: My Name <me@example.com>

when you have finished reviewing the an entire patch.
With that you keep track of who reviewed each patch.

When the contributor sends a revision of the patch set, it will include those tags in the commits (if the reviewed commits haven't changed), which inform reviewers that certain patches have already been reviewed by some people.

When you've only reviewed part of a patch, you reply inline adding something more informal like 'LGTM' after the parts you've reviewed. Then maybe you reply to the parts you won't review by saying you won't review those, and ask someone else to review them.

I have never seen issues with that.

Some people are using b4(1) and/or Patchwork to automate some of that, but I still use bare git(1) and MUA for my work and I'm happy with its simplicity. I'm quite scared of the complexity of those tools (see what happened to Kees Cook recently with b4(1)).

> The other part is about managing open questions

I manage open questions by leaving email marked as unread. So far it works reasonably well.

Refusing to use PRs

Posted Jan 12, 2026 20:01 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (2 responses)

> Usual practice in email is to reply to each patch with

Partial patch reviews are still missing. Also doesn't handle split patches and the like.

> With that you keep track of who reviewed each patch.

The problem is that not everyone has the entire list archive on hand at all times. If I hop onto a patch series in situ, I may be missing discussion history because it's not well-linked on lists. Personally, I prefer to reply to series 1's 00 patch with further series, but some maintainers don't like that because they toss the whole thread around. But that sounds like one can "spoil" a thread by replying with a (proposed) patch of their own which seems…unfortunate.

> When the contributor sends a revision of the patch set, it will include those tags in the commits (if the reviewed commits haven't changed), which inform reviewers that certain patches have already been reviewed by some people.

Will it? `b4` probably does it, but I've always ended up copying things manually. There's also no verification that the tags are actually truly sourced.

> When you've only reviewed part of a patch, you reply inline adding something more informal like 'LGTM' after the parts you've reviewed. Then maybe you reply to the parts you won't review by saying you won't review those, and ask someone else to review them.

I have no idea how I would keep track of this team-wide in any efficient manner without structured data. Trailers probably don't suffice (Partially-reviewed-by? what parts?).

> I manage open questions by leaving email marked as unread. So far it works reasonably well.

The thing is that such state is local to you. I work with teams of developers and knowing Sally has an open question about something can help prevent me from prematurely merging it because I cannot see her unread state (or even interpret it if I do see it).

Refusing to use PRs

Posted Jan 12, 2026 22:46 UTC (Mon) by alx.manpages (subscriber, #145117) [Link] (1 responses)

> > With that you keep track of who reviewed each patch.
>
> The problem is that not everyone has the entire list archive on hand at all times. If I hop onto a patch series in situ, I may be missing discussion history because it's not well-linked on lists. Personally, I prefer to reply to series 1's 00 patch with further series, but some maintainers don't like that because they toss the whole thread around. But that sounds like one can "spoil" a thread by replying with a (proposed) patch of their own which seems…unfortunate.

I also prefer like you having all v2,v3,... be replies to v1's 00 patch. As you said, that makes all the context available to a reader without much effort.

I just realized I haven't documented this in the contributing guidelines. I should document this now.

> > When the contributor sends a revision of the patch set, it will include those tags in the commits (if the reviewed commits haven't changed), which inform reviewers that certain patches have already been reviewed by some people.
>
> Will it? `b4` probably does it, but I've always ended up copying things manually. There's also no verification that the tags are actually truly sourced.

I do it manually. (Very) recently, I started appending a 'Message-ID:' tag right below each review tag, documenting the mail from which I took the tag. That makes sure I don't make mistakes, or at least that if I make mistakes one can easily check them. I don't expect frequent mistakes.

Another case is malicious tags. If one adds false tags, someone would likely realize and then bust the author. FWIW, I've seen invalid tags added in a patch generated by an LLM. We completely prohibit LLMs in the Linux man-pages project, and this served as yet another proof that the contribution was from an LLM, so I banned the user.

Refusing to use PRs

Posted Jan 12, 2026 23:45 UTC (Mon) by alx.manpages (subscriber, #145117) [Link]

Regarding replying to previous iterations of a patch set, I've documented it now:

<https://git.kernel.org/pub/scm/docs/man-pages/man-pages.g...>

Refusing to use PRs

Posted Jan 12, 2026 10:02 UTC (Mon) by Lennie (subscriber, #49641) [Link] (2 responses)

Something I keep wondering...

At the moment we have tools like Github and Gitlab automatically create a Markdown-based wiki:

some_project.git becomes some_project.wiki.git

Why of why, don't we have this too ?:

some_project.git becomes some_project.issues.git

This means we can replace the medium (email or web interface, etc.) and we don't depend on the tool (email client or web interface) anymore either.

Maybe people need to agree on some format. Or maybe it first needs to be part of git itself, so people can agree on it and adopt it ?

Refusing to use PRs

Posted Jan 12, 2026 10:14 UTC (Mon) by Lennie (subscriber, #49641) [Link] (1 responses)

I should add, maybe git-bug already solved most if not all of that.

Refusing to use PRs

Posted Jan 12, 2026 19:52 UTC (Mon) by mathstuf (subscriber, #69389) [Link]

You still need some kind of separate infrastructure to do things like:

- needs some kind of side database for accounts (forges provide this for their project metadata already)
- notify users when activity on an issue they're "subscribed to" (more metadata one would want to track) changes (after all, maybe they want email notifications too!)

I've come to the conclusion that issue trackers (all of them) are terrible for *task* tracking because of the disparity in sources. Not all of my work happens on a single forge (or even any forge at all!). The devtodo tool I wrote pulls things in for Github, but other forges shouldn't be too hard. However, they are invaluable for *coordination* with other developers and users, so some kind of public activity should be visible eventually.

Refusing to use PRs

Posted Jan 13, 2026 9:53 UTC (Tue) by donald.buczek (subscriber, #112892) [Link] (1 responses)

> I use it even to document my github PRs, in projects I'm forced to use github. See the 'Revisions' in this comment:
<https://github.com/shadow-maint/shadow/pull/1471#issue-37...> ('git rd' is an alias I use: 'rd = range-diff master..@{u} master@{u}..HEAD').

Hmm, I don't know. You've come up with a personal standard that is more of a hindrance than a help to a potential reviewer. How is anyone supposed to know what it means? Where is revision “v1”? Why are there suddenly “b” and “c” variants in ‘v5’ but no “a”? Which version was or is now in the PR branch? In terms of content, there is free text under the versions, which is good, but then there is a privately defined git command (“git rd”) that has no clear meaning for anyone else and a list of commits where it is also unclear what they should be used for.

You might be better off sticking to the standard of the respective project instead of your personal one.

Refusing to use PRs

Posted Jan 13, 2026 12:04 UTC (Tue) by alx.manpages (subscriber, #145117) [Link]

> > I use it even to document my github PRs, in projects I'm forced to use github. See the 'Revisions' in this comment: <https://github.com/shadow-maint/shadow/pull/1471#issue-37...> ('git rd' is an alias I use: 'rd = range-diff master..@{u} master@{u}..HEAD').
>
> Hmm, I don't know. You've come up with a personal standard that is more of a hindrance than a help to a potential reviewer. How is anyone supposed to know what it means?

That is a project where I contribute often and I help co-maintain. Maintainers know what to expect.

> Where is revision “v1”?

What I'm showing are range-diffs, so that if you've reviewed a previous revision, you know what changed for the next one.

v1 would compare to /dev/null, but that wouldn't be useful, since you're new to the patch set, so you need to review the entire PR at whatever revision it is.

> Why are there suddenly “b” and “c” variants in ‘v5’ but no “a”?

v5 is v5a. I differentiate minor revisions from major ones, so that reviewers know to expect unimportant changes there. That focuses attention on the major revisions.

> Which version was or is now in the PR branch?

The latest, of course. In that case, v5c.

> In terms of content, there is free text under the versions, which is good, but then there is a privately defined git command (“git rd”) that has no clear meaning for anyone else and a list of commits where it is also unclear what they should be used for.

When I contribute to other projects, I use regular `git range-diff ...`. But in this project, people know what me and what it means. After all, the names I give to branches and remotes locally are not useful to reviewers. What counts is the commit hashes, which are still visible, and of course the diffs.

> You might be better off sticking to the standard of the respective project instead of your personal one.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds