Git 2.44.0 released [LWN.net]

Git 2.44.0 released

Posted Feb 24, 2024 5:38 UTC (Sat) by intelfx (subscriber, #130118) [Link] (12 responses)

> including the git replay command for faster, server-side rebasing

I had to wonder for a few seconds what does "server-side" mean here, considering that in Git (which is distributed) there isn't supposed to be _a_ server-side...

Apparently this means merely that `git replay` does not require a worktree to operate, which makes it particularly suitable for rebase-like operations on bare repositories (which are typically found on dedicated Git hosts).

Git 2.44.0 released

Posted Feb 24, 2024 10:44 UTC (Sat) by grawity (subscriber, #80596) [Link] (4 responses)

> considering that in Git (which is distributed) there isn't supposed to be _a_ server-side...

I believe distributed means that it's not supposed to *require* a server, not that it's not supposed to *have* one. There's a difference.

Git 2.44.0 released

Posted Feb 24, 2024 11:55 UTC (Sat) by intelfx (subscriber, #130118) [Link] (3 responses)

Likewise, there is a difference between what I said and the statement that Git is not supposed to have a server. What I meant is that in Git, there is no "server-side" as in "the party that performs operations on behalf of clients" (and that there is little to no semantic difference between a Git server and a Git client).

Git 2.44.0 released

Posted Feb 24, 2024 17:09 UTC (Sat) by Nahor (subscriber, #51583) [Link] (2 responses)

> Git (which is distributed) there isn't supposed to be _a_ server-side

Quite the contrary, "distributed" means _all_ Git installations have a server side. "distributed" just means there isn't a _dedicated_ server application/machine and a dedicated client, like in centralized systems. Git is always both. And as such a part of its code _is_ dedicated to serving data (listening for and answering incoming requests, ...). And as you mentioned afterwards, that "server side" is highlighted by the notion of "bare repository".

Git 2.44.0 released

Posted Feb 25, 2024 5:05 UTC (Sun) by intelfx (subscriber, #130118) [Link] (1 responses)

> "distributed" just means there isn't a _dedicated_ server application/machine and a dedicated client, like in centralized systems

That's exactly the meaning I intended to convey, as explained in the message you were directly replying to.

Git 2.44.0 released

Posted Feb 25, 2024 17:55 UTC (Sun) by Nahor (subscriber, #51583) [Link]

>> "distributed" just means there isn't a _dedicated_ server application/machine and a dedicated client, like in centralized systems
>That's exactly the meaning I intended to convey, as explained in the message you were directly replying to.

That's not how I understood either of your posts (or how I understood the fact that you posted your confusion in the first place despite your eventual realization of what the changelog was talking about). Even in your last post, you still said:
> What I meant is that in Git, there is no "server-side" as in "the party that performs operations on behalf of clients"

My understanding is that you feel that the notion of server in Git is a stretch (*), an abuse of the word for working with bare repositories. And I'm arguing that it's the opposite, that Git is a true server (and a client, all combined into a single app), and the bare repository is a consequence of that.
You can even start that server with the `git-daemon` command.

(*) Although, I admit I'm not clear about what you mean by "performs operations", so maybe that's where the dichotomy is. It feels to me like you have some restrictive idea of what operations a server is supposed to do. Since Git can do the same "operations" as an FTP or HTTP server (put/store/save + get/load) and more, I personally don't see a distinction between them and Git, i.e. Git is as much a server as they are.

Git 2.44.0 released

Posted Feb 26, 2024 9:22 UTC (Mon) by farnz (subscriber, #17727) [Link] (6 responses)

In every networked git operation, there is a server side and a client side. Which repository is server-side and which is client-side is determined by which command you ran and where, but there is always a server side in git.

The thing that git doesn't have is a concept of an authority that resolves conflcits; both client and server are equally authoritative in any operation, and indeed it's possible for you to swap client and server in any interaction and come to the same result (albeit with different commands to get there). This contrasts to something like git-svn or SVK, where there is an authoritative server (the SVN server), and thus the authoritative server can be called upon to resolve conflicts between non-authoritative users of the repo.

Git 2.44.0 released

Posted Feb 26, 2024 14:19 UTC (Mon) by paulj (subscriber, #341) [Link] (5 responses)

If I run git clone of a git repo on a distributed FS, what is the server and the client? E.g., if I (cd /tmp; git clone /dist/repo/foo) on, say, one of the lock-servers or some-kind-of-masterish-members, and it has to fetch chunks from a "JBOD member of the distributed FS", which is the server and which is the client? :)

Git 2.44.0 released

Posted Feb 26, 2024 14:27 UTC (Mon) by farnz (subscriber, #17727) [Link] (4 responses)

From git's point of view, the client side is the repo you're running in, and the server side is the remote repo. So, in the case you describe, git sees /tmp/foo as the "client" side, and "/dist/repo/foo" as the "server" side. If you then do a "git push" from "/dist/repo/foo" to "/tmp/foo", the roles swap over - /tmp/foo becomes the "server" side.

And this applies no matter how convoluted you make the underlying storage system :-)

Git 2.44.0 released

Posted Feb 26, 2024 16:29 UTC (Mon) by paulj (subscriber, #341) [Link] (3 responses)

I think they're both just git repos really. It's not really client/server, it's "Git reads blobs from 2 repos. One, the 'remote' can be RO". It's the same git reading both repos.

At most, you're feeding it the blobs of the other repo using some kind of network connection and some helper to retrieve the blobs from the storage. E.g. git over SSH it's still git both sides, and both machines can generally be things you'd consider "clients". Least, I - not completely rarely - do git ops between ostensibly client machine (I.e. machines with GUIs that I log into).

But.. meh. ;)

Git 2.44.0 released

Posted Feb 26, 2024 17:34 UTC (Mon) by farnz (subscriber, #17727) [Link] (2 responses)

Internally, git treats one side as a server and the other as a client. But it's kinda malleable which one's which, because neither repo is "more authoritative" than the other, so all operations exist in symmetrical forms where the "local" repo uses "client" operations, and the "remote" repo uses "server" operations.

Git 2.44.0 released

Posted Feb 26, 2024 21:46 UTC (Mon) by Wol (subscriber, #4433) [Link] (1 responses)

I'd draw my distinction as the server is a bare repo, while the client is a working repo, but even there the distinction is iffy ...

Cheers,
Wol

Git 2.44.0 released

Posted Feb 27, 2024 14:25 UTC (Tue) by geert (subscriber, #98403) [Link]

The side that has a bare repo is typically only usable as a server.
If both sides are not bare repos, the fancy "new" term is not client-server, but peer2peer ;-)

Git 2.44.0 released

Posted Feb 25, 2024 0:24 UTC (Sun) by intgr (guest, #39733) [Link] (1 responses)

> * Add support for GitLab CI.

What's this about? Is git considering a move to GitLab?

Git 2.44.0 released

Posted Feb 25, 2024 2:44 UTC (Sun) by ABCD (subscriber, #53650) [Link]

It looks like it's mostly a way for GitLab (the organization) to more easily test the changes they are sending upstream, per https://git.kernel.org/pub/scm/git/git.git/commit/?h=v2.4...

Git replay

Posted Feb 26, 2024 13:56 UTC (Mon) by Tobu (subscriber, #24111) [Link] (2 responses)

Git replay looks good, I can use it to rebase stacked histories like so:

git replay --onto linus/master linus/master..remote/feature linus/master..feature |git update-ref --stdin

It's better than git rebase in that it works fast and won't update worktree timestamps unnecessarily (you do need to start with a clean work tree and finalize with a git checkout feature, or git reset --hard if already on that branch), but it's also lacking useful functionality: instead of skipping previously applied commits (eg those that moved from remote/feature to linus/master), it generates empty commits. Which makes it hard to run git range-diff ...remote/feature to see what's changed.

Git replay

Posted Feb 27, 2024 3:44 UTC (Tue) by newren (subscriber, #5160) [Link] (1 responses)

> Git replay looks good, I can use it to rebase stacked histories like so:
>
> git replay --onto linus/master linus/master..remote/feature linus/master..feature |git update-ref --stdin

Glad folks are finding it useful. Note, though, that this only works for linear histories; if any of the commits in the range is a merge, you'll hit "replaying merge commits is not yet supported". There are plans to support merge commits (fairly detailed ones), but it doesn't exist yet.

> It's better than git rebase in that it works fast and won't update worktree timestamps unnecessarily (you do need to start with a clean work tree and
> finalize with a git checkout feature, or git reset --hard if already on that branch),

Why do you need to start with a clean work tree? Have you found a bug, or are you being careful about starting with a clean worktree because you are using a hard reset afterward and that would have problems with the changes? (Part of the point of git replay is that you don't necessarily want or need to check out the result afterwards, so having a clean worktree should be irrelevant for such cases.)

> but it's also lacking useful functionality: instead of skipping previously applied commits (eg those that moved from remote/feature to
> linus/master), it generates empty commits. Which makes it hard to run git range-diff ...remote/feature to see what's changed.

Boy, oh boy is it lacking functionality. I was super surprised that folks at GitLab and GitHub thought it useful in its current state. But they did, and apparently you and others do too, so it's probably a good thing they decided to push to have it made available. Anyway, adding an --empty={stop,drop,roll} flag (similar to those found in am & rebase) to either stop and ask the user how to handle the now-empty commit, or to automatically drop such commits, or to roll them up into the end result would indeed be a useful addition. There are many other capabilities I'd love to work on: giving users the ability to handle conflicts and continue instead of simply dying, replaying merges much smarter than rebase does, allowing a few more general revision flags like --ancestry-path, completely new and simplified interactivity handling, etc., etc. If only $employer hadn't decided it was time to invest in stuff other than git, and/or if I only had more free time. :-( I'm sure I'll get to it eventually, it just may be a while.

Git replay

Posted Feb 28, 2024 10:52 UTC (Wed) by Tobu (subscriber, #24111) [Link]

Glad folks are finding it useful. Note, though, that this only works for linear histories; if any of the commits in the range is a merge, you'll hit "replaying merge commits is not yet supported". There are plans to support merge commits (fairly detailed ones), but it doesn't exist yet.

Yes, for me at least this is enough: stacking non-merge commits as I would with git rebase.

Why do you need to start with a clean work tree? Have you found a bug, or are you being careful about starting with a clean worktree because you are using a hard reset afterward and that would have problems with the changes? (Part of the point of git replay is that you don't necessarily want or need to check out the result afterwards, so having a clean worktree should be irrelevant for such cases.)

The second: because I pipe into update-refs updating the current branch, followed by a hard reset. If there was a way to advance the current branch with autostash, that would be safer.

I'm used to working with low-level and high-level git commands depending on what works, and currently I can work with replay as a low-level command, git filter-branch to prune the empty commits (git filter-branch --prune-empty linus/master.. in this example — I looked at filter-repo and it doesn't take ranges), and finally git update-ref with manual stashing.

I'm not sure yet what would be a reasonable high-level interface: presumably just updating the refs from git replay would work (git replay --update-refs --autostash). Mimicking the rebase interface is an option but I don't know if I'd use it, because the point for me is not to reflect intermediate states inside the worktree, and that seems to preclude the interactive bits like --continue / --skip / --abort. Another totally different approach would be to stash worktree timestamps with their blob hashes somewhere and apply them back for unchanged files; then updating the worktree with in-between states wouldn't be a problem.