|
|
Subscribe / Log in / New account

Git considers SHA-256, Rust, LLMs, and more

[LWN subscriber-only content]

Welcome to LWN.net

The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider subscribing to LWN. Thank you for visiting LWN.net!

By Jonathan Corbet
October 21, 2025
The Git source-code management system is a foundational tool upon which much of the free-software community is based. For many people, Git simply works, though perhaps in quirky ways, so the activity of its development community may not often appear on their radar. There is a lot happening in the Git world at the moment, though, as the project works toward a 3.0 release sometime in 2026. Topics of interest in the Git community include the SHA-256 transition, the introduction of code written in Rust, and how the project should view contributions created with the assistance of large language models.

Moving to SHA-256

Hashes are a core part of how Git works; they are used to identify commits, but also to identify the individual files ("blobs") managed in a Git repository. The security of the repository (and, specifically, the integrity of the chain of commits that leads to any given state of the repository) is no stronger than the security of the hash that is used. Git, since the beginning, has used the SHA-1 hash algorithm, which is increasingly viewed as being insecure. It has been understood for years that, sooner or later, Git will have to move to using a different hash algorithm.

So far, that move has been repeatedly pushed to the "later" column. That is not to say that no work has been done in that area; LWN first covered the effort to move to SHA-256 in 2020, with an update in 2022. Git has had the ability to manage a repository using SHA-256 hashes since the 2.29 release in 2020. That is only part of the job, though; before SHA-256 can be widely used, there needs to be a solution for interoperability between SHA-1 and SHA-256 repositories. Git is a distributed system, with hundreds or thousands of repositories existing even for relatively small projects. Converting all of those repositories to a new hash function simultaneously is simply not going to happen, so there must be a way to move commits between repositories using different hash functions.

Writing that sort of interoperability code is the kind of task that few developers are aching to take on. So it is not surprising that, in this case, few have. The task has fallen to brian m. carlson, who has done almost all of the SHA-256 work. This work is progressing slowly; a patch series focused mostly on documentation updates looks set to land in the next Git release. But, as carlson said recently, there is a lot still to be done if the planned 3.0 release is to switch to SHA-256 by default:

The SHA-256 interoperability work is not done yet. My estimate of this work is 200–400 patches, of which about 100 are done. If the original schedule is maintained, this would require writing up to 75 patches and sending in 100 patches per cycle, which is unrealistic without additional contributors.

He also pointed out that some of the Git-based forge systems are more advanced than others with regard to readiness for this change. The project as a whole seems undecided as to whether the completion of the interoperability code is a required feature for the 3.0 release or not. There is a desire, though, to set some sort of date for the SHA-256 switch, to put pressure on forges and such to be ready, if for no other reason.

Rust

When Linus Torvalds first wrote Git in 2005, he naturally wrote it in C, and that is still the language that the project uses. As is the case with many other C projects, though, there is an interest in moving to a safer language — Rust, in this case. Some Git developers are already working in Rust; notably, carlson is implementing some of the SHA-256 interoperability code in that language. There is also a reimplementation of the xdiff library in Rust by Ezekiel Newren that is making the rounds. Rust, it seems, is in Git's future.

The first step in that direction is likely to be this patch series from Patrick Steinhardt, which introduces an optional Rust module as a "trial balloon" to help users and distributors adapt to the new building requirements. The series includes a documentation change indicating that Rust will become mandatory for building Git as of the 3.0 release. This change seems likely to land in a near-term Git release as well. Steinhardt has also been working on some improvements to Git's continuous-integration infrastructure to enable testing the Rust side of the build.

Large language models

Many projects have been struggling with whether (and how) to accept code that was produced with the help of large language models (LLMs); the Git project is no exception. Some projects are cautiously opening the door to such contributions; Git is being more cautious than most. Partly, that may be a result of its 2025 Google Summer of Code experience, where nearly all of the proposals received were LLM-generated; a first attempt at a related policy was considered at that time. Christian Couder recently posted an updated proposed policy for LLM-generated code that, in part, reads:

The Developer's Certificate of Origin requires contributors to certify that they know the origin of their contributions to the project and that they have the right to submit it under the project's license. It's not yet clear that this can be legally satisfied when submitting significant amount of content that has been generated by AI tools.

Another issue with AI generated content is that AIs still often hallucinate or just produce bad code, commit messages, documentation or output, even when you point out their mistakes.

To avoid these issues, we will reject anything that looks AI generated, that sounds overly formal or bloated, that looks like AI slop, that looks good on the surface but makes no sense, or that senders don't understand or cannot explain.

There has been some discussion of this proposal, with carlson saying that it is not firm enough. Chuck Wolber worried that it reads like a total rejection of LLM-generated code, which he seemingly does not support. Elijah Newren said that he has already contributed some LLM-generated documentation and wondered if it needed to be reverted. Git maintainer Junio Hamano has posted a firmer variant of the proposed policy that is derived from the one used by the QEMU project. More discussion is to be expected, but it seems that the Git project will remain relatively unwelcoming to machine-generated contributions for the foreseeable future.

Other stuff

It will probably not be in the next release, but sometime thereafter Git will include some documentation of its data model contributed by Julia Evans. A change that more users may notice is using "main" as the default branch name, by Phillip Wood. There has been a desire to move away from "master" for some time; the change is likely to be made in the 3.0 release. The biggest concern about that change at this point, seemingly, is the existing body of Git tutorials using "master", which could prove especially confusing for just the sort of new users those tutorials are aimed at. To head off confusion, Git is likely to include one other change providing a hint for people who want to change the name back.

The Git project celebrated its 20th anniversary this year; in those two decades, Git has become one of the most important tools in a software developer's toolbox. After all that time, it remains clear that the job is not yet done. Development of Git is proceeding rapidly, and does not appear to be set to slow down anytime soon.



to post comments

master/main change

Posted Oct 21, 2025 15:40 UTC (Tue) by jhe (subscriber, #164815) [Link] (3 responses)

How would i keep the HEAD symrefs up-to date when upstream projects change from master to main? Every time upstream deletes the branch that previously was their HEAD, all mirrors of that repo end up with a dangling symref. Current solution is doing a fresh git clone, but this is not sustainable for upstream.

master/main change

Posted Oct 21, 2025 16:04 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (2 responses)

Usually you can just git switch main and it will figure itself out (--guess is on by default).

If you have more than one remote, you can (I think) write something like git switch -t origin/main. If you have local changes, you'll have to decide what to do with them, and there are flags for that (see git-switch(1)).

master/main change

Posted Oct 21, 2025 16:24 UTC (Tue) by jhe (subscriber, #164815) [Link] (1 responses)

Thats what im doing (nano'ing the HEAD because git switch refuses to work in a bare repository) with the 1500 git mirrors. Whack-a-mole but on payroll.

master/main change

Posted Oct 21, 2025 17:11 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

Don't do that. Use [1] and write a five-line bash script instead. It will save you so much time over manually nano'ing individual HEAD files one at a time.

[1]: https://git-scm.com/docs/git-symbolic-ref

gitk

Posted Oct 21, 2025 15:42 UTC (Tue) by adobriyan (subscriber, #30858) [Link]

If gitk doesn't work with SHA-256 repo for you, clone gitk from https://github.com/j6t/gitk.git

This is what is pulled into git's git. alias gitk='~/distfiles/git/gitk.git/gitk'.

Google told me that 2.42 should be OK except even 2.49 doesn't have latest gitk usable with SHA-256.

This is LLMs for you.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds