Git archive generation meets Hyrum's law
Git archive generation meets Hyrum's law
Posted Feb 3, 2023 7:47 UTC (Fri) by mb (subscriber, #50428)Parent article: Git archive generation meets Hyrum's law
But it's just plain wrong to do that for on-the-fly generated archives.
These build systems should never have depended on on-the-fly generated archives.
They should have used released archives (maybe plus patches).
They should have cached the archives on a server they control. What if github goes down?
Having everything on somebody else's server is a very very bad trend.
This trend will hit us in the face, in the foreseeable future. (Anybody, who was not affected by the worldwide downtime of the MS clould a couple of days ago?)
Posted Feb 3, 2023 11:09 UTC (Fri)
by cortana (subscriber, #24596)
[Link] (8 responses)
These build systems should never have depended on on-the-fly generated archives. You're not wrong, but to pick the latest release of a random GitHub project: NetBox 3.4.4's only artefacts are these on-the-fly generated archives. So consumers of this content have no choice but to use them...
Posted Feb 3, 2023 11:25 UTC (Fri)
by bof (subscriber, #110741)
[Link] (7 responses)
I get that for the largest projects, that becomes a bit undesirable, but for other stuff, where's the problem?
Posted Feb 3, 2023 11:33 UTC (Fri)
by cortana (subscriber, #24596)
[Link] (2 responses)
(Admittedly I believe there's an option to git-clone which does this, only last time I used it I don't think it worked.)
The philosophical reason is a strongly held belief that a software release is a thing with certain other things attached to it (release notes, a source archive, maybe some binary archives). Once created, those artefacts are immutable.
If a software project isn't doing that then it's not a mature project doing release management, it's a developer chucking whatever works for them over the wall. Which is fine, most project start that way; but we've all been taking advantage of the convenience of GitHub's generated-on-the-fly source archives, instead of automating the creation of these source archives as part of our release processes and attaching them to GitHub releases.
As another poster said, for projects which _do_ do that they then have the problem that the GitHub 'source archive' links can't be removed, so now users have to learn "don't click on the obvious link to get the source code, instead you have to download this particularly-named archive attached to the release and ignore those other ones". GitHub really needs a setting that a project can set to get rid of those links!
Posted Feb 3, 2023 11:52 UTC (Fri)
by Wol (subscriber, #4433)
[Link] (1 responses)
There is. I think it's called a shallow clone. Something like --depth=1.
But last I heard, for a lot of projects, the size of the shallow clone is actually a *large* percentage of the full archive.
Cheers,
Posted Feb 3, 2023 17:02 UTC (Fri)
by cortana (subscriber, #24596)
[Link]
Posted Feb 3, 2023 15:10 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link] (3 responses)
When releasing, you *like* working with dead dumb archives at the end of a curl-able URL, with a *single* version of all the files you are releasing, and a *single* license attached to this archive (the multi-vendored repos devs so love are a legal soup which is hell to release without legal risks).
Posted Feb 3, 2023 15:29 UTC (Fri)
by farnz (subscriber, #17727)
[Link] (1 responses)
And the reason we're seeing pain here is that people are not actually working with "dead dumb archives" - the people being hurt are working with archives that are produced on-demand by git archive, and have been assuming that they are dumb archives, not the result of a computation on a git repo.
Basically, they've got something that's isomorphic to a git shallow clone at depth 1, but they thought they had a dumb archive. Oops.
Posted Feb 3, 2023 16:32 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link]
The ligua franca of release management is dumb archives, because devs like to move from svn to hg to git to whatever, or even not source control bulky binary files (images, fonts, music, whatever) so anything semi-generic will curl archives from a list of urls. And if github only provides reliably (for dubious values of reliably) generated archives that’s what people will use.
Posted Feb 3, 2023 15:56 UTC (Fri)
by paulj (subscriber, #341)
[Link]
Various build system generators support specifying a git commit as dependency and doing a git shallow clone to obtain it.
Git archive generation meets Hyrum's law
Git archive generation meets Hyrum's law
Git archive generation meets Hyrum's law
Git archive generation meets Hyrum's law
Wol
Git archive generation meets Hyrum's law
Git archive generation meets Hyrum's law
Git archive generation meets Hyrum's law
Git archive generation meets Hyrum's law
Git archive generation meets Hyrum's law