Sapling source-code management system - no staging area?

Posted Nov 16, 2022 19:24 UTC (Wed) by sdalley (subscriber, #18550)
Parent article: Meta's Sapling source-code management system

Hm, it doesn't have a staging area, and touts this as an advantage. I must say, losing this feature now that I've come to appreaciate it in git, really feels like a backward step...

Sapling source-code management system - no staging area?

Posted Nov 17, 2022 0:05 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (14 responses)

That's a Mercurialism. Sapling inherited it.

In general, Git's attitude seems to be that high-level concepts like commits and rebases should be understood directly in terms of their low-level on-disk representations as trees, refs, etc. Git's on-disk representation uses a staging area, so therefore you have a staging area as part of the UI. Mercurial does not do this. In Mercurial, the on-disk representation is considered an implementation detail, subject to revision at any time, and you are expected to understand commits as primitive objects. A staging area would be redundant to what Mercurial calls a "secret commit" (i.e. a commit that you don't intend to push, and that the tooling will prevent you from pushing accidentally), so Mercurial does not supply a staging area, even though there is some denormalization under the hood. This is a relatively small difference of opinion, but an important one.

Sapling source-code management system - no staging area?

Posted Nov 17, 2022 5:16 UTC (Thu) by jthill (subscriber, #56558) [Link] (2 responses)

Git's not abstract, it's concrete, that's true; it uses abstractions to help understand and describe what's possible, not to limit it. Pretending that that's some sort of abstract principle rather than for concrete benefit is rather spectacularly missing the point.

Sapling source-code management system - no staging area?

Posted Nov 17, 2022 17:54 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (1 responses)

> Pretending that that's some sort of abstract principle rather than for concrete benefit is rather spectacularly missing the point.

Ironically, this sentence is too abstract, and I have no idea what you are talking about.

Sapling source-code management system - no staging area?

Posted Nov 11, 2024 18:58 UTC (Mon) by jthill (subscriber, #56558) [Link]

Okay, this might be such an extreme necro it qualifies as actually weird but trying to find words for a reply to this has been niggling at my hindbrain all this time. That acknowledged,

Git is: a dag of snapshots plus annotated tags in the object db; local refs; and an index for tracking work on (often constructing new) snapshots. That's it. Everything else, everything else, is in whatever's-useful-in-your-work territory.

There's software design that starts with some perceived ideal/need and jumps straight to abstractions which are then explained and implemented, this is the root of the "implementation details don't matter" view of software, the "abstract principle first" sort of design that views any behavior not covered by the abstraction as aberrant, egregious.

Then there's software design that starts with basically a data structure and asks "what use can be made of it", it might start out as a design for a perceived need but abstractions are just ways of talking about the effects you can get.

What I'm saying is: Git's the second kind. Anything you can do with a dag of snapshots and re-hanging local labels, you can do with Git. The people who want definitive and elegant abstractions tend to express distaste for this, they'll call Git's UI a leaky abstraction and get more pejorative from there. And I think that's where they're entirely missing the point. Git's a tool, a data structure plus commands to work with it. The Git interface uses abstractions to talk about the useful things you can do, not to define what's proper.

Sapling source-code management system - no staging area?

Posted Nov 19, 2022 15:01 UTC (Sat) by kleptog (subscriber, #1183) [Link] (5 responses)

Whether git requires a staging area internally is irrelevant, I consider it one of Git's greatest innovations and contributions to the world of VCSs. The staging area is simply a "subset of the changes between the current commit and the working tree". Often when working on a piece of code I find and fix other unrelated issues I run across. And then afterwards use "git add -p" to select which chunks I want in which commit. Trying to do this with Mercurial with the MQ extension was an exercise in frustration.

The staging area is useful precisely *because it is not a commit*. You can add chunks, remove chunks, edit chunks in preparation for commit and only at the last moment do you actually make the commit. When making significant changes, it's not always immediately apparent which parts go where and having a separate staging area helps managing this.

I'm not sure I could go back to a VCS without a staging area. Secret commits seem like a straitjacket in comparison.

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 0:56 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (4 responses)

The MQ extension is informally deprecated, and probably would be formally deprecated if it didn't have a tiny sliver of use cases which changeset obsolescence does not cover (and which Git does not cover either, to my understanding).

> The staging area is useful precisely *because it is not a commit*. You can add chunks, remove chunks, edit chunks in preparation for commit and only at the last moment do you actually make the commit.

You can do all of those things with a secret commit, too. hg commit -i will happily prompt you for the precise chunks you want, let you edit them, etc, in exactly the same way as git add -p. The only difference is the terminology.

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 1:14 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (1 responses)

Just for clarity, here's the full equivalence:

When there staging area is empty:

* git commit does nothing, so it has no equivalent.
* git commit -a is equivalent to hg commit
* git add [file] is equivalent to hg commit --secret [file]
* git add -p is equivalent to hg commit -i
* git reset --mixed does nothing, so it has no equivalent.
* If a file is newly created or deleted, you have to run hg add/remove on it. hg forget will stop tracking a file without deleting it. This also applies to the nonempty case.

When the staging area is nonempty:

* git commit is equivalent to hg phase -d . (last argument is a dot and is the hg equivalent of HEAD)
* git commit -a is equivalent to hg phase -d . && hg amend (commands can be run in either order)
* git add [file] is equivalent to hg amend [file]
* git add -p is equivalent to hg amend -i
* git reset --mixed is equivalent to hg uncommit --no-keep
* Since the staging area has a description like any other commit, you might want to change it. hg amend -e will change the description, but also does a regular amend; you can pass additional arguments to tell it not to include any files in the amend, or make an alias for that if you need to do it frequently.

Bonus feature: You can stack multiple staging areas on top of each other, by using commit instead of amend. Git can't do that without using something like stash, which requires you to fiddle with an entirely different set of commands.

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 1:15 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

> * git add -p is equivalent to hg commit -i

Rather, hg commit --secret -i, assuming you still want to work on it some more.

Sapling source-code management system - no staging area?

Posted Nov 21, 2022 6:30 UTC (Mon) by roc (subscriber, #30627) [Link] (1 responses)

This.

There is simply no reason for the staging area to exist. If it was more fungible than a commit, that would be an indication to make commits more fungible, not to introduce an entirely new concept.

Sapling source-code management system - no staging area?

Posted Dec 6, 2022 15:54 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

I'd be interested in more formal mechanisms to replace what I'm using the index for in one project[1] where the index is used to skip actually checking out files to disk that don't need to be there for index operations while still leaving a place for conflict files and other file-based operations to occur. But without such a mechanism, the index is *very* useful to me.

[1]https://gitlab.kitware.com/utils/rust-git-workarea

Sapling source-code management system - no staging area?

Posted Nov 19, 2022 15:21 UTC (Sat) by Wol (subscriber, #4433) [Link] (4 responses)

> In general, Git's attitude seems to be that high-level concepts like commits and rebases should be understood directly in terms of their low-level on-disk representations as trees, refs, etc. Git's on-disk representation uses a staging area, so therefore you have a staging area as part of the UI.

That has a MAJOR benefit. If your understanding of the abstraction is different from mine, there is no "source of truth" to put us right. With git, you just point to the on-disk structure and say "There!".

There is another MAJOR benefit. While I can't speak to the stats, higher Mathematics requires the ability of abstract thought. Somewhere I came across the "fact", that people acquire this ability about age 14, and maybe *less than half* the population EVER acquire it. In other words, you have to be above average to understand how Mercurial works? Even worse, there's no source of truth to tell you whether you're right?

(If you remember that long screed about Pick and Relational, we have exactly the same thing - Pick may be abstract but it is heavily defined in how it maps to disk structures. Relational is defined in mathematical tuples and how it works is "ignore that man behind the curtain. Things are so much easier to understand when they map to real-world concepts you can build on.)

Cheers,
Wol

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 0:58 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (3 responses)

> That has a MAJOR benefit. If your understanding of the abstraction is different from mine, there is no "source of truth" to put us right. With git, you just point to the on-disk structure and say "There!".

The program's behavior is the source of truth. The abstraction is what it is, no more and no less.

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 8:40 UTC (Sun) by Wol (subscriber, #4433) [Link] (2 responses)

The point behind Relational, though, is that you're not supposed to know about the program - detailed behaviour especially ... and what happens if there are multiple implementations ...

Cheers,
Wol

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 15:49 UTC (Sun) by kleptog (subscriber, #1183) [Link] (1 responses)

That's something lecturers like to tell you at university and what academics like to write about in papers. Out in the real world you need to know which database you're using because SQL doesn't standardise many important things. Like whether an empty string is NULL or not. Indexes are completely implementation specific. You need to know if the implementation you're using is smart enough to reorder left/inner/outer/semi/anti-joins or whether you have to figure it out yourself.

Basically, SQL is a language with many dialects. No large application can ignore the characteristics of the specific implementation they're using.

Sapling source-code management system - no staging area?

Posted Nov 21, 2022 8:34 UTC (Mon) by Wol (subscriber, #4433) [Link]

> Basically, SQL is a language with many dialects. No large application can ignore the characteristics of the specific implementation they're using.

This. Because Pick *expects* you to know the characteristics of the database, the reality is that they're all very similar. We did a major port between two different implementations once, and the bulk of the work was jsut *tweaking* the DataBASIC so it compiled on the new system. (That plus QA, of course.)

Now I'm working on yet another different implementation, I'm not noticing any real differences. The biggest, off the top of my head, is the lack of the SEQUENTIAL file type (a table optimised for sequential numeric keys). I guess the standard dynamic hash has improved ...

Cheers,
Wol