|
|
Subscribe / Log in / New account

Meta's Sapling source-code management system

Meta has announced the open-source release of part of its internal source-code management system, called Sapling.

Sapling began 10 years ago as an initiative to make our monorepo scale in the face of tremendous growth. Public source control systems were not, and still are not, capable of handling repositories of this size. Breaking up the repository was also out of the question, as it would mean losing monorepo’s benefits, such as simplified dependency management and the ability to make broad changes quickly. Instead, we decided to go all in and make our source control system scale.

Starting as an extension to the Mercurial open source project, it rapidly grew into a system of its own with new storage formats, wire protocols, algorithms, and behaviors. Our ambitions grew along with it, and we began thinking about how we could improve not only the scale but also the actual experience of using source control.

At this point, only the client side of the system has been released; the company "hopes to" release the rest later.


to post comments

Meta's Sapling source-code management system

Posted Nov 16, 2022 15:26 UTC (Wed) by rjek (subscriber, #94501) [Link] (1 responses)

I can never help but read "Monorepo" as it is sung in The Simpsons classic "Monorail".

Meta's Sapling source-code management system

Posted Nov 17, 2022 15:05 UTC (Thu) by ldearquer (guest, #137451) [Link]

Now that's two of us :)

Meta's Sapling source-code management system

Posted Nov 16, 2022 16:15 UTC (Wed) by eplanit (guest, #121769) [Link] (17 responses)

I'm not so impressed by the results of Meta's "ambition" so far. Maybe it's a good VCS, but it seems to come from quite a self-aggrandizing attitude: our problems are so unique, and solvable by nobody else except ourselves.

Meta's Sapling source-code management system

Posted Nov 16, 2022 16:27 UTC (Wed) by mgk (guest, #74833) [Link]

... you beat me to it. +1

Meta's Sapling source-code management system

Posted Nov 16, 2022 17:12 UTC (Wed) by Sesse (subscriber, #53779) [Link] (1 responses)

Large monorepos have been solved many times before, but I don't know if anyone else has published their solutions?

I'm not really sure what the client alone is good for, though. I assume there's no public server?

Meta's Sapling source-code management system

Posted Nov 16, 2022 17:18 UTC (Wed) by geofft (subscriber, #59789) [Link]

The client is capable of cloning Git repos - that's the example that they show in the blog post. I think the blog post is a little confusing because they've rolled up a whole bunch of good ideas into a single system (which is understandable: that's the system they were using internally). My reading of the blog post is that you can see some of the good ideas, like the way the CLI shows history and handles stacked changes, with Sapling pointed at a Git repo, but other good ideas, like whatever they've done to address large monorepos, require pointing it at a Sapling server.

The blog post says "You can now try its various features using Sapling’s built-in Git support to clone any of your existing repositories." and "Many of our scale features require using a Sapling-specific server and are therefore unavailable in our initial client release."

Note that in addition to the Sapling CLI, they're also releasing ReviewStack, an alternative user interface for reviewing GitHub pull requests. The code appears to be in the Sapling repo, and there's also a public instance of it at https://reviewstack.dev .

Meta's Sapling source-code management system

Posted Nov 16, 2022 17:13 UTC (Wed) by geofft (subscriber, #59789) [Link] (13 responses)

They're not terribly unique in reaching that conclusion, though, nor are they unique in trying to solve their own VCS scaling problems internally. Google has their own deeply proprietary VCS. Microsoft has a highly-customized version of Git, most of which they've recently upstreamed. My own employer, which is much smaller than either, has custom tooling (which we've open-sourced, but to my knowledge nobody else has adopted) because until about two years ago, thanks to work by GitHub (Microsoft), GitLab, and others, pure upstream Git wasn't even in the running.

I'm hoping the end result of this is something similar to what Microsoft did with Scalar - the good ideas from it get merged into upstream Git, instead of it becoming yet another standalone VCS.

This is also a pathway that Meta themselves is familiar with: Instagram released their internal CPython fork Cinder https://github.com/facebookincubator/cinder not because they want people to use Cinder itself but because they want it as a public base of discussion to upstream the good ideas into actual CPython, so they can eventually drop the fork.

Meta's Sapling source-code management system

Posted Nov 17, 2022 3:56 UTC (Thu) by bartoc (guest, #124262) [Link] (1 responses)

(I work for Microsoft)

Yeah, the up-streamed stuff from gvfs/scalar is also pretty vastly improved over what was in the old fork. sparse index/worktree/clones are way better than gvfs because you don't need to worry about some random program enumerating the git repo (including getting file sizes) and causing gvfs to download everything. I had both TortoiseGit and WinDirStat do this, it's quite annoying.

Something not in git uptream that I would love to see is a way to automatically symlink/junction git submodules (the ones in .git/modules) to some central area, scalar (the from git-for-windows) does seem to _somehow_ do this, I think using the alternates mechanism and a shim clone/fetch command but it's not super clean. I would be happy just getting modules pointing to exactly the same initial remote pointing somewhere common, it would at least make it harder to end up with 10 different copies of LLVM's repo on my machine.

Oh, another pretty easy win (on linux, at least, but perhaps on windows and mac with a compatibility shim) would be teaching git-checkout-index to use copy_file_range when available. The first "chunk" of the (unpacked) git object doesn't match the first "chunk" of the checked-out file so it's kinda filesystem specific if this works. And ofc on windows even if you have a compat shim almost nobody can use it because ReFS/btrfs/zfs are not widely used (I think those are all the CoW filesystems with windows implementations).

Meta's Sapling source-code management system

Posted Nov 19, 2022 4:00 UTC (Sat) by sionescu (subscriber, #59410) [Link]

> sparse index/worktree/clones are way better than gvfs because you don't need to worry about some random program enumerating the git repo (including getting file sizes) and causing gvfs to download everything. I had both TortoiseGit and WinDirStat do this, it's quite annoying.

That would be throwing the baby out with the bathwater. Having a single total view of the repo has so many benefits that if some devtool can't cope with the size, then it's time to blacklist or fix it.

Meta's Sapling source-code management system

Posted Nov 17, 2022 10:43 UTC (Thu) by nysan (guest, #81015) [Link] (10 responses)

In the end, all monorepos end up into what clearcase used to be.
A big configspec to select different versions from different subdirectories of the monorepo. And then comes a corporate merger, and now you have two monorepos. :-O

Compare above with an google-repo XML file in a git repo, describing multiple sub-git-repos.

Its essentially the same thing. Monorepo is just way worse.

Meta's Sapling source-code management system

Posted Nov 17, 2022 12:20 UTC (Thu) by khim (subscriber, #9252) [Link] (7 responses)

> In the end, all monorepos end up into what clearcase used to be. A big configspec to select different versions from different subdirectories of the monorepo.

How much time this “end” needs? AFAIK Google's one haven't devolved into that.

The trick is simple: don't provide means to combine two versions. Period. If you need two versions of some third-party code for some reason then you just create two directories. Like Python2 vs Python3 difference was handled in the linux distros for years, too.

> And then comes a corporate merger, and now you have two monorepos. :-O

Why is that a problem? As long as you don't start weird automerger schemes and just treat code from another repo as “third party” and import code in the appropriate fashion everything works. Google does that with abseil AFAIK.

> Compare above with an google-repo XML file in a git repo, describing multiple sub-git-repos.

repo exist because Google needed something similar to monorepo but open-sourced. I deal with it on my $DAYJOB. It kinda-sorta works but is just so flaky, cumbersome and unreliable compared to normal monorepo.

Meta's Sapling source-code management system

Posted Nov 17, 2022 12:51 UTC (Thu) by nysan (guest, #81015) [Link] (6 responses)

"The trick is simple: don't provide means to combine two versions. Period."

OK, so you have 40 ppl working on feature X, and 40 ppl working on feature Y.
In the end, you need to integrate and test X and Y together, since they are dependent.
Merging X first, and Y second does compile, but can't be integration-tested.

How would you do this, in case you only allow a single CM version in the monorepo ?

Meta's Sapling source-code management system

Posted Nov 17, 2022 14:07 UTC (Thu) by pkolloch (subscriber, #21709) [Link]

Incremental integration with feature flags.

Meta's Sapling source-code management system

Posted Nov 17, 2022 14:36 UTC (Thu) by khim (subscriber, #9252) [Link] (4 responses)

> Merging X first, and Y second does compile, but can't be integration-tested.

If that's a monorepo then there are no merging. Individual commits are merged, of course (you can not have few thousand people working on the same code and not have some conflicts) but features are never implemented in branches.

Branches are for bugfixes.

Android couldn't follow that model 100% because of organisational issues, but it tries.

> OK, so you have 40 ppl working on feature X, and 40 ppl working on feature Y.

That just means that 80 people are committing to the trunk, what's the problem?

Meta's Sapling source-code management system

Posted Nov 17, 2022 22:06 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (3 responses)

> Branches are for bugfixes.

Not even. At least in my experience, branches are for releases. You branch at a point where the build is green, do any cherrypicks that you need, cut a release on the branch, and that's it. No merging. The branch just gets abandoned (maybe we GC it eventually?).

The corollary to this: If your code does not run at HEAD, that's your problem. You cannot make your own private branch where you use some ancient version of libfoo that nobody else is willing to support. When libfoo updates, everyone is expected to update with it, or else your code stops building (and, eventually, stops running in production). Depending on the size and reasonableness of the breakage, the people who maintain (and/or vendor) libfoo will probably be expected to help you transition to the new version, or even to do it for you, but you can't just say "we like the old version better" and expect that to end the discussion.

The corollary to the corollary: You really want to have good test coverage, because the libfoo maintainers can't be reasonably expected to find the breakage if the tests all pass (or if there are no tests).

Meta's Sapling source-code management system

Posted Nov 17, 2022 22:25 UTC (Thu) by khim (subscriber, #9252) [Link]

This is similar to crater run, I guess.

Only crater run ensures that compiler can be updated (and not other libraries) while in monorepo everything is supposed to work like that (but you can also update all the clients, which is the whole reason it's a monorepo).

Meta's Sapling source-code management system

Posted Nov 29, 2022 3:37 UTC (Tue) by brooksmoses (guest, #88422) [Link]

Yup; in my experience the "branches are for bugfixes" comes up when you need to fast-track a very specific bugfix, and so you cherrypick it onto the release branch of the existing release and then make a new release from that branch.

Meta's Sapling source-code management system

Posted Nov 29, 2022 13:59 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

Hmm. We use topic branches for *all* development (there are a few exceptions; mainly automatic development version number bumps, but nothing manual). Branches for releases are `-s ours` merged into more recent branches (this preserves an "all history is reachable from HEAD" property and means we can trivially resurrect any old branch for maintenance as needed). But we also have strict vendoring rules and mangle everything to avoid conflicts with anything that could be loaded in the same process (such is life when you make SDK-like things, not end-user products).

Meta's Sapling source-code management system

Posted Nov 17, 2022 22:52 UTC (Thu) by bartoc (guest, #124262) [Link] (1 responses)

I broadly agree (also, note that if you have a multi-repo scheme like this you can use git namespaces to keep the separate heads, branches, and tags while using the same object store and thus possibly deduplicating more things).

However, I think repo is .... not that good. It's mostly submodules plus some features that are almost always not a good idea. I don't think this is really true of repo given its age and Gerrit integration, but a lot of these tools feel like someone reading that submodules were problematic somewhere and just reinventing them without really understanding them. Basically all the criticisms of submodules have easy solutions or are just misunderstandings about how git works.

Meta's Sapling source-code management system

Posted Nov 18, 2022 1:09 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

Eh. We use submodules and I still don't like them. I don't think there are *better* solutions that are as easy to use when they are updated "often enough" (subtree extraction/merging works for "infrequent" updates). The lack of easy sharing between worktrees and local forks is painful as well when coupled with poor support for shallow cloning the things. My biggest gripe is `git archive` just punting instead of doing anything useful. `git-archive-all` is better, but still doesn't handle the cornercases that we end up hitting (custom attributes are only supported at the top-level; we can't export-ignore either because we need to query for other attributes).

The solutions exist, but are spread out, not easy to stitch together, or just end up being custom code.

Meta's Sapling source-code management system

Posted Nov 16, 2022 17:17 UTC (Wed) by q_q_p_p (guest, #131113) [Link] (2 responses)

At least it uses GPL license, too bad it conflicts with sl - decades of muscle training to not make that typo wasted :)

Meta's Sapling source-code management system

Posted Nov 16, 2022 17:47 UTC (Wed) by Sesse (subscriber, #53779) [Link] (1 responses)

sudo apt install sl? :-)

Meta's Sapling source-code management system

Posted Nov 16, 2022 18:08 UTC (Wed) by q_q_p_p (guest, #131113) [Link]

yeah, sapling ( https://github.com/facebook/sapling#sapling-cli ) conflicts with it ;)

Meta's Sapling source-code management system

Posted Nov 16, 2022 18:20 UTC (Wed) by IanKelling (subscriber, #89418) [Link] (7 responses)

This is a fork of Mercurial, a GPLv2 program that accepts contributions under GPLv2. Yet, facebook is requiring contributions under a CLA which gives them a permissive license on contributions. They accepted a huge body of code without those conditions when they forked.

https://engineering.fb.com/2022/11/15/open-source/sapling...

"I’d also like to thank the Mercurial open source community for all their collaboration and inspiration" but this is also an announcement that their contributions are no longer welcome under GPL as they were before. Unless I'm missing something, that seems like a rather backhanded thank you.

Meta's Sapling source-code management system

Posted Nov 17, 2022 10:42 UTC (Thu) by paulj (subscriber, #341) [Link] (6 responses)

I'm not sure this is a fork of the mercurial code-base.

The internal history of this, as I understood it (a couple of years out of date, and not near any team responsible - just as a user), is that FB started with mercurial. They then had to heavily customise hg to make it work at the ever greater scales they had internally with more and more code and developers working on it. Until they had effectively completely rewritten the back-end to use a Facebook specific, distributed object store - that's the "eden" bit in the source code I think (maybe simplified / pared-down for external use, I don't know). The front-end hg tools I think were heavily modified too. Sl however is a from scratch rewrite. I think it started out as a wrapper around the hg tools, but grew into a standalone front-end. There was also an effort to reimplement the back-end in Rust, along with front-end tooling for that - I think that's the "Mononoke" bit in the code, IIRC.

Last I remember, I /think/ the developer workflow still had some odd cases where you needed to use the hg commands, but for nearly all stuff you could use sl for your daily work-flow.

I presume that progressed to the stage where the completely rewritten Facebook^WMeta front-end + backend, sl / mononoke, can do everything itseflf, and is feature complete - and hence this can be released as "sapling" (retro-fitted name).

Meta's Sapling source-code management system

Posted Nov 17, 2022 10:47 UTC (Thu) by paulj (subscriber, #341) [Link] (1 responses)

Oh, and 'sl' was pretty cool and useful. I liked it.

I'd be a bit sceptical of using Facebook stuff outside of FB. There are internal brownie points for releasing stuff as open-source perhaps, but there are few to none for taking the time to maintain open-source stuff. Also, the internal culture is to build everything from a mono-repo, and have no concern for backward compatibilities (other than the non-atomic roll outs of binaries/artifacts from a build from said mono-repo). So I'd hate to depend on FB code outside of FB. In particular, the FB C++ library (folly) maintainers explicitly are hostile to attempts to make life easier for maintaining code out of FBCode that depends on their stuff.

Meta's Sapling source-code management system

Posted Nov 29, 2022 12:25 UTC (Tue) by scientes (guest, #83068) [Link]

> I'd be a bit sceptical of using Facebook stuff outside of FB.

Except ZSTD.

Meta's Sapling source-code management system

Posted Nov 17, 2022 15:28 UTC (Thu) by IanKelling (subscriber, #89418) [Link] (3 responses)

> I'm not sure this is a fork of the mercurial code-base.

I downloaded the repo. It is a copy of the mercurial repo from 2005 onward until it forks.

Meta's Sapling source-code management system

Posted Nov 17, 2022 16:45 UTC (Thu) by paulj (subscriber, #341) [Link] (2 responses)

Which code-base are you looking at?

I'm looking at https://github.com/facebook/sapling and - at a high-level anyway - I don't see anything from mercurial in the head/tip, nor in the history. I know internally that monnooke was a from-scratch rewrite. And the Eden object store and SCM backend isn't from mercurial either.

I could be confused, but can you be more explicit about what part of Sapling started out as mercurial code?

Meta's Sapling source-code management system

Posted Nov 18, 2022 13:52 UTC (Fri) by IanKelling (subscriber, #89418) [Link] (1 responses)

Yes, that repo you link to if you do git log --stat, and go to the end of the output, there are thousands of commits to mercurial starting in 2005. What history are you looking at? It isn't definitive, it will take a detailed analysis, but facebook should be the one clearly explaining whether mercurial copyrights are still a part of this codebase, that is another wrong they have done here. Also, you said "rewritten" as if that makes it no longer a fork but that in no way excludes it from being a fork of the same program https://en.wikipedia.org/wiki/Derivative_work.

Meta's Sapling source-code management system

Posted Nov 18, 2022 14:15 UTC (Fri) by paulj (subscriber, #341) [Link]

Yeah, sorry, you're right. The fb mercurial repo got merged in, to a subdir at some point - in gitk that happens in the middle of the history, and I had looked at top and bottom. :)

The mercurial code is there at: https://github.com/facebook/sapling/tree/main/eden/scm/ed... - maybe in other places.

Sapling source-code management system - no staging area?

Posted Nov 16, 2022 19:24 UTC (Wed) by sdalley (subscriber, #18550) [Link] (15 responses)

Hm, it doesn't have a staging area, and touts this as an advantage. I must say, losing this feature now that I've come to appreaciate it in git, really feels like a backward step...

Sapling source-code management system - no staging area?

Posted Nov 17, 2022 0:05 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (14 responses)

That's a Mercurialism. Sapling inherited it.

In general, Git's attitude seems to be that high-level concepts like commits and rebases should be understood directly in terms of their low-level on-disk representations as trees, refs, etc. Git's on-disk representation uses a staging area, so therefore you have a staging area as part of the UI. Mercurial does not do this. In Mercurial, the on-disk representation is considered an implementation detail, subject to revision at any time, and you are expected to understand commits as primitive objects. A staging area would be redundant to what Mercurial calls a "secret commit" (i.e. a commit that you don't intend to push, and that the tooling will prevent you from pushing accidentally), so Mercurial does not supply a staging area, even though there is some denormalization under the hood. This is a relatively small difference of opinion, but an important one.

Sapling source-code management system - no staging area?

Posted Nov 17, 2022 5:16 UTC (Thu) by jthill (subscriber, #56558) [Link] (2 responses)

Git's not abstract, it's concrete, that's true; it uses abstractions to help understand and describe what's possible, not to limit it. Pretending that that's some sort of abstract principle rather than for concrete benefit is rather spectacularly missing the point.

Sapling source-code management system - no staging area?

Posted Nov 17, 2022 17:54 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (1 responses)

> Pretending that that's some sort of abstract principle rather than for concrete benefit is rather spectacularly missing the point.

Ironically, this sentence is too abstract, and I have no idea what you are talking about.

Sapling source-code management system - no staging area?

Posted Nov 11, 2024 18:58 UTC (Mon) by jthill (subscriber, #56558) [Link]

Okay, this might be such an extreme necro it qualifies as actually weird but trying to find words for a reply to this has been niggling at my hindbrain all this time. That acknowledged,

Git is: a dag of snapshots plus annotated tags in the object db; local refs; and an index for tracking work on (often constructing new) snapshots. That's it. Everything else, everything else, is in whatever's-useful-in-your-work territory.

There's software design that starts with some perceived ideal/need and jumps straight to abstractions which are then explained and implemented, this is the root of the "implementation details don't matter" view of software, the "abstract principle first" sort of design that views any behavior not covered by the abstraction as aberrant, egregious.

Then there's software design that starts with basically a data structure and asks "what use can be made of it", it might start out as a design for a perceived need but abstractions are just ways of talking about the effects you can get.

What I'm saying is: Git's the second kind. Anything you can do with a dag of snapshots and re-hanging local labels, you can do with Git. The people who want definitive and elegant abstractions tend to express distaste for this, they'll call Git's UI a leaky abstraction and get more pejorative from there. And I think that's where they're entirely missing the point. Git's a tool, a data structure plus commands to work with it. The Git interface uses abstractions to talk about the useful things you can do, not to define what's proper.

Sapling source-code management system - no staging area?

Posted Nov 19, 2022 15:01 UTC (Sat) by kleptog (subscriber, #1183) [Link] (5 responses)

Whether git requires a staging area internally is irrelevant, I consider it one of Git's greatest innovations and contributions to the world of VCSs. The staging area is simply a "subset of the changes between the current commit and the working tree". Often when working on a piece of code I find and fix other unrelated issues I run across. And then afterwards use "git add -p" to select which chunks I want in which commit. Trying to do this with Mercurial with the MQ extension was an exercise in frustration.

The staging area is useful precisely *because it is not a commit*. You can add chunks, remove chunks, edit chunks in preparation for commit and only at the last moment do you actually make the commit. When making significant changes, it's not always immediately apparent which parts go where and having a separate staging area helps managing this.

I'm not sure I could go back to a VCS without a staging area. Secret commits seem like a straitjacket in comparison.

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 0:56 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (4 responses)

The MQ extension is informally deprecated, and probably would be formally deprecated if it didn't have a tiny sliver of use cases which changeset obsolescence does not cover (and which Git does not cover either, to my understanding).

> The staging area is useful precisely *because it is not a commit*. You can add chunks, remove chunks, edit chunks in preparation for commit and only at the last moment do you actually make the commit.

You can do all of those things with a secret commit, too. hg commit -i will happily prompt you for the precise chunks you want, let you edit them, etc, in exactly the same way as git add -p. The only difference is the terminology.

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 1:14 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (1 responses)

Just for clarity, here's the full equivalence:

When there staging area is empty:

* git commit does nothing, so it has no equivalent.
* git commit -a is equivalent to hg commit
* git add [file] is equivalent to hg commit --secret [file]
* git add -p is equivalent to hg commit -i
* git reset --mixed does nothing, so it has no equivalent.
* If a file is newly created or deleted, you have to run hg add/remove on it. hg forget will stop tracking a file without deleting it. This also applies to the nonempty case.

When the staging area is nonempty:

* git commit is equivalent to hg phase -d . (last argument is a dot and is the hg equivalent of HEAD)
* git commit -a is equivalent to hg phase -d . && hg amend (commands can be run in either order)
* git add [file] is equivalent to hg amend [file]
* git add -p is equivalent to hg amend -i
* git reset --mixed is equivalent to hg uncommit --no-keep
* Since the staging area has a description like any other commit, you might want to change it. hg amend -e will change the description, but also does a regular amend; you can pass additional arguments to tell it not to include any files in the amend, or make an alias for that if you need to do it frequently.

Bonus feature: You can stack multiple staging areas on top of each other, by using commit instead of amend. Git can't do that without using something like stash, which requires you to fiddle with an entirely different set of commands.

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 1:15 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

> * git add -p is equivalent to hg commit -i

Rather, hg commit --secret -i, assuming you still want to work on it some more.

Sapling source-code management system - no staging area?

Posted Nov 21, 2022 6:30 UTC (Mon) by roc (subscriber, #30627) [Link] (1 responses)

This.

There is simply no reason for the staging area to exist. If it was more fungible than a commit, that would be an indication to make commits more fungible, not to introduce an entirely new concept.

Sapling source-code management system - no staging area?

Posted Dec 6, 2022 15:54 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

I'd be interested in more formal mechanisms to replace what I'm using the index for in one project[1] where the index is used to skip actually checking out files to disk that don't need to be there for index operations while still leaving a place for conflict files and other file-based operations to occur. But without such a mechanism, the index is *very* useful to me.

[1]https://gitlab.kitware.com/utils/rust-git-workarea

Sapling source-code management system - no staging area?

Posted Nov 19, 2022 15:21 UTC (Sat) by Wol (subscriber, #4433) [Link] (4 responses)

> In general, Git's attitude seems to be that high-level concepts like commits and rebases should be understood directly in terms of their low-level on-disk representations as trees, refs, etc. Git's on-disk representation uses a staging area, so therefore you have a staging area as part of the UI.

That has a MAJOR benefit. If your understanding of the abstraction is different from mine, there is no "source of truth" to put us right. With git, you just point to the on-disk structure and say "There!".

There is another MAJOR benefit. While I can't speak to the stats, higher Mathematics requires the ability of abstract thought. Somewhere I came across the "fact", that people acquire this ability about age 14, and maybe *less than half* the population EVER acquire it. In other words, you have to be above average to understand how Mercurial works? Even worse, there's no source of truth to tell you whether you're right?

(If you remember that long screed about Pick and Relational, we have exactly the same thing - Pick may be abstract but it is heavily defined in how it maps to disk structures. Relational is defined in mathematical tuples and how it works is "ignore that man behind the curtain. Things are so much easier to understand when they map to real-world concepts you can build on.)

Cheers,
Wol

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 0:58 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (3 responses)

> That has a MAJOR benefit. If your understanding of the abstraction is different from mine, there is no "source of truth" to put us right. With git, you just point to the on-disk structure and say "There!".

The program's behavior is the source of truth. The abstraction is what it is, no more and no less.

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 8:40 UTC (Sun) by Wol (subscriber, #4433) [Link] (2 responses)

The point behind Relational, though, is that you're not supposed to know about the program - detailed behaviour especially ... and what happens if there are multiple implementations ...

Cheers,
Wol

Sapling source-code management system - no staging area?

Posted Nov 20, 2022 15:49 UTC (Sun) by kleptog (subscriber, #1183) [Link] (1 responses)

That's something lecturers like to tell you at university and what academics like to write about in papers. Out in the real world you need to know which database you're using because SQL doesn't standardise many important things. Like whether an empty string is NULL or not. Indexes are completely implementation specific. You need to know if the implementation you're using is smart enough to reorder left/inner/outer/semi/anti-joins or whether you have to figure it out yourself.

Basically, SQL is a language with many dialects. No large application can ignore the characteristics of the specific implementation they're using.

Sapling source-code management system - no staging area?

Posted Nov 21, 2022 8:34 UTC (Mon) by Wol (subscriber, #4433) [Link]

> Basically, SQL is a language with many dialects. No large application can ignore the characteristics of the specific implementation they're using.

This. Because Pick *expects* you to know the characteristics of the database, the reality is that they're all very similar. We did a major port between two different implementations once, and the bulk of the work was jsut *tweaking* the DataBASIC so it compiled on the new system. (That plus QA, of course.)

Now I'm working on yet another different implementation, I'm not noticing any real differences. The biggest, off the top of my head, is the lack of the SEQUENTIAL file type (a table optimised for sequential numeric keys). I guess the standard dynamic hash has improved ...

Cheers,
Wol

Meta's Sapling source-code management system

Posted Nov 17, 2022 10:34 UTC (Thu) by nysan (guest, #81015) [Link]

Ugh,

monorepo is a bad idea for so many reasons.
https://gerrit.googlesource.com/git-repo anyone ?

Meta's Sapling source-code management system

Posted Nov 19, 2022 18:57 UTC (Sat) by ahornby (subscriber, #3366) [Link]

For a sneak peak at what the server and VFS are likely to be based on: https://github.com/facebook/sapling/tree/main/eden/mononoke and https://github.com/facebook/sapling/tree/main/eden/fs

No commit signing

Posted Nov 22, 2022 13:23 UTC (Tue) by zdzichu (subscriber, #17118) [Link]

My exploration of Sapling was cut short, at the first commit. There's no GPG signing implemented yet :(
There's an open issue #218 asking for that.


Copyright © 2022, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds