GitHub?

Posted May 16, 2017 17:48 UTC (Tue) by mathstuf (subscriber, #69389)
In reply to: GitHub? by NightMonkey
Parent article: A proposal to move GNOME to GitLab

Importing projects into Github can be lossy. For example, issues and PRs share a namespace and there's no way to turn off PRs on a project. If, during an issue migration, someone opens a PR, if your goal is to align issue numbers, you're forever off-by-one. I've imported 3 projects into Gitlab with 16000+ issues each[1] and the issue numbers today are the same as the pre-Gitlab era (Mantis *shudder*).

Other benefits of Gitlab that I've liked that Github doesn't have (or am unaware of):

- ability to move issues between repositories (given enough permissions on both repos);
- "discussions" on MRs where users can say "yes, this comment has been addressed" keeping discussions concise;
- diffs between MR pushes are readily available (not just via a notification or email link);
- terminology: "merge request" is better than "pull request" IMO since I don't care if you pull my code; I'd like it merged ;) );
- self-hosting so that you can have admin access if that's necessary;
- add "TODO" items for issues and MRs; and
- TODO/notifications are not cleared just by visiting the target page (adding a comment clears it).

Things I'd like Gitlab to implement/enable:

- emails for MR updates;
- emails for your own actions;
- MR reviews are Enterprise-only (I believe there is push to have it in CE as well);
- non-Gitlab-CI testing is a second-class citizen;
- forks between all repos in a "fork network" (I have a *really old* MR that does this, but it got stale due to a lack of time); and
- hook JSON objects are unstable and undocumented.

I can find issues/MR links for those interested in this list.

[1]They had a shared numbering namespace before.

GitHub?

Posted May 17, 2017 3:06 UTC (Wed) by raven667 (subscriber, #5198) [Link] (6 responses)

> non-Gitlab-CI testing is a second-class citizen;

I have been using the gitlab runner for very basic CI, using tito to build RPMs/createrepo on commit, but it looks like you could integrate with anything if you can drive it with scripts run from the gitlab runner account. Do you consider this an unusually messy way to do integration, instead expecting plugins for gitlab itself to drive the APIs of other CI tools? I'm interested in this area and just want to be explicit in my understanding of your critique.

GitHub?

Posted May 17, 2017 12:37 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (5 responses)

Our testing matrix has about 20 or so axes, 8 of which really matter and maybe 3 more which only apply if other options are set correctly. I don't know the best way to express such things in a flat YAML file, especially is some need extra things to be done on the machine itself (like having CUDA or a special built Homebrew package available). The projects can also take up to 45 minutes to build and test, so we don't have the hardware to test every single MR that comes in (we also need real GPUs, so Docker containers or cloud machines are generally out). Instead, we use build or mediated by another bot which checks for permission to start testing. This can be used to focus test only on certain platforms or against builds which have settings a certain way (e.g., a Windows-specific change doesn't need to waste the macOS builder's time).

On the Gitlab side, if you're using a fork-based workflow (as we are), you need Developer access to a repo to set commit statuses, so our main bot (which runs with admin privileges for other Gitlab-specific reasons), adds our buildbot account to all new repos. Neither bot with the tokens run arbitrary code from the projects, so leaking via a malicious MR isn't trivial at least. It also means that if a fork turns off the Pipelines feature on their repo, viewing the statuses is basically broken. There are existing issues for these problems as well.

GitHub?

Posted May 17, 2017 16:44 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

JFYI, both AWS and Azure have instance types with GPU.

GitHub?

Posted May 17, 2017 17:51 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (3 responses)

Yeah, but we still need macOS tested too :/ . I also think those GPUs tend to be homogeneous which makes testing wider ranges less viable. If we wanted to go full-cloud for testing, it'd be really expensive (either persistent machines to take advantage of incremental builds versus full builds every time) and I think we'd have to manage things across 3 or 4 providers. Just having the hardware local makes build management uniform at least even if maintenance is more of a burden (all of the builder descriptions live in one place and use a uniform description "language" rather than some being Docker containers, others being VM images, and whatever one uses for provisioning macOS testers.

And one project supports platforms that will never be in the cloud (VS 2008, macOS 10.7 (or so?), HP-UX, AIX, Solaris, etc.), so we're still back to some kind of local test infrastructure management solution.

GitHub?

Posted May 17, 2017 18:50 UTC (Wed) by excors (subscriber, #95769) [Link] (2 responses)

You could have short-lived VMs with persistent storage, so you can still do incremental builds despite starting a fresh VM each time. Or use separate machines for building and for running the tests. EC2's on-demand GPU instances cost ~50%-100% more than reserved ones per hour, so they're cheaper if you're only using it <50% of the time (after rounding up each use to an integer number of hours).

("Cheaper" is relative of course, it looks like EC2's cheapest modern one (with half of a GRID K520) is around $3K/year reserved, and the ones with more compute power cost more, so probably not worthwhile if you could get away with a cheap consumer GPU and don't need any of the other cloud features.)

GitHub?

Posted May 17, 2017 19:18 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (1 responses)

For clean builds, our build/test cycle takes about 20–40 minutes (depending on platform) machines on 10-core CPUs with hyperthreading for one project; another takes at least 45 on the same hardware. Incremental builds can be as small as 5 minutes, 15 for the larger project. The machines cost ~$2500 up front and can run multiple builds (depending on the project). I don't know what the electricity costs. There's also a benefit in being able to sit down with a machine to see what's gone wrong on it (helpful when you're dealing with things like rendering differences).

GitHub?

Posted May 17, 2017 20:00 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

Yes, for GPU it might make sense to do local builds.

BTW, one of our clients uses pre-built containers to run stuff on expensive instances (with 1Tb of RAM). The build is handled on cheap instance types and only the final containers are run on the expensive instances that are spun down once the calculation is over.

You can also sometimes get GPU instances on EC2 Spot Instances for very cheap.