Migrating 140 Repositories From GitLab to GitHub

I have spent the past year and a half moving an R&D team’s repositories, somewhere between 120 and 140 of them, from a self-hosted GitLab installation to GitHub. The migration is largely done, the per-repo CI is running, and the daily flow on GitHub has settled.

Where We Started

The team had been on self-hosted GitLab for years, with Jenkins doing the package builds. The repositories covered Debian packaging, Java services, Perl libraries, Python tooling, and a long tail of test fixtures and shared infrastructure. The build pipelines were Jenkins jobs configured per repository, with a dedicated build host and JFrog Artifactory as the package registry.

The decision to move to GitHub was made above me. The mandate was: keep the build pipelines running, do not break the release cadence, do not lose history. There was no maintenance window long enough for a hard cutover.

The Migration Tool

I built a small migration tool around the GitHub and GitLab APIs. It read the GitLab project list, created the matching GitHub repository, mirrored all branches and tags, copied the description and topics, set up the team permissions, and recorded which repositories had been moved. It supported re-runs, because the first pass on a busy repository would always miss something (late branches, force-pushed tags, the occasional rename), and the tool needed to pick up from where it left off without rewriting history that had already settled.

The bulk of the migration ran in batches through the year. The biggest single batch moved a couple of dozen repositories in one evening; the rest happened in smaller waves as the corresponding sub-teams switched their daily flow.

Per-Repo CI in the Rough-Edges Era

GitHub Actions in early 2022 was rougher than it is today. Reusable workflows were still in beta and only went GA in August. There was no clean way to share workflow logic across repositories. Composite actions existed but only at the step level, not at the job or workflow level. There were no job summaries, no native concept of required workflows across an organisation, and self-hosted runner groups were still maturing.

This meant the per-repo workflows had to be hand-rolled and copy-pasted across the matrix. To keep the duplication manageable, I split the shared logic into a central CI orchestrator repository that owned the heavier concerns: building per-service images, uploading Debian packages to the artifact registry, copying source tarballs between development and production registries, and monitoring the self-hosted runner pool. The per-repo workflows stayed thin and called into the central repository for anything that would otherwise drift.

Driving the Design

The technical work was a small fraction of the job. Most of the time went into stakeholder collaboration: agreeing on a workflow naming convention with the maintainers of each language family, deciding which concerns belonged in the per-repo workflow versus the central orchestrator, sizing the runner pool, and picking a secrets management strategy that worked for both human-triggered runs and scheduled rebuilds.

The design choices that mattered most: keep the per-repo CI deterministic and minimal; centralise anything that touched the artifact registry; never let two repositories disagree about the build steps for the same package type; preserve the ability to fall back to Jenkins for a few weeks during the transition, in case a build pipeline had to be debugged against the old system.

What’s Next

Reusable workflows went GA a few months ago and the next phase is to consolidate the per-repo workflows behind a small set of reusable ones. That work has started but is not yet at the point where it makes sense to write about. For now, the migration is done, the team is on GitHub day-to-day, and the build pipelines run on GitHub Actions.