DevOps Architecture

[DevOps Architecture] Version Control - Git + Monorepo + GitHub Flow Is the Standard

[DevOps Architecture] Version Control - Git + Monorepo + GitHub Flow Is the Standard

About this article

As the third installment of the “DevOps Architecture” category in the series “Architecture Crash Course for the Generative-AI Era,” this article explains version control.

Version control is the land registry. With Git as the de facto today, repo structure, branch strategy, and tag operation are premises of every dev process - getting them sloppy spins everything else. This article covers the use of Trunk-Based / GitHub Flow / Git Flow, monorepo vs multi-repo, SemVer/tag operation, and SVN→Git migration.

DevOps Architecture Overview — One Pipeline for Build, Ship, and Runen.senkohome.com/arch-intro-devops-overview/[DevOps Architecture] DevOps and SRE Overview - Speed and Stability Coexisten.senkohome.com/arch-intro-devops-sre/[DevOps Architecture] Dev Environment and Local Execution - Half a Day to First Commiten.senkohome.com/arch-intro-devops-devenv/[DevOps Architecture] Code Review - PR 300 Lines + 1 Approver + CODEOWNERSen.senkohome.com/arch-intro-devops-review/[DevOps Architecture] Test Design - Pyramid + Testcontainers + Branch Coverageen.senkohome.com/arch-intro-devops-test/[DevOps Architecture] CI/CD - GitHub Actions + OIDC + Feature Flag Is the Standarden.senkohome.com/arch-intro-devops-cicd/[DevOps Architecture] Deploy Strategy - Raise Frequency, Lower Risken.senkohome.com/arch-intro-devops-deploy/[DevOps Architecture] Monitoring and Observability - Three Pillars + OpenTelemetry + SLO Alertsen.senkohome.com/arch-intro-devops-observability/[DevOps Architecture] Log Design - Structured JSON + No PII + Phased Cold-Tieringen.senkohome.com/arch-intro-devops-logging/[DevOps Architecture] SLO and SLI - Don't Pursue 100%, Buy Speed With Error Budgeten.senkohome.com/arch-intro-devops-slo/[DevOps Architecture] Incident Response - Resolve via Mechanism, Not Heroesen.senkohome.com/arch-intro-devops-incident/[DevOps Architecture] SRE Practices - Toil Reduction and Chaos Drillsen.senkohome.com/arch-intro-devops-sre-practice/[DevOps Architecture] Documentation - Lean README + ADR + OpenAPI Toward Giten.senkohome.com/arch-intro-devops-docs/[DevOps Architecture] Ticket and Project Management - Epic/Story/Task + 1-Day Granularityen.senkohome.com/arch-intro-devops-ticket/

How Git became the standard

In 2005, Linus Torvalds built Git in 2 weeks for Linux kernel development. The direct trigger was the licensing trouble between the prior commercial VCS (BitKeeper) and the kernel community, and Git arrived with what’s now obvious traits: “no central server, distributed, fast, lightweight branches.”

After GitHub’s 2008 launch, OSS standards converged to Git, and corporate closed development largely unified to Git. Commercial VCSes (Perforce, ClearCase) still remain in some large enterprises and the gaming industry, but there’s almost no reason to choose them for new selection. Today, Git is the only choice in reality.

GenerationVCSCharacteristics
1stRCS, CVSPer-file, centralized
2ndSubversion (SVN), PerforcePer-repo, centralized
3rd (modern)Git, MercurialDistributed, fast, lightweight branches

Why Git won

Technically, Git’s strengths were “distributed, fast, lightweight branches,” but the deciding factor in spread was socialization via GitHub. Mechanisms like PR (Pull Request), Issues, and Stars fundamentally changed code-review and OSS-contribution experience, swinging developers in one go to the state of “use Git in order to put it on GitHub.”

ElementWhy Git is superior
DistributedEven if central server goes down, everyone has a complete copy
Lightweight branchesCreated/switched in milliseconds, no resistance to discarding
Rich merge strategiesUse rebase / squash / merge commit per situation
Hosting integrationPR flow standardized via GitHub / GitLab / Bitbucket
AI training-data depthChatGPT / Copilot perfectly understand Git operations

In the AI-driven dev era, AI’s fluency in writing Git operations became the decider for selection. Perforce and ClearCase have thin training data, and AI agents can’t correctly handle commits or conflict resolution. That alone is reason enough to drop them from new selection.

What to decide in version control

Version control design doesn’t end with “use Git” - decide by combining the following 5 axes. Each is high-cost to change later, so the rule is to decide at project start.

AxisChoices
Repo structureMonorepo, multi-repo, hybrid
Branch strategyTrunk-Based, GitHub Flow, Git Flow
Merge methodSquash merge, Rebase merge, Merge commit
Tag/release operationSemantic Versioning, date tags, none
Large filesGit LFS, Git Annex, separate management

This 5-axis combination decides the premises of CI/CD, test strategy, and review operation. Repo structure in particular is the most upstream - getting it wrong means redoing everything.

Monorepo vs multi-repo

The biggest issue in repo structure is monorepo (everything in one repo) vs multi-repo (separate repo per service). The impression “monorepo is for big enterprises” is outdated, and today monorepo is strong even for small/mid-size teams.

AxisMonorepoMulti-repo
Version managementCommon version for all codeIndependent per service
Cross-cut changesComplete in 1 PRMultiple PRs needed
CIChange-range testing required (slow)Per-repo, fast
Permission managementControlled by CODEOWNERSPer repo
Representative toolsNx, Turborepo, Bazel, pnpm workspace(no special tools)

Google, Meta, Uber, Airbnb use monorepo. Amazon leans multi-repo (the “two-pizza team, one service one repo” philosophy). It’s not which is correct, but matching the organization’s communication structure (Conway’s law).

Criteria for choosing monorepo

Monorepo is favorable under “code mutually depends” and “frequent need for bulk changes.” Conversely, with highly independent service groups, multi-repo is lighter to operate.

SituationRecommended
Front + back + shared typesMonorepo (TypeScript power maxed by centralized types)
10+ microservices held by independent teamsMulti-repo (clarify team boundaries)
Library + using app dev simultaneouslyMonorepo (instant verification via local references)
Acquired/subsidiary independent organizations gatheredMulti-repo (often not worth integration cost)

In my experience, monorepo lands well for 80% of teams. Multi-repo only functions when “the org is truly independent,” which is limited to clear division-of-labor at dozens-to-hundreds-of-people scale. When in doubt, start with monorepo and decompose to multi-repo as scale demands - the safe order.

Monorepo is the top candidate when in doubt. Multi-repo carries the burden of proof for independence.

Branch strategy - 3 patterns

Branch strategy is also touched in CI/CD article, but from version control’s perspective, the core is how short-lived. The longer-lived branches are, the heavier merge conflicts get and the slower reviews become.

gitGraph
    commit id: "main"
    branch feature-A
    commit id: "feat A1"
    commit id: "feat A2"
    checkout main
    branch feature-B
    commit id: "feat B"
    checkout main
    merge feature-A id: "PR#1 squash"
    merge feature-B id: "PR#2 squash"
    commit id: "release"
StrategyCharacteristicsSuited for
Trunk-Based DevelopmentShort-lived feature branches, hours-to-day to mainHigh-grade CI/CD, expert teams
GitHub Flowfeature branch + PR + mainWeb services, modern dev standard
Git Flowmain + develop + release + hotfixPackaged products, version-parallel management

For new projects, GitHub Flow or Trunk-Based is the standard. Git Flow is too complex - excessive for continuously-deployed services. Git Flow only shines in cases like packaged products and on-prem distributed software where “version-unit independent operation is needed,” but those business models are decreasing.

Merge method - Squash / Rebase / Merge

There are 3 ways to merge a PR, and which you adopt greatly changes commit-history readability. Not which is superior - what matters is unifying within the team.

MethodResultProsCons
Squash mergeCompress whole PR into 1 commitConcise main history, per-PR revertFine-grained PR-internal history lost
Rebase mergeStack PR commits onto main in orderLinear history, easy to followPremise: thorough commit conventions
Merge commitLeaves PR branch and merge commitHistory of “merged this PR” remainsmain history gets complex

The current mainstream is Squash merge. Especially in small/mid teams, following history at PR granularity is more practical and pairs well with auto-generation of release notes. Rebase merge functions when expert teams thoroughly use Conventional Commits.

Mixing the 3 is the worst. Unifying on Squash causes the least friction.

Tags and release operation

Tags identifying releases tend to be undervalued in version control. Teams that can’t instantly identify “which code is in production” always get stuck on incidents.

NamingFormatSuited for
Semantic Versioning (SemVer)v2.3.1 (Major.Minor.Patch)Libraries, packaged products
CalVer (Calendar Versioning)2026.04.01SaaS, frequently-released products
Build numbersbuild-1234Internal CI/CD identification
Date + Git hash20260422-abc123Simple production tracking

SemVer functions only on teams that can guarantee the semantics of “major shows compatibility break.” Won’t work if breaking changes are dumped into Major without planning. CalVer has the benefit of glance-clear “from when is this code,” good chemistry with SaaS.

Standard practice is leaving tags and changelogs together via GitHub Releases or GitLab Release. Conventional Commits + auto-release-note generation (release-please, semantic-release) is the modern standard.

Large files and Git LFS

Git is bad at large binary files. Normally committing several-GB videos, images, ML models, or game assets bloats the repo to where clone takes hours.

MethodCharacteristics
Git LFS (Large File Storage)Git manages just pointers, body in separate storage
DVC (Data Version Control)ML-oriented, integrates with S3 etc. for data version control
External storage (S3, GCS) + referenceDon’t put in Git at all, only version managed separately
Git AnnexMore flexible than Git LFS but high learning cost

Git LFS is most adopted, with GitHub/GitLab/Bitbucket standard-supporting it. But LFS has pitfalls too - mass branch creation balloons storage costs, so hundreds-of-GB-scale ML datasets suit DVC or S3 references better.

Normally committing binaries kills the repo. Design with LFS or external references from the start.

SVN-to-Git migration decision

There are still sites with SVN (Subversion) today, mainly large enterprises with 10+ years of operation or organizations with underdeveloped CI/CD. SVN-to-Git migration takes large effort, but you can see there’s almost no option to continue.

SVN-continuation downsideContent
Heavy branchesDesigned via directory copies, high switching cost
Not distributedAll stop on central-server failure
AI tools unsupportedCopilot/Cursor etc. premise Git
Bad chemistry with modern CI/CDGitHub Actions / GitLab CI premise Git
Hiring disadvantageNew grads to mid-career have almost no SVN experience

The standard migration tool is git-svn (Git-ize while preserving history). Full migration takes weeks to months, but from the 3 viewpoints of “hiring, dev speed, AI usage,” reasons to continue no longer remain. Once migration is decided, the rule is short-decisive battle, finishing it in one go. Phased migration becomes hell during “the period of maintaining both SVN and Git.”

The longer SVN is kept on life support, the more debt accumulates. End migration with a short decisive battle - the front-runner.

.gitignore and handling secrets

The No.1 source of accidents in version control is accidentally committing secrets (API keys, passwords, tokens). Once committed, they remain in Git history, and even force-push can’t erase them from history caches, treated as effectively leaked.

CountermeasureContent
Thorough .gitignoreAlways exclude .env, *.pem, secrets/
pre-commit hooksAuto-detect with gitleaks, detect-secrets
Secret Scanning (GitHub standard)Detect secrets at push, block before public
Rotation on leakInstantly invalidate all leaked keys/tokens

In 2022, a Toyota subsidiary accidentally published an access key to GitHub, leaving about 300,000 customer records accessible for 5 years. After discovery they invalidated the key, but there’s no way to zero out 5 years of leak risk. .gitignore and pre-commit hooks are an area where “later because it’s annoying” isn’t allowed.

Numerical gates and operational metrics for version control

Note: Industry baseline values as of April 2026. Will become outdated as technology and the talent market shift, so requires periodic updates.

Version control health is practically tracked numerically. Below are industry-standard metrics.

MetricRecommendedWhat to do if exceeded
Feature branch lifespan2-3 daysConsider forced merge over 1 week, long-life is conflict hell
Lines changed per PR~400Split over 1000, review formalizes
Direct push to main0All via PR, enforce by branch protection
Commit unitConventional Commits compliantfeat:/fix:/docs: etc., auto-versioning via semantic-release
Repo clone time~30 secondsConsider LFS / shallow clone if exceeded
Secret Scanning alert responseWithin 5 minutesInstantly rotate leaked keys
.gitignore leak incidents0Physically block via pre-commit hook
Tagging ruleUnified SemVer/CalVerMixing makes history untraceable
Monorepo CI runtimeWithin 10 min on PRShorten via change-range testing (Nx / Turbo)

Secret Scanning becoming a GitHub-standard feature since 2022 dramatically shortened the time to incident discovery. The “Toyota subsidiary 300k customer records 5-year-accessibility incident” (2022) symbolizes the cost of the era before Secret Scanning.

Secrets get physically blocked before commit. “Be careful” is operations that don’t function.

Version-control-operation pitfalls and forbidden moves

Typical accident patterns in version control. All lead to “breaking code history” / “losing organizational trust”.

Forbidden moveWhy it’s bad
Newly adopt SVN/Perforce/ClearCaseThin AI training data, hiring disadvantage, zero rationality today
Commit .env or private keys to GitRemains in history caches, effectively unrecoverable, like the 2022 Toyota subsidiary 300k incident
Force push to main/developOther members’ work erased, history rewriting fails audit
Leave branches over a week before mergingConflict hell, “80% done” reports continuing 3 weeks
Normally commit binary files (videos, models)Repo bloats to tens of GB, clone takes hours
Mix merge methods (Squash/Rebase/Merge commit)History gets complex, “per-PR revert” becomes impossible
Adopt Git Flow on continuously-deployed SaaSToo complex, dev speed drops, switch to GitHub Flow
No commit-message conventionHistory search and release-note automation impossible
Leave branches over a year oldBacklog graveyard, periodic triage to delete
Mix monorepo vs multi-repo by moodToolchain disperses, CI duplicate operation

The GitLab January 2017 DB-deletion incident (deleted production DB during on-call, 4 of 5 backups didn’t function, recovered from 6-hour-old snapshot) is a symbol of the lesson “evaluate backups not by ‘taken’ but by ‘restorable’.”

Force push is absolutely forbidden on main/develop, rebase on feature branches is allowed. The point is distinguishing where it’s forbidden.

Version control in the AI era

In AI-driven dev, version-control design directly affects AI agent performance. AI fluency in Git operations is a premise; what’s decisive is whether repo structure, commit conventions, and branch operation are in AI-understandable form.

Favored in the AI eraDisfavored in the AI era
Git + GitHub (standard)SVN, Perforce (thin AI training data)
Monorepo (related code in one place)Multi-repo (context dispersed)
Conventional CommitsFree-form commits
Trunk-Based / GitHub FlowGit Flow (complex, AI mistakes it)
README, ADR in Git as MarkdownConfluence, Notion, verbal tradition

When asking an AI agent “implement feature X in this repo,” having related code together in one repo with structured commit history dramatically raises AI’s success rate. Conversely, multi-repo with dependencies scattered across 10 repos prevents AI from grasping the whole picture, with frequent rework.

In the AI era, monorepo + Conventional Commits is overwhelmingly superior.

Common misconceptions 1

Version control tends to be lightly seen as “ending with using Git,” but design gaps always surface later.

Using Git means version control is OK

Git is just a tool. The design of repo structure, branch strategy, merge policy, and tag operation is the essence of version control - if not agreed in the team, version control is unmaintained even with Git.

Monorepo only works because it’s Google

Today, with Turborepo, Nx, and pnpm workspace mature, 5-person teams can operate monorepo. The “tool of large scale” view is outdated; small scale especially benefits from monorepo.

Common misconceptions 2

Branches are safer kept long

Long-life branches are a hotbed of merge hell. Branches over 1 week need continuous main merging or conflicts snowball. Returning to main in 2-3 days is healthy.

Force push is absolute evil

Force push to main/develop is indeed forbidden, but on your own feature branch, rebase to organize history is standard practice. Not “absolutely forbidden” but “where it’s forbidden” is the correct understanding.

Using Git doesn’t equal version control done. Design is separate.

Author’s note - GitLab January 2017 DB-deletion incident

To talk version control, you can’t skip GitLab’s January 31, 2017 production-DB-deletion incident. The on-call engineer, during midnight incident response, accidentally deleted the production PostgreSQL database directory. The command was supposed to run on the standby system.

What deepened the severity: despite having 5 types of backups, 4 of them weren’t functioning. Eventually recovered from a 6-hour-old snapshot, losing about 300 projects and 5,000 comments in between.

GitLab livestreamed the incident on Twitter and fully published a detailed postmortem. Lessons like “backups are evaluated not just by ‘taken’ but by ‘periodic restore drills, otherwise meaningless’” and “minimize human manual work, thoroughly use IaC and automation” had strong impact on industry standards thereafter. The very stance of broadcasting live without hiding during the incident is told as a fine example of blameless culture.

Backups are evaluated by “restorable” not “taken”. Backups without drills are just comfort.

What to decide - what is your project’s answer?

For each of the following, try to articulate your project’s answer in 1-2 sentences. Starting work with these vague always invites later questions like “why did we decide this again?”

  • VCS (Git only)
  • Repo structure (monorepo / multi-repo)
  • Branch strategy (Trunk-Based / GitHub Flow / Git Flow)
  • Merge method (Squash / Rebase / Merge commit)
  • Tag-naming rule (SemVer / CalVer / build numbers)
  • Large-file handling (LFS / DVC / external reference)
  • .gitignore and secret-detection policy
  • SVN etc. legacy-VCS migration plan (if applicable)

How to make the final call

The essence of version control is making code history traceable - using Git alone doesn’t even get half there. Design the 5 - repo structure, branch strategy, merge method, tag operation, and secret management - as a set, and only when all team members move with the same rules does it function. Not confusing tools and operational design is the first step.

Today my recommended standard composition is “Git + GitHub + monorepo + GitHub Flow + Squash merge + Conventional Commits + SemVer/CalVer + Git LFS + Secret Scanning.” Whether for AI-driven dev compatibility, flexibility scaling from small to large, or hiring perspectives - no better option exists. Multi-repo limited to truly-independent large organizations, SVN to migrate only, Perforce/ClearCase out of scope for new selection.

Selection priorities

  1. Default to monorepo - put burden of proof on multi-repo for independence
  2. Choose GitHub Flow or Trunk-Based for branch strategy - avoid Git Flow
  3. Squash merge + Conventional Commits to structure history
  4. Secret Scanning + pre-commit to physically block leaks

“When in doubt, monorepo + GitHub Flow.” This handles 80% of teams.

Summary

This article covered version control, including Git’s superiority, monorepo vs multi-repo, branch strategies, merge methods, tag operation, LFS, secrets, and the AI-era optimal form.

Default to monorepo, choose GitHub Flow or Trunk-Based, structure history via Squash merge + Conventional Commits, physically block leaks with Secret Scanning + pre-commit. That is the practical answer for version control in 2026.

Next time we’ll cover dev environment and local execution (Docker Compose, Dev Container, cloud IDE).

Back to series TOC -> ‘Architecture Crash Course for the Generative-AI Era’: How to Read This Book

I hope you’ll read the next article as well.