About this article
As the third installment of the âDevOps Architectureâ category in the series âArchitecture Crash Course for the Generative-AI Era,â this article explains version control.
Version control is the land registry. With Git as the de facto today, repo structure, branch strategy, and tag operation are premises of every dev process - getting them sloppy spins everything else. This article covers the use of Trunk-Based / GitHub Flow / Git Flow, monorepo vs multi-repo, SemVer/tag operation, and SVNâGit migration.
Other articles in this category
How Git became the standard
In 2005, Linus Torvalds built Git in 2 weeks for Linux kernel development. The direct trigger was the licensing trouble between the prior commercial VCS (BitKeeper) and the kernel community, and Git arrived with whatâs now obvious traits: âno central server, distributed, fast, lightweight branches.â
After GitHubâs 2008 launch, OSS standards converged to Git, and corporate closed development largely unified to Git. Commercial VCSes (Perforce, ClearCase) still remain in some large enterprises and the gaming industry, but thereâs almost no reason to choose them for new selection. Today, Git is the only choice in reality.
| Generation | VCS | Characteristics |
|---|---|---|
| 1st | RCS, CVS | Per-file, centralized |
| 2nd | Subversion (SVN), Perforce | Per-repo, centralized |
| 3rd (modern) | Git, Mercurial | Distributed, fast, lightweight branches |
Why Git won
Technically, Gitâs strengths were âdistributed, fast, lightweight branches,â but the deciding factor in spread was socialization via GitHub. Mechanisms like PR (Pull Request), Issues, and Stars fundamentally changed code-review and OSS-contribution experience, swinging developers in one go to the state of âuse Git in order to put it on GitHub.â
| Element | Why Git is superior |
|---|---|
| Distributed | Even if central server goes down, everyone has a complete copy |
| Lightweight branches | Created/switched in milliseconds, no resistance to discarding |
| Rich merge strategies | Use rebase / squash / merge commit per situation |
| Hosting integration | PR flow standardized via GitHub / GitLab / Bitbucket |
| AI training-data depth | ChatGPT / Copilot perfectly understand Git operations |
In the AI-driven dev era, AIâs fluency in writing Git operations became the decider for selection. Perforce and ClearCase have thin training data, and AI agents canât correctly handle commits or conflict resolution. That alone is reason enough to drop them from new selection.
What to decide in version control
Version control design doesnât end with âuse Gitâ - decide by combining the following 5 axes. Each is high-cost to change later, so the rule is to decide at project start.
| Axis | Choices |
|---|---|
| Repo structure | Monorepo, multi-repo, hybrid |
| Branch strategy | Trunk-Based, GitHub Flow, Git Flow |
| Merge method | Squash merge, Rebase merge, Merge commit |
| Tag/release operation | Semantic Versioning, date tags, none |
| Large files | Git LFS, Git Annex, separate management |
This 5-axis combination decides the premises of CI/CD, test strategy, and review operation. Repo structure in particular is the most upstream - getting it wrong means redoing everything.
Monorepo vs multi-repo
The biggest issue in repo structure is monorepo (everything in one repo) vs multi-repo (separate repo per service). The impression âmonorepo is for big enterprisesâ is outdated, and today monorepo is strong even for small/mid-size teams.
| Axis | Monorepo | Multi-repo |
|---|---|---|
| Version management | Common version for all code | Independent per service |
| Cross-cut changes | Complete in 1 PR | Multiple PRs needed |
| CI | Change-range testing required (slow) | Per-repo, fast |
| Permission management | Controlled by CODEOWNERS | Per repo |
| Representative tools | Nx, Turborepo, Bazel, pnpm workspace | (no special tools) |
Google, Meta, Uber, Airbnb use monorepo. Amazon leans multi-repo (the âtwo-pizza team, one service one repoâ philosophy). Itâs not which is correct, but matching the organizationâs communication structure (Conwayâs law).
Criteria for choosing monorepo
Monorepo is favorable under âcode mutually dependsâ and âfrequent need for bulk changes.â Conversely, with highly independent service groups, multi-repo is lighter to operate.
| Situation | Recommended |
|---|---|
| Front + back + shared types | Monorepo (TypeScript power maxed by centralized types) |
| 10+ microservices held by independent teams | Multi-repo (clarify team boundaries) |
| Library + using app dev simultaneously | Monorepo (instant verification via local references) |
| Acquired/subsidiary independent organizations gathered | Multi-repo (often not worth integration cost) |
In my experience, monorepo lands well for 80% of teams. Multi-repo only functions when âthe org is truly independent,â which is limited to clear division-of-labor at dozens-to-hundreds-of-people scale. When in doubt, start with monorepo and decompose to multi-repo as scale demands - the safe order.
Monorepo is the top candidate when in doubt. Multi-repo carries the burden of proof for independence.
Branch strategy - 3 patterns
Branch strategy is also touched in CI/CD article, but from version controlâs perspective, the core is how short-lived. The longer-lived branches are, the heavier merge conflicts get and the slower reviews become.
gitGraph
commit id: "main"
branch feature-A
commit id: "feat A1"
commit id: "feat A2"
checkout main
branch feature-B
commit id: "feat B"
checkout main
merge feature-A id: "PR#1 squash"
merge feature-B id: "PR#2 squash"
commit id: "release"
| Strategy | Characteristics | Suited for |
|---|---|---|
| Trunk-Based Development | Short-lived feature branches, hours-to-day to main | High-grade CI/CD, expert teams |
| GitHub Flow | feature branch + PR + main | Web services, modern dev standard |
| Git Flow | main + develop + release + hotfix | Packaged products, version-parallel management |
For new projects, GitHub Flow or Trunk-Based is the standard. Git Flow is too complex - excessive for continuously-deployed services. Git Flow only shines in cases like packaged products and on-prem distributed software where âversion-unit independent operation is needed,â but those business models are decreasing.
Merge method - Squash / Rebase / Merge
There are 3 ways to merge a PR, and which you adopt greatly changes commit-history readability. Not which is superior - what matters is unifying within the team.
| Method | Result | Pros | Cons |
|---|---|---|---|
| Squash merge | Compress whole PR into 1 commit | Concise main history, per-PR revert | Fine-grained PR-internal history lost |
| Rebase merge | Stack PR commits onto main in order | Linear history, easy to follow | Premise: thorough commit conventions |
| Merge commit | Leaves PR branch and merge commit | History of âmerged this PRâ remains | main history gets complex |
The current mainstream is Squash merge. Especially in small/mid teams, following history at PR granularity is more practical and pairs well with auto-generation of release notes. Rebase merge functions when expert teams thoroughly use Conventional Commits.
Mixing the 3 is the worst. Unifying on Squash causes the least friction.
Tags and release operation
Tags identifying releases tend to be undervalued in version control. Teams that canât instantly identify âwhich code is in productionâ always get stuck on incidents.
| Naming | Format | Suited for |
|---|---|---|
| Semantic Versioning (SemVer) | v2.3.1 (Major.Minor.Patch) | Libraries, packaged products |
| CalVer (Calendar Versioning) | 2026.04.01 | SaaS, frequently-released products |
| Build numbers | build-1234 | Internal CI/CD identification |
| Date + Git hash | 20260422-abc123 | Simple production tracking |
SemVer functions only on teams that can guarantee the semantics of âmajor shows compatibility break.â Wonât work if breaking changes are dumped into Major without planning. CalVer has the benefit of glance-clear âfrom when is this code,â good chemistry with SaaS.
Standard practice is leaving tags and changelogs together via GitHub Releases or GitLab Release. Conventional Commits + auto-release-note generation (release-please, semantic-release) is the modern standard.
Large files and Git LFS
Git is bad at large binary files. Normally committing several-GB videos, images, ML models, or game assets bloats the repo to where clone takes hours.
| Method | Characteristics |
|---|---|
| Git LFS (Large File Storage) | Git manages just pointers, body in separate storage |
| DVC (Data Version Control) | ML-oriented, integrates with S3 etc. for data version control |
| External storage (S3, GCS) + reference | Donât put in Git at all, only version managed separately |
| Git Annex | More flexible than Git LFS but high learning cost |
Git LFS is most adopted, with GitHub/GitLab/Bitbucket standard-supporting it. But LFS has pitfalls too - mass branch creation balloons storage costs, so hundreds-of-GB-scale ML datasets suit DVC or S3 references better.
Normally committing binaries kills the repo. Design with LFS or external references from the start.
SVN-to-Git migration decision
There are still sites with SVN (Subversion) today, mainly large enterprises with 10+ years of operation or organizations with underdeveloped CI/CD. SVN-to-Git migration takes large effort, but you can see thereâs almost no option to continue.
| SVN-continuation downside | Content |
|---|---|
| Heavy branches | Designed via directory copies, high switching cost |
| Not distributed | All stop on central-server failure |
| AI tools unsupported | Copilot/Cursor etc. premise Git |
| Bad chemistry with modern CI/CD | GitHub Actions / GitLab CI premise Git |
| Hiring disadvantage | New grads to mid-career have almost no SVN experience |
The standard migration tool is git-svn (Git-ize while preserving history). Full migration takes weeks to months, but from the 3 viewpoints of âhiring, dev speed, AI usage,â reasons to continue no longer remain. Once migration is decided, the rule is short-decisive battle, finishing it in one go. Phased migration becomes hell during âthe period of maintaining both SVN and Git.â
The longer SVN is kept on life support, the more debt accumulates. End migration with a short decisive battle - the front-runner.
.gitignore and handling secrets
The No.1 source of accidents in version control is accidentally committing secrets (API keys, passwords, tokens). Once committed, they remain in Git history, and even force-push canât erase them from history caches, treated as effectively leaked.
| Countermeasure | Content |
|---|---|
Thorough .gitignore | Always exclude .env, *.pem, secrets/ |
| pre-commit hooks | Auto-detect with gitleaks, detect-secrets |
| Secret Scanning (GitHub standard) | Detect secrets at push, block before public |
| Rotation on leak | Instantly invalidate all leaked keys/tokens |
In 2022, a Toyota subsidiary accidentally published an access key to GitHub, leaving about 300,000 customer records accessible for 5 years. After discovery they invalidated the key, but thereâs no way to zero out 5 years of leak risk. .gitignore and pre-commit hooks are an area where âlater because itâs annoyingâ isnât allowed.
Numerical gates and operational metrics for version control
Note: Industry baseline values as of April 2026. Will become outdated as technology and the talent market shift, so requires periodic updates.
Version control health is practically tracked numerically. Below are industry-standard metrics.
| Metric | Recommended | What to do if exceeded |
|---|---|---|
| Feature branch lifespan | 2-3 days | Consider forced merge over 1 week, long-life is conflict hell |
| Lines changed per PR | ~400 | Split over 1000, review formalizes |
| Direct push to main | 0 | All via PR, enforce by branch protection |
| Commit unit | Conventional Commits compliant | feat:/fix:/docs: etc., auto-versioning via semantic-release |
| Repo clone time | ~30 seconds | Consider LFS / shallow clone if exceeded |
| Secret Scanning alert response | Within 5 minutes | Instantly rotate leaked keys |
.gitignore leak incidents | 0 | Physically block via pre-commit hook |
| Tagging rule | Unified SemVer/CalVer | Mixing makes history untraceable |
| Monorepo CI runtime | Within 10 min on PR | Shorten via change-range testing (Nx / Turbo) |
Secret Scanning becoming a GitHub-standard feature since 2022 dramatically shortened the time to incident discovery. The âToyota subsidiary 300k customer records 5-year-accessibility incidentâ (2022) symbolizes the cost of the era before Secret Scanning.
Secrets get physically blocked before commit. âBe carefulâ is operations that donât function.
Version-control-operation pitfalls and forbidden moves
Typical accident patterns in version control. All lead to âbreaking code historyâ / âlosing organizational trustâ.
| Forbidden move | Why itâs bad |
|---|---|
| Newly adopt SVN/Perforce/ClearCase | Thin AI training data, hiring disadvantage, zero rationality today |
Commit .env or private keys to Git | Remains in history caches, effectively unrecoverable, like the 2022 Toyota subsidiary 300k incident |
| Force push to main/develop | Other membersâ work erased, history rewriting fails audit |
| Leave branches over a week before merging | Conflict hell, â80% doneâ reports continuing 3 weeks |
| Normally commit binary files (videos, models) | Repo bloats to tens of GB, clone takes hours |
| Mix merge methods (Squash/Rebase/Merge commit) | History gets complex, âper-PR revertâ becomes impossible |
| Adopt Git Flow on continuously-deployed SaaS | Too complex, dev speed drops, switch to GitHub Flow |
| No commit-message convention | History search and release-note automation impossible |
| Leave branches over a year old | Backlog graveyard, periodic triage to delete |
| Mix monorepo vs multi-repo by mood | Toolchain disperses, CI duplicate operation |
The GitLab January 2017 DB-deletion incident (deleted production DB during on-call, 4 of 5 backups didnât function, recovered from 6-hour-old snapshot) is a symbol of the lesson âevaluate backups not by âtakenâ but by ârestorableâ.â
Force push is absolutely forbidden on main/develop, rebase on feature branches is allowed. The point is distinguishing where itâs forbidden.
Version control in the AI era
In AI-driven dev, version-control design directly affects AI agent performance. AI fluency in Git operations is a premise; whatâs decisive is whether repo structure, commit conventions, and branch operation are in AI-understandable form.
| Favored in the AI era | Disfavored in the AI era |
|---|---|
| Git + GitHub (standard) | SVN, Perforce (thin AI training data) |
| Monorepo (related code in one place) | Multi-repo (context dispersed) |
| Conventional Commits | Free-form commits |
| Trunk-Based / GitHub Flow | Git Flow (complex, AI mistakes it) |
| README, ADR in Git as Markdown | Confluence, Notion, verbal tradition |
When asking an AI agent âimplement feature X in this repo,â having related code together in one repo with structured commit history dramatically raises AIâs success rate. Conversely, multi-repo with dependencies scattered across 10 repos prevents AI from grasping the whole picture, with frequent rework.
In the AI era, monorepo + Conventional Commits is overwhelmingly superior.
Common misconceptions 1
Version control tends to be lightly seen as âending with using Git,â but design gaps always surface later.
Using Git means version control is OK
Git is just a tool. The design of repo structure, branch strategy, merge policy, and tag operation is the essence of version control - if not agreed in the team, version control is unmaintained even with Git.
Monorepo only works because itâs Google
Today, with Turborepo, Nx, and pnpm workspace mature, 5-person teams can operate monorepo. The âtool of large scaleâ view is outdated; small scale especially benefits from monorepo.
Common misconceptions 2
Branches are safer kept long
Long-life branches are a hotbed of merge hell. Branches over 1 week need continuous main merging or conflicts snowball. Returning to main in 2-3 days is healthy.
Force push is absolute evil
Force push to main/develop is indeed forbidden, but on your own feature branch, rebase to organize history is standard practice. Not âabsolutely forbiddenâ but âwhere itâs forbiddenâ is the correct understanding.
Using Git doesnât equal version control done. Design is separate.
Authorâs note - GitLab January 2017 DB-deletion incident
To talk version control, you canât skip GitLabâs January 31, 2017 production-DB-deletion incident. The on-call engineer, during midnight incident response, accidentally deleted the production PostgreSQL database directory. The command was supposed to run on the standby system.
What deepened the severity: despite having 5 types of backups, 4 of them werenât functioning. Eventually recovered from a 6-hour-old snapshot, losing about 300 projects and 5,000 comments in between.
GitLab livestreamed the incident on Twitter and fully published a detailed postmortem. Lessons like âbackups are evaluated not just by âtakenâ but by âperiodic restore drills, otherwise meaninglessââ and âminimize human manual work, thoroughly use IaC and automationâ had strong impact on industry standards thereafter. The very stance of broadcasting live without hiding during the incident is told as a fine example of blameless culture.
Backups are evaluated by ârestorableâ not âtakenâ. Backups without drills are just comfort.
What to decide - what is your projectâs answer?
For each of the following, try to articulate your projectâs answer in 1-2 sentences. Starting work with these vague always invites later questions like âwhy did we decide this again?â
- VCS (Git only)
- Repo structure (monorepo / multi-repo)
- Branch strategy (Trunk-Based / GitHub Flow / Git Flow)
- Merge method (Squash / Rebase / Merge commit)
- Tag-naming rule (SemVer / CalVer / build numbers)
- Large-file handling (LFS / DVC / external reference)
.gitignoreand secret-detection policy- SVN etc. legacy-VCS migration plan (if applicable)
How to make the final call
The essence of version control is making code history traceable - using Git alone doesnât even get half there. Design the 5 - repo structure, branch strategy, merge method, tag operation, and secret management - as a set, and only when all team members move with the same rules does it function. Not confusing tools and operational design is the first step.
Today my recommended standard composition is âGit + GitHub + monorepo + GitHub Flow + Squash merge + Conventional Commits + SemVer/CalVer + Git LFS + Secret Scanning.â Whether for AI-driven dev compatibility, flexibility scaling from small to large, or hiring perspectives - no better option exists. Multi-repo limited to truly-independent large organizations, SVN to migrate only, Perforce/ClearCase out of scope for new selection.
Selection priorities
- Default to monorepo - put burden of proof on multi-repo for independence
- Choose GitHub Flow or Trunk-Based for branch strategy - avoid Git Flow
- Squash merge + Conventional Commits to structure history
- Secret Scanning + pre-commit to physically block leaks
âWhen in doubt, monorepo + GitHub Flow.â This handles 80% of teams.
Summary
This article covered version control, including Gitâs superiority, monorepo vs multi-repo, branch strategies, merge methods, tag operation, LFS, secrets, and the AI-era optimal form.
Default to monorepo, choose GitHub Flow or Trunk-Based, structure history via Squash merge + Conventional Commits, physically block leaks with Secret Scanning + pre-commit. That is the practical answer for version control in 2026.
Next time weâll cover dev environment and local execution (Docker Compose, Dev Container, cloud IDE).
I hope youâll read the next article as well.
đ Series: Architecture Crash Course for the Generative-AI Era (56/89)