System Architecture

Choosing a Deployment Model — On-Prem / Cloud / Hybrid

Choosing a Deployment Model — On-Prem / Cloud / Hybrid

About this article

This article is the second deep dive in the “System Architecture” category of the Architecture Crash Course for the Generative-AI Era series, covering the deployment model“where to put the system, and who runs it.”

Whether you own the physical equipment, rent cloud, or mix both — this decision swaps initial cost, operating cost, security, and flexibility. The article covers the five forms (on-prem / public / private / hybrid / multi-cloud) plus configurations and selection flow by scale.

What is a deployment model in the first place

A deployment model is, roughly speaking, “the form of where you put the system and who takes care of it.”

Imagine restaurant real-estate strategies. Opening in your own building (on-premises) gives maximum freedom but costs a fortune in construction and maintenance. Leasing a tenant space (public cloud) is cheap up-front with shared facilities, but even changing the wall color requires following the building’s rules. You can also outfit part of your own building as tenant-style (private cloud), or mix both (hybrid). Which real-estate form you choose swaps initial cost, rent, freedom, and ease of exit — deployment model selection works exactly the same way.

Why deployment model selection matters

If you proceed without thinking it through, “cloud for now” can mean discovering six months later that regulations require domestic data storage — forcing an architecture rebuild. This is not just a technology decision; it’s a strategic call intertwining business, regulation, and technology.

Where does the system live, and who runs it?

This is the question right after application form, in the planning stage of any new system. Own physical equipment, rent cloud, or mix both. The choice swaps initial cost, operating cost, security, and flexibility wholesale.

In the early 2010s the choice was a binary “on-prem vs cloud.” Today the options have multiplied — public, private, hybrid, multi-cloud — and the optimum varies with company size, industry, regulations, and existing assets. In Japan’s finance and public sectors, the legality of cloud usage itself imposes constraints, making this an area where pure technical judgment doesn’t suffice.

Deployment-model selection is closer to a CTO / CIO-level executive decision than a pure tech call. Engineers cannot decide it alone — legal, procurement, and executive teams must be in the room. Understanding this first prevents you from writing a technically optimal proposal that fails internally. Deployment model is a strategic decision at the trinity of business, regulation, and technology.

The five major forms

flowchart LR
    ON[On-prem<br/>Self-owned hardware]
    PRIV[Private cloud<br/>Dedicated cloud env]
    HYB[Hybrid<br/>On-prem + cloud]
    PUB[Public cloud<br/>AWS/Azure/GCP shared]
    MULTI[Multi-cloud<br/>Multiple clouds]
    ON --> PRIV
    PRIV --> HYB
    HYB --> PUB
    PUB --> MULTI
    ON -. "Manage everything" .- LABEL_L[Heavy management]
    MULTI -. "Manage minimum" .- LABEL_R[Light management]
    classDef onp fill:#fee2e2,stroke:#dc2626;
    classDef priv fill:#fef3c7,stroke:#d97706;
    classDef hyb fill:#f0f9ff,stroke:#0369a1;
    classDef pub fill:#dbeafe,stroke:#2563eb;
    classDef mul fill:#fae8ff,stroke:#a21caf;
    class ON onp;
    class PRIV priv;
    class HYB hyb;
    class PUB pub;
    class MULTI mul;
FormSubstance
On-premisesOwn and operate physical servers in-house
Public cloudShared use of AWS / Azure / GCP, etc.
Private cloudCloud environment dedicated to one organization
Hybrid cloudOn-prem and cloud coupled
Multi-cloudMultiple clouds combined

The majority of new systems are built on public cloud, but for established enterprises hybrid is the realistic answer. “Fully migrated to cloud” is actually a minority position; many large Japanese enterprises keep their core systems on-prem and only build new systems in cloud.

On-premises

On-premises is the traditional model: buy and install physical servers, network gear, and storage in your own data center or server room. Before “cloud” existed as a concept, this was the only choice.

ProsCons
Highest customizabilityVery high initial cost
Physical isolation = high securityHardware procurement takes weeks to months
Full control over latencyAll ops in-house
Reuse existing assetsDisaster / outage recovery is hard

On-prem still persists in financial-institution core systems, government agencies, healthcare, and manufacturing-control systems. The reasons are regulation, security requirements, and legacy weight — non-technical “things you can’t move” cases dominate. “It’s old so it’s still here” isn’t accurate; “there’s a reason it’s still here” is closer to truth.

Picking on-prem fresh in 2026 is rare. But adding features on top of existing assets still leaves on-prem as an option.

Public cloud

Public cloud is shared infrastructure provided by hyperscalers — AWS, Azure, Google Cloud — across many tenants. “Zero upfront, pay-as-you-go” is the structure that powered the post-2010 startup explosion.

ProsCons
Near-zero initial costCustomization limits
Pay-as-you-go cost optimizationSharing physical resources with others
Usable from day oneHard to satisfy strict security standards in some cases
Auto-scaling, resource changesVendor lock-in risk

Representative services: AWS EC2 / S3 / Lambda, Azure VM / App Service, GCP Compute Engine / Cloud Run. 80%+ of new services are built on public cloud — the de facto default. Twenty years after EC2’s 2006 launch, “new = public cloud” is locked in.

Default new projects to public cloud. To pick anything else, be ready to explain “why not public cloud” — if you can’t, the call is emotional, not technical.

Private cloud

Private cloud is an independent cloud environment dedicated to one organization. The aim is reconciling “cloud flexibility” with “dedicated-environment safety,” common in finance, healthcare, and government.

TypeDescription
On-prem-styleBuild a cloud environment in your own data center
Hosted-styleCarve out a dedicated section inside a major cloud (current mainstream)

Hosted-style (AWS Outposts, Azure Stack, etc.) is the mainstream — “physically managed by the cloud provider but logically dedicated.” No need to build your own DC, lower initial cost, dedicated-environment safety preserved.

ProsCons
Public-cloud flexibilityHigher cost than public
Higher security barConstruction / ops effort
No noisy-neighbor effectsResource provisioning may be slow

Private cloud is the “hybrid of cloud and dedicated environment.” Particularly effective in regulated industries.

Private cloud realization on AWS

Multiple services are available depending on the isolation level required. Reason backward from the requirements: “how much do we actually need to isolate from others?”

ServiceIsolation levelUse
Amazon VPCLogical (virtual network)Most basic isolation
AWS PrivateLinkInternal trafficCommunication without traversing the Internet
Direct Connect / VPNExternal trafficDedicated-line connection
EC2 Dedicated HostsPhysicalDon’t share physical hosts
AWS OutpostsSite-levelAWS environment in your own DC

Logical isolation via VPC is free and used by nearly all AWS users. Dedicated Hosts is the upper-tier option for finance / healthcare with “no physical sharing” mandates — a last-resort weapon. “Just in case” physical isolation balloons cost by multiples.

Raise isolation in steps as required. Excess maps directly to waste.

Hybrid cloud

Hybrid cloud combines on-prem (or private cloud) with public cloud, integrated. For Japanese enterprises this is the actual center of gravity.

Typical patterns: “keep existing on-prem systems, build new on cloud,” “core on-prem, frontend on cloud,” “sensitive data on-prem, analytics on cloud.” With circumstances preventing full cloud migration (existing assets, regulation, audit, executive decisions), only new builds go to cloud — that’s hybrid in practice.

ProsCons
Use cloud while keeping legacy systemsHard to optimize cost
Physically isolate sensitive dataSystem and operations complexity
Phased cloud migration is possibleNeed expertise in both

The biggest reason to pick hybrid is the organizational reality of “want full cloud, can’t drop existing assets.” It’s a compromise stitching past and future, not a technical optimum. Operational cost is high; engineering load is heavy. Whether it’s healthy or pure debt depends on whether you frame it as “a transition midpoint” with a real plan to consolidate to a single cloud later.

Hybrid cloud is enterprises’ realistic landing zone. Full-cloud migration turns out to be unexpectedly hard.

Multi-cloud

Multi-cloud combines multiple public clouds — AWS + Azure, AWS + GCP. Goals are “avoid vendor lock-in” and “pick best-of-breed by use case,” but operational difficulty spikes.

ProsCons
Best service per use caseSystem / operational complexity
Avoid vendor dependencyHard to optimize cost
Distribute outage impactSpecialty knowledge per environment

The thinking: “compute on AWS, AI on GCP, Office integration on Azure.” Beautiful in theory. Unifying ops, monitoring, and security across all environments is extremely hard, and many cases land at “don’t do it.”

“Just-in-case multi-cloud” is a textbook soft choice. Without specialists for both clouds, mid-sized teams take on operational debt the moment they adopt it. The practical floor for multi-cloud is 3+ dedicated infra engineers. Below that, update tracking, monitoring unification, and permission management alone melt the team.

Pick multi-cloud only with a clear reason. Defaulting to it spikes ops cost.

On-prem vs cloud cost comparison

The conditioning “cloud is cheaper” isn’t always true; depending on scale and time horizon on-prem can be cheaper. Use TCO (Total Cost of Ownership) as the accurate comparison axis.

ItemOn-premPublic cloud
InitialTens of thousands to millions of dollarsNear zero
MonthlyDepreciation + electricity + ops laborPay-as-you-go
5-year TCO (small / mid)HigherLower
5-year TCO (large / heavy load)Lower (long term)Higher (load growth explodes cost)

Famously, Dropbox in 2015-2016 ran the “Magic Pocket” project to migrate its storage backend from AWS to its own data center, and publicly reported tens of millions of dollars in annual cost savings over multiple years. Even Amazon internally is moving certain workloads back to in-house DCs.

Cloud is overwhelmingly better for “fast start, flexible,” but cost spikes when you use it heavily at scale. Not a panacea.

That said, this is the “Dropbox-scale” discussion. 90%+ of services close before they reach Dropbox scale, and the ones that do take 5-10 years. “We’ll be Dropbox-scale eventually so let’s start on-prem” is canonical premature optimization and almost always fails.

Selection criteria

Selection lands on three axes: cost, business characteristics, security requirements.

CaseRecommended
No constraints, want to minimize cost and opsPublic cloud only
High security on data locationHybrid cloud
Mandatory Google / Microsoft integrationVendor-leaning or multi-cloud
Large existing on-prem assetsHybrid (phased migration)
Finance / healthcare / governmentPrivate cloud / on-prem

For new startups and individuals: “don’t think, public cloud.” Picking on-prem without a special reason is anachronistic and even hurts hiring. Recruiting young engineers to “on-prem-centric” companies grows harder year over year in 2026.

Even within “public cloud,” the right configuration shifts with org size. Pair monthly cost guidance with the operational capacity required:

PhaseMonthly infra (est)RecommendedDedicated infra people
MVP / individualup to $300Single public cloud, 1 region, managed-first0 (split duty)
Early startup$300-3kSingle cloud, multi-AZ, IaC mandatory0.5 (split duty)
Mid-sized SaaS$3k-30kSingle cloud, multi-AZ + DR in 2 regions1-3
Enterprise core$30k+Hybrid (legacy + cloud), dedicated lines, AWS Organizations5+
Finance / public / healthcareIndustry-dependentPrivate cloud or hybrid + compliance certifications10+

The practical floor for multi-cloud / hybrid is 3+ dedicated infra people. Below that, just tracking updates, unifying monitoring, and managing permissions melts the team. Picking multi-cloud “for the future” is canonical soft choice — adopting only when needed isn’t too late.

Until single-cloud causes pain at your scale, “the courage to lean on one” saves operations.

Hybrid / multi-cloud traps

Especially with hybrid / multi-cloud, “ops cost doubles the moment you adopt”, so just avoiding the forbidden moves prevents major injuries.

Forbidden moveWhy
Adopting multi-cloud “just in case” at the startHiring specialists for both, unified monitoring, SSO unification — cost and complexity grow exponentially, not linearly
Same IP range on on-prem and cloudRouting collision over VPN / dedicated lines; CIDR design must be unified up front
Runbook for one side only in hybridOn-prem has runbooks but cloud is by hand (or vice versa) — recipe for breakdown
Duplicated auth between cloud and on-premPassword / permission drift, audit failure; unify via AD / IdP
Hybrid construction without estimating egressMonthly cloud-to-on-prem data movement bills can hit thousands of dollars
Multi-cloud + DIY abstraction layer”Cloud-neutral” home-grown wrappers hit a maintenance ceiling in 6 months. Beyond Kubernetes, DIY is a bad fit
”Cloud is cheaper” as the sole reason for full on-prem migrationTCO varies with scale — on-prem can be cheaper. 3-year TCO comparison is the rule
Not checking data-residency regionsGDPR and similar can make overseas-region data storage a regulatory violation

Cost comparison should use 3-year TCO. Dropbox’s “AWS -> own DC” story stands up only with multi-year TCO; judging from a few months of bills produces wrong conclusions every time.

Don’t pick hybrid / multi-cloud below 3 dedicated people. Operations don’t function.

AI decision axes

With AI-driven development as the assumption, the dominant axis collapses to one point: “does it complete in IaC?”

AI-era favorableAI-era unfavorable
Public cloud + full IaCOn-prem, manual builds
Lean on a single cloud (AI’s context concentrates)Multi-cloud (config dispersed, AI hallucinates)
Containers + Kubernetes manifestsGUI-required PaaS consoles
Hybrid with explicit IaC boundariesOn-prem left as “touchable black box”

The new design principle is not “AI can write it, so complexity is OK” — it’s “keep architecture inside what AI can write.”

  1. Identify regulation / industry-mandated exclusions (finance, healthcare, public).
  2. Estimate the weight of existing on-prem assets (if undroppable, phased hybrid).
  3. Cut multi-cloud by ops headcount (under 2 specialists -> single cloud).
  4. IaC-ability as the future-debt judgment axis.

The weekend bill incident (industry case)

A common case at companies new to cloud: spin up a verification environment Friday night, forget to stop it over the weekend, hit Monday — bill comes in 10x the expected at month end. GPU instances, large RDS, and NAT Gateways spun up “just to try briefly” and forgotten to stop are the classic offenders.

Cloud’s “only what you use” banner trips you up because there are kinds of resources billed even when not used. NAT Gateways, ELBs, and unattached Elastic IPs are the canonical ones; the person who spun them up doesn’t notice while a few dollars/day melt.

The fix is simple: set cost-cap alerts (AWS Budgets, GCP Billing Alerts), notify on monthly threshold crossings. Cloud operations without these are an accident waiting to happen.

Cloud’s “pay-as-you-go” has parts billed even when idle. Cost alerts are mandatory.

What you must decide — what’s your project’s answer?

Articulate your project’s answer in 1-2 sentences for each:

  • Base deployment model (Cloud / On-prem / Hybrid)
  • Cloud vendor (AWS / GCP / Azure / domestic)
  • Isolation level (shared / logical / physical)
  • Data location (country, region, regulation handling)
  • Existing-system integration
  • Cost ceiling and monitoring
  • DR / BCP (Business Continuity Plan) requirements

Recording the decision rationale

Deployment-model selection has a wide blast radius and high cost to reverse, so recording the rationale as an ADR (Architecture Decision Record) at decision time is strongly recommended. Here is a concrete example:

ItemContent
TitleRun production on containers (ECS Fargate)
StatusAccepted
ContextCurrent EC2 manual provisioning takes 30+ minutes per deploy, and environment-drift incidents occur ~3 times/year. Goal: reduce ops load while improving deploy speed
DecisionAdopt AWS ECS Fargate and containerize all services
Rationale- Eliminates EC2 OS management and patching, saving ~20 hours/month in ops
- Container images lock the environment, eliminating “works in dev but not in prod”
- Fargate is serverless, so autoscaling config is minimal
Rejected alternativesEKS (Kubernetes): overhead too large for a 4-person team. Lambda: existing app is stateful, migration cost too high
OutcomeContainerization requires CI/CD pipeline setup. Dockerfile standardization and image scanning become additional tasks

Store ADRs as Markdown in docs/adr/ in the code repo, approved through the same PR-review flow. Having “why we chose this” visible at a glance later is the greatest value of an ADR.

Summary

This article covered deployment-model selection — “where does the system live, and who runs it?”

Public cloud is the overwhelming default for new projects. Hybrid and multi-cloud should be limited to cases with clear reasons and the operating capacity. In the AI era, single-cloud-with-IaC’s advantage only widens.

The next article covers the biggest decision after picking public cloud: cloud vendor selection (AWS / Azure / GCP).

Back to series TOC -> ‘Architecture Crash Course for the Generative-AI Era’: How to Read This Book

I hope you’ll read the next article as well.