About this article
This article is the second deep dive in the “System Architecture” category of the Architecture Crash Course for the Generative-AI Era series, covering the deployment model — “where to put the system, and who runs it.”
Whether you own the physical equipment, rent cloud, or mix both — this decision swaps initial cost, operating cost, security, and flexibility. The article covers the five forms (on-prem / public / private / hybrid / multi-cloud) plus configurations and selection flow by scale.
What is a deployment model in the first place
A deployment model is, roughly speaking, “the form of where you put the system and who takes care of it.”
Imagine restaurant real-estate strategies. Opening in your own building (on-premises) gives maximum freedom but costs a fortune in construction and maintenance. Leasing a tenant space (public cloud) is cheap up-front with shared facilities, but even changing the wall color requires following the building’s rules. You can also outfit part of your own building as tenant-style (private cloud), or mix both (hybrid). Which real-estate form you choose swaps initial cost, rent, freedom, and ease of exit — deployment model selection works exactly the same way.
Why deployment model selection matters
If you proceed without thinking it through, “cloud for now” can mean discovering six months later that regulations require domestic data storage — forcing an architecture rebuild. This is not just a technology decision; it’s a strategic call intertwining business, regulation, and technology.
Where does the system live, and who runs it?
This is the question right after application form, in the planning stage of any new system. Own physical equipment, rent cloud, or mix both. The choice swaps initial cost, operating cost, security, and flexibility wholesale.
In the early 2010s the choice was a binary “on-prem vs cloud.” Today the options have multiplied — public, private, hybrid, multi-cloud — and the optimum varies with company size, industry, regulations, and existing assets. In Japan’s finance and public sectors, the legality of cloud usage itself imposes constraints, making this an area where pure technical judgment doesn’t suffice.
Deployment-model selection is closer to a CTO / CIO-level executive decision than a pure tech call. Engineers cannot decide it alone — legal, procurement, and executive teams must be in the room. Understanding this first prevents you from writing a technically optimal proposal that fails internally. Deployment model is a strategic decision at the trinity of business, regulation, and technology.
The five major forms
flowchart LR
ON[On-prem<br/>Self-owned hardware]
PRIV[Private cloud<br/>Dedicated cloud env]
HYB[Hybrid<br/>On-prem + cloud]
PUB[Public cloud<br/>AWS/Azure/GCP shared]
MULTI[Multi-cloud<br/>Multiple clouds]
ON --> PRIV
PRIV --> HYB
HYB --> PUB
PUB --> MULTI
ON -. "Manage everything" .- LABEL_L[Heavy management]
MULTI -. "Manage minimum" .- LABEL_R[Light management]
classDef onp fill:#fee2e2,stroke:#dc2626;
classDef priv fill:#fef3c7,stroke:#d97706;
classDef hyb fill:#f0f9ff,stroke:#0369a1;
classDef pub fill:#dbeafe,stroke:#2563eb;
classDef mul fill:#fae8ff,stroke:#a21caf;
class ON onp;
class PRIV priv;
class HYB hyb;
class PUB pub;
class MULTI mul;
| Form | Substance |
|---|---|
| On-premises | Own and operate physical servers in-house |
| Public cloud | Shared use of AWS / Azure / GCP, etc. |
| Private cloud | Cloud environment dedicated to one organization |
| Hybrid cloud | On-prem and cloud coupled |
| Multi-cloud | Multiple clouds combined |
The majority of new systems are built on public cloud, but for established enterprises hybrid is the realistic answer. “Fully migrated to cloud” is actually a minority position; many large Japanese enterprises keep their core systems on-prem and only build new systems in cloud.
On-premises
On-premises is the traditional model: buy and install physical servers, network gear, and storage in your own data center or server room. Before “cloud” existed as a concept, this was the only choice.
| Pros | Cons |
|---|---|
| Highest customizability | Very high initial cost |
| Physical isolation = high security | Hardware procurement takes weeks to months |
| Full control over latency | All ops in-house |
| Reuse existing assets | Disaster / outage recovery is hard |
On-prem still persists in financial-institution core systems, government agencies, healthcare, and manufacturing-control systems. The reasons are regulation, security requirements, and legacy weight — non-technical “things you can’t move” cases dominate. “It’s old so it’s still here” isn’t accurate; “there’s a reason it’s still here” is closer to truth.
Picking on-prem fresh in 2026 is rare. But adding features on top of existing assets still leaves on-prem as an option.
Public cloud
Public cloud is shared infrastructure provided by hyperscalers — AWS, Azure, Google Cloud — across many tenants. “Zero upfront, pay-as-you-go” is the structure that powered the post-2010 startup explosion.
| Pros | Cons |
|---|---|
| Near-zero initial cost | Customization limits |
| Pay-as-you-go cost optimization | Sharing physical resources with others |
| Usable from day one | Hard to satisfy strict security standards in some cases |
| Auto-scaling, resource changes | Vendor lock-in risk |
Representative services: AWS EC2 / S3 / Lambda, Azure VM / App Service, GCP Compute Engine / Cloud Run. 80%+ of new services are built on public cloud — the de facto default. Twenty years after EC2’s 2006 launch, “new = public cloud” is locked in.
Default new projects to public cloud. To pick anything else, be ready to explain “why not public cloud” — if you can’t, the call is emotional, not technical.
Private cloud
Private cloud is an independent cloud environment dedicated to one organization. The aim is reconciling “cloud flexibility” with “dedicated-environment safety,” common in finance, healthcare, and government.
| Type | Description |
|---|---|
| On-prem-style | Build a cloud environment in your own data center |
| Hosted-style | Carve out a dedicated section inside a major cloud (current mainstream) |
Hosted-style (AWS Outposts, Azure Stack, etc.) is the mainstream — “physically managed by the cloud provider but logically dedicated.” No need to build your own DC, lower initial cost, dedicated-environment safety preserved.
| Pros | Cons |
|---|---|
| Public-cloud flexibility | Higher cost than public |
| Higher security bar | Construction / ops effort |
| No noisy-neighbor effects | Resource provisioning may be slow |
Private cloud is the “hybrid of cloud and dedicated environment.” Particularly effective in regulated industries.
Private cloud realization on AWS
Multiple services are available depending on the isolation level required. Reason backward from the requirements: “how much do we actually need to isolate from others?”
| Service | Isolation level | Use |
|---|---|---|
| Amazon VPC | Logical (virtual network) | Most basic isolation |
| AWS PrivateLink | Internal traffic | Communication without traversing the Internet |
| Direct Connect / VPN | External traffic | Dedicated-line connection |
| EC2 Dedicated Hosts | Physical | Don’t share physical hosts |
| AWS Outposts | Site-level | AWS environment in your own DC |
Logical isolation via VPC is free and used by nearly all AWS users. Dedicated Hosts is the upper-tier option for finance / healthcare with “no physical sharing” mandates — a last-resort weapon. “Just in case” physical isolation balloons cost by multiples.
Raise isolation in steps as required. Excess maps directly to waste.
Hybrid cloud
Hybrid cloud combines on-prem (or private cloud) with public cloud, integrated. For Japanese enterprises this is the actual center of gravity.
Typical patterns: “keep existing on-prem systems, build new on cloud,” “core on-prem, frontend on cloud,” “sensitive data on-prem, analytics on cloud.” With circumstances preventing full cloud migration (existing assets, regulation, audit, executive decisions), only new builds go to cloud — that’s hybrid in practice.
| Pros | Cons |
|---|---|
| Use cloud while keeping legacy systems | Hard to optimize cost |
| Physically isolate sensitive data | System and operations complexity |
| Phased cloud migration is possible | Need expertise in both |
The biggest reason to pick hybrid is the organizational reality of “want full cloud, can’t drop existing assets.” It’s a compromise stitching past and future, not a technical optimum. Operational cost is high; engineering load is heavy. Whether it’s healthy or pure debt depends on whether you frame it as “a transition midpoint” with a real plan to consolidate to a single cloud later.
Hybrid cloud is enterprises’ realistic landing zone. Full-cloud migration turns out to be unexpectedly hard.
Multi-cloud
Multi-cloud combines multiple public clouds — AWS + Azure, AWS + GCP. Goals are “avoid vendor lock-in” and “pick best-of-breed by use case,” but operational difficulty spikes.
| Pros | Cons |
|---|---|
| Best service per use case | System / operational complexity |
| Avoid vendor dependency | Hard to optimize cost |
| Distribute outage impact | Specialty knowledge per environment |
The thinking: “compute on AWS, AI on GCP, Office integration on Azure.” Beautiful in theory. Unifying ops, monitoring, and security across all environments is extremely hard, and many cases land at “don’t do it.”
“Just-in-case multi-cloud” is a textbook soft choice. Without specialists for both clouds, mid-sized teams take on operational debt the moment they adopt it. The practical floor for multi-cloud is 3+ dedicated infra engineers. Below that, update tracking, monitoring unification, and permission management alone melt the team.
Pick multi-cloud only with a clear reason. Defaulting to it spikes ops cost.
On-prem vs cloud cost comparison
The conditioning “cloud is cheaper” isn’t always true; depending on scale and time horizon on-prem can be cheaper. Use TCO (Total Cost of Ownership) as the accurate comparison axis.
| Item | On-prem | Public cloud |
|---|---|---|
| Initial | Tens of thousands to millions of dollars | Near zero |
| Monthly | Depreciation + electricity + ops labor | Pay-as-you-go |
| 5-year TCO (small / mid) | Higher | Lower |
| 5-year TCO (large / heavy load) | Lower (long term) | Higher (load growth explodes cost) |
Famously, Dropbox in 2015-2016 ran the “Magic Pocket” project to migrate its storage backend from AWS to its own data center, and publicly reported tens of millions of dollars in annual cost savings over multiple years. Even Amazon internally is moving certain workloads back to in-house DCs.
Cloud is overwhelmingly better for “fast start, flexible,” but cost spikes when you use it heavily at scale. Not a panacea.
That said, this is the “Dropbox-scale” discussion. 90%+ of services close before they reach Dropbox scale, and the ones that do take 5-10 years. “We’ll be Dropbox-scale eventually so let’s start on-prem” is canonical premature optimization and almost always fails.
Selection criteria
Selection lands on three axes: cost, business characteristics, security requirements.
| Case | Recommended |
|---|---|
| No constraints, want to minimize cost and ops | Public cloud only |
| High security on data location | Hybrid cloud |
| Mandatory Google / Microsoft integration | Vendor-leaning or multi-cloud |
| Large existing on-prem assets | Hybrid (phased migration) |
| Finance / healthcare / government | Private cloud / on-prem |
For new startups and individuals: “don’t think, public cloud.” Picking on-prem without a special reason is anachronistic and even hurts hiring. Recruiting young engineers to “on-prem-centric” companies grows harder year over year in 2026.
Recommended configurations by scale
Even within “public cloud,” the right configuration shifts with org size. Pair monthly cost guidance with the operational capacity required:
| Phase | Monthly infra (est) | Recommended | Dedicated infra people |
|---|---|---|---|
| MVP / individual | up to $300 | Single public cloud, 1 region, managed-first | 0 (split duty) |
| Early startup | $300-3k | Single cloud, multi-AZ, IaC mandatory | 0.5 (split duty) |
| Mid-sized SaaS | $3k-30k | Single cloud, multi-AZ + DR in 2 regions | 1-3 |
| Enterprise core | $30k+ | Hybrid (legacy + cloud), dedicated lines, AWS Organizations | 5+ |
| Finance / public / healthcare | Industry-dependent | Private cloud or hybrid + compliance certifications | 10+ |
The practical floor for multi-cloud / hybrid is 3+ dedicated infra people. Below that, just tracking updates, unifying monitoring, and managing permissions melts the team. Picking multi-cloud “for the future” is canonical soft choice — adopting only when needed isn’t too late.
Until single-cloud causes pain at your scale, “the courage to lean on one” saves operations.
Hybrid / multi-cloud traps
Especially with hybrid / multi-cloud, “ops cost doubles the moment you adopt”, so just avoiding the forbidden moves prevents major injuries.
| Forbidden move | Why |
|---|---|
| Adopting multi-cloud “just in case” at the start | Hiring specialists for both, unified monitoring, SSO unification — cost and complexity grow exponentially, not linearly |
| Same IP range on on-prem and cloud | Routing collision over VPN / dedicated lines; CIDR design must be unified up front |
| Runbook for one side only in hybrid | On-prem has runbooks but cloud is by hand (or vice versa) — recipe for breakdown |
| Duplicated auth between cloud and on-prem | Password / permission drift, audit failure; unify via AD / IdP |
| Hybrid construction without estimating egress | Monthly cloud-to-on-prem data movement bills can hit thousands of dollars |
| Multi-cloud + DIY abstraction layer | ”Cloud-neutral” home-grown wrappers hit a maintenance ceiling in 6 months. Beyond Kubernetes, DIY is a bad fit |
| ”Cloud is cheaper” as the sole reason for full on-prem migration | TCO varies with scale — on-prem can be cheaper. 3-year TCO comparison is the rule |
| Not checking data-residency regions | GDPR and similar can make overseas-region data storage a regulatory violation |
Cost comparison should use 3-year TCO. Dropbox’s “AWS -> own DC” story stands up only with multi-year TCO; judging from a few months of bills produces wrong conclusions every time.
Don’t pick hybrid / multi-cloud below 3 dedicated people. Operations don’t function.
AI decision axes
With AI-driven development as the assumption, the dominant axis collapses to one point: “does it complete in IaC?”
| AI-era favorable | AI-era unfavorable |
|---|---|
| Public cloud + full IaC | On-prem, manual builds |
| Lean on a single cloud (AI’s context concentrates) | Multi-cloud (config dispersed, AI hallucinates) |
| Containers + Kubernetes manifests | GUI-required PaaS consoles |
| Hybrid with explicit IaC boundaries | On-prem left as “touchable black box” |
The new design principle is not “AI can write it, so complexity is OK” — it’s “keep architecture inside what AI can write.”
- Identify regulation / industry-mandated exclusions (finance, healthcare, public).
- Estimate the weight of existing on-prem assets (if undroppable, phased hybrid).
- Cut multi-cloud by ops headcount (under 2 specialists -> single cloud).
- IaC-ability as the future-debt judgment axis.
The weekend bill incident (industry case)
A common case at companies new to cloud: spin up a verification environment Friday night, forget to stop it over the weekend, hit Monday — bill comes in 10x the expected at month end. GPU instances, large RDS, and NAT Gateways spun up “just to try briefly” and forgotten to stop are the classic offenders.
Cloud’s “only what you use” banner trips you up because there are kinds of resources billed even when not used. NAT Gateways, ELBs, and unattached Elastic IPs are the canonical ones; the person who spun them up doesn’t notice while a few dollars/day melt.
The fix is simple: set cost-cap alerts (AWS Budgets, GCP Billing Alerts), notify on monthly threshold crossings. Cloud operations without these are an accident waiting to happen.
Cloud’s “pay-as-you-go” has parts billed even when idle. Cost alerts are mandatory.
What you must decide — what’s your project’s answer?
Articulate your project’s answer in 1-2 sentences for each:
- Base deployment model (Cloud / On-prem / Hybrid)
- Cloud vendor (AWS / GCP / Azure / domestic)
- Isolation level (shared / logical / physical)
- Data location (country, region, regulation handling)
- Existing-system integration
- Cost ceiling and monitoring
- DR / BCP (Business Continuity Plan) requirements
Recording the decision rationale
Deployment-model selection has a wide blast radius and high cost to reverse, so recording the rationale as an ADR (Architecture Decision Record) at decision time is strongly recommended. Here is a concrete example:
| Item | Content |
|---|---|
| Title | Run production on containers (ECS Fargate) |
| Status | Accepted |
| Context | Current EC2 manual provisioning takes 30+ minutes per deploy, and environment-drift incidents occur ~3 times/year. Goal: reduce ops load while improving deploy speed |
| Decision | Adopt AWS ECS Fargate and containerize all services |
| Rationale | - Eliminates EC2 OS management and patching, saving ~20 hours/month in ops - Container images lock the environment, eliminating “works in dev but not in prod” - Fargate is serverless, so autoscaling config is minimal |
| Rejected alternatives | EKS (Kubernetes): overhead too large for a 4-person team. Lambda: existing app is stateful, migration cost too high |
| Outcome | Containerization requires CI/CD pipeline setup. Dockerfile standardization and image scanning become additional tasks |
Store ADRs as Markdown in docs/adr/ in the code repo, approved through the same PR-review flow. Having “why we chose this” visible at a glance later is the greatest value of an ADR.
Summary
This article covered deployment-model selection — “where does the system live, and who runs it?”
Public cloud is the overwhelming default for new projects. Hybrid and multi-cloud should be limited to cases with clear reasons and the operating capacity. In the AI era, single-cloud-with-IaC’s advantage only widens.
The next article covers the biggest decision after picking public cloud: cloud vendor selection (AWS / Azure / GCP).
Back to series TOC -> ‘Architecture Crash Course for the Generative-AI Era’: How to Read This Book
I hope you’ll read the next article as well.
📚 Series: Architecture Crash Course for the Generative-AI Era (7/89)