System Architecture

Choosing a Runtime — VM / Container / Serverless / Wasm

Choosing a Runtime — VM / Container / Serverless / Wasm

About this article

This article is the fourth deep dive in the “System Architecture” category of the Architecture Crash Course for the Generative-AI Era series, covering runtime selection — “where the app actually runs.”

The 2013 launch of Docker swapped the field from “VM-led” to “containers as default” in 10 years, and FaaS and Wasm are pressing in on top. This article compares the five layers — bare metal / VM / container / serverless / WebAssembly — covers recommended configurations by scale and use, and the migration traps.

What is a runtime environment in the first place

A runtime environment is, roughly speaking, “the platform that actually runs the programs you write.”

Imagine a kitchen for cooking. Your home kitchen (bare metal / VM) lets you set up any equipment you want, but cleaning and maintenance are all on you. A shared kitchen (container) comes with basic equipment, and you just bring your own tools. A cloud kitchen for delivery only (FaaS) uses the kitchen only when orders come in, and you pay only for what you use. How much you manage yourself versus how much you delegate — that trade-off is what runtime environment selection is about.

Why runtime environment selection matters

Once code is written for a specific environment, migrating to another requires partial rewrites. The lightness of initial selection and the heaviness of later reversal are asymmetric, so choosing casually leads to pain later.

How much do you manage, and how much do you let go of?

Selection touches development efficiency, operational burden, cost, performance, and scalability. Until the 2010s, VMs were mainstream; in the 2020s, containers became the default. FaaS and Wasm are spreading rapidly on top.

Reframing the question from “which one” to “how much management do I let go of” produces sharper judgment.

Runtime selection is not a Two-way Door. Once code is written for a given runtime, moving to another involves partial rewrites. Particularly for FaaS-style event-driven design, going back to a regular server is major construction.

The asymmetry between “easy to try” and “expensive to undo” means a casual “let’s try” stance gets punished later.

Runtime is the trade-off of “how much you manage yourself.” Let go of as much as you can — that’s the rule.

The five major options

Modern runtimes split into five layers. The further down, the “narrower management scope, easier ops,” at the cost of degree of freedom and performance.

flowchart TB
    BARE[Bare metal<br/>HW through app, all yours]
    VM[VM<br/>OS through app yours]
    CON[Container<br/>App + runtime only]
    FAAS[FaaS<br/>Function code only]
    WASM[WebAssembly<br/>Run in sandbox]
    BARE --> VM --> CON --> FAAS --> WASM
    LEFT[Heavy management<br/>High freedom<br/>High ops load] -.- BARE
    WASM -.- RIGHT[Light management<br/>Low freedom<br/>Light ops load]
    classDef heavy fill:#fee2e2,stroke:#dc2626;
    classDef mid fill:#fef3c7,stroke:#d97706;
    classDef modern fill:#dbeafe,stroke:#2563eb;
    classDef light fill:#fae8ff,stroke:#a21caf;
    class BARE heavy;
    class VM mid;
    class CON,FAAS modern;
    class WASM light;
OptionManagement scopeExamples
Bare metalHardware through appDirect physical-server ops
VM (virtual machine)OS through appAWS EC2, VMware
ContainerApp + runtime onlyDocker, Kubernetes
Serverless (FaaS)App code onlyAWS Lambda, Cloud Functions
WebAssembly (Wasm)On browser / runtimeCloudflare Workers

The principle: “pick the easiest one.” Within what satisfies functional and performance requirements, let go of as much management as you can. That’s the spine.

Housing analogy: bare metal = house with land (yard and plumbing all yours), VM = condo (building shared, unit yours), container = furnished weekly rental (move anytime), FaaS = capsule hotel (charged only when sleeping), Wasm = airport lounge (in and out instantly). Keep this in mind and the placement of each form clicks.

Bare metal

Bare metal runs the OS directly on a fresh physical server, then runs the app. Before cloud existed, this was the only choice. Most on-prem still runs this way.

ProsCons
No virtualization overheadAll hardware management yours
Full hardware controlScaling physically constrained
Highest security via physical isolationLong procurement / setup
Can use existing physical assetsRecovery is hard

There’s essentially no reason to pick bare metal new. Candidates are limited to HFT (High-Frequency Trading), real-time computation, scientific computing, special GPU workloads — uses where “no overhead is allowed.” For typical web apps and backends, bare metal is a don’t-do.

It still gets used only in cases that don’t tolerate even virtualization overhead.

VM (virtual machine)

A VM runs multiple virtual computers on one physical server. Each VM has its own OS and runs without interference. Cloud’s EC2 and Azure VM are all delivered as VMs.

Two virtualization styles. Host-OS style runs virtualization software (VMware Workstation, etc.) on a normal OS — common in development. Production uses hypervisor style (AWS Nitro, ESXi) — virtualization layer directly on hardware, higher efficiency. EC2 runs this way.

ProsCons
Run without minding hardwareGuest-OS layer overhead
Flexible resources / scalingBuild and operate from OS up
Good for migrating legacy on-prem appsSlower start than containers
Full OS-level isolationLower density than containers

Still strong as a cloud-migration entry point. For new builds, container is the favorite.

Container

A container packages the application and its required libraries / config into a lightweight execution unit. Where VMs virtualize at the OS level, containers share the host’s OS kernel and isolate only the application layer. Startup goes from minutes to seconds (or milliseconds).

The biggest milestone: ending the “works in dev, breaks in prod” problem. Distributed as a Docker image, the app and its runtime ship together and behave identically anywhere.

That portability dramatically shortened CI/CD cycles and was the decisive factor for DevOps spreading. Docker started in 2013 as an internal tool at the PaaS vendor dotCloud, opened up — and rewrote the industry in 10 years.

ProsCons
Fast startup (seconds)Container learning curve
Environment identity guaranteedOrchestration (k8s) is complex
High resource efficiency, dense packingOS-kernel-level isolation weaker than VM
Pairs well with CI/CDStateful workloads need extra design

The 2026 de facto standard. New web apps go to Docker containers as the default.

Container operating tools

To build and run containers you need containerization and orchestration (managing multiple containers) tools. Field standards:

CategoryToolDescription
ContainerizationDockerOverwhelming share — effectively the standard
OrchestrationKubernetes (k8s)Origin Google, general orchestrator
Managed k8sEKS / AKS / GKEEach cloud’s k8s managed offering
Light orchestrationAWS ECS / Cloud RunEasier container ops

Kubernetes is powerful but has sharp operational fangs. It’s overkill for small/mid scale, and adopting it badly melts the ops team. Choices that “run containers more easily than k8s” — AWS ECS, App Runner, GCP Cloud Run — are better fit at smaller scale.

K8s is a powerful tool with high ops cost. Match it to scale.

Serverless (FaaS)

FaaS uploads code by function and runs it. AWS Lambda, Azure Functions, Google Cloud Functions are the lineup. Broadly, “serverless” means any service without server management; in the field it usually means FaaS.

Developers don’t need to think about hardware, OS, or containers — just write business logic. Pay-as-you-go billing on execution time means $0 during idle. “Run things at very low cost” is the headline.

ProsCons
Zero infra managementCold starts (start latency)
Per-call billing, zero idleExecution-time caps (Lambda 15 min, etc.)
Auto scalingStateless
Easy deploysBad fit for long-running / high-frequency workloads

FaaS shines for event-driven, async batch, webhook handling, lightweight per-API processing. Using it for an always-on web server is the classic landmine.

FaaS limits and mitigation

LimitSubstanceMitigation
Cold startSeveral seconds for first call when inactiveProvisioned Concurrency, periodic invocation
Execution-time capLambda 15 min / Azure 10 min, etc.Split via Step Functions, etc.
StatelessMemory cannot persist across invocationsExternal stores like DynamoDB
Hard to debugDifficult to run locallySAM / Serverless Framework

Cold starts especially hit user-facing APIs. Picking FaaS for VIP-user APIs or low-latency requirements is a bad fit. Wasm or containers are the favorites there.

FaaS shines on “light, short, event-driven.” Identifying the wrong fits matters.

WebAssembly (Wasm)

Wasm was originally born to run high-performance work (3D, video editing, physics) in the browser. Compile high-level languages (C++, Rust, Go) to .wasm binaries; the browser or a Wasm runtime executes them at multiples of JavaScript’s speed.

What’s drawn attention recently is Wasm as a server-side runtime. Adopted in edge-compute environments like Cloudflare Workers and Fastly Compute@Edge. “Far faster cold starts than FaaS,” “safe isolation,” “multi-language support” — positioned as FaaS’s next generation.

ProsCons
Cold starts in millisecondsEcosystem still maturing
High safety via sandboxLimited language / library support
Multi-language binaries in one formatDebugging tools immature
Optimal for edgeRestricted file I/O and OS features

The “frontrunner for edge execution.” Positioned as a next-generation runtime that overcomes FaaS’s limits.

FaaS vs Edge Runtime vs BaaS — similar but not the same

Conflated under the “don’t manage servers” banner, but where they run and what they cover differ.

FaaS: per-function, on-demand execution in a specific region. Edge Runtime: lightweight runtime that ms-starts at edges (CDN PoPs) globally. BaaS (Backend as a Service): bundles auth, DB, functions, real-time — “backend as a whole.” Different layers. Not mutually exclusive — they can be combined.

For individual development and MVPs, BaaS + Edge Runtime as the configuration with near-zero in-house backend has cemented over the last 2-3 years. For LLM-streaming responses, Edge Runtime overwhelmingly outperforms regular FaaS — often the deciding factor.

AspectFaaSEdge RuntimeBaaS
ExamplesAWS Lambda, Cloud FunctionsCloudflare Workers, Vercel EdgeSupabase, Firebase, Convex
Cold start100s of ms-secondsNear-zero (ms)Built-in functions are Edge-equivalent
Execution time15-min cap, etc.~30s + streaming separatelyService-dependent
StrengthsAsync batch, webhooksStreaming, lightweight APIsAuth+DB+functions bundled
Lock-inMid-highLow-midHigh (DB-and-all is hard to move)

BaaS is the option that “writes no backend” to its limits. Highest cost-effectiveness in personal use, with heavier exit cost as you scale.

Five-method overview

Lining up the five methods on key axes — moving left to right, management load drops:

AspectBare metalVMContainerFaaSWasm
Management scopeAllOS+App onlyCode onlyCode only
StartupSlow (minutes)Slow (minutes)Fast (seconds)Fairly fastFastest (ms)
CostFixed, highFixedVariablePay-as-you-goPay-as-you-go
ScalingHardMostly manualAutoAutoAuto
Vendor lock-inLowMidMidHighLow
Suited caseHFT, scientific computingFirst step from on-premGeneral new web appsBatch, webhook processingEdge low-latency APIs

On vendor lock-in, containers and Wasm are portable and easier to move; FaaS is tightly coupled to cloud-specific features and hard to migrate. Picking FaaS means accepting that lock-in.

Selection by case

CaseRecommended
New SaaS-style web appContainer (+ ECS/Cloud Run)
Migrate on-prem web app with minimal changeVM (EC2)
Migrate on-prem web app with ops modernizationContainer
Event-driven batch / webhooksFaaS (Lambda)
Ultra-low-latency edge processingWasm (Cloudflare Workers, etc.)
Special ultra-fast workloadBare metal

For new web apps, “container as the default” — and supplementing with FaaS for partial work (image transformation, notifications, webhook receipt) is the most common configuration in the field.

“Container is the default” is correct, but the right granularity shifts with scale and use. Team-size affinity:

Team sizeApp natureRecommended runtimeOrchestration
Up to 3Web / APICloud Run / App Runner / VercelNone (managed)
3-10Web + batchContainer (ECS Fargate) + Lambda (batch)ECS (k8s too early)
10-30MicroservicesContainer + EKS / GKEk8s (dedicated ops mandatory)
30+Multi-service, multi-regionk8s + service meshk8s + Istio / Linkerd
SpecialEdge low-latencyWasm (Cloudflare Workers, etc.)
SpecialHFT / scientificBare metal

The practical floor for k8s adoption: “~10-person scale with one dedicated ops engineer.” Below that, the learning cost of CRDs, Helm, Ingress, certificate renewal, and network policies exhausts everyone.

ECS Fargate / Cloud Run is roughly “90% of k8s functionality at 10% of the learning cost” — the default for small/mid scale.

K8s is a weapon for orgs that fit it. Reaching for it melts ops.

Runtime-migration traps

VM -> container -> FaaS / Wasm migration goes “one layer at a time.” Skipping usually breaks things.

Forbidden moveWhy
Suddenly turning a VM-running app into FaaSStateful behavior, 15-min+ processing, sessions don’t fit FaaS
Dockerizing while still writing logs to filesLogs lost on container restart; flow stdout/stderr
Adopting k8s without dedicated infraYAML, RBAC, Helm, CRD, Ingress, CNI, certs — all in scope; without dedicated ops, incidents pile up
Always-on web API on FaaSHundreds of ms to seconds of cold-start cuts UX; Edge Runtime or container is the answer
Wasm for workloads with heavy file mapping or native librariesLimited support; polyfill compatibility breaks
Container base image with latest tagReproducibility lost; one day it stops working
No retry-strategy design for FaaSAuto-retry runs the same payment / email twice
Designing 15-min+ long-running work on FaaSHits the timeout ceiling (Lambda 15 min) and gets cut mid-run. Long tasks go to Step Functions / queue + worker
Containerized but environment dependencies hard-codedHost-specific paths, ports, env vars baked in — breaks immediately on another environment

Migrate via “trial in one feature -> monitored production -> full adoption” in three stages. Big-bang full-switch migration is a forbidden move.

Migrate gradually. Full switch is the shortest path to a production outage.

AI decision axes

With AI-driven development as the assumption, the selection axis pivots from “human learning cost” to “performance, startup speed, and lock-in.”

AI-era favorableAI-era unfavorable
Container (portable, IaC-complete)Bare-metal manual ops
Wasm (fast start, low lock-in)Vendor-specific runtimes
FaaS with OIDC + IaC managementClick-through GUI runtime config
Kubernetes (AI writes the YAML)Custom in-house ops automation
  1. Filter exclusion candidates by physical / performance constraints (latency, GPU, cold start).
  2. Decide k8s / ECS / Cloud Run granularity by ops-staff thickness.
  3. Adjust FaaS density by portability (future cloud migration possibility).
  4. Don’t fixate on one method; design with use-by-use selection.

”Adopting k8s at small scale” — the day (industry case)

At small (single-digit) team scale, adopting Kubernetes in production and getting buried weekly by CRDs (Custom Resource Definitions, k8s extensibility) and Helm Charts (k8s manifest packaging) is widely reported around 2018-2020. Monitoring, logs, Ingress, certificate renewal, network policies — all required specialty knowledge to keep “correct ops,” and were obvious overkill for under-3-people teams.

Teams in the same era using AWS ECS or GCP Cloud Run got the same functions at 1/5 of the operational load. Many stories of teams that adopted k8s and switched back to ECS within six months.

K8s is the standard for large orgs, but picking it without scale produces suffering. Canonical over-reach landmine.

K8s is a weapon for orgs at scale. At small scale, ECS / Cloud Run is enough.

What you must decide — what’s your project’s answer?

Articulate your project’s answer in 1-2 sentences for each:

  • Primary runtime (VM / container / FaaS / Wasm)
  • Orchestration (k8s / ECS / managed)
  • Scaling strategy (auto / manual / scheduled)
  • Tolerance for cold start
  • Acceptable lock-in
  • Existing-asset (VM / on-prem) policy
  • Runtime language support state

Summary

This article covered runtime selection in the cloud era — VM, container, serverless, Wasm.

The 2026 default: spine on containers, supplement minimally with use-specialized choices. K8s is a weapon for orgs that match its scale; ECS / Cloud Run is enough for small/mid teams.

The next article covers OS selection — Linux / Windows / UNIX.

Back to series TOC -> ‘Architecture Crash Course for the Generative-AI Era’: How to Read This Book

I hope you’ll read the next article as well.

📚 Series: Architecture Crash Course for the Generative-AI Era (9/89)