Choosing a Runtime — VM / Container / Serverless / Wasm

About this article

This article is the fourth deep dive in the “System Architecture” category of the Architecture Crash Course for the Generative-AI Era series, covering runtime selection — “where the app actually runs.”

The 2013 launch of Docker swapped the field from “VM-led” to “containers as default” in 10 years, and FaaS and Wasm are pressing in on top. This article compares the five layers — bare metal / VM / container / serverless / WebAssembly — covers recommended configurations by scale and use, and the migration traps.

What is a runtime environment in the first place

A runtime environment is, roughly speaking, “the platform that actually runs the programs you write.”

Imagine a kitchen for cooking. Your home kitchen (bare metal / VM) lets you set up any equipment you want, but cleaning and maintenance are all on you. A shared kitchen (container) comes with basic equipment, and you just bring your own tools. A cloud kitchen for delivery only (FaaS) uses the kitchen only when orders come in, and you pay only for what you use. How much you manage yourself versus how much you delegate — that trade-off is what runtime environment selection is about.

Why runtime environment selection matters

Once code is written for a specific environment, migrating to another requires partial rewrites. The lightness of initial selection and the heaviness of later reversal are asymmetric, so choosing casually leads to pain later.

How much do you manage, and how much do you let go of?

Selection touches development efficiency, operational burden, cost, performance, and scalability. Until the 2010s, VMs were mainstream; in the 2020s, containers became the default. FaaS and Wasm are spreading rapidly on top.

Reframing the question from “which one” to “how much management do I let go of” produces sharper judgment.

Runtime selection is not a Two-way Door. Once code is written for a given runtime, moving to another involves partial rewrites. Particularly for FaaS-style event-driven design, going back to a regular server is major construction.

The asymmetry between “easy to try” and “expensive to undo” means a casual “let’s try” stance gets punished later.

Runtime is the trade-off of “how much you manage yourself.” Let go of as much as you can — that’s the rule.

The five major options

Modern runtimes split into five layers. The further down, the “narrower management scope, easier ops,” at the cost of degree of freedom and performance.

flowchart TB
    BARE[Bare metal<br/>HW through app, all yours]
    VM[VM<br/>OS through app yours]
    CON[Container<br/>App + runtime only]
    FAAS[FaaS<br/>Function code only]
    WASM[WebAssembly<br/>Run in sandbox]
    BARE --> VM --> CON --> FAAS --> WASM
    LEFT[Heavy management<br/>High freedom<br/>High ops load] -.- BARE
    WASM -.- RIGHT[Light management<br/>Low freedom<br/>Light ops load]
    classDef heavy fill:#fee2e2,stroke:#dc2626;
    classDef mid fill:#fef3c7,stroke:#d97706;
    classDef modern fill:#dbeafe,stroke:#2563eb;
    classDef light fill:#fae8ff,stroke:#a21caf;
    class BARE heavy;
    class VM mid;
    class CON,FAAS modern;
    class WASM light;

Option	Management scope	Examples
Bare metal	Hardware through app	Direct physical-server ops
VM (virtual machine)	OS through app	AWS EC2, VMware
Container	App + runtime only	Docker, Kubernetes
Serverless (FaaS)	App code only	AWS Lambda, Cloud Functions
WebAssembly (Wasm)	On browser / runtime	Cloudflare Workers

The principle: “pick the easiest one.” Within what satisfies functional and performance requirements, let go of as much management as you can. That’s the spine.

Housing analogy: bare metal = house with land (yard and plumbing all yours), VM = condo (building shared, unit yours), container = furnished weekly rental (move anytime), FaaS = capsule hotel (charged only when sleeping), Wasm = airport lounge (in and out instantly). Keep this in mind and the placement of each form clicks.

Bare metal

Bare metal runs the OS directly on a fresh physical server, then runs the app. Before cloud existed, this was the only choice. Most on-prem still runs this way.

Pros	Cons
No virtualization overhead	All hardware management yours
Full hardware control	Scaling physically constrained
Highest security via physical isolation	Long procurement / setup
Can use existing physical assets	Recovery is hard

There’s essentially no reason to pick bare metal new. Candidates are limited to HFT (High-Frequency Trading), real-time computation, scientific computing, special GPU workloads — uses where “no overhead is allowed.” For typical web apps and backends, bare metal is a don’t-do.

It still gets used only in cases that don’t tolerate even virtualization overhead.

VM (virtual machine)

A VM runs multiple virtual computers on one physical server. Each VM has its own OS and runs without interference. Cloud’s EC2 and Azure VM are all delivered as VMs.

Two virtualization styles. Host-OS style runs virtualization software (VMware Workstation, etc.) on a normal OS — common in development. Production uses hypervisor style (AWS Nitro, ESXi) — virtualization layer directly on hardware, higher efficiency. EC2 runs this way.

Pros	Cons
Run without minding hardware	Guest-OS layer overhead
Flexible resources / scaling	Build and operate from OS up
Good for migrating legacy on-prem apps	Slower start than containers
Full OS-level isolation	Lower density than containers

Still strong as a cloud-migration entry point. For new builds, container is the favorite.

Container

A container packages the application and its required libraries / config into a lightweight execution unit. Where VMs virtualize at the OS level, containers share the host’s OS kernel and isolate only the application layer. Startup goes from minutes to seconds (or milliseconds).

The biggest milestone: ending the “works in dev, breaks in prod” problem. Distributed as a Docker image, the app and its runtime ship together and behave identically anywhere.

That portability dramatically shortened CI/CD cycles and was the decisive factor for DevOps spreading. Docker started in 2013 as an internal tool at the PaaS vendor dotCloud, opened up — and rewrote the industry in 10 years.

Pros	Cons
Fast startup (seconds)	Container learning curve
Environment identity guaranteed	Orchestration (k8s) is complex
High resource efficiency, dense packing	OS-kernel-level isolation weaker than VM
Pairs well with CI/CD	Stateful workloads need extra design

The 2026 de facto standard. New web apps go to Docker containers as the default.

Container operating tools

To build and run containers you need containerization and orchestration (managing multiple containers) tools. Field standards:

Category	Tool	Description
Containerization	Docker	Overwhelming share — effectively the standard
Orchestration	Kubernetes (k8s)	Origin Google, general orchestrator
Managed k8s	EKS / AKS / GKE	Each cloud’s k8s managed offering
Light orchestration	AWS ECS / Cloud Run	Easier container ops

Kubernetes is powerful but has sharp operational fangs. It’s overkill for small/mid scale, and adopting it badly melts the ops team. Choices that “run containers more easily than k8s” — AWS ECS, App Runner, GCP Cloud Run — are better fit at smaller scale.

K8s is a powerful tool with high ops cost. Match it to scale.

Serverless (FaaS)

FaaS uploads code by function and runs it. AWS Lambda, Azure Functions, Google Cloud Functions are the lineup. Broadly, “serverless” means any service without server management; in the field it usually means FaaS.

Developers don’t need to think about hardware, OS, or containers — just write business logic. Pay-as-you-go billing on execution time means $0 during idle. “Run things at very low cost” is the headline.

Pros	Cons
Zero infra management	Cold starts (start latency)
Per-call billing, zero idle	Execution-time caps (Lambda 15 min, etc.)
Auto scaling	Stateless
Easy deploys	Bad fit for long-running / high-frequency workloads

FaaS shines for event-driven, async batch, webhook handling, lightweight per-API processing. Using it for an always-on web server is the classic landmine.

FaaS limits and mitigation

Limit	Substance	Mitigation
Cold start	Several seconds for first call when inactive	Provisioned Concurrency, periodic invocation
Execution-time cap	Lambda 15 min / Azure 10 min, etc.	Split via Step Functions, etc.
Stateless	Memory cannot persist across invocations	External stores like DynamoDB
Hard to debug	Difficult to run locally	SAM / Serverless Framework

Cold starts especially hit user-facing APIs. Picking FaaS for VIP-user APIs or low-latency requirements is a bad fit. Wasm or containers are the favorites there.

FaaS shines on “light, short, event-driven.” Identifying the wrong fits matters.

WebAssembly (Wasm)

Wasm was originally born to run high-performance work (3D, video editing, physics) in the browser. Compile high-level languages (C++, Rust, Go) to .wasm binaries; the browser or a Wasm runtime executes them at multiples of JavaScript’s speed.

What’s drawn attention recently is Wasm as a server-side runtime. Adopted in edge-compute environments like Cloudflare Workers and Fastly Compute@Edge. “Far faster cold starts than FaaS,” “safe isolation,” “multi-language support” — positioned as FaaS’s next generation.

Pros	Cons
Cold starts in milliseconds	Ecosystem still maturing
High safety via sandbox	Limited language / library support
Multi-language binaries in one format	Debugging tools immature
Optimal for edge	Restricted file I/O and OS features

The “frontrunner for edge execution.” Positioned as a next-generation runtime that overcomes FaaS’s limits.

FaaS vs Edge Runtime vs BaaS — similar but not the same

Conflated under the “don’t manage servers” banner, but where they run and what they cover differ.

FaaS: per-function, on-demand execution in a specific region. Edge Runtime: lightweight runtime that ms-starts at edges (CDN PoPs) globally. BaaS (Backend as a Service): bundles auth, DB, functions, real-time — “backend as a whole.” Different layers. Not mutually exclusive — they can be combined.

For individual development and MVPs, BaaS + Edge Runtime as the configuration with near-zero in-house backend has cemented over the last 2-3 years. For LLM-streaming responses, Edge Runtime overwhelmingly outperforms regular FaaS — often the deciding factor.

Aspect	FaaS	Edge Runtime	BaaS
Examples	AWS Lambda, Cloud Functions	Cloudflare Workers, Vercel Edge	Supabase, Firebase, Convex
Cold start	100s of ms-seconds	Near-zero (ms)	Built-in functions are Edge-equivalent
Execution time	15-min cap, etc.	~30s + streaming separately	Service-dependent
Strengths	Async batch, webhooks	Streaming, lightweight APIs	Auth+DB+functions bundled
Lock-in	Mid-high	Low-mid	High (DB-and-all is hard to move)

BaaS is the option that “writes no backend” to its limits. Highest cost-effectiveness in personal use, with heavier exit cost as you scale.

Five-method overview

Lining up the five methods on key axes — moving left to right, management load drops:

Aspect	Bare metal	VM	Container	FaaS	Wasm
Management scope	All	OS+	App only	Code only	Code only
Startup	Slow (minutes)	Slow (minutes)	Fast (seconds)	Fairly fast	Fastest (ms)
Cost	Fixed, high	Fixed	Variable	Pay-as-you-go	Pay-as-you-go
Scaling	Hard	Mostly manual	Auto	Auto	Auto
Vendor lock-in	Low	Mid	Mid	High	Low
Suited case	HFT, scientific computing	First step from on-prem	General new web apps	Batch, webhook processing	Edge low-latency APIs

On vendor lock-in, containers and Wasm are portable and easier to move; FaaS is tightly coupled to cloud-specific features and hard to migrate. Picking FaaS means accepting that lock-in.

Selection by case

Case	Recommended
New SaaS-style web app	Container (+ ECS/Cloud Run)
Migrate on-prem web app with minimal change	VM (EC2)
Migrate on-prem web app with ops modernization	Container
Event-driven batch / webhooks	FaaS (Lambda)
Ultra-low-latency edge processing	Wasm (Cloudflare Workers, etc.)
Special ultra-fast workload	Bare metal

For new web apps, “container as the default” — and supplementing with FaaS for partial work (image transformation, notifications, webhook receipt) is the most common configuration in the field.

Recommended configurations by scale and use

“Container is the default” is correct, but the right granularity shifts with scale and use. Team-size affinity:

Team size	App nature	Recommended runtime	Orchestration
Up to 3	Web / API	Cloud Run / App Runner / Vercel	None (managed)
3-10	Web + batch	Container (ECS Fargate) + Lambda (batch)	ECS (k8s too early)
10-30	Microservices	Container + EKS / GKE	k8s (dedicated ops mandatory)
30+	Multi-service, multi-region	k8s + service mesh	k8s + Istio / Linkerd
Special	Edge low-latency	Wasm (Cloudflare Workers, etc.)	—
Special	HFT / scientific	Bare metal	—

The practical floor for k8s adoption: “~10-person scale with one dedicated ops engineer.” Below that, the learning cost of CRDs, Helm, Ingress, certificate renewal, and network policies exhausts everyone.

ECS Fargate / Cloud Run is roughly “90% of k8s functionality at 10% of the learning cost” — the default for small/mid scale.

K8s is a weapon for orgs that fit it. Reaching for it melts ops.

Runtime-migration traps

VM -> container -> FaaS / Wasm migration goes “one layer at a time.” Skipping usually breaks things.

Forbidden move	Why
Suddenly turning a VM-running app into FaaS	Stateful behavior, 15-min+ processing, sessions don’t fit FaaS
Dockerizing while still writing logs to files	Logs lost on container restart; flow stdout/stderr
Adopting k8s without dedicated infra	YAML, RBAC, Helm, CRD, Ingress, CNI, certs — all in scope; without dedicated ops, incidents pile up
Always-on web API on FaaS	Hundreds of ms to seconds of cold-start cuts UX; Edge Runtime or container is the answer
Wasm for workloads with heavy file mapping or native libraries	Limited support; polyfill compatibility breaks
Container base image with `latest` tag	Reproducibility lost; one day it stops working
No retry-strategy design for FaaS	Auto-retry runs the same payment / email twice
Designing 15-min+ long-running work on FaaS	Hits the timeout ceiling (Lambda 15 min) and gets cut mid-run. Long tasks go to Step Functions / queue + worker
Containerized but environment dependencies hard-coded	Host-specific paths, ports, env vars baked in — breaks immediately on another environment

Migrate via “trial in one feature -> monitored production -> full adoption” in three stages. Big-bang full-switch migration is a forbidden move.

Migrate gradually. Full switch is the shortest path to a production outage.

AI decision axes

With AI-driven development as the assumption, the selection axis pivots from “human learning cost” to “performance, startup speed, and lock-in.”

AI-era favorable	AI-era unfavorable
Container (portable, IaC-complete)	Bare-metal manual ops
Wasm (fast start, low lock-in)	Vendor-specific runtimes
FaaS with OIDC + IaC management	Click-through GUI runtime config
Kubernetes (AI writes the YAML)	Custom in-house ops automation

Filter exclusion candidates by physical / performance constraints (latency, GPU, cold start).
Decide k8s / ECS / Cloud Run granularity by ops-staff thickness.
Adjust FaaS density by portability (future cloud migration possibility).
Don’t fixate on one method; design with use-by-use selection.

”Adopting k8s at small scale” — the day (industry case)

At small (single-digit) team scale, adopting Kubernetes in production and getting buried weekly by CRDs (Custom Resource Definitions, k8s extensibility) and Helm Charts (k8s manifest packaging) is widely reported around 2018-2020. Monitoring, logs, Ingress, certificate renewal, network policies — all required specialty knowledge to keep “correct ops,” and were obvious overkill for under-3-people teams.

Teams in the same era using AWS ECS or GCP Cloud Run got the same functions at 1/5 of the operational load. Many stories of teams that adopted k8s and switched back to ECS within six months.

K8s is the standard for large orgs, but picking it without scale produces suffering. Canonical over-reach landmine.

K8s is a weapon for orgs at scale. At small scale, ECS / Cloud Run is enough.

What you must decide — what’s your project’s answer?

Articulate your project’s answer in 1-2 sentences for each:

Primary runtime (VM / container / FaaS / Wasm)
Orchestration (k8s / ECS / managed)
Scaling strategy (auto / manual / scheduled)
Tolerance for cold start
Acceptable lock-in
Existing-asset (VM / on-prem) policy
Runtime language support state

Summary

This article covered runtime selection in the cloud era — VM, container, serverless, Wasm.

The 2026 default: spine on containers, supplement minimally with use-specialized choices. K8s is a weapon for orgs that match its scale; ECS / Cloud Run is enough for small/mid teams.

The next article covers OS selection — Linux / Windows / UNIX.

Back to series TOC -> ‘Architecture Crash Course for the Generative-AI Era’: How to Read This Book

I hope you’ll read the next article as well.