About this article
This article is the fifth deep dive in the “System Architecture” category of the Architecture Crash Course for the Generative-AI Era series, covering OS selection.
Server OS selection is the area least amenable to redos — apps, middleware, runbooks, monitoring tools, and personnel are all bound to the OS. This is a decision where you have to be ready to carry the choice for 10 years. The article covers the three families (Linux / Windows Server / commercial UNIX), distribution selection, license / EOL management, and CPU architecture (x86 / ARM).
More articles in this category
A 10-year commitment
The main options are Linux, Windows Server, and commercial UNIX. For new projects in 2026, the actual decisions narrow to two: which Linux distribution and whether to use ARM. Windows Server gets picked when there’s an integration requirement with an existing Microsoft platform; commercial UNIX gets picked only to maintain existing core systems.
OS is the “decide it at project planning” domain. Once code, runbooks, and monitoring tools are written for one OS, switching is essentially a rebuild. “Just go with what we know” locks the team out of switching for the next 10 years.
OS is “a foundation you can’t change later.” Decide at planning time.
The major options
flowchart TB
SERVER([Server OS]) --> L["Linux<br/>Overwhelming majority (90%+ new)"]
SERVER --> W["Windows Server<br/>AD/.NET/Office integration"]
SERVER --> U["Commercial UNIX<br/>(AIX/Solaris)<br/>Effectively gone for new builds"]
L --> L1[Ubuntu/Debian family]
L --> L2[RHEL/Rocky/AlmaLinux family]
L --> L3[Amazon Linux/Container OSes]
classDef root fill:#fef3c7,stroke:#d97706;
classDef linux fill:#dbeafe,stroke:#2563eb,stroke-width:2px;
classDef win fill:#f0f9ff,stroke:#0369a1;
classDef legacy fill:#fee2e2,stroke:#dc2626;
class SERVER root;
class L,L1,L2,L3 linux;
class W win;
class U legacy;
| OS | Main use | Position |
|---|---|---|
| Linux | Web/app servers, container infrastructure | Overwhelming majority |
| Windows Server | Active Directory (Microsoft’s identity platform), .NET, Office | Specific use cases |
| UNIX (AIX / Solaris, etc.) | Legacy large-scale finance / core systems | Effectively gone for new builds |
For new projects, the temperature is “if you pick anything but Linux, you owe a reason.” With cloud and containers as the assumption, building a rationale to pick anything else is harder than just picking Linux.
Linux
Linux is the open-source UNIX-compatible OS Linus Torvalds began in 1991. It now runs the world’s servers, smartphones (Android), and embedded devices, and dominates cloud VMs, containers, and supercomputers as the foundation.
| Pros | Cons |
|---|---|
| Free (most distros) | No GUI-ops culture, learning cost |
| Lightweight, fast startup | Enterprise commercial support costs money |
| Highest affinity with containers and cloud | Inter-distro command differences |
| Rich OSS / middleware available | Desktop usage still minority |
Most AWS EC2 / GCP / Azure instances run Linux. The de facto for server use, and “Linux when in doubt” is the modern default.
Linux distributions
A Linux distribution is “the complete Linux package” — Linux kernel + various software bundled. They split into Red Hat family and Debian family, with different package-management commands and config layouts.
| Family | Examples | Package manager |
|---|---|---|
| Red Hat family | RHEL / Rocky Linux / AlmaLinux / Amazon Linux | yum / dnf (RPM) |
| Debian family | Debian / Ubuntu | apt (DEB) |
| Other | SUSE / Alpine / Arch | Custom |
Pick the family the team is comfortable with. Enterprises lean Red Hat family; developers and startups lean Debian family (especially Ubuntu). After CentOS’s EOL policy change in 2021, Rocky Linux and AlmaLinux were elevated as RHEL successors and have become mainstream.
Distribution selection by case
| Situation | Recommended |
|---|---|
| Enterprise, commercial support priority | RHEL (Red Hat Enterprise Linux) |
| RHEL-equivalent for free | Rocky Linux / AlmaLinux |
| Developer-friendly, info density | Ubuntu LTS |
| On AWS | Amazon Linux 2023 |
| Embedded / ultra-light (container base) | Alpine Linux |
| SAP and other ERP (Enterprise Resource Planning) platforms | SUSE Linux Enterprise |
Ubuntu LTS (Long Term Support) gives 5 years of support and is the favorite for startups and in-house development. Finance, public sector, and large-enterprise core systems pick RHEL’s commercial support — willing to pay tens of thousands of dollars annually for the comfort of getting Red Hat on the line during incidents.
For new Linux: Ubuntu LTS / Amazon Linux / RHEL family. All three have high AI-generation accuracy.
Windows Server
Windows Server is Microsoft’s commercial server OS. Even in the Linux era, companies built around the Microsoft platform (Active Directory, .NET, SQL Server, Office) still pick it as the first candidate.
| Fits | Reason |
|---|---|
| Active Directory-centric authentication | AD is a Windows Server feature |
| Existing .NET business apps | Tightly coupled to Windows Server |
| Office / SharePoint integration | Microsoft-product affinity |
| Hyper-V virtualization | Microsoft hypervisor |
| Pros | Cons |
|---|---|
| Easy GUI-based ops | High licensing cost |
| Full Microsoft compatibility | Slow startup, high memory |
| Strong enterprise support | Container ecosystem weaker than Linux |
| Strong AD integration | High running cost on cloud |
Mostly used to extract value from existing Microsoft assets. Picking it for a new cloud-native web app is a bad fit.
Skip Windows Server for new builds outside Microsoft-platform integration.
Commercial UNIX
Commercial UNIX (AIX from IBM, HP-UX, Solaris) was tied to a specific vendor’s hardware and used in large-scale, high-reliability workloads. From the 1990s into the early 2000s it ran financial core, airline reservation, and telco systems, but Linux’s rise drove share down significantly.
| Trait | Description |
|---|---|
| Extremely high stability | Multi-year non-stop operation track record |
| Vendor-specific hardware | Tied to IBM Power, SPARC, etc. |
| High licensing / maintenance cost | Can run hundreds of thousands of dollars annually |
| Scarce skill base | Engineer aging and decline |
Commercial UNIX is essentially never picked for new builds. Most efforts are now in “maintain the existing core system, gradually migrate to Linux.”
Don’t pick for new builds. Only relevant for legacy maintenance.
License forms
OS license form maps directly to cost structure. Cloud generally bills “OS-included by the hour,” but you can also bring existing OS licenses (BYOL — Bring Your Own License).
| Form | Examples | Trait |
|---|---|---|
| OSS, free | Rocky / AlmaLinux / Debian / Ubuntu | Free, self-managed |
| OSS + commercial subscription | RHEL / Ubuntu Pro / SUSE | Annual contract with support |
| Commercial product | Windows Server / AIX / Solaris | License purchase + maintenance |
RHEL / Ubuntu Pro is the hybrid model: “OSS but with commercial support.” At scale, the annual support fee buys the comfort of getting official help during production incidents.
Support period (EOL)
EOL (End of Life) is the day OS vendors stop providing security patches. Continuing past EOL means accumulating critical security risk as vulnerabilities go unfixed. Pick a version with EOL well past your operational period.
| OS | Standard support end |
|---|---|
| RHEL 9 | 2032 |
| Ubuntu 24.04 LTS | 2029 (2036 with Pro extension) |
| Amazon Linux 2023 | 2028 |
| Windows Server 2022 | 2031 |
| Debian 12 | 2028 (LTS included) |
For systems that will run 10+ years, picking near-EOL is a landmine. Within a few years you trigger an OS-update project at significant cost.
EOL = patches stopped. Pick a version with support longer than the project.
CPU architecture
Equally important is CPU architecture. x86_64 (amd64) was overwhelmingly dominant traditionally, but ARM64 has risen since the early 2020s in cloud, mobile, and Apple Silicon as a low-power, low-cost option.
| CPU | Trait | Examples |
|---|---|---|
| x86_64 (amd64) | Most common, max compatibility | Intel Xeon / AMD EPYC |
| ARM64 (aarch64) | Power-efficient, cost-effective | AWS Graviton / Apple M-series |
| RISC-V | Open architecture, emerging | Embedded / IoT |
AWS Graviton (ARM) runs 20-40% cheaper than equivalent-performance x86 instances. Web servers, Java apps, and many managed services (RDS, ElastiCache, etc.) support Graviton, and not considering ARM for new builds is a missed opportunity.
In cloud, ARM adoption can cut cost 20-40%. Always evaluate for new builds.
Container-specialized lightweight OSes
Container-only OSes optimized for the container era have appeared, surpassing general Linux on security, startup speed, and ops automation, with growing adoption on Kubernetes platforms.
| OS | Trait | Provider |
|---|---|---|
| Bottlerocket | AWS EKS/ECS-specific, auto-update | AWS |
| Flatcar Container Linux | Kubernetes-oriented, immutable | CNCF (Cloud Native Computing Foundation) |
| Talos Linux | API-managed, K8s-only | Sidero Labs |
No SSH login, API-only management — minimizing attack surface while automating ops. Worth considering when running Kubernetes seriously.
Selection criteria
OS selection is decided by “existing assets, people, use case.”
| Situation | Recommended |
|---|---|
| No special constraints, web app | Ubuntu LTS or Amazon Linux |
| Enterprise, support priority | RHEL (paid support) |
| Container ops on AWS | Amazon Linux 2023 or Bottlerocket |
| Microsoft-centric company | Windows Server |
| Existing UNIX core system | AIX / Solaris continuing -> phased Linux migration |
| Cost-first | Ubuntu / Rocky + ARM64 |
For new SaaS / web services, Ubuntu LTS + ARM64 is overwhelmingly the best 2026 choice. Plenty of learning material, easy hiring, low cloud cost.
EOL management / patching ladder
Note: industry rates as of April 2026. Periodic refresh required.
OS isn’t “pick and forget” — the real work is the 10-year run to EOL. What matters changes by phase.
| Phase | When | What | Cadence |
|---|---|---|---|
| Selection | Project planning | EOL >= operating period + 2 years | Decide in 1 week |
| Patching | Post-launch up to 2 years before EOL | Monthly CVE patches; Critical within 72 hours | Monthly + emergency |
| Migration evaluation | 2 years before EOL | Stand up next-LTS validation | Half year to 1 year ahead |
| Production migration | 1 year before EOL | Blue-Green migration, parallel run | 3-6 months |
| Stop the old OS | At EOL | Cleanly shut down — don’t leave vulnerable servers | Immediate |
Continuing on an EOL’d OS produces the same structure as the Equifax 2017 incident (Apache Struts unpatched, 147M records leaked). Manual patching doesn’t scale; modern practice automates with tools like AWS Systems Manager Patch Manager.
EOL: act backward from the announcement date. Last-minute discovery doubles migration cost.
Patching / migration traps
| Forbidden move | Why |
|---|---|
| Apply patches in production all at once | Kernel-patch side effects break middleware startup; do Blue-Green or Canary |
| Keep using an OS whose roadmap was killed (e.g. CentOS 8) | December 2020 CentOS 8 acceleration: 2029 plan compressed to 1 year |
| Apply Windows Server major versions without business-app verification | .NET major incompatibilities crash production apps |
Run apt upgrade / yum update interactively in production | Without version pinning and rollback procedures, unrecoverable incidents happen |
| Decide x86_64 -> ARM64 with unit tests only | JVM, native modules, Docker images differ in ARM support; integration and prod-equivalent load testing required |
| Treat OSS commercial subscription as “free” and not renew | Production incident with no official support — hours to days of guesswork |
| Bundle annual OS updates | Diff size too large to bisect; cadence in months or quarters |
After Equifax, companies that ignored known vulnerabilities took on lawsuits, fines, and stock drops. Security patches are “updates that don’t add features” — automate and apply continuously.
“It’s running, don’t touch it” is a hotbed of security incidents. Patches are not optional.
The AI-era lens
With AI-driven development as the assumption, “engineer familiarity” drops as a selection criterion, replaced by “AI training data volume” and “command standardization.”
Ubuntu LTS and the RHEL family (including Amazon Linux) have huge public information, Stack Overflow answers, and GitHub Issues; AI generation of shell, Ansible (declarative server-config automation), and Dockerfile is highly accurate. Minor distros and commercial UNIX have sparse AI training data; command generation hallucinates frequently. That’s a fatal gap.
| AI-era favorable | AI-era unfavorable |
|---|---|
| Ubuntu LTS / Amazon Linux (max info) | Minor distros (Slackware, etc.) |
| Container-specialized OSes like Bottlerocket | Commercial UNIX (AIX, Solaris) |
| ARM64 + Linux (info / cases growing) | Windows Server GUI ops |
| Shell / systemd / apt/dnf-complete config | Vendor-tool-dependent ops |
Windows Server can become more manageable with PowerShell DSC, WinGet, WSL2-based code-driven approaches, but Linux’s AI accuracy is clearly higher. Talent supply doesn’t reverse from here either.
In the AI era, “Ubuntu LTS / Amazon Linux” is effectively the default. ARM64 cost optimization is also easy.
Common misreadings
- RHEL is paid, so it’s higher-functioning -> RHEL and Rocky/AlmaLinux are essentially the same internally. The difference is “support.” If you don’t need support, Rocky/Alma is enough; failing to make that call adds wasted annual fees.
- Windows Server is easier to operate -> GUI looks easier but loses badly on automation / IaC. In the AI era, “manual is easier” flips into a liability.
- Latest distros are safer -> The latest has thinner support periods and documentation. Enterprise rule is LTS (Long Term Support); the criterion is “how long can I use it?” not “how new is it?”
- ARM compatibility is sketchy, avoid it -> Major middleware and language runtimes have caught up; web-app workloads are largely fine. Avoiding it leaks 20-40% in costs — a soft choice.
The CentOS 8 incident (industry case)
After picking CentOS 8 and developing on it, December 2020’s sudden announcement of “support ends end of 2021” forced a migration in about 1 year on projects that should have run until 2029.
One Japanese SI took millions of dollars and several months of unplanned work to swap CentOS 8 across dozens of deployed sites to Rocky Linux. Stories of “we just kicked off a new project on CentOS 8 — meeting the next week was hellish” abound.
The lesson is simple: “OSS doesn’t mean roadmap reliability is the same.” For free distros too, check “who’s behind it” and “have they moved EOL before?” as a defense line.
“Should be usable for a long time” can flip on one announcement. Verify the backer’s reliability.
What you must decide — what’s your project’s answer?
Articulate your project’s answer in 1-2 sentences for each:
- OS family (Linux / Windows / UNIX)
- Distribution (RHEL family / Debian family / other)
- Version and EOL date
- CPU architecture (x86_64 / ARM64)
- License contract (free / paid sub / commercial)
- Patching policy (manual / auto / Blue-Green)
- Support contract scope
Common failure patterns
- Adopted CentOS 8, forced quick migration -> December 2020 EOL acceleration; 2029 became ~1-year window. Mass migrations to Rocky/Alma.
- Picked a near-EOL version -> OS-update project hits within a few years.
- Picked Windows Server “by feel” -> Licensing fees add up, budget overrun.
- Stayed on x86 without checking ARM -> Missing the 20-40% Graviton savings.
- Free Linux without a support contract -> Production incident with no one to call, days of root-cause hunting.
How to make the final call
OS is a 5-10 year unchangeable foundation, so the decision core sits on long support period and realistic talent supply, not short-term preference.
Eliminate any OS whose EOL is shorter than your operating period; then evaluate whether you need commercial support (can you call the official line during incidents?) by industry and risk tolerance. That’s the first filter.
For new Linux, the choice is effectively “Ubuntu LTS / Amazon Linux / RHEL family.” All three have rich AI training data, so shell / IaC generation accuracy doesn’t differ much. The differentiator is CPU architecture. Whether you adopt ARM64 (Graviton, etc.) changes cloud cost by 20-40%.
Skip Windows Server for new builds outside Microsoft-platform integration.
Selection priority:
- EOL must exceed operating period + 2 years.
- Existing-platform integration (Microsoft / UNIX legacy).
- Commercial support contract sized to industry risk.
- CPU architecture: always evaluate ARM64 (cost axis).
The default is “Ubuntu LTS / Amazon Linux + ARM64.” Top on info density, cost, and AI productivity.
Summary
This article covered OS selection — Linux / Windows / UNIX comparison, distributions, EOL management, ARM architecture.
New Linux: pick from Ubuntu LTS / Amazon Linux / RHEL family. Windows Server: only when integrating with Microsoft platforms. Commercial UNIX: only legacy maintenance. CPU: ARM64 evaluation is mandatory. The 2026 realistic answer.
The next article covers datastore (overall data placement at the system-architecture level).
Back to series TOC -> ‘Architecture Crash Course for the Generative-AI Era’: How to Read This Book
I hope you’ll read the next article as well.
📚 Series: Architecture Crash Course for the Generative-AI Era (10/89)