Appendix

Best-Practice Catalog — When in Doubt, Lean on These

Best-Practice Catalog — When in Doubt, Lean on These

About this article

This article is the second installment of the “Appendix” category in the Architecture Crash Course for the Generative-AI Era series, covering the best-practice catalog.

Where the anti-pattern catalog was a reverse lookup of “landmines you must not step on”, this article is the forward-lookup catalog of “when in doubt, start here.” Each domain gets a one-page distillation of the boring but reliable standard stack. Use it as the skeleton for new projects, the final check before reviews, or the underlay for explaining choices to other teams.

flowchart TB
    NEW([New project<br/>= when in doubt, lean on standards])
    A[Architecture overall<br/>YAGNI / Modular monolith]
    I[Infra<br/>Managed / IaC / Single cloud]
    D[Data<br/>PostgreSQL / 3NF normalization]
    APP[App<br/>SOLID / Small classes]
    F[Frontend<br/>TypeScript+Next.js+Tailwind]
    S[Security<br/>IDaaS / Passkey / Secrets Manager]
    O[Monitoring & ops<br/>SLO / Datadog / On-call]
    P[Process<br/>GitHub Flow / Squash / Small PRs]
    AI[AI era<br/>Standard FW / Type safety / Rich training data]
    NEW --> A
    NEW --> I
    NEW --> D
    NEW --> APP
    NEW --> F
    NEW --> S
    NEW --> O
    NEW --> P
    NEW --> AI
    classDef new fill:#fef3c7,stroke:#d97706,stroke-width:2px;
    classDef good fill:#dcfce7,stroke:#16a34a;
    class NEW new;
    class A,I,D,APP,F,S,O,P,AI good;

Architecture-wide standards

Before any specific tech choice, these are posture toward design principles that are widely effective in the industry. None are flashy, but obeying them alone keeps you off the “burning project” list.

PracticeContentRationale
YAGNI (only what you need now)Don’t build layers / abstractions of uncertain useUnused code is the prime suspect for technical debt
Choose Boring TechnologyPrefer options with 2-3+ years of track recordInformation density and adoption suppress hallucinations
Leave reasoning in ADRsOne-pager per decisionYour largest readers are your future self and your successor
Standard libraries / SaaS firstAvoid reinventing the wheelCustom implementation breeds vulnerabilities and maintenance cost
Always measure before choosing”Feels faster” is not evidencePerceived and measured differ by ~30% routinely

A documented technical compromise is stronger in long-term operations than an undocumented technically correct answer.

Infrastructure / deployment standards

The cloud / runtime defaults that work for startups and mid-sized teams alike 90% of the time. Stretching to K8s or multi-cloud only becomes necessary when revenue and team size grow substantially.

PracticeContentPhase
Lean on a single cloudOne of AWS / GCP / AzureAll phases (up to ~$100M revenue)
ECS Fargate / Cloud RunStandard for container opsMVP-to-mid before reaching for K8s
Manage all resources via Terraform / CDKBan manual setup completelyFrom engineer #1
RDS in private subnetNever put DBs on public networksNo exceptions
2 AZ, RTO 1h / RPO 15minMinimum availability barMinimum target for business systems

Phase-by-phase in practice: MVP runs on ECS Fargate single-AZ, RDS t4g.small, ~$30/month. Growth phase (DAU 100k+) adds 2 AZ + Auto Scaling + CloudFront. Enterprise (internal business, regulated) layers on Multi-AZ + VPC endpoints + AWS Control Tower.

The default play is single cloud + managed services. Distributed and DIY are too early for 90% of teams.

Data standards

Because data, unlike applications, cannot be rebuilt, the first choice ripples for five years. The current industry default is RDB + strict schema definitions at the core, and AI-era assumptions don’t change that.

PracticeContentWhy
PostgreSQL as first choiceSchema, JSONB, pgvector, extensibility — all presentCloses off the schemaless escape hatch
Separate OLTP and OLAP earlyDon’t mix operational and analytical DBsAnalytical queries on production are dangerous
History tables or Event SourcingDon’t overwrite-update; keep historyPays off for audit, AI, incident analysis
dbt tests / Great ExpectationsAutomate data quality checksBy the time you notice, inconsistencies are in the tens of thousands
Backups must run restore drillsQuarterly recovery rehearsalHaving backups doesn’t mean recovery works

Numeric gates: tables over 10M rows need partitioning; >10k RPS triggers streaming (Kafka / Kinesis); DWH selection from Redshift / BigQuery / Snowflake.

Data architecture failures are 5x heavier than application architecture failures. Compromises here echo for five years.

Application standards

Code design boils down to “≤300 lines per file, ≤50 lines per method, max 3 levels of nesting.” Sticking to those primitive numerics prevents most maintainability problems. More effective than DDD or Clean Architecture theatrics.

PracticeContentThreshold
Single Responsibility Principle splitsOne class / file / method = one responsibility≤300 lines/file, ≤50 lines/method
Business logic in the appDon’t push it into stored proceduresPreserves DB migration optionality
Don’t swallow errorscatch -> log + rethrowSwallowing is fatal for incident detection
Domain-term namingAvoid data, manager, utilNames that make intent readable
Constructor injectionAvoid all-staticFoundation for testability
Optional / Result typesReplace null with type-encoded statesEliminates missed null checks

Teams that quietly stick to numeric upper bounds tend to produce more in five years than teams flexing complex theory.

Frontend standards

The current industry standard is the three-piece set: meta-framework + utility CSS + managed authentication. Hand-rolled auth, in-house CSS systems, and raw React routing rarely produce returns proportional to their effort.

PracticeContentExamples
Use a meta-frameworkDon’t hand-roll routing, SSR, buildNext.js / Astro / Remix
JWT in HttpOnly Cookie + BFFlocalStorage storage bannedHide tokens behind a BFF
Auth via SaaSClerk / Auth.js / Auth0 / CognitoDIY auth is a vulnerability factory
Tailwind + shadcn/uiDon’t build a custom CSS design systemOptimal for hiring and learning cost
Images via CDN transform + WebP / AVIFTarget LCP under 2.5sCore Web Vitals work
JS bundle ≤ 170KB (after gzip)Code-split for staged deliveryInitial render under 3s on 3G

For SEO-critical: SSG / ISR. Dashboards: CSR + API. Content + interactivity: SSR + RSC (React Server Components — server-side rendered, ships HTML and minimal JS to the client). That’s the current rate.

“Raw React with hand-rolled routing” is, in 2026, a poor choice. Riding a meta-framework is the standard.

Security standards

Security is cheapest when standardized from day one; bolting it on later costs 100x more. The industry default is delegate, defense-in-depth, least privilege. Avoid in-house implementation thoroughly.

PracticeContentRequired level
Auth delegated to IDaaSAuth0 / Cognito / Clerk / OktaDay 1 of new services
MFA mandatory for all usersTOTP / Passkey over SMSNo exception for admins
Standardize on PasskeyFIDO2-based passwordlessStandard for new services today
TLS 1.3 mandatory, 1.2 minimumDisable 1.0 / 1.1Enforced on all traffic
Secrets in Vault / Secret ManagerDetect Git contamination via pre-commit hookDay 1 of development
Zero Trust assumedDon’t trust the inside of the VPNAuthenticate every request
PII masking in logsDon’t log raw personal dataData protection law compliance

Don’t build it, delegate it is the iron rule. The number of organizations in any country where in-house security implementation is justified can be counted on one hand.

Monitoring / operations standards

“Running” and “observable” are different things. Build the visualization stack from the start and run operations on numbers, not gut feel — the SRE-style standard configuration.

PracticeContentTarget
Standardize structured logs (JSON)Format that assumes search and aggregationFrom day 1
SLO / SLI / Error budgetDiscuss availability numerically99.9% (43 min downtime per month)
Three pillars assembledMetrics, logs, traces unifiedDatadog / New Relic / Grafana Stack
On-call + PagerDutyReliably reach a human with alertsNight shifts always rotate
Runbook maintenanceDocument procedures for major incidentsGranular enough for new hires to handle nights
Mandatory postmortemsRecord cause and countermeasure post-incidentRecurrence prevention, not blame
Production changes only via CI/CDBan production SSHReproducibility, audit-ability guaranteed

“What’s not visualized may as well not exist” is the operations community’s shared assumption.

Process / organization standards

The most common project failure mode is winning on tech and losing on process. The three things that determine long-term operational success: decision records, phased migration, buyer-side understanding.

PracticeContentEffect
Leave decisions in ADRsOne-pager per “why”Prevents the “nobody can answer in 3 years” problem
Strangler Fig phased migrationAvoid big-bang rewritesAvoids years and millions in burning projects
PoC -> implementation orderAlways validate uncertain tech firstEstimation accuracy moves by 2x+
Mix architects and implementersAvoid isolated idealismDesigns aligned with the field
Use Conway’s Law in reverseReverse-engineer system boundaries from team structureAlign team and API boundaries
Buyer also understands architectureDon’t outsource without understandingOperations handoff stays possible

A state where “the buyer doesn’t understand what’s running their tech” is a future explosion. This is industry common knowledge.

AI-era standards

When AI-driven development is assumed, “can AI fluently write or read this?” moves to the center of the selection axis. The four things that determine AI-compatibility: mainstream framework, type safety, declarative, CLI-operable.

PracticeContentAI-era effect
Lean on mainstream frameworksNext.js / Django / Rails / FastAPIVolume of training data drives productivity
Make types and schemas explicitTypeScript / Pydantic / dbt modelsSuppresses AI hallucinations
Tools operable via CLI / API / IaCAvoid GUI-only toolsLets AI take over operations
Build data catalogs and metadataDocument descriptions, tags, relationshipsRAG and AI agents jump in accuracy
AI-generated code goes through normal reviewNo skip-the-checks deploysPrevents vulnerabilities from reaching production
Design assuming pgvector / PineconeDon’t bolt vector search on laterCuts cost of adding RAG features

The current design principle centers on “can AI fluently write and read this?” Choosing along that axis ends up producing human-friendly designs as a side effect.

Combinations that win with “boring tech”

Stack Overflow running .NET + SQL Server + Redis on 9 servers for 100M+ monthly hits is the canonical “winner who picked boring tech” story. Conversely, Uber’s post-2,200-microservice consolidation into DOMA (Domain-Oriented Microservice Architecture) is frequently cited as “the post-mortem on going to maximum decomposition.”

The current no-fail default combination is below. For new projects, pull from this stack and only swap the elements you really need to.

LayerDefault
CloudAWS (Tokyo region)
RuntimeECS Fargate
DBPostgreSQL + pgvector
BackendPython (FastAPI / Django) or Go
FrontendNext.js + Tailwind + shadcn/ui
AuthClerk or Auth.js
MonitoringDatadog or Grafana Cloud
IaCTerraform
CI/CDGitHub Actions

Teams that can keep a boring stack running straight for five years are, in the end, the strongest teams.

Author’s note — picking “the standard” is a fight against feeling lame

The unexpectedly hard part of architecting, repeatedly mentioned in industry circles, is shaking off the three temptations of “latest,” “cool,” “educational.” Conference-spotlight cases assume “substantial scale, substantial team, substantial budget” — copying them with a 10-person team usually means the core feature work doesn’t happen.

Shopify still running Ruby on Rails as a monolith at massive scale, Basecamp deliberately building HEY’s email service on “boring tech,” Amazon still using C++ and an in-house RPC behind S3 after 20 years — all of these point to the courage to accept being boring as a common trait of winners.

Teams that swallowed the lameness and leaned on standards keep humming five years later. The thing an architect should be proud of is not “how new the technology is” but “the product has been running for five years without stopping.” That, more than anything, is what people who keep doing this work for a long time keep in mind.

Self-check checklist

Confirm whether the standards are in place. Failing 3+ items is a red zone; revisit the relevant references.

  • Production decisions are recorded in ADRs.
  • All resources managed via Terraform / CDK (no manual setup).
  • RDS / DBs always in private subnets.
  • Auth delegated to IDaaS (Auth0 / Cognito / Clerk / etc.).
  • MFA mandatory for all users.
  • Logs output as structured JSON.
  • SLOs explicitly defined numerically (e.g., 99.9%).
  • Production changes always via CI/CD.
  • Runbooks documented for major incidents.
  • Backup-restore drills run regularly.

How to make the final call

The essence of best practices is adopting the combination most field teams have survived with as your opening move. Other locally superior options exist, but options where information density, talent availability, and operational track record all line up are typically narrowed to one or two at any given moment.

In the AI era this trend deepens. Frameworks AI writes fluently in produce multi-fold productivity boosts, widening the gap with niche options. Boring but standard directly translates into designs friendly to humans and AI alike.

Selection priority

  1. Standard / majority — information density and hire-ability are long-term winners.
  2. Managed / SaaS — wins on vulnerabilities and operational cost over DIY.
  3. Type-safe / declarative — boosts both AI accuracy and maintainability.
  4. Phased migration possible — avoid big-bang rewrites, leave room to swap.

“Choose the standard, and nobody can blame you in 3 years.” Eccentric selections, even when successful, become tribal knowledge; when they fail, they isolate you.

Summary

This article covered the best-practice catalog end-to-end — domain-by-domain defaults, boring tech, the AI-era standard stack, and the design posture for systems that hum quietly for five years.

Lean on the standard, delegate to managed, fasten with types, migrate in phases. That is the realistic answer for best practices in 2026.

The next article covers the “major incident catalog” — Knight Capital, Equifax, SolarWinds, CrowdStrike, and other cases where the industry paid hundreds of millions of dollars. A practical reference for learning from those bills.

Back to series TOC -> ‘Architecture Crash Course for the Generative-AI Era’: How to Read This Book

I hope you’ll read the next article as well.