About this article
As the fifth installment (final) of the âSolution Architectureâ category in the series âArchitecture Crash Course for the Generative-AI Era,â this article explains PoC design.
PoC is investment to produce decisions - PoCs without answers are failures. This article handles pre-defined Go/No-Go criteria, period setting (within 3 months), differences from MVP, AI-PoC specifics (accuracy, hallucination rate), and weekly PoC cycles - design that doesnât end with âso what?â
What is PoC in the first place
Think of a tasting session. Before officially adding a new dish to the menu, you prepare a small batch to verify the taste, cost, and operations. The purpose is to test small before committing to full investment â âis it really good?â âdoes it justify the cost?â
PoC (Proof of Concept) is the IT version of a tasting session. Before investing tens of millions to hundreds of millions into serious development, you verify technical and business feasibility with a small prototype and obtain Go/No-Go decision material.
Without a PoC, jumping straight into serious development means the entire investment is wasted the moment technical impossibility is discovered. PoC is the mechanism for keeping failures small.
Why PoC is needed
Lower risk before serious development
Before serious investment of tens of millions to hundreds of millions, verifying with millions minimizes loss on failure.
Reduce uncertainty
New tech, new operations, AI usage - many elements unknowable without trying. Getting reliable info via PoC is rational.
Build basis for decisions
In scenes where convincing management requires demonstration, actually-running small prototypes are more eloquent than anything.
PoC vs prototype vs MVP
PoC, prototype, and MVP are similar but different. With different purposes, design policies differ too.
| Type | Purpose | Users |
|---|---|---|
| PoC (Proof of Concept) | Verify âfeasibilityâ | Internal stakeholders |
| Prototype | Verify âusabilityâ | Some users |
| MVP (Minimum Viable Product) | Minimum form for market launch | Real users |
PoC is internal experiment for Go / No-Go judgment, MVP is product measuring whether value emerges in market. Confusing them breaks down design.
What to verify in PoC
PoC doesnât verify everything - the iron rule is narrowing to the most uncertain parts. Define âwhat proven lets us proceed to serious development.â
| Verification target | Example |
|---|---|
| Tech feasibility | Whether it really works with this tech |
| Performance achievability | Whether processing speed meets requirements |
| Business fit | Whether used in the field |
| Data quality | Whether expected results emerge with data |
| Cost validity | Whether buildable at expected cost |
| Vendor capability | Whether candidates really can do it |
PoCs verifying already-known things are waste. Choose only unknown / uncertain parts.
What not to verify in PoC
PoC also clarifies scope not to verify. Vagueness here makes PoC bloat and become indistinguishable from serious development.
| Shouldnât verify | Reason |
|---|---|
| Fine UI design | Handled in serious phase after PoC |
| Scalability | Hard to judge at small scale |
| Full production data | Samples enough |
| Already-verified tech | No point PoC-ing |
| All-feature implementation | Scope explosion |
Go/No-Go judgment criteria
The most important PoC design element is the Go/No-Go judgment criteria. By pre-deciding âif this number is achieved, Go; if not, No-Go,â prevents post-PoC emotional disputes.
flowchart TB
START([PoC start<br/>document criteria upfront])
EXEC[PoC execution<br/>1-3 months]
M1{Processing time<br/>P95 <= 500ms?}
M2{Accuracy >= 90%?}
M3{Cost<br/>within 1.5x of expected?}
GO[Go: serious-dev approval]
PIVOT[Pivot: scope change]
NOGO[No-Go: stop<br/>consider alternatives]
START --> EXEC --> M1
M1 -->|Yes| M2
M1 -->|No| NOGO
M2 -->|Yes| M3
M2 -->|partial| PIVOT
M2 -->|No| NOGO
M3 -->|Yes| GO
M3 -->|No| PIVOT
classDef start fill:#fef3c7,stroke:#d97706;
classDef step fill:#dbeafe,stroke:#2563eb;
classDef good fill:#dcfce7,stroke:#16a34a,stroke-width:2px;
classDef pivot fill:#fae8ff,stroke:#a21caf;
classDef bad fill:#fee2e2,stroke:#dc2626;
class START,EXEC start;
class M1,M2,M3 step;
class GO good;
class PIVOT pivot;
class NOGO bad;
Starting PoC without pre-deciding judgment criteria is the worst. Disputes over âis this success or failure?â after finishing.
PoC period design
The principle for PoC is short, with clear deadline. Realistic to keep within 3 months at most - longer means scope too wide.
| Period | Suited PoC |
|---|---|
| 1-2 weeks | Tech selection, vendor evaluation |
| 1 month | Single-feature feasibility |
| 2-3 months | Verification including operations |
| 3+ months | Closer to serious development than PoC |
The trap of PoC is continues forever without setting deadline. The iron rule is delimiting time and producing answers.
PoC regime
The principle for PoC is few people, short concentration. Large-scale regime canât move and decisions delay.
| Role | People guideline |
|---|---|
| Architect (lead) | 1 |
| Engineer | 1-3 |
| Business expert | 1 |
| Project manager | 0.5 |
| External vendor (when needed) | 1-2 |
The ideal is 5 or fewer, minimizing communication cost. Large-scale PoCs have management cost exceeding effect.
Gap between PoC and serious development
PoC working doesnât mean serious-dev success. PoC just shows âcan be doneâ; scale, operations, maintenance are separate issues.
| Areas insufficient in PoC | Content |
|---|---|
| Large-scale data | 100x, 1000x scale |
| Concurrent users | Behavior with simultaneous use |
| Production operations | 24/7 ops, incident response |
| Security | Production-level countermeasures |
| Governance | Permissions, audits |
| Other-system integration | Real-env connections |
PoC success != project success. Designing how to leverage PoC results in serious development is also important.
AI PoC specifics
AI / ML PoC need verification axes different from conventional. Beyond ârunning,â âproduces business valueâ and âcan maintain accuracy continuouslyâ matter.
| Special verification items | Content |
|---|---|
| Data quality | Whether learning data is sufficient |
| Accuracy / recall | Level usable in business |
| Hallucination | Wrong-answer rate for LLMs (Large Language Models) |
| Continuous learning | Temporal accuracy degradation |
| Explainability | Transparency of judgment reasons |
| Cost | Inference-cost actuals |
In LLM PoCs, âmay significantly exceed expectations on one hand, accuracy drops at scaleâ - careful evaluation is needed.
Choices after PoC
After PoC, choose from 3 options. Ending in No-Go is also fine PoC outcome.
| Option | Content |
|---|---|
| Go (serious development) | Goal achieved, serious investment starts |
| No-Go (stop) | Hard to realize, consider alternative approach |
| Pivot (direction change) | Partial success, re-consider with scope change |
Culture of not seeing No-Go as shame is important. Failed PoCs are successes preventing failure in serious investment. Some orgs reward failed PoCs.
PoC outputs
PoC outputs arenât just running code - judgment documents are included too. The iron rule is leaving in form management can read.
| Output | Content |
|---|---|
| Working prototype | Verification code |
| Evaluation report | Measurement results, judgment |
| Go / No-Go recommendation | Next recommended action |
| Risk list | Notes for serious development |
| Estimate (refined version) | Re-calculated ROI for serious dev |
| Demo video | For management |
Working code + 1-page summary is most effective for management.
Decision criterion 1: uncertainty level
The higher project uncertainty, the higher PoC value. With known tech and similar-project experience, PoC unneeded.
| Uncertainty | PoC needed? |
|---|---|
| Known tech, known operations | Unneeded |
| New tech, known operations | Tech PoC recommended |
| Known tech, new operations | Business PoC recommended |
| New tech, new operations | Multiple PoCs required |
| Research element | Exploration PoC + R&D (Research and Development) |
Decision criterion 2: investment scale
The bigger serious investment, the higher PoC value. Serious PoC for small projects is excessive.
| Serious investment | Recommended PoC |
|---|---|
| ~JPY 5M | No PoC, serious development |
| JPY 5-30M | Lightweight PoC (1 month) |
| JPY 30M-100M | Serious PoC (2-3 months) |
| JPY 100M+ | Multiple PoCs + phased |
How to choose by case
New-tech selection / vendor comparison
1-2 week tech PoC + quantitative comparison table. Run 3 candidate companiesâ products in same scenario, compare performance, usability, and cost. Concentrated 1-2 engineers, judgment criteria pre-agreed with clear thresholds for performance numbers and licensing fees.
AI / LLM utilization projects
1-4 week AI PoC + accuracy / cost / hallucination-rate evaluation. Verify with real-data samples from internal data, prototype with Dify / LangChain, Go judgment based on âbusiness-usable accuracy X%+ + monthly inference cost within JPY Y.â Note risk of accuracy degradation at production scale.
Operations reform / RPA / workflow
2-3 month business PoC + field-user participation. Select 5-10 pilot users from business departments, have them use real operations for 1 month, measure time / mistake reduction. Judgment criteria are âachieving X hours monthly reductionâ and user-satisfaction score.
Large-scale core reform / JPY 100M+
Multiple PoCs in parallel + phased decision gates. Run tech PoC, data PoC, business PoC in parallel, Go/No-Go judgment meeting after each PoC, serious-dev approval only after all pass. Vendor selection also via PoC for actual capability measurement.
PoC scale / period numerical gates
Note: Industry baseline values as of April 2026. Will become outdated as technology and the talent market shift, so requires periodic updates.
The iron rule for PoC is short and clear. Below are industry-standard guidelines.
| PoC scale | Period | People | Budget guideline | Judgment criteria |
|---|---|---|---|---|
| Tech-selection PoC | 1-2 weeks | 1-2 | ~JPY 1M | Performance numbers + licensing fees |
| Single-feature feasibility PoC | 1 month | 2-3 | JPY 1-5M | Whether technically running |
| AI/LLM PoC | 1-4 weeks | 2-3 | JPY 1-5M | Accuracy + cost + hallucination rate |
| Business-included PoC | 2-3 months | 3-5 | JPY 5-20M | Achieving business-time-reduction goal |
| Pre-serious-investment PoC | Within 3 months | 5 or fewer | 5-10% of serious investment | Multiple criteria simultaneously achieved |
PoCs over 3 months are close to serious development, the sign to review scope. 5-10% of serious investment is the PoC-budget guideline - for JPY 100M projects, JPY 5-10M PoC budget is appropriate. In the AI era, 1-week PoC has become realistic.
PoC is within 3 months, 5 or fewer, numerical criteria required. Missing this falls into PoC hell of unable-to-judge.
PoC-design pitfalls and forbidden moves
Typical accident patterns in PoC. All end in iridescent reports of âsort of workedâ.
| Forbidden move | Why itâs bad |
|---|---|
| Start PoC without deciding Go/No-Go criteria | Disputes over âsuccess/failureâ after, report meetings drift |
| No upper bound for PoC period | Continues forever, âalmost has answerâ extends 6 months |
| Try to verify all features | Scope explosion, becomes indistinguishable from serious development |
| PoC-ize known tech / known operations | No point verifying, resource waste |
| Run with large headcount (10+) regime | Management cost exceeds effect, 5 or fewer principle |
| Inject PoC code as is into production | Quality not production-ready, rewrite required |
| Recognize No-Go as shame | Lost value of preventing serious failure, firestorms by forcing Go |
| Run 14 AI PoCs in parallel with zero conclusions | Org exhaustion, management loses AI expectation |
| Report PoC results with just running code | 1-page summary + demo video lands with management |
| Confuse MVP / prototype / PoC in operations | Design breakdown from purpose mismatch, internal-judgment vs market-launch are different |
Netflixâs âTest and Learnâ culture (annually hundreds to thousands of A/B tests, each with success / failure conditions / period pre-coded, statistical-significance auto-judgment) is a success case systematically erasing un-judgable PoCs. In contrast, the 14-PoCs-parallel hell case (each departmentâs own, no judgment criteria, zero conclusions a year later) shows the cost of vague purposes and judgment criteria.
Go/No-Go criteria are PoC insurance and human-relations insurance. Write on A4 paper and sign.
| âPoC must not failâ â fearing failure | No-Go is a kind of success; has the value of preventing serious failure | | âExtending PoC period brings successâ â dragging on | PoCs without answers donât produce them with extension; reset needed |
AI decision axes
| AI-era favorable | AI-era unfavorable |
|---|---|
| 1-week PoC, high-frequency verification | 3-month-fixed PoC plan |
| Multiple-case parallel verification | Verify only 1 case |
| AI-premised business design | Conventional-business PoC |
| Continuous small PoCs | One-shot large PoC |
- Pre-decide Go/No-Go criteria numerically â seal disputes from vague judgment
- Narrow to most uncertain parts only â PoCs verifying known tech are waste
- Few people, within 3 months â continues forever without deadline
- Make weekly via AI utilization â high frequency, multiple-case parallel, fail fast learn fast
What to decide - what is your projectâs answer?
For each of the following, try to articulate your projectâs answer in 1-2 sentences. Starting work with these vague always invites later questions like âwhy did we decide this again?â
- Verification purpose (what to prove)
- Judgment criteria (Go/No-Go numbers)
- Period (usually within 1-3 months)
- Regime (few people, clear lead)
- Verification scope (do, donât do)
- Outputs (code, report, demo)
- Post-PoC progression (serious / stop / Pivot)
Authorâs note - cases of âPoC hellâ wasting a year
Cases of vague-purpose / judgment-criteria PoCs continuously exhausting orgs are repeatedly told.
A large enterprise, under the policy of âutilize generative AI in operations,â had each department launch generative AI PoCs independently, with the result of 14 PoCs running in parallel a year later, all âsort of workedâ without conclusions, zero serious deployments - cases often reported. PoCs without judgment criteria become âcontent for report meetingsâ rather than success or failure, with field engineers exhausted, management losing AI expectations - falling into a vicious circle.
In contrast, Netflixâs âTest and Learnâ culture is cited as a PoC-design success case. Netflix runs hundreds to thousands of feature A/B tests yearly, but pre-declares âsuccess / failure conditions / periodâ in code for each test, with mechanisms automatically judging when results become statistically significant. Without waiting for human judgment, Go/No-Go gets decided mechanically, systematically erasing âunjudgable PoCs.â
Both show the truth from front and back that PoC value is producing decisions; PoCs not producing decisions are blocks of existence cost. Go/No-Go conditions are PoC insurance and human-relations insurance.
Summary
This article covered PoC design, including Go/No-Go criteria, period, regime, differences from MVP, AI-PoC specifics, and weekly-cycle-ization.
Pre-decide Go/No-Go, narrow to uncertain parts, cut at within 3 months, run weekly. That is the practical answer for PoC design in 2026.
And this was the final installment of the âSolution Architectureâ category. Next time weâll start a new category (Case Studies). Plan to dig into how the judgment axes learned in all categories so far combine in the field, through scale / phase-specific real-case comparisons.
I hope youâll read the next article as well.
đ Series: Architecture Crash Course for the Generative-AI Era (79/89)