World Models in Product Development
or How to Stop Worrying and Love AI
Article 1 of 3: From Agile to World Models
Introduction: The Pattern We've Seen Before
Companies are spending more than $500 billion on AI this year. Software development teams question whether they need project managers when AI can ideate, develop requirements, and iterate with the team. Applications can be created from whole cloth and deployed by developers alone.
In the wider corporate world, AI prompts in browsers and applications are available to automate clerical tasks as well as the operational tasks that once defined project management. However, 95% of these business-focused AI projects deliver no return on investment, and an increasing number of investors and analysts are demanding 'show us the money'!
A very similar pattern happened before - with Agile - and we spent twenty years figuring things out through trial, error, and painful transformation. The difference is that AI adoption is measured in months and years, not decades. Delay will have existential costs.
The Central Challenge
AI adoption requires the same fundamental shift Agile needed: role clarity combined with reliable data pipelines. But this time we're armed with experience, and we can't afford another generation of costly mistakes.
The solution lies in world models‚ co-evolving representations of intent, roles, and data that adapt as AI changes where bottlenecks emerge in the product development cycle. This isn't a new framework to learn. It's an evolution of the roles you already have, informed by a decade of experience moving product development teams toward the processes needed to successfully adopt AI.
The Core Problem: Why AI Projects Are Failing
The Numbers Tell a Story
Initial enthusiasm for AI tools leads to high usage rates that fall off quickly and precipitously. Companies introduce AI programs and see short-term productivity increases followed by rapid decreases. The pattern is consistent across industries and use cases.
Three Root Causes
1. The Trust Gap
AI is running into serious hurdles even in the world of developers. There is a cost when tools don't provide a reliable user experience. When AI tools break, are inconsistent, or return too much slop, no amount of prompt engineering or agentic magic will overcome lagging trust and the high costs of engagement.
Questions plague adoption:
- What can we safely enter into context windows?
- How do we validate AI outputs in high-stakes domains like legal, medical, and financial services?
- What accountability frameworks apply when AI produces errors?
The slop problem creates a vicious cycle: when AI breaks or produces inconsistent results, developer collaboration suffers, and impressive solo accomplishments may not scale to teams. No amount of technical sophistication can recover from broken trust.
2. Data Fragmentation
Many organizations are finding that their internal data is too fragmented or messy for AI to use reliably. Connecting AI with legacy infrastructure is often more expensive than the AI software itself.
This problem has deep roots in Agile philosophy. Agile conditioned us to maximize work for short-term investment, creating a lossy approach to information and documentation. The marginal value of documentation was deemed low, and capturing the intent behind why changes were made was considered even lower. Product managers focused on the next release rather than the past.
The result: data wastelands. "A wiki is where information goes to die."
AI tools are particularly good at creating value from data. Given the product development philosophy influenced by Agile, there should be little surprise that AI is not attracting paying customers. Without data, there is no value proposition.
3. Role Confusion
Outside of software development, AI diffusion is not delivering on the promise to provide systemic core business value. Experience is sparse on how to calculate ROI. Staff require guidance toward cultural adaptation as well as coaching and clarity about their roles along the way.
The fundamental questions remain unanswered:
- Where does AI fit in existing workflows?
- Who can direct and oversee AI operations?
- How do we address fears about job displacement?
- What happens when AI lowers the cost of cognitive tasks but creates new bottlenecks in human review?
CEOs adopt AI-first initiatives with savvy investor outreach but unclear expectations. The force to conform is strong, especially with actions that create soundbites to please investors. "Reduced unnecessary headcount by adopting AI" is currently in vogue, even when the productivity gains don't materialize.
Why This Feels Familiar
These challenges mirror exactly what we experienced during early Agile adoption:
- Trust issues with new tools and processes
- Data infrastructure problems
- Role confusion and cultural resistance
- CEO pressure for quick wins to demonstrate value to investors
We've seen this all before. The question is whether we've learned enough to avoid repeating the same mistakes.
What Agile Taught Us: The Evolution from Iteration to Data-First
The Agile Philosophy and Its Blindspot
All flavors of Agile evolved to protect the most expensive and time-consuming aspects of product development‚ generally building and testing a product. Agile introduced team-based rituals to factor product requirements into smaller parts and then apply a dynamic "measure twice, cut once" approach to implementation.
What Agile Optimized For:
- Protecting expensive resources: development time and testing cycles
- Adapting requirements within sprints based on feedback
- Team collaboration and iterative delivery
What Agile Deprioritized:
- Documentation ("low marginal value")
- Capturing intent behind decisions ("even lower value")
- Data versioning and curation
- Cross-functional knowledge sharing beyond immediate sprint needs
By design, feature requirements and use cases evolved within sprints, and it took tremendous work to coordinate and sequence integration, validation, and documentation. Market pressures drove teams to spend time on work that provided immediate value in support of sprint goals.
Agile never solved the core challenge of curating and versioning data related to the product development cycle. Good GitFlow and DevOps mitigated this data loss to some extent, but even the clear benefits of automation weren't enough to support maintaining test cases over time.
The Inflection Point: When Automation Changed Everything
Prior to AI, the previous "big shift" was toward automation. QA morphed into Test Automation, which became part of automated DevOps pipelines with change control and deployment. This introduced one problem that ruled them all: automation requires clear requirements against which to test.
The Challenges That Emerged (circa 2010-2021):
Teams from startups to global power companies had to adjust to a much faster develop-and-deploy cycle. Just-in-time changes made during refinement, in-sprint, and in feature branches required integration with other teams' work.
Prior to automation, the relatively slow manual merge process provided time for PM review, "enough" testing, and multiple attempts at deployment‚ ideally with some trailing initiative to create documentation and occasionally give Customer Support insights into changes and issues.
With automation, that buffer disappeared. The costs of production suddenly valued accurate data and documentation. To support automation, Agile teams had to spend time on work that provided longer-term value: versioned requirements, documentation, and validated test coverage.
The Failed First Response
Very few focused on how role-based tribal knowledge, experience, and wisdom were the key to successful digital transformation. The initial response was to include everyone: Agile teams adjusted by including members from design teams as well as customer support, finance, and marketing. Stakeholders were invited into the fold.
Why It Failed:
This helped surface a few issues but more generally increased overhead, slowed development, and made it more difficult for team members to understand what was going on. People are not very good at understanding their own roles in complex systems.
The Solution That Worked: Role-Based Sources of Truth
The next iteration of data-first Agile resolved most of the problems. Representatives from Product Management, Design, Development, Validation, Deployment, and other stakeholders were still included in teams‚ but they were now accountable for being the sources of truth for their respective roles.
Team Contracts with SLAs:
- If a requirement was unclear or inconsistent, the design team was required to solve the problem
- Changes were reflected in versioned design specifications and integration tests
- Documentation was updated before a change was considered done
- Customer support and marketing were involved in the business decision to deploy features
Each Role as Source of Truth:
- Product Management: Requirements and business context
- Design: User flows and specifications
- Development: Implementation and technical decisions
- Validation: Test coverage and acceptance criteria
- Deployment: Release management and infrastructure
Building Trust Through Success
The companies that successfully incorporated automation and DevOps pipelines focused on working with the people in each role. They understood what each role needed to gain trust in the rest of the system and, just as importantly, secured critical buy-in for taking on a few small tasks and processes in addition to existing responsibilities.
A problem with this type of distributed model to increase productivity is that everyone has to trust that it works. The most effective way to build trust moving to a data-focused Agile process is success.
The Timeline:
In general, it took three to six months for teams to look back and wonder how they did things any other way. Successful feature launches involved Marketing and Finance. Customer Service had the new user flows and issues in hand prior to release. Documentation always reflected the current version.
Stakeholders provided more autonomy. Customers were happier. CEOs were rewarded by investors.
Key Lessons for AI Adoption
- Role clarity matters more than tools - The best AI won't fix unclear responsibilities
- Data pipelines require buy-in from everyone - One broken link breaks the chain
- Trust takes time but not decades - 3-6 months for process adoption, 2-3 years for full cultural change
- Success breeds adoption - Small wins matter more than grand visions
- Incremental role changes work better than wholesale transformation
Why AI Is Different (And Why the Fundamentals Remain the Same)
The Accelerated Timeline
Agile adoption played out over twenty-plus years across industries, with each organization learning painful lessons independently. Digital transformations were decades-long journeys, with shiny new approaches emerging every few years to encourage CEO engagement and cajole the laggards.
AI adoption is happening at a fundamentally different pace. The growing pains and failures appear more quickly and publicly. The successes are difficult to quantify but far easier than the early days of Agile. Market pressure, competitive advantage, and investor expectations compress decision cycles from years to months.
Critical difference: Delay has existential costs in the AI era. Organizations that fall behind may not have time to catch up.
The Expanded Scope
The rate of AI adoption is faster than Agile and across a more broad range of industry. Agile primarily focused on software development and gradually diffused to other business verticals. AI represents simultaneous adoption across finance, healthcare, legal, marketing, and operations.
More diverse use cases mean more complex failure modes. The variants and markets for AI tools are in infancy for both frontier models and how to use them. However, AI-washing already provides CEOs leverage to rationalize massive capital investments and layoffs.
The Dynamic Bottleneck Problem
This is where AI fundamentally differs from Agile in a way that demands new thinking.
Agile's Assumption:
Development and testing are the expensive bottlenecks. Build team rituals and processes to optimize around protecting these resources.
AI's Reality:
AI lowers the cost of cognitive tasks that can be expressed as language. The bottleneck moves dynamically depending on which tasks AI can handle effectively.
Examples of Shifting Bottlenecks:
- Code generation: Development is no longer the primary constraint as code review, integration testing, and technical validation become bottlenecks
- Content creation: Writing is cheap and fast, but editorial judgment, brand consistency, and strategic messaging require more human oversight
- Data analysis: Generating insights happens quickly, but validating accuracy, determining business implications, and making decisions based on analysis require expertise
- Customer support: Response generation is instant, but empathy, escalation judgment, complex problem-solving, and relationship management can't be automated
The Catch-22:
The bottleneck for adopting AI is not incremental improvements to frontier models. No matter the hype, AI does not have domain experience, intuition, and wisdom to divine human intent, ask the right questions, or reward innovation.
The costs of human labor in data preparation and cleaning far eclipse even the costs of training, which in itself is magnitudes more costly in compute and intellectual effort than improving fine-tuning, context window expansion, and RAG.
Without clean, structured data, AI can't deliver value. But without AI proving value, organizations won't invest in the data infrastructure needed to make it work.
The Similarities That Matter
Despite these differences, the fundamental problem is identical to what we faced with Agile.
Both require:
- Clear role definitions that adapt to new production economics
- Data pipelines that capture intent and context, not just output
- Trust-building through incremental success
- Cultural change that respects human capabilities rather than trying to replace them
The human problems remain constant:
- Skills gaps and training needs
- Cultural resistance to change
- CEO pressure for quick wins and investor optics
- Fear of job displacement
- Difficulty measuring ROI during transition periods
What's Genuinely New
Challenges Specific to AI:
- Opacity: AI decision-making is harder to audit than human or rule-based decisions
- Reliability variance: Performance varies wildly by domain and task type
- Context limitations: How much information can be safely and effectively provided?
- Hallucinations and slop: AI confidently produces wrong answers
- Rapid evolution: Best practices for current models may not apply to next-generation models
Why Traditional Change Management Fails
Standard change management approaches assume:
- Stable tools with predictable capabilities
- Clear best practices emerging from early adopters
- ROI calculations based on established use cases
AI reality demands different thinking:
- Tools and capabilities evolving monthly
- Best practices still being discovered through experimentation
- ROI dependent on organization-specific data quality and role adaptation
We can't wait for someone else to figure it out. Each organization needs a framework that co-evolves with AI capabilities.
The World Model Approach: Co-Evolving with AI
What Is a World Model?
A world model is a dynamic, shared representation of how intent, roles, data, and validation interact within your product development cycle‚ designed to adapt as AI changes the economics of production.
This is not another framework to learn. It's an approach to thinking about your existing roles and processes through a lens that makes AI integration natural rather than disruptive.
The Core Concept:
In AI research, world models refer to internal representations that systems build to model how states change over time in response to specific actions. These models enable AI to predict outcomes and plan effectively.
In business, world models serve a parallel function: they create explicit, shared representations of how decisions, data, and outcomes connect across roles. Just as AI builds internal representations of patterns, organizations need external representations of their knowledge flows that both humans and AI can navigate.
Why "World Models" and Not Just "Process Documentation":
The term emphasizes that these representations:
- Are dynamic (they evolve as capabilities and bottlenecks change)
- Are shared (everyone can see how the pieces fit together)
- Model change (they capture not just what is, but how things transform)
- Enable co-evolution between human practices and AI capabilities
The Three Layers of a World Model
Layer 1: Intent Mapping
This layer captures the "why" behind work:
- What problem are we solving? (Product Management domain)
- What does success look like? (Design and Business domains)
- What constraints matter? (Technical, regulatory, resource domains)
AI's role in this layer: Help articulate and validate intent through natural language interaction. Surface conflicts or ambiguities in stated goals.
Human's role in this layer: Provide domain expertise, business intuition, and strategic wisdom that AI lacks. Make judgment calls about priorities and tradeoffs.
Layer 2: Role-Based Data Pipelines
This layer defines how work flows between roles:
- Each role maintains its source of truth
- Data flows between roles with clear handoffs and SLAs
- Changes are versioned with context about why, not just what
- Dependencies and blockers are explicit
AI's role in this layer: Automate data transformation between formats. Flag inconsistencies. Suggest connections between related information.
Human's role in this layer: Make judgment calls when data conflicts. Resolve ambiguities. Validate that AI suggestions make sense in context.
Layer 3: Continuous Validation
This layer ensures quality at every stage:
- Are we building the right thing? (Product validation)
- Are we building it right? (Technical validation)
- Can we support what we built? (Operational validation)
AI's role in this layer: Pattern recognition across large datasets. Identify anomalies and edge cases. Suggest test scenarios based on historical issues.
Human's role in this layer: Contextual judgment about which issues matter most. Risk assessment for edge cases. Stakeholder communication about tradeoffs.
Why This Enables AI Adoption
Addresses the Trust Gap:
Clear accountability at each step means AI suggestions can be traced back to source data and validated by the responsible human role. There are explicit checkpoints at critical junctures where humans review and approve AI contributions.
Trust builds gradually as teams see AI catching issues they would have missed while also learning when to override AI suggestions based on context it doesn't have.
Solves the Data Problem:
When each role sees immediate utility from maintaining their source of truth, data quality becomes a natural priority rather than an afterthought. Documentation becomes a byproduct of the work rather than additional overhead.
The data pipelines serve humans first: clearer handoffs, fewer misunderstandings, less rework. AI benefits second by having clean, contextualized data to learn from.
Clarifies Roles:
AI handles tasks expressible as language and pattern recognition. Humans focus on intent, judgment, and validation. As AI capabilities grow, roles evolve rather than disappear.
The question shifts from "will AI replace this role?" to "how does this role's focus change as AI handles routine cognitive tasks?"
Provides an Evolutionary Path:
No "big bang" transformation required. Start with current roles and minimal process additions. Add AI tools where they provide clear value. Expand as trust and data quality improve.
Teams can adopt at human pace because the framework anticipates that AI capabilities will continue evolving. The goal is not to optimize for today's AI but to create structures that adapt as AI improves.
How This Differs from Other Approaches
vs. AI-First Transformation:
AI-first approaches often assume AI can replace complex human judgment and push organizations to reorganize around AI capabilities. This creates resistance and often fails when AI can't deliver on promises.
World models start with humans and add AI incrementally. The focus is on amplifying existing strengths rather than wholesale replacement.
vs. Traditional Change Management:
Traditional change management assumes you're moving from a known current state to a known future state and that the tools are relatively stable.
World models are designed for dynamic capabilities. They emphasize co-evolution rather than transformation to a fixed end-state and include built-in adaptation mechanisms.
vs. Data Governance Initiatives:
Data governance typically focuses on compliance, access control, and preventing misuse. It's often seen as overhead that slows teams down.
World models focus on active data utility in daily workflows. They connect data quality directly to productivity gains, making maintenance feel valuable rather than burdensome.
The Feedback Loop
World models improve over time through a continuous cycle:
- Explicit representation of current state (roles, data flows, workflows)
- AI interaction reveals gaps, inefficiencies, and opportunities
- Human refinement based on what actually works in practice
- Update the model with new patterns and improved practices
- Repeat as AI capabilities and business needs evolve
This creates institutional knowledge that:
- Survives role transitions and employee turnover
- Can be queried by both humans and AI
- Evolves with your organization's unique context
- Becomes increasingly valuable over time as it captures more experience
The world model becomes a living asset that reflects not just how work is done but why it's done that way‚ the accumulated wisdom that makes your organization effective.
The Difference Between This and the Last Twenty Years
Agile adoption was painful because we didn't know where it was going. We experimented, failed, learned slowly, and gradually converged on practices that worked.
We do know where AI adoption needs to go now. We've already learned the hard lessons:
- Role clarity matters more than sophisticated tools
- Data pipelines create value for humans first, AI second
- Trust comes from demonstrated success, not executive mandates
- Incremental change beats big-bang transformation
- People with domain expertise can't be replaced by prompts‚ but they can be amplified
The Promise of World Models
You don't need perfect AI. You don't need complete data. You don't need to transform everything at once.
You need a shared understanding of how work flows through your organization and a framework that adapts as AI changes what's possible.
Start small. Measure honestly. Evolve constantly.
The companies that will thrive with AI aren't the ones with the fanciest models or the most ambitious transformation programs.
They're the ones who figured out how to make AI and humans genuinely better together‚ not by replacing people with prompts, but by building world models that amplify human judgment with machine capability.
You already have the roles. You already have the people. You already have the domain knowledge.
Now you have a roadmap to put them together in a way that actually works.
Next Steps
Ready to start?
- Map your current state (Weeks 1-2): Identify where your sources of truth actually live
- Find your bottleneck (Weeks 3-4): Track where work waits in your current process
- Pick one pipeline (Month 2): Choose a high-value workflow to optimize first
Need deeper guidance?
Each section of this summary will expand into a full chapter with:
- Detailed examples from specific industries
- Templates and tools for each role
- Case studies of successful implementations
- Common pitfalls and how to avoid them
About This Work
This summary draws on ten years of experience guiding product development teams through the exact role and process transformations needed to adopt AI successfully. The approach, rooted in Bespoke Agile principles developed since 2016, anticipated the data-first, role-based solutions that AI adoption requires.
For organizations ready to navigate AI adoption without repeating Agile's costly mistakes, consulting and full-time engagement opportunities are available.
Learn more: bespokeagile.com