You don’t need a perfect data warehouse to start with AI
The two-year data-warehouse project is the most expensive way to delay AI value. Governed agents can work your data where it already lives — today.
Here's the advice that quietly kills AI initiatives: "first we need to centralize everything into a clean data warehouse, then we'll do AI." It sounds responsible. It's usually a two-year detour that delivers AI value roughly never.
Why "warehouse first" is a trap
Enterprise data-warehouse and lakehouse projects routinely run 12–24 months and cost a fortune before anyone sees a result. By the time it's "done," the requirements have changed, the AI landscape has moved on, and — worst of all — a warehouse is a copy of your data. It captures the rows, not the workflow. The renewal process, the approval chain, the way your team actually works lives in the source systems, not in a star schema.
Example: an association spends 18 months and six figures consolidating Salesforce NPSP, Sage Intacct, and Cvent into a warehouse so it can "do AI." Meanwhile the actual need — a board packet that pulls current numbers from all three — could have run on day one against the live systems. The warehouse delivered a dataset; the business needed a workflow.
Agents work your data where it lives
A governed agent doesn't need everything pre-centralized. It reaches into your systems where they are — CRM, finance, documents, events — over governed connectors (MCP, REST, A2A, custom workers), pulls exactly what the task needs, and works with it under policy. Your data stays in its system of record; the agent brings the question to the data, not the other way around.
Example: "prep Q3 renewals." The agent queries NPSP for lapsing members, Intacct for payment status, and Cvent for recent engagement — live, today — and drafts the campaign. No warehouse, no 18 months, and the numbers are current because they came straight from the source.
But what about messy data?
Data quality is real — messy, inconsistent, duplicated data genuinely holds AI back. The mistake is thinking you must fix all of it upfront. You don't. Agents can work with imperfect data and flag what doesn't reconcile, which tells you exactly which data is worth cleaning because it's actually in the path of a decision. You fix the data that matters, in priority order, as you go — instead of boiling the ocean before you've shipped anything.
Example: the reconciliation agent surfaces that 4% of member records have mismatched IDs across two systems. Now you have a targeted cleanup worth doing — not a blanket "clean everything" mandate that never ends.
When a warehouse is the right call
None of this means warehouses are bad. For genuine large-scale analytics — cross-year trend modeling, BI across millions of rows, ML on historical data — a warehouse or lakehouse (Snowflake, Databricks, BigQuery) is the right tool, and we build those too. The point is narrower: a warehouse is not a prerequisite for AI value. Don't let "we're not data-ready" be the reason you wait two years to start.
Your data is your DNA — and DNA works in place, in the living system. You don't have to extract it all into a lab first to put it to work.
That's the practical version of treating your data as your DNA: start where the data already is, govern the access, and let the value compound. See why the governance layer matters before you point agents at live systems.