The AI Ikigai – Chapter 2: Laying the Groundwork – Why Great AI Starts with Boring Infrastructure

Tech Writers at PF

May 19, 2025

When we talk about AI, it’s tempting to jump straight to use cases, models, and clever demos. But the hard truth I’ve learned — across Amazon, Microsoft, Visible, and now at Property Finder — is this:

If your data is a mess and your tech stack is brittle, AI will expose that faster than any audit.

This is why I often say: AI doesn’t start with LLMs. It starts with plumbing.

Before you can scale trust, automation, or decision augmentation, you need clean, contextual, and composable data. You need resilient infrastructure that scales without collapsing under load. And you need a team that treats infrastructure not as a cost center, but as the foundation of intelligence.

If that sounds unglamorous, that’s because it is. But every AI transformation I’ve seen succeed — and every one I’ve seen fail — has hinged on this single truth: you’re only as smart as your systems allow you to be.

The Oxygen Analogy

We’ve heard for years that “data is the new oil.” But oil is a commodity. It’s extracted, refined, and sold. In reality, for AI, a better analogy is this:

Data is the oxygen. Infrastructure is the lungs.

Without high-quality, accessible data flowing through the system, your models can’t breathe. And without the right architecture to transport and process that data, you’re choking before you even start.

A Harvard Business Review survey confirmed what many of us in the trenches already know: 91% of professionals agree a reliable data foundation is essential for AI, but only 55% feel confident in their company’s readiness. And nearly every executive I meet nods when I share this: 99% of ML projects hit a data quality problem at some point.

That’s not an edge case. That’s the default.

At Visible: Cloud as Our Accelerator

When I led the AI transformation at Visible, Verizon’s digital telco, we built an AI routing and prediction engine that resolved 70% of support tickets before a human ever got involved. It was one of the most operationally meaningful AI deployments I’ve seen — not flashy, but deeply effective.

But here’s the part no one talks about on stage: We were able to do it because our backend was clean and cloud-native.

We were on Google Cloud Platform (GCP), our data was unified. Our infrastructure could scale elastically. When we wanted to train a model on millions of support interactions, we didn’t wait on procurement or batch exports — we just did it. That infrastructure agility saved us months of delay. It gave our engineers and data scientists real freedom to explore, test, and deploy.

Infrastructure, not just algorithms, won us that outcome.

Why Property Finder Is Rebuilding Its Core

Fast forward to today. At Property Finder, we’re operating in a market where data inconsistency isn’t just a technical bug — it’s a trust risk.

Real estate is high-friction, high-emotion, and high-stakes. And yet, most of the ecosystem is built on fragmented data: partial listings, inconsistent taxonomy, missing geolocation tags, and legacy systems that don’t talk to each other.

So the first thing we did wasn’t launch an LLM. We started with unification:

We restructured our data services team to own retrieval-ready pipelines and build real-time APIs.
We enforced strict taxonomy standards across agents, listings, and user behavior.
We invested in verified listings and SuperAgent attribution, not as product features, but as trust anchors for downstream AI.
And we are building a foundational platform strategy that doesn’t rely on brittle ETLs, but on event-driven data and modular services — ready for consumption by internal teams, partner APIs, and machine learning alike.

Because the moment your AI model depends on manual exports or weekend QA to make sense of data — it’s already dead on arrival.

Why Most Companies Are Still Playing Catch-Up

Here’s what I see with most AI-hungry organizations today: the desire is strong, but the groundwork is shallow.

54% of leaders say they lack the infrastructure for AI.
Over 92% cite data issues as their #1 barrier.
And yet… billions are being spent on consultants and model training.

It’s like trying to install a smart home system in a house with faulty wiring. You can buy all the gadgets you want — the lights won’t work.

When I speak with other CTOs, many of whom are under pressure from their boards to “show AI progress,” I offer a simple metric:

Can your teams find, access, and trust your most critical datasets in under five minutes? If not, stop building models. Start fixing plumbing.

The Infrastructure Stack That Scales with AI

So what does “AI-ready” architecture look like in practice? In my experience, it means four key pillars:

Unified Data Platform: A single source of truth — or at least a virtual one. This can be a data lake, a lakehouse, or a federated query layer. But it must abstract the fragmentation without compromising lineage.
MLOps Pipelines: Reproducibility is key. You need versioned models, automated deployment paths, CI/CD for ML, and a robust feedback loop to retrain or retire models as behavior shifts.
Real-Time Stream Processing: Batch AI is fine for historical reports. But decision augmentation (recommendations, alerts, dynamic scoring) needs event-driven, stream-first architecture. Kafka, Flink, Pub/Sub — pick your poison, but make it real-time.
Governance-as-Code: You can’t bolt on compliance later. The best teams are now baking data lineage, bias audits, and explainability protocols into their build processes — much like DevSecOps.

We’re inspiring from this exact blueprint at Property Finder. And we’re not doing it all ourselves. We’re partnering with Amazon Bedrock, Anthropic, and other experts at UC Berkley, University of Toronto to integrate the best experts on our terms, into our platforms, not the other way around.

The Legacy Challenge – And the Cultural One

Every established company has skeletons in its tech closet — legacy systems, hand-coded ETLs, siloed CRMs. The question isn’t whether they exist. It’s whether you’ve named them, mapped the dependencies, and created bridges, not excuses.

At Microsoft, one of our hardest transformations was connecting legacy SharePoint and Exchange infrastructures with new cloud-native analytics layers. It wasn’t glamorous. But without it, Office 365 never would’ve scaled.

And beyond the tech debt, there’s the human one: IT teams working in isolation. Data teams chasing quarterly dashboards. No one really owning data as a product.

At PF, we fixed this by creating a data platform org that sits between infrastructure and insight — tasked with making data not just accessible, but useful. We are working on data catalogs, lineage tracking, and self-serve layers so even business users can query the system without a support ticket.

This isn’t just about pipelines. It’s about creating a culture where data is respected — not feared.

My Final Advice for Leaders Reading This

If you’re a CEO, CTO, or business leader under pressure to “do something in AI,” here’s my plea:

Don’t start with the demo. Start with the foundation.

Make one investment this quarter: diagnose your data infrastructure. Identify where your AI ambitions are being quietly throttled by broken plumbing. And treat infrastructure upgrades not as tech debt cleanup — but as the first step in your AI go-to-market plan.

Because AI is not just code. It’s context.

And context starts with data that is clean, connected, and composable.

TL;DR (But Make It Honest)

AI readiness = data + infrastructure + discipline. Don’t let anyone sell you otherwise.
Clean data is not a byproduct — it’s the product before the product.
Cloud-native, event-driven, governed systems are the only way to scale AI responsibly.
Fix the house before you buy the furniture.