Most Companies Aren't Agent-Native. They Just Use AI.
Every company I talk to these days says they’re “doing AI.” Their board asks about AI strategy. Their CEO mentions AI in earnings calls. Their marketing team has added “AI-powered” to the website.
But here’s what I actually see when I walk into an engineering org: most of them are using Copilot. Some of them have built chatbots or added a smart search feature. Maybe a few have spun up an internal agent or two. And almost all of them think they’re further along than they actually are.
The problem isn’t ambition. The problem is clarity. Everyone’s playing the same game, but nobody’s keeping score with the same rules.
At Seeko, we’ve worked with dozens of companies trying to figure out where they actually stand with AI, and we’ve noticed a pattern. There’s a spectrum. Most companies cluster between two levels, and they almost always overestimate which one they’re on. More importantly, there’s a jump somewhere in the middle that looks easy on paper but breaks organizations in practice.
Let me walk you through the Agent-Native Maturity Spectrum.
Level 0: You’re Using Traditional Software
This one’s straightforward. No AI. No ML pipelines. No language models integrated into your product. You’re building features the old way, with deterministic code paths and human-authored logic.
Honestly? Some companies should stay here longer than they do. Not every product needs an AI integration today.
Level 1: Your Developers Use AI
This is where most companies actually start, but they don’t call it that. A developer asks ChatGPT a question. Someone’s using Copilot for code completion. Your team is more productive, sure. But your architecture? Completely unchanged.
The key insight here: the AI isn’t part of your product. It’s not part of your infrastructure. It’s a tool that developers happen to use, like Stack Overflow or a debugger.
This is Level 1, and it’s honestly fine for the team productivity gains. But when someone asks “are you an AI-first company?” and you say yes because you’re using Copilot… that’s where the confusion starts.
Level 2: You’ve Added AI Features
Now things start looking real. You’ve built a smart search feature. You’ve added auto-categorization to your product. Maybe you’ve created an internal tool that uses an LLM to classify customer feedback or generate summaries.
You’ve done engineering work here. Real engineering work. You’ve integrated a model, built evaluation loops (probably), handled rate limiting, managed costs. Your architecture now has AI components in it.
The difference from Level 1: the AI is in your product now. Your customers see it. Your revenue depends on it. You’ve made architectural decisions to include it.
But here’s the thing that separates Level 2 from everything that comes after: you’re still in control. Your human engineers designed the workflows. Your product is still human-operated. The AI features support human decision-making, not replace it.
Most companies I talk to think they’re at Level 3. They’re usually at Level 2. And they have no idea why.
The Jump to Level 3: Where Everything Gets Harder
Here’s where the maturity spectrum stops being about feature parity and starts being about fundamental engineering.
Level 3 is agent-assisted operations. Your agents handle defined workflows with human oversight. You’re not just adding an AI feature to your product anymore. You’re building systems where agents make decisions, take actions, and operate with some degree of autonomy.
This sounds simple. It’s not.
To operate reliably with agents, you need infrastructure that most companies don’t have. Addy Osmani wrote about agentic engineering, and his framing is useful: you can’t just drop an agent into a system and expect it to work. You need eval frameworks to verify behavior. You need observability to understand what the agent actually did and why. You need state management patterns. You need auth models that work for agents, not just humans.
More concretely: you need AGENTS.md files. You need codebases that agents can actually navigate and understand. Greg Brockman at OpenAI recommended that teams start writing AGENTS.md documentation specifically for agent consumption, not humans. You need eval infrastructure that catches drift before your agent breaks in production.
This is engineering work. Not feature work. Architecture work.
The jump from Level 2 to Level 3 is where 95% of agent pilots fail, according to data from Composio. Not because the agents aren’t smart enough. Because the organizations weren’t structured to support them. They built an agent, threw it at their codebase, and watched it hallucinate or degrade because nobody had built the observability to catch it.
(I wrote about this lifecycle in more detail in Agent-Native Systems Become More Deterministic Over Time, Not Less.)
Levels 4 and 5: Agent-Native by Design
Once you’re past Level 3, things get conceptually cleaner, though no less complex.
Level 4 is full agent-native architecture. Your system is designed for agent orchestration from the ground up. You’re running multi-agent systems with intelligent routing. You’re making cost-aware decisions about which model to use for which task. Your entire deployment pipeline is eval-driven. When you deploy, you’re verifying agent behavior at scale, not hoping for the best.
Level 5 is where your organization itself becomes agent-native. Your team structure reflects agent capabilities. Your hiring, your career paths, your business model, all account for the fact that agents are doing work that humans used to do. This is rare. I haven’t seen many companies here yet.
The Honest Take
Here’s what I want to say directly: you probably don’t need to be at Level 5. Most companies shouldn’t be. The framework isn’t a roadmap to higher numbers. It’s a clarity tool.
Some companies should stay at Level 2 forever. If you’ve built a product with AI features that your customers love and your unit economics work, you’re good. Don’t rebuild your architecture just because the spectrum goes up to 5.
The problem is knowing where you are.
I’ve sat in rooms with engineering leaders who were sure they were building Level 4 systems. They had a prototype agent that sometimes worked. They had aspirations to multi-agent orchestration. But when I asked about eval coverage, observability, state management, they had nothing. They were at Level 2, trying to bootstrap Level 4.
I’ve also sat with teams who were doing serious Level 3 work, building real agent-assisted operations with proper infrastructure, and they were underselling themselves. They thought they should be further along.
The LangChain State of AI Agents report captures this well. 89% of companies said they have observability in place. Only 52% actually have proper eval infrastructure. That gap tells you something: most teams have monitoring, but they don’t have the systematic verification that Level 3 actually requires.
What I See in Engagements
When Seeko walks into an organization, we ask one simple question: what can your agents do that you’ve actually verified they can do reliably?
The answer usually determines your level faster than any survey.
Level 1: “Well, our developers use AI tools.” Level 2: “We’ve built features that use language models.” Level 3: “We have agents handling specific workflows, and we know they work because we’ve tested them.” Level 4: “We’ve designed our entire architecture around agent orchestration and we deploy changes based on eval results.”
The difference between a company that understands its level and a company that doesn’t is usually six months of engineering velocity.
If you’re at Level 2 and you know it, you can ship faster. You’re not building eval infrastructure you don’t need yet. You’re focused on the product.
If you’re at Level 2 but you think you’re at Level 4, you’re going to waste a lot of time. You’ll build complexity you don’t have the infrastructure to manage. You’ll deploy agents that degrade in ways you can’t see.
Where This Matters Most
The hardest transition is 2 to 3, but the most important thing is simply knowing where you are.
You don’t need to be at the highest level. You need to be at the right level with the right infrastructure.
If you’re building agent-assisted operations, you need observability. You need evals. You need to understand when your agent is working and when it’s not. That’s not negotiable at Level 3.
If you’re at Level 2, optimize for product velocity. Don’t prematurely build for orchestration you don’t need yet.
And if someone asks you what level your company is at, the honest answer is usually one level lower than you’d guess.
That’s not a failure. That’s clarity. And clarity is how you actually move up the spectrum.
Thinking through the same questions?
We help companies figure out what AI actually changes for their business — and build the systems to act on it. Strategy, architecture, automations.
Tell us what you're working on →