,

What Agent-First Development Looks Like in Enterprise Data Team


Anthropic’s top engineers don’t write most of their code anymore. The codebase is largely built by Claude Code. That’s not a line from a press release — it’s a statement from the people building one of the most scrutinized AI companies in the world, backed by internal research Anthropic has published. Their own data shows employees went from using Claude in 28% of their daily work to 59% in under twelve months. Self-reported productivity gains went from 20% to 50% in the same period. Across engineering, 27% of Claude-assisted tasks were things that simply would not have been completed otherwise. Not done faster. Not done cheaper. Not done at all.

Now ask yourself: what is stopping your data team from doing the same thing?

If your answer involves infrastructure constraints, governance, or organizational readiness — you’re not wrong. But you’re probably further from the edge than you think, which is different from being far from the edge entirely. Agent-first development is already running inside traditional enterprises on existing infrastructure. The gap between where most data teams are today and where they could be is not primarily a technology problem. It’s an adoption problem. And that gap is five iterations wide.

The last mile problem nobody talks about

The 80% data-cleaning statistic has been cited at every analytics conference for a decade. Everyone who’s lived it knows it’s directionally right but imprecise. The more useful number: senior data scientists, the ones your organization pays between $180,000 and $250,000 a year, routinely spend somewhere between 30 and 40 percent of their time on work that doesn’t require senior data science judgment. Formatting executive decks. Writing documentation no one reads until something breaks. Building reports from scratch because the stakeholder needed it in a slightly different cut than last quarter’s version.

These numbers vary by team and by organization — your situation will differ. But the pattern is consistent enough to name: the analyst spent the morning doing the analysis. They’re spending the afternoon packaging it.

That is not a productivity problem. It’s a capital misallocation problem.

The analysis exists. The insight exists. Your senior data scientist ran the query, interpreted the output, and formed a clear point of view before lunch. What they’re doing from lunch until end of day is building the artifact that makes the insight visible to the people who need to act on it. The bottleneck is rarely the thinking. The bottleneck is almost always the packaging.

This is the inefficiency that agent-first workflows actually solve — and it’s the one most enterprise AI conversations miss entirely. Organizations get excited about AI doing analysis. The larger opportunity is AI handling everything between finishing the analysis and the decision-maker seeing it.

What agent-first actually means in practice

Let’s be precise, because “agent-first” has been stretched to cover things it shouldn’t.

It does not mean replacing your data scientists. It means restructuring what they focus on. The human stays at the strategic layer: setting objectives, evaluating outputs, making judgment calls about what matters. The agent handles execution: generation, formatting, iteration, documentation. A division of labor, not a substitution.

The Anthropic example is worth looking at beyond engineering. Non-engineering teams — security, alignment, support, operations — are increasingly using Claude to handle work that previously required technical dependencies. Security analysts are using it to navigate unfamiliar codebases. Non-technical employees are resolving operational problems that previously required routing to a developer. The pattern isn’t “AI does everything.” It’s “AI lets each person execute without pulling in someone else.”

That last part matters most in a traditional enterprise. The biggest tax on analytical teams isn’t headcount — it’s dependency chains. The analyst who needs a developer to build the report. The data scientist who needs a designer to make the deck presentable. Agent-first workflows collapse those chains. Each person becomes more capable across functions than their job description says they should be.

Five iterations to production: the honest account

Here’s what the sales pitch leaves out: the first iteration rarely works cleanly.

The first pass usually produces something structurally correct and tonally wrong. The framework of the analysis is there, the insight is framed, but the formatting is off, the vocabulary doesn’t match internal standards, and the chart labels are too small for the screen it’ll be presented on.

That’s iteration one. An observation, not a failure.

The second iteration, you’re specific about what’s wrong. Not “fix the formatting” — that’s a complaint, not a direction. “The executive summary should lead with the business implication, not the methodology. Move the confidence interval explanation to an appendix.” The agent produces a revised version in two minutes.

Third iteration: the visualization. A bar chart for a trend conversation doesn’t communicate movement — it communicates magnitude. You ask for a line chart with a rolling average overlay. The agent builds it. You evaluate it against what the slide needs to do.

Fourth iteration: the language. “Statistically significant uplift” is not what an executive audience needs. They need “the change was real and here’s what it means for the decision you’re about to make.” You tighten the narrative.

Fifth iteration: the final check. Does this match the format the leadership team uses? Does the title answer the question or raise it? Is there a clear ask on the last slide?

For a well-scoped deliverable, that entire cycle runs under thirty minutes on a productive day. Each cycle is minutes, not hours, and the human is directing, not executing.

For a well-scoped deliverable, that process runs under thirty minutes on a productive day — though ambiguous inputs and complex tasks take longer, especially when early direction is unclear. What changes is not the length of each iteration but the cost. Each cycle is minutes, not hours, and the human is directing, not executing. The quality of the output scales directly with the quality of the judgment being applied — which is why this amplifies experienced practitioners rather than bypassing them.

Why existing infrastructure changes the conversation

Here’s the part that surprises most enterprise technology leaders: none of this requires a new licensing agreement.

The workflow above can run on Databricks using a model the organization is already running. Not a new cloud contract. Not a separate AI procurement. The existing data infrastructure, the existing output format, the existing model. The agent layer sits on top of what’s already there.

Agent Skills

What makes this portable is how the skill layer is built. In the Agent Skills paradigm, a skill is a small, version-controlled package the agent can discover and load on demand — typically a folder with a required SKILL.md file (YAML metadata + Markdown instructions), plus optional resources (scripts, templates, reference docs). This matters because it enables progressive disclosure: instead of stuffing every procedure into one giant system prompt, the agent starts with a lightweight “menu” of skill names and descriptions (discovery), loads full instructions only when a task matches (activation), then follows the procedure and pulls in any bundled resources as needed (execution).

A concrete example: your team might have an “exec-readout” skill that tells the agent exactly how to turn an analysis notebook into the artifact your leadership expects — what the first paragraph must answer, the three bullets that always appear, what gets pushed to an appendix, and the final “decision ask” format. Another analyst doesn’t need to remember the conventions (or re-prompt for them). The agent loads that skill only when the request looks like an executive readout, producing output that’s consistent across people and quarters.

To be clear: a markdown file is not a governance framework. Regulated industries still need access controls, audit logging, and model documentation before these workflows touch production data. That’s real work, and it shouldn’t be skipped. The point of packaging skills as plain files is auditability (anyone can read what the agent was told), maintainability (diffs, reviews, versioning), and interoperability (the same skill package can be used across skills-compatible agent products). The technical foundation runs on infrastructure you already have — governance is an obligation, not a reason to wait.

Organizations waiting for a fully governed, vendor-packaged enterprise solution are paying a cost while they wait. The gap between where they are and where they could be is measured in analyst-hours and decision latency. It compounds every quarter.

The capacity reallocation argument

Stop framing this as a productivity improvement.

The word “productivity” sends the conversation into the wrong room. It implies squeezing more throughput from the same process. More output, same people, lower cost per unit. That framing positions AI as an efficiency tool, which immediately puts you on defense with your team — because efficiency tools have historically meant headcount reductions. If your data scientists hear “AI productivity” and start updating their LinkedIn, you’ve already lost the adoption battle before it starts.

The right frame is capacity reallocation.

Your senior data scientists have a fixed number of hours. A meaningful share of those hours currently goes to work that doesn’t require their expertise. If an agent workflow redirects that time, you don’t have a more productive team in the narrow sense. You have a team whose capacity is pointed at harder problems — the revenue questions, the forecast models, the cost optimization analyses that have been sitting in the backlog for months because no one gets to them.

Same headcount. Higher ceiling on what they deliver. And the work is more interesting, which matters if you want to keep the people who are good enough to leave.


Same headcount. Higher ceiling on what they deliver. And the work is more interesting, which matters if you want to keep the people who are good enough to leave.

Present it to your board that way: not cost reduction, but a different allocation of expensive analytical capacity toward the problems that move the business.

What traditional enterprises get wrong about adoption

The cultural distance between how AI adoption gets sold and how it actually spreads in a large organization is real, and most companies walk straight into it.

The standard failure mode: leadership decides the organization will “adopt AI,” issues guidance, schedules training, measures compliance rates. Six months later, the tools are technically deployed and barely used. The organization has the vocabulary of AI adoption without anyone changing how they actually work. Compliance theater.

The way agent-first workflows actually spread is through one person who finds something that works and won’t stop talking about it.

One data analyst finds that an agent workflow cuts their weekly reporting cycle from two days to four hours. They show their manager. They show the person at the adjacent desk. They present it at the next team retrospective, unprompted. Within ninety days, variations of the same workflow are running in three other teams. No mandate. No training program. No steering committee. Just a visible result that was easy to evaluate and easy to copy.

This is how new practices spread in organizations too large for top-down mandates to create real behavior change. Directives travel down the org chart. Enthusiasm travels sideways. Peer influence is faster and it sticks.

Find the person who’s already experimenting. Give them a small amount of support and visibility — a place to demo what they’ve built, some air cover when compliance asks questions, a budget line for what they need. Then let the result do the work. One person, one workflow, one win that people can see for themselves. Everything else follows.

The question is when, not whether

Every data team will work this way eventually. That’s not a prediction requiring much conviction — it’s an observation about where the trajectory goes. The only real variable is timing.

The organizations that develop agent-first workflows in 2025 and 2026 have something the ones arriving in 2028 won’t have: two or three years of institutional learning. They know which workflow types translate well and which ones don’t. They know where iteration cycles break down. They know how to write skills that produce consistent output across different team members. That knowledge doesn’t come from a vendor. It builds as people use the tools, fail in small ways, and refine their approach. It compounds.

Gartner projects 40% of enterprise applications will include embedded AI agents by end of 2026. Recent research shows 96% of organizations are already using AI agents in some capacity — but only 23% have moved beyond a single function. The gap between “using” and “transformed by” is where structural advantage gets built.

Agent-first isn’t a transformation project. There’s no kickoff meeting, no eighteen-month roadmap. It’s a habit that compounds.

Agent-first isn’t a transformation project. There’s no kickoff meeting, no eighteen-month roadmap. It’s a habit that compounds. One workflow at a time, one team at a time, until the way the organization does analytical work looks fundamentally different from how it looked three years earlier.

The enterprises that understand this earliest aren’t waiting for strategy to solidify. They’re already in iteration two.