The engineer, reimagined: AI-driven development at Parloa

12 March 2026
Author(s)

Masashi Beheim

VP Engineering

Nuno Marques

Staff Software Engineer
Table of contents

A year ago, our engineers wrote code. Today, they spend most of their time orchestrating swarms of agents that produce software. 

In less than 12 months, the center of gravity shifted from “how do I implement this?” to “how do I design and supervise a system of agents that will implement this for me?” 

The role didn't change gradually. It leaped. And it's still leaping. Every quarter, the balance between human work and agent work keeps tilting further toward the agents — and our practices, guardrails, and team structure keep evolving to keep up. 

For us at Parloa, this is more than a thought experiment. It’s how we ship production systems today. Our contact-center AI runs on code that was largely generated, reviewed, and hardened inside this agent ecosystem, under the supervision of engineers who now think in workflows, constraints, and risks rather than individual lines of code.

Here's the journey as we lived it. From autocomplete, to AI pair programmers, to fully instrumented “kitchen” environments and emerging autonomous agents, each step forced us to renegotiate what engineering ownership means, and how you build safely at AI speed.

How the engineer’s job changed

It started with code completion. GitHub Copilot, autocomplete, fill-in-the-blanks. The engineer was still firmly in the driver's seat. AI was basically a faster keyboard.

Then the driver's seat shifted. AI started writing whole functions, then whole files. The human moved from author to reviewer — still in the loop, but no longer holding the pen.

Next, AI broke out of the editor. It started showing up in standups, backlog grooming, data collection for product decisions, prototyping, automating toil. The entire software development lifecycle became its territory, not just the coding part.

Then came parallelism. One development stream became two, then five. And that's when we hit a wall: cognitive load. Multitasking is expensive for machines. It's even more expensive for humans.

The bottleneck was no longer typing speed, code quality, or even architecture. It was the human brain's ability to hold context across multiple parallel streams of AI-generated work. At any given moment, our engineers now have over 3x more PRs in flight than a year ago. That's the parallelism in action, and the source of the cognitive load problem. You can increase throughput with agents, but you can’t increase human working memory.

That realization forced us to redefine what “being an engineer” means in an AI-native organization.

From product-minded architects to founders

Today, most engineers don't primarily write code. They orchestrate a swarm of agents to produce software.

Half a year ago, we coined a term for what we thought the engineer was becoming: product-minded architect. AI handles coding, bug fixes, and implementation details, and architects make higher-level technical decisions to ensure the architecture is sound and the design remains coherent. Human engineers decide what gets built and guide the hows because, after all, outcomes are more important than outputs.

Fast forward to today. Some engineers have moved past that. They see themselves as QA. A quick manual check that things work. A brief sense-check that the output matches the intent. The agents have already reviewed for security, performance, coding standards, and best practices. The human confirms that the right thing got built. In other words: engineers own intent, coherence, and risk — the parts that don’t commoditize.

Where does this lead? Eventually, we all become founders. The cost and entry barrier of software development are approaching zero. Value comes from knowing what to build and being the best at it. A plethora of new applications will emerge. Every category will have one or very few winners. Since many more people will try, there will be many more losers. The best will win.

Engineers have considerably more leverage than ever before. Engineers become guardians of quality, of ethics, of what gets built and what doesn't. It's about taste and curation. Knowing what good looks like today, or what good will look like tomorrow. The leverage comes from judgment: choosing the right problems, setting the right constraints, and saying “no” to things that don’t meet the bar.

Culture eats tooling for breakfast

The tools are commoditized and accessible to all. Everyone has access to the same models, the same APIs, the same potential. What differs is how much of that potential gets materialized.

Mindset is what separates incremental gains from exponential impact, the difference between 2x and 100x productivity.

The AI champions in your engineering organization are among the most valuable people you have right now. They unlock that potential for everyone around them. If you don't know who your AI champions are, you don't have them yet.

At Parloa, we supported and formalized this emerging culture. Our AI coding environment is called "the kitchen", a development environment built around Claude Code. Last summer, we handed out Parloa-branded chef hats and aprons to people that were contributing to the kitchen and helping push innovation in our SDLC. That was a way for us to bring people together in the AI transformation journey and excite them in this new era. It feels so long ago it's already old. But it was fun, and it made the shift visible and social rather than something that happened quietly at individual desks.

The metaphor is deliberate. In a real kitchen, the value is in preparation, shared techniques, and consistent standards. Our “kitchen” works the same way: engineers contribute tools, templates, and guardrails, and everyone cooks faster and more safely on top of that shared base.

We are not limited by cost or tooling. We spend weeks training people to use the tools effectively. Accelerator weeks and “Show me how to cook” sessions, where we share best AI practices, bring us together, acknowledging that everyone is on the same journey but at different stages. In these sessions, senior engineers and AI champions live‑demo real workflows: spinning up services, refactoring systems, hardening security — all using the kitchen. People don’t just see features; they see how work actually changes.

The tools will keep changing. The models will keep improving. But a team that doesn't have the mindset to rethink how work gets done will plateau at autocomplete, no matter what you put in front of them. For enterprise buyers, this is the difference between “we use AI” and “AI changes our delivery model.”

Inside AI-driven development at Parloa

The first engineers at Parloa who shifted to 100% AI-generated code did so last July. They were already at 80–90% with Cursor, but Claude Code with Skills made the jump. Some engineers who started in January haven't written a single line of code manually. Not because they can't, because they don't need to. They still understand the code deeply, they just spend their time specifying, reviewing, and refining instead of typing.

Getting here wasn't straightforward. We tried shared rules first, a repository of prompts and conventions. It broke under its own weight: distribution was manual, context windows couldn't hold everything at once, and every compaction lost nuance.

That’s why we rebuilt it from scratch. The result is what we now call the kitchen: built around Claude Code, it grew to over 500 tools, 50 skills, and 17 plugins in under a year, covering security scanning, reliability guardrails, architecture validation, coding standards, test generation, PR review, sprint ceremonies, and threat modeling. Every engineer uses it daily. New joiners learn how to work in this environment from day one; it’s not an optional side tool, it’s the default path to shipping.

An engineer shipped a complete feature in under two weeks using this ecosystem. That scope was previously a quarter's work for a team. We see the same pattern elsewhere: what used to be “project-sized” work now fits comfortably into a sprint when agents and guardrails carry most of the execution.

Code reviews also shifted completely. Pair programming happens by default with the AI. Last year, we still reviewed PRs line by line. As our kitchen and guardrails evolved, we started skimming. PRs got significantly longer, and we only gave them an overall sense-check.

Today, engineers review a much smaller surface. Through investment in verification gates — security scans, best-practice enforcement, coding-standard compliance — and a well-tuned harness with strong context engineering, agents handle the bulk of what used to be manual review. Engineers focus their attention on architecture decisions, business logic, and edge cases that automated evals can't catch. The review didn't disappear. The scope shrank. Humans still review every change that matters, but they do it with far more signal and far less noise.

The numbers confirm that PR throughput has multiplied significantly. Cycle time decreased despite the increase in volume. However, the time to first review went slightly up, because humans became the bottleneck. The old model of reviewing every line wasn't scaling. Engineers moved to sense-checking agent-validated output, and the metrics reflect that transition, not a decline in quality. For enterprise buyers, this is important: we traded busywork in review for higher-leverage scrutiny, not for lower standards.

That bottleneck pointed to a deeper problem: cognitive load is the real constraint now. When we refactored a system half a year ago, we optimized the pieces specifically for human reviewability. Producing across multiple development streams, each with many AI agents, can be extremely taxing on the human battery. We're learning to design work around cognitive sustainability and not just throughput. We now design systems and PRs to be “reviewable units,” not just “deployable units” — with clear narratives, smaller deltas where possible, and explicit boundaries between parallel streams.

The bias has shifted to making a case through PoCs. When the cost of building something is reduced so much, the fastest way to convince someone is to build it. We are optimizing for more impact, not for cost. If a hypothesis can be tested cheaply with agents, we test it in code instead of in slides. That shortens decision cycles for both engineering leaders and business stakeholders.

The fundamentals still count

None of this changes the fundamentals. It amplifies them.

It's still outcome over output. If you were building the wrong things 20% of the time before and you're now 10x faster, you're producing 10x more wrong things. That’s a lot of wrong things. Having a strong product function that knows what to build and why is more important than ever, not less.

AI allows for many more things. But less complexity, fewer programming languages, fewer different databases are still better than more. At least for now. Simplicity was always a virtue. Speed just makes the consequences of ignoring it arrive faster.

The guardrails we've built into our kitchen aren't a response to AI being unreliable. They're a response to speed being dangerous without direction. Architecture standards, security checks, coding conventions exist because the engineering culture decided that velocity without discipline is waste. The tooling encodes the culture. Without the cultural decision first, the guardrails never get built. This is also how we build for enterprises: opinionated standards, explicit guardrails, and an internal expectation that “fast” and “safe” must coexist.

What comes next

We're working toward autonomous agents. One-shotting features from spec to production. Increasing agent runtime so they can work through problems without human intervention at every step.

The goal: agents that run continuously, not in batches you review later, but as a constant presence making small, incremental decisions around the clock. When confidence is high, they act. When it's not, they escalate in real time and you decide in minutes, not days.

Trust is earned progressively: an agent starts by recommending, graduates to acting with a window for human objection, and eventually operates autonomously but only after proving itself over hundreds of decisions. The human role shifts from doing the work to supervising the system and handling the exceptions the agents flag. We already follow this pattern internally, and it mirrors how we think about AI agents in production contact centers as well.

AI moves beyond development into operations. KTLO trends toward zero as agents handle monitoring, incident response, and routine operational work. The engineering time freed up goes back into building. We are expanding this carefully, area by area, with the same emphasis on auditability, guardrails, and human override that enterprise buyers expect from production systems.

And the human side matters more, not less. We're optimizing for cognitive load and for a sustainable pace. The future of engineering work might involve more walking and thinking than sitting and typing. High dopamine from shipping fast is real, but so is the crash from context-switching across multiple parallel agent streams. We're learning what sustainable looks like in this new mode. We want engineers who can do this for years, not sprints, which is why we treat cognitive sustainability as an engineering problem.

The long arc is clear: from natural language to binaries. Programming languages were always an intermediary between human intent and machine execution — Assembly, C, Java, Python, each one a step closer to how humans think. AI is the next step. Eventually, the intermediary disappears. You describe what you want. The machine builds it.

The cost of creating software is approaching zero. The cost of knowing what to create never will. That’s where senior engineers and enterprise leaders come in: deciding what should exist, under which constraints, and with which safeguards. That is the real job of the new engineer.