Agent Lifecycle

Agentic software engineering at scale: Parloa’s Claude Kitchen

7 May 2026

Author(s)

Pedro Castillo

Staff Data Engineer

Table of contents

I’m a data engineer, and I haven’t written a single line of code in over a year. With Parloa’s Claude Kitchen, a plug-in ecosystem of tools, plug-ins, and skills, I don’t need to.

In April 2025, Parloa’s tech teams hosted an Accelerator Week, a week dedicated to solving the CursorMeister challenge, a challenge in which participants built microservices based on a set of requirements, with automated verification via contract testing. To my surprise, a data scientist and product analyst, not engineers, submitted completely viable microservices the fastest. This was the moment I realized human and AI collaboration was the future of coding, and it was the impetus for an org-wide transition to agentic engineering.

Following the Acceleration Week, Parloa’s product and engineering teams worked towards three significant milestones that transitioned us into an organization that now leads and scales with agentic software engineering:

May 2025: A curated set of “.cursorrules” and model context protocols (MCPs) are shared across teams, establishing the Parloa AI-SDLC.

July 2025: Claude Code is rolled out across product and engineering teams.

October 2025: Parloa Claude Kitchen is created.

The three components to successful agentic engineering

At Parloa, we believe that good software is created from the perfect combination of tools, processes, and people.

Tools

We leverage a mix of agentic and classic tools in our software production. For agentic coding, we rely on Cursor and Claude. For project management and collaboration, we rely on our tried-and-true classics: Notion, Miro, Figma, etc., but we also build tools for our tools.

Claude Kitchen

Claude Kitchen is a plugin marketplace that includes a set of plugins and skills (we call them recipes) that enhance the core Claude Code experience. It was created to establish best practices that ensure all agents are operating the Parloa way. In this kitchen, human engineers have a single source of truth from which they can access the skills and plugins they need to make their agents successful within the context of Parloa.

By leveraging skills from the kitchen, coding agents transition from generic executors to opinionated experts. If we have a special condition for how we code microservices, we put it in the kitchen. If we have a certain way we want to define APIs or determine how we do testing, we add it as a skill. Rather than listing guidelines that rarely get followed by humans, engineers add guardrail skills to agents in one click, streamlining processes and mitigating headaches.

To date, Claude Kitchen has 20+ plugins, 80+ skills, and 500+ tools. At that scale, MCP must evolve from a connector to infrastructure. Otherwise, putting hundreds of tools directly into every local agent context would bloat their context windows, duplicate authentication work, spin up redundant MCP servers across parallel sessions, and force every engineer to reinstall whenever a tool changed. To avoid all of these challenges, Parloa turned MCP into a gateway. With the gateway approach, engineers connect once, updates land centrally, and the organization can manage access, authentication, and security controls in one place.

Processes

There are four key processes we keep top of mind with every engineering initiative. We believe every process should answer a question, and if the process cannot be answered, the initiative stops.

Prototype

At the prototype stage, engineers must ask themselves, could this work? If the answer is no, the initiative to build ends.

Prototypes start by leveraging AI to combine product requirement documents (PRDs), service catalogs, and meeting notes into a potential concept. Because prototypes are only representative of a product and not the product in actuality, vibe coding is allowed at this stage. With each prototype comes discovery notes that help capture the successes and failures of each iteration, guiding the agent towards a viable product.

Diagram of the processes Parloa's Claude Kitchen follows.

Planning

After vibe coding a prototype, a new question emerges, how will we actually do it? Here, the PRD and discovery notes guide what Parloa calls “pair planning.” As a team, the engineers collaborate with Claude to iterate on how this product will be executed, finding the right process to produce a request for comments (RFC) and epic. Through this collaborative approach, we ensure all options are explored and the best plan is put into place.

The planning phase of Parloa's Claude Kitchen

Implementation

The implementation phase includes four questions that all lead to answering the overarching question of, did we make this a viable product?

What are we building?

Create a spec from the ticket, PRD, and RFC. To avoid slop, we aim to be as specific as possible with our spec.

Did we make it real?

Take the idea and build an actual product from it.

Does it work?

Test the product for functionality.

Is what we built what we asked for?

After creating and testing the product, go back to the PRD, RFC, and ticket and compare what the agent built to the original goal in mind.

Parloa designs the implementation phase as agentic handoffs, so the system operates in a loop rather than one long-running chat. Each implementation phase (spec, build, test, and review) has at least one context window, and a main development-flow agent orchestrates each Claude Code subagent as a phase. The first subagent classifies the request as a feature, bug, or chore, the second leverages a stronger model for planning, the third utilizes a different model for building, and the fourth uses a cheaper or faster model for test loops.

The test loop mechanics are explicit: run the test suite, inspect failures, spawn another agent to fix them, and repeat until everything passes.

For the UI of a product, the agent creates an end-to-end plan, clicks through the interface, captures screenshots, and verifies that the behavior matches the request. If the review does not match the spec or the screenshots do not prove the expected behavior, the workflow automatically loops back to the implementation phase to correct the problem without human intervention.

The output of the implementation phase is not only code. Each phase comes with review artifacts that make large changes easier to inspect: screenshots of the initial and final state, evidence that the new behavior was exercised, a technical implementation summary, new files and changed files, usage notes, configuration details, test results, and any caveats the human reviewer should know. Through this process, humans are able to review intent, proof, and impact of the product as a whole before diving into line-by-line code review.

Parloa's Claude Kitchen Implementation Phase

Maintenance

In the maintenance phase, we ask the question, how do we keep it good? At Parloa, we’ve found that effective maintenance comes from not just a set of reactive skills, but from a feedback loop. If agents make mistakes in plans, human teams create or improve plan-review and general-review skills. If mean time to repair (MTTR) increases, the human team connects agents to operational data and traces so that investigation becomes repeatable. Finally, we created postmortem skills to turn incidents into better delivery practices rather than one-off documents that rarely get referenced.

The same maintenance loop applies to Claude Kitchen itself. Maintenance includes recipe (skill) creation and improvement: encode the best practices, update them when new model guidance arrives, compare vanilla Claude against the Kitchen version, and require verification steps. When a recipe fails, that failure becomes context for another iteration. Maintenance is how production lessons become reusable agent behavior.

Parloa's Claude Kitchen Maintenance Phase

People

With the rise of agents in software engineering comes a transition to the role of software engineers. At Parloa, we believe that engineers must operate as product-minded architects that manage the production of code. This transition from execution to management brings a change in bottlenecks, as time and effort are no longer the issues. Rather, because everything happens so fast, humans need to manage parallel workstreams and be diligent in ensuring nothing slips through the cracks. Humans must focus on working smarter, not harder, and building tools that will help them do so.

Furthermore, collaboration is evolving. Rather than focusing on pair programming, engineering teams should focus on pair conceptualizing. We are building systems and full products, not just tasks, and we’re not just working alongside humans. We need to consider agents as our coworkers, and we need to build the scaffolding to support them.

Finally, as the agentic coding environment evolves rapidly, we need to give ourselves, and our teams, time to experiment and understand each new development. We should position ourselves as researchers, testing the innovations, comparing model capabilities, and ultimately, making the best decisions for our business.

Lessons against the slop

Through our transition to agentic engineering, five key lessons emerged:

Empower experimentation - From top down, everyone should be testing. Patterns are being discovered in real time. Establish your own benchmarks.
Master the primitives - Don’t overcomplicate development. Focus on the five primitives: model, context, tools, prompt, harness. Consider your agent’s perspective, is what you’re asking them to do possible with the resources you’ve given them?
Service templates - Templates are the foundations to successful agents, as agents tend to replicate what already exists. Establish strict guardrails from the start to ensure security and consistency.
Feedback loops - Create closed-loop systems for agents to validate their work. Focus on workflows before anything else.
Code reviews need to change - Always start with a spec review. If there is disagreement from the start, reject the initiative. Review the proof, and leverage codebots to do the line-by-line security reviews for you.

What’s next

The move towards agentic software engineering has been tremendous over the past year, but it’s not done. We envision a world where these systems can run with even less human oversight, self-correcting in real time. By continuing to iterate on Claude Kitchen and allowing ourselves time to explore the next frontiers of AI innovation, we believe Parloa's systems will soon get there.

More in this series

12 March 2026Insights

The engineer, reimagined: AI-driven development at Parloa

AI is rapidly transforming how software gets built at Parloa. Engineers are shifting from writing code to orchestrating AI agents that generate, review, and refine it. In this new model, developers focus less on implementation and more on guiding workflows, setting guardrails, and ensuring quality.

Masashi Beheim and Nuno Marques

19 December 2025Insights

Building customer-facing data products: A builder’s perspective

At Parloa, our AI agents drive high-stakes customer interactions, which demands a data platform designed for resilience. This article gives an overview on the architecture and governance principles we’ve implemented to meet this challenge.

Elisabeth Reitmayr

1 November 2025Research

A Bayesian framework for A/B testing AI agents

We’re introducing a hierarchical Bayesian model for A/B testing AI agents. It combines deterministic binary metrics and LLM-judge scores into a single framework that accounts for variation across different groups.

Stefan Ostwald, Matthäus Deutsch, Rouven Glauert and Anjana Vasan