The caller’s register: Why language habits outlast the technology that created them
Part of the Agent Architect's Digest, a series from Parloa's Agent Architects team.
I've spent my career at the intersection of language and technology. I earned my PhD in linguistics, with a focus on psycholinguistics, and quickly moved into natural language processing (NLP) and conversational AI. What's drawn me through each of those transitions is the same thing: a fascination with how language actually works, and a growing conviction that many of the hard problems in voice AI aren’t engineering problems at all. They're language problems.
IVR technology has dominated the voice support space for the last 40+ years. As such, customers have grown accustomed to guiding technology down rigid menu paths when they call support lines, pressing 1, 2, or 3 and saying fragments of sentences to emphasize key phrases. Voice AI eliminates the need for such structure entirely, but despite the technological shift, consumers are slow to adapt. They call support lines assuming bad technology and talk in script rather than natural language that would actually help the agent help the customer better.
Automated support has created a trust problem, and my role as an agent architect is to win it back by designing AI agents grounded in how language actually works.
A technological evolution still underway
IVR arrived in the 1980s, rigid and menu-driven. The early 2000s brought intent-based NLU systems that improved classification but still required callers to hit specific trigger phrases to be understood. LLM-based agentic AI is categorically different: it handles natural language variation by default, iterates at a fundamentally different speed, and can reason across live data sources mid-conversation.
But most callers haven't experienced these new systems yet.
Parloa's State of Agentic CX report found that 99% of voice automation still runs on old-generation infrastructure. Most people are still calling into the world that shaped their habits. They've learned to compress their language, front-load keywords, and minimize the words they say to drive efficiency with the system based on past experiences. This habit forming is what linguists call a register, and rebuilding a human’s register to work for modern automated support systems is one of my most interesting problems to solve. Because it's not a technology problem. It's a language problem.
The register
In linguistics, a register is a stable variety of language acquired through consistent situational exposure. You have a register for texting your friends, and likely a different one for writing to your manager.
The register that callers developed for automated systems is rational and learned. Callers might not even realize they’re using it, but the call data shows it. People say only two words, front load the conversation, and ask for a human before the voice AI even finishes its greeting. They do it to increase efficiency based on what past experiences taught them: Speak naturally, be misunderstood. Compress your language, get through faster, or fail faster and be routed to a human.
While this register may have felt functional with old technology, it was never truly effective. Even rule-based systems performed better with fuller, more structured input than callers typically gave them. The compressed, fragmented language callers developed was a workaround, not a solution. With LLM-based agents, however, the stakes of that workaround are higher and the potential gain much greater. For the first time, a system can handle nuance, context, and natural phrasing. A caller who explains "I'm locked out of my account and I think it's related to a payment issue" will get a meaningfully better response than one who says "account problem." The register that once helped callers survive automated systems is now the very thing standing between them and a conversation that could actually resolve their problem. The question becomes: how can we build systems that, interaction by interaction, earn enough trust that callers begin to speak naturally again?
Three design principles based on linguistic concepts
As an agent architect, I ground my design decisions in what linguistics has taught us about how human communication actually works.
Priming
In psycholinguistics, structural priming is a well-established phenomenon describing how exposure to certain language patterns makes people more likely to produce similar patterns in response. If you ask a question in complex, full-sentence natural language, for example, you're more likely to get a full-sentence natural language answer.
That’s why the opening prompt of a voice agent serves as a linguistic scaffold. An opening that says "Please state the reason for your call" will get a very different response than one that says, "Hi, I'm here to help — what's going on today?" In order to hear natural language back, AI agents need to be designed with an opening that invites the customer to speak in natural language. Callers will learn through the experience, not instruction.
Lexical mirroring
Lexical mirroring refers to the tendency humans have to use the same words or phrases their conversation partner is using. In mirroring the language, the speaker makes their partner feel heard. This feeling of understanding builds trust with the conversation partner, encouraging them to speak naturally and continue engaging in the conversation.
Given this tendency, when a caller uses a specific word, the agent should use it back.
For example, if a caller says "my dishwasher isn’t functioning," the agent should say "functioning," not "working." Doing so signals their specific language was received, not just classified. Each time it works, the caller builds trust with the system, beginning to believe that speaking in natural language is safe.
Common ground
Common ground is the accumulated set of mutual beliefs, knowledge, and assumptions that speakers bring into an interaction and continuously update throughout the conversation. It’s what lets two people refer back to something said earlier, skip explanations, and ultimately, feel understood.
Years of starting from zero with every support conversation have led callers to not expect common ground from automated systems. While API integrations like CRM and case history lookup capabilities have existed for years, they typically required explicit, one-off API requests to retrieve a specific piece of data. The process was narrow and unscalable. Now with agentic AI, however, the automated system can leverage a reasoning layer to connect to CRM data, ticketing systems, and case history simultaneously, relating it to what the caller is saying in the moment.
Yet all of this context isn’t always better.
Common ground works in human conversation because speakers are selective. They only carry what’s relevant across conversations. Research on cognitive load in conversation shows that speakers actively manage how much shared information they activate at once because if it’s too much, they know that the listener ends up having to work even harder. With agentic AI, the challenge in automation has changed. Where old systems used to fail by being too limited in context, the new systems risk giving too much context at once, overwhelming the customer and the voice agent, as the voice agent needs to know what action to take with each piece of context.
Finding the perfect context balance is where linguistic understanding becomes a design tool. The designer’s job is not only to decide what the system should know but also how and when it uses what it knows, and then how to integrate it naturally rather than display it. “I can see there’s still an open case on your account. Is that what you’re calling about?” is an example of common ground. Reading back a data record of everything the system has on file is not. When used correctly, the caller who arrived ready to explain everything from scratch feels understood from the start of the conversation, building trust and shifting the register as a result.
Testing: the register is the test suite
The callers described in this article aren’t edge cases that should be considered, they are the primary audience to be prioritized.
At Parloa, we simulate the hardest callers and evaluate how the system responds to each. This kind of evaluation is iterative and qualitative, and it's where the real signal to trust-building conversations lives. First, we train our agents to be able to respond to the common register, then we train them to respond in a way that builds trust and ultimately, encourages the customer to shift their register to one of natural language.
Here are some profiles we often test:
## RoleYou are a customer calling a service hotline.You can only speak English.## BehaviourThis is the behaviour you follow throughout the call:{{ input_variables.behavior }}and here’s an example of how we instruct a simulation:
## Instructions
You are calling because your goal is to eventually ask:
{{ input_variables.question }}
Rephrase the question to match your behaviour.
Do not change its meaning — only how you express it.
## When to hang up
Hang up when any of the following occur:
* The agent hangs up
* The agent forwards you to another agent or department
* The agent asks if there is anything else they can do
* You have completed your goal and received an answer
* The conversation has exceeded 10 turns without resolutionProfile | Behaviour | Question |
Minimal caller | Responds in 2–3 words and does not supply detail unless explicitly asked. Does not volunteer information between turns. Uses as few words as possible in every turn and only adds context when directly and specifically prompted. If the conversation stalls or feels unproductive, asks for a human agent. | Order |
Front-loader | Provides full context, multiple questions, and background information in a single opening turn before being asked. Mixes several needs into one statement and does not wait for structured prompts. Does not repeat information already given if asked for it again. | Hi, I ordered something last week, it was supposed to arrive on Thursday, it's now Monday and it still hasn't shown up. And when I tracked it, it said delivered but it definitely wasn't. Also one of the items wasn't even in the package the last time this happened. I just want to know if I can get my money back or at least a replacement. |
Refuser | Requests a human agent in the first turn and repeats the request at least once during the conversation. Does not engage with generic reassurances and only continues if given a specific reason tied to their situation. Disengages if not addressed directly. | I want to speak to someone. |
Hesitant | Uses approximations and non-standard terminology throughout. Self-corrects mid-sentence and leaves utterances incomplete. Does not provide required details on the first ask and waits to be prompted again. If asked for specific information they are unsure about, approximates or says so directly. | I think there is a problem with my delivery. |
Trust and time
Language habits formed over years of consistent feedback don't reverse after one successful interaction. Callers won't update their expectations immediately because the technology improved. They'll update them because the experience improved, consistently, over enough interactions to make them trust that the agent can understand their natural language and take the correct next steps. As we design towards this shift at Parloa, we’re already starting to see the habits transform. Callers are opening up over the course of a conversation, as they realize that they’re being treated like the humans they are.
That's what we're designing for, natural, scalable conversations. A system that not only demonstrates capability, but also earns enough trust that the caller on the other end finally says the thing they actually mean.

:format(webp))
:format(webp))
:format(webp))