Booking.com agent strategy: disciplined, modular and already twice as exact

When many enterprises weren’t even thinking about agent behavior and infrastructure, Reservation.com has already “stumbled upon” them thanks to its own conversational recommendation system. These early experiments allowed the company to take a step back and avoid getting caught up in the crazy hype around AI agents. Instead, it takes a disciplined, layered and modular approach to model development: diminutive models designed to travel, enabling low-cost and rapid inference; larger, enormous language models (LLM) for reasoning and understanding; and domain-specific assessments, created internally when precision is critical. Thanks to this hybrid strategy – combined with selective cooperation with OpenAI – Booking.com has doubled its accuracy in key search, ranking and customer interaction tasks. Pranav Pathak, head of AI product development at Booking.com, posed to VentureBeat in a modern podcast: “Do you build a system that’s very, very specialized and bespoke and then you have an army of a hundred agents? Or do you keep it general enough and have five agents that are good at general tasks, but then you have to organize a lot around them? I think we’re still trying to find that balance, like the rest of the industry.” Check out what’s modern Except for the pilot podcast hereand read on for the most significant information.

Go from guesswork to deep personalization without being “scary”

Recommendation systems are the basis of Booking.com’s customer contact platforms; However, conventional recommendation tools rely less on recommendations and more on guesswork, Pathak admitted. That’s why from the beginning he and his team vowed to avoid generic tools: as he put it, price and recommendation should be based on the customer’s context. Booking.com’s initial pre-generational AI tools for detecting intent and topics were a diminutive language model that Pathak referred to as “BERT scale and size.” The model extracted information from the customer about their problem to determine whether it could be resolved through self-service or human assistance. “We started with an architecture where you have to invoke the tool if that’s the target you detect, and that’s how you analyzed the structure,” Pathak explained. “It was very, very similar to the first few agent architectures that came out in terms of reasoning and defining a tool call.” His team has since expanded this architecture to include an LLM orchestrator that classifies queries, triggers search-assisted generation (RAG), and calls APIs or smaller, specialized language models. “We were able to scale this system quite well because the architecture was so close that with a few tweaks we now have a full stack of agents,” Pathak said. As a result, Booking.com sees a 2x escalate in topic discovery, which in turn frees up the bandwidth of human agents by 1.5 to 1.7 times. More and more topics, even intricate ones previously defined as “other” and requiring escalation, are becoming automated. Ultimately, this supports greater self-service, allowing employees to focus on customers with extremely specific problems for which the platform does not have a dedicated tool flow – for example, a family who cannot access their hotel room at 2 a.m. when the reception is closed. Pathak noted that not only is this “really starting to pick up,” but it is also having a direct, long-term impact on customer retention. “One of the things we’ve seen is that the better we serve the customer, the more loyal our customers are.” Another recent implementation is personalized filtering. Booking.com has between 200 and 250 search filters on its site – an unrealistic number that any human could search through, Pathak noted. So his team introduced a free text box that users can type into to instantly receive customized filters. “This becomes an important cue for personalization in terms of what you’re looking for in your own words rather than in a stream of clicks,” Pathak said. In turn, it tells Booking.com what customers really want. For example, sizzling tubs – when filter customization was first introduced, sizzling tubs were one of the most popular requests. This wasn’t even considered before; there wasn’t even a filter. Now this filter is energetic. “I had no idea,” Pathak noted. “Honestly, I never looked for a hot tub in my room.” However, when it comes to personalization, there is a fine line; memory remains complicated, Pathak emphasized. While it’s significant to make long-term memories and develop relationships with your customers – retaining information such as their typical budget, preferred hotel star ratings, or whether they need disabled accommodations – this must be done on their terms and protect their privacy. Booking.com is extremely careful about memory and tries to obtain consent so as not to “scare” collecting information about customers. “Managing memory is much more difficult than actually building it,” Pathak said. “The technology is already available, we have the technical resources to build it. We want to make sure that we don’t release a storage facility that doesn’t respect the customer’s consent, because it doesn’t seem very natural.”

Finding the balance between building and buying

As agents mature, Booking.com faces a central question facing the entire industry: How narrow should agents be? Rather than committing to a swarm of highly specialized agents or a few generalized agents, the company aims to make reversible decisions and avoids the “one-way doors” that lock its architecture down long-term and costly paths. Pathak’s strategy is as follows: Generalize where possible, specialize where necessary, and maintain flexibility in agent design to ensure resilience. Pathak and his team are “very attentive” to employ cases, assessing where to build more generalized, reusable agents and where to build more task-specific agents. They strive to employ the smallest possible model, with the highest level of accuracy and output quality, for each employ case. Whatever can be generalized, it is. Latency is another significant issue. When factual accuracy and avoiding hallucinations are paramount, his team will employ a larger, much slower model; but with search and recommendations, user expectations determine speed. (Pathak noted, “We are no one’s patients.”) “We would never use something as heavy as GPT-5 for topic detection or object extraction, for example,” he said. Booking.com takes a similarly versatile approach when it comes to monitoring and ratings: if it’s general-purpose monitoring that shows that someone else is better at building and has horizontal capabilities, they’ll buy it. But if brand guidelines need to be enforced, they will develop their own ratings. Ultimately, Booking.com became “super predictable”, agile and versatile. “At this point, with everything that’s going on with artificial intelligence, we’re a little reluctant to go through a one-way door,” Pathak said. “We want as many of our decisions as possible to be reversible. We don’t want to get stuck in a decision that we won’t be able to reverse in two years.”

What other builders can learn from Booking.com’s AI journey

Booking.com’s AI journey can serve as an significant model for other companies. Looking back, Pathak admitted that they started with a “pretty complicated” technology stack. They’re getting the hang of it now, “but we probably could have started with something much simpler and seen how customers interacted with that.” That being said, he offered some valuable advice: If you’re just starting out with LLM or agents, pre-built APIs will be fine. “APIs are customizable enough that you might already have a big impact before you decide you want to do more.” On the other hand, if your employ case requires customization that can’t be achieved with a standard API call, that’s an argument for internal tools. Still, he stressed: Don’t start with complicated things. Address “the simplest, most painful problem you can find and the simplest, most obvious solution.” Identify product-market fit and then explore ecosystems, he advised, but don’t just destroy senior infrastructure because the modern employ case requires something specific (e.g. moving your entire cloud strategy from AWS to Azure just to employ an OpenAI endpoint). Ultimately: “Don’t close too early,” Pathak noted. “Don’t make decisions that are one-sided until you are absolutely sure this is the solution you want to pursue.”

Categories

Booking.com agent strategy: disciplined, modular and already twice as exact

Go from guesswork to deep personalization without being “scary”

Finding the balance between building and buying

What other builders can learn from Booking.com’s AI journey

Science says left-handed people are more competitive

OpenAI delays ChatGPT ‘adult mode’ again

5 useful Python scripts to automate exploratory data analysis

Sleep apnea often goes undetected in women. This is starting to change

Anthropic’s contract with the Pentagon is a warning to startups chasing federal contracts

More News

When AI companies go to war, security gets left behind

War with Iran threatens global chip supplies and the expansion of artificial intelligence

ByteDance’s artificial intelligence ambitions are hampered by computational limitations and copyright concerns

OpenAI banned military applications. The Pentagon tested its models through Microsoft anyway

Science says left-handed people are more competitive

OpenAI delays ChatGPT ‘adult mode’ again

5 useful Python scripts to automate exploratory data analysis