Photo by the editor
# Entry
If you have AI agents that work great on your notebook but break down once you go into production, you’re in good company. API calls timed out, immense language model (LLM) responses returned garbled – and rate limits kick in at the worst possible time.
The reality of agent deployment is convoluted, and most problems result from skillful handling of failures. The thing is, you don’t need a huge platform to solve this problem. These five Python decorators have saved me from countless headaches, and they’ll probably save you, too.
# 1. Automatic retry with exponential rollback
Every AI agent communicates with external APIs, and every external API will eventually fail. Maybe it’s OpenAI returning 429 because you hit the rate limit, or maybe it’s a low network outage. Either way, your agent shouldn’t give up after the first failure.
AND @retry the decorator wraps any function so that when it throws a particular exception, it will wait for a while and try again. The exponential rollback part is crucial because you want the wait time to escalate with each attempt. The first retry waits one second, the second retry waits two, the third waits four, and so on. Thanks to this, you won’t have to mess with the API that is already struggling with problems.
You can build it yourself with plain packaging using time.sleep() and loop, or reach for the Tenacity librarywhich gives you battle-tested @retry decorator out of the box. The key is to configure it with the right exception types. You don’t want to retry on an invalid prompt (which will fail every time), but you definitely want to retry on connection and rate limit response errors.
# 2. Apply of timeout guards
LLM connections may hang. This doesn’t happen often, but when it does, your agent sits and does nothing while the user stares at the dial. Worse still, if you’re running multiple agents in parallel, one hung connection can clog the entire pipeline.
AND @timeout decorator sets a hard limit on how long any function can run. If the function does not return within, say, 30 seconds, the decorator calls a TimeoutError that you can catch and handle gracefully. A typical implementation uses Python signal module for synchronous code or asyncio.wait_for() if you are working in an asynchronous area.
Combine this with the retry decorator and you have a powerful combination: if a connection hangs, a timeout will terminate it and the retry logic will trigger on a novel attempt. This alone eliminates a huge category of production failures.
# 3. Implementation of response caching
Here’s something that will dramatically reduce your API costs. If your agent makes the same call with the same parameters more than once (and it often does, especially with multi-step reasoning loops), there is no reason to pay for that response twice.
AND @cache decorator stores the result of calling a function based on its input arguments. The next time the function is called with the same arguments, the decorator will immediately return the stored result. Embedded Python functools.lru_cache works great for plain cases, but for agent workflows you’ll need something with time-to-live (TTL) support so that cached responses expire after a reasonable amount of time.
This matters more than you think. Agents using tool calling patterns often re-verify previous results or re-fetch context they have already retrieved. Buffering these calls means faster processing and a lower bill at the end of the month.
# 4. Validation of inputs and outputs
Enormous language models are inherently unpredictable. You send a carefully crafted prompt asking for JSON, and sometimes you get a block of markdown code with a trailing comma that breaks the parser. AND @validate the decorator catches these problems at the boundary before bad data flows deeper into the agent’s logic.
On the input side, the decorator checks whether the arguments received by the function match the expected types and constraints. On the output side, it checks whether the return value complies with the schema, while Pydantic makes it incredibly clean. You define the expected response as a Pydantic model, and the decorator tries to parse the LLM output into that model. If verification fails, you can retry the connection, apply the recovery feature, or return to the default settings.
The real advantage is that validation decorators turn noiseless data corruption into thunderous, catchable errors. You’ll debug problems in minutes, not hours.
# 5. Building emergency chains
Production agents need a plan B. If your base model doesn’t work, if your vector database is unreachable, if your tool’s API returns garbage, your agent should degrade gracefully, not crash.
AND @fallback decorator allows you to define a chain of alternative functions. The decorator first tries the base function, and if it throws an exception, it moves on to the next function in the chain. You can configure fallback from GPT-5.4 to Claude to the local Lamy model. Or from a live database query to a cached snapshot to a hard-coded default value.
The implementation is plain. The decorator accepts a list of fallback calls and iterates on failure. You can deal with this by adding logging at each failure level so you know exactly where and why your system has degraded. This pattern appears everywhere in production machine learning systems, and having it as a decorator allows you to separate the logic from the business code.
# Application
Decorators are one of Python’s most underrated features when it comes to building reliable AI agents. The five patterns described here cover the most common failure modes you may encounter when your agent leaves the secure Jupyter notebook.
And they compose beautifully. Stack a @retry on top A @timeout on top A @validateand you have a function that won’t crash, won’t give up too easily, and won’t silently forward bad data. Get started by adding retry logic to your API calls today. Once you see how much cleaner error handling becomes, you’ll need decorators everywhere.
Nahla Davies is a programmer and technical writer. Before devoting herself full-time to technical writing, she managed, among other intriguing things, to serve as lead programmer for a 5,000-person experiential branding organization whose clients include: Samsung, Time Warner, Netflix and Sony.
