Photo by the editor
# Entry
Python decorators can be extremely useful in projects involving Development of AI and machine learning systems. They are great at separating key logic like modeling and data pipelines from other standard tasks like testing and validation, synchronization, logging, and so on.
This article discusses five particularly useful Python decorators that, based on developer experience, have been shown to be effective at cleaning up AI code.
The following code examples cover plain, basic logic based on standard Python libraries and best practices, e.g functools.wraps. Their main purpose is to illustrate the utilize of each specific decorator, so you only have to worry about adapting the decorator logic to your AI coding project.
# 1. Concurrency limiter
A very useful decorator when dealing with (often annoying) free tier limits when using third party vast language models (LLMs). When such limits are reached by sending too many asynchronous requests, this pattern introduces a throttling mechanism to make these calls more secure. Thanks to semaphores, the number of executions of an asynchronous function is confined:
import asyncio
from functools import wraps
def limit_concurrency(limit=5):
sem = asyncio.Semaphore(limit)
def decorator(func):
@wraps(func)
async def wrapper(*args, **kwargs):
async with sem:
return await func(*args, **kwargs)
return wrapper
return decorator
# Application
@limit_concurrency(5)
async def fetch_llm_batch(prompt):
return await async_api_client.generate(prompt)
# 2. Structured machine learning recorder
It is no surprise that in software as sophisticated as that governing machine learning systems, the standard print() instructions get lost easily, especially when deployed to a production environment.
Using the logging decorator below, you can “capture” executions and errors and format them into structured JSON logs that are easily searchable for quick debugging. The following sample code can be used as a template to decorate, for example, a function defining a training epoch in a model based on a neural network:
import logging, json, time
from functools import wraps
def json_log(func):
@wraps(func)
def wrapper(*args, **kwargs):
start = time.time()
try:
res = func(*args, **kwargs)
logging.info(json.dumps({"step": func.__name__, "status": "success", "time": time.time() - start}))
return res
except Exception as e:
logging.error(json.dumps({"step": func.__name__, "error": str(e)}))
raise
return wrapper
# Application
@json_log
def train_epoch(model, training_data):
return model.fit(training_data)
# 3. Injector function
Introduce a particularly useful decorator in the model implementation and inference stages! Let’s say you’re moving a machine learning model from a notebook to a lightweight production environment, e.g FastAPI endpoint. Manually ensuring that raw data coming from end users goes through the same transformations as the original training data can sometimes become a bit tedious. The function injector helps ensure consistency in how functions are generated from raw data, all under the hood.
This is very useful at the implementation and application stage. When moving a model from a Jupyter notebook to production, the main concern is ensuring that the raw incoming user data undergoes the same transformations as the training data. This decorator ensures that these functions are generated consistently under the hood before the data reaches your model.
The following example simplifies the process of adding a function called 'is_weekend'depending on whether the date column in the existing data frame contains a date associated with Saturday or Sunday:
from functools import wraps
def add_weekend_feature(func):
@wraps(func)
def wrapper(df, *args, **kwargs):
df = df.copy() # Prevents Pandas mutation warnings
df['is_weekend'] = df['date'].dt.dayofweek.isin([5, 6]).astype(int)
return func(df, *args, **kwargs)
return wrapper
# Application
@add_weekend_feature
def process_data(df):
# 'is_weekend' is guaranteed to exist here
return df.dropna()
# 4. Deterministic seed sower
This one is distinguished by two specific stages of the AI/ML lifecycle: experimentation and hyperparameter tuning. These processes typically involve the utilize of a random source while tuning key hyperparameters such as the model’s learning rate. Let’s say you just adjusted its value and suddenly the model accuracy drops. In such a situation, you may need to check whether the cause of the performance drop is a modern hyperparameter setting or simply a bad random initialization of the weights. By blocking the seed, we isolate variables, making the results of tests such as A/B more reliable.
import random, numpy as np
from functools import wraps
def lock_seed(seed=42):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
random.seed(seed)
np.random.seed(seed)
return func(*args, **kwargs)
return wrapper
return decorator
# Application
@lock_seed(42)
def initialize_weights():
return np.random.randn(10, 10)
# 5. Back in developer mode
A life-saving decorator, especially in local development environments and CI/CD tests. Let’s say you’re building an application layer on top of LLM – for example, an augmented search generation (RAG) system. If a decorated function fails due to external factors, such as a connection timeout or API usage limits, instead of throwing an exception, the decorator catches the error and returns a predefined set of “sample test data”.
Why a lifeguard? Because this mechanism can ensure that the application does not stop completely if the external service goes down temporarily.
from functools import wraps
def fallback_mock(mock_data):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
try:
return func(*args, **kwargs)
except Exception: # Catches timeouts and rate limits
return mock_data
return wrapper
return decorator
# Application
@fallback_mock(mock_data=[0.01, -0.05, 0.02])
def get_text_embeddings(text):
return external_api.embed(text)
# Summary
This article examines five effective Python decorators that will lend a hand make AI and ML code cleaner in a variety of specific situations: from structured, searchable logging to controlled random initialization in aspects such as data sampling, testing, and more.
Ivan Palomares Carrascosa is a thought leader, writer, speaker and advisor in the fields of Artificial Intelligence, Machine Learning, Deep Learning and LLM. Trains and advises others on the utilize of artificial intelligence in the real world.
