Thursday, May 8, 2025

3 Questions: Improving Last Mile Logistics with Machine Learning

Share

Q: What is the vehicle routing problem and how do time-honored operations research (OR) methods solve it?

AND: Almost every logistics and delivery company, such as USPS, Amazon, UPS, FedEx, DHL, struggles with the problem of vehicle routing every day. Simply put, it is finding an proficient route connecting a group of customers to whom goods should be delivered or collected. It decides which customers each of these vehicles – that you see on the road – should visit on a given day and in what order. Typically, the goal is to find routes that lead to the shortest, fastest or cheapest route. However, very often they are also guided by customer-specific constraints. For example, if you have a customer who has a specific delivery window or a customer on the 15th floor of a high-rise building versus the ground floor. This makes it hard for these customers to integrate into an proficient delivery route.

To solve the problem of vehicle routing, we obviously cannot do our modeling without the right information about demand and, ideally, customer characteristics. For example, we need to know the size or weight of the packages ordered by a given customer, or how many units of a given product need to be shipped to a specific location. All of this determines the time you need to serve a given stop. In realistic problems, you also want to know where the driver can safely park the vehicle. Traditionally, the route planner has had to come up with good estimates for these parameters, so it is very common to find models and planning tools that make general assumptions because data on specific stops was not available.

Machine learning could be really fascinating in this because most drivers these days have smartphones or GPS trackers, so there’s a ton of information about how long it takes to deliver a package. You can now extract that information on a enormous scale, in a somewhat automated way, and calibrate every single stop so that you can model it in a realistic way.

Using the time-honored OR approach means writing an optimization model that starts with defining an objective function. In most cases, this is some kind of cost function. Then there are a few other equations that define the inner workings of the routing problem. For example, you need to tell the model that if a vehicle visits a customer, it must also leave the customer again. In academic terms, this is usually called the flow conservation principle. Similarly, you need to ensure that each customer is visited exactly once on a given route. These and many other real-world constraints together define a viable route. This may seem obvious to us, but it needs to be explicitly coded.

Once the optimization problem is formulated, there are algorithms that facilitate us find the best possible solution; we call them solvers. Over time, they find solutions that satisfy all the constraints. Then they try to find routes that are better and better, and therefore cheaper and cheaper, until you say, “OK, that’s good enough for me,” or until they prove mathematically that they’ve found the optimal solution. The average delivery vehicle in a U.S. city makes about 120 stops. Solving this problem explicitly can take a long time, so companies don’t usually do it because it’s just too computationally high-priced. So they apply what are called heuristics, which are algorithms that are very proficient at finding reasonably good solutions, but they usually can’t quantify how far away those solutions are from the theoretical optimum.

Q: You are currently applying machine learning to solve the vehicle routing problem. How can we apply it to leverage and possibly outperform time-honored OR methods?

AND: That’s what we’re working on right now with the folks at MIT-IBM Watson AI Lab. The general idea here is that you train a model on a enormous set of existing routing solutions that you’ve either observed in real-world company operations or generated using one of these powerful heuristics. In most machine learning models, you no longer have an explicit objective function. Instead, you need to make the model understand what kind of problem it’s actually solving and what a good solution to the problem looks like. For example, just like training a enormous language model on the words in a given language, you need to train a routing model on the concept of different delivery stops and their demand characteristics. Just like understanding the innate grammar of natural language, your model needs to understand how to connect those delivery stops in a way that results in a good solution—in our case, a low-cost or brisk solution. If you then present the system with a completely recent set of customer requirements, it can still connect the dots in a very literal way, just as you would if you were trying to figure out a good way to reach those customers.

To do this, we apply model architectures that most people are familiar with from the language processing space. This seems a little counterintuitive, because what does language processing have to do with routing? But in fact, the properties of these models, especially the transformer models, allow you to find structure in language—to combine words in such a way that they form sentences. For example, in a given language, you have a vocabulary, and it’s fixed. It’s a distinct set of possible words that you can apply, and the challenge is to combine them in a meaningful way. Routing is similar. There are about 40,000 addresses in Cambridge that you can visit. Typically, you have to visit a subset of those addresses, and the challenge is: how do you combine that subset—those “words”—into a meaningful sequence?

This is the novelty of our approach—taking this structure that has proven to be extremely effective in the language space and applying it to combinatorial optimization. Routing is simply a great testing ground for us, because it is the most fundamental problem in the logistics industry.

Of course, there are already very good routing algorithms that have emerged from decades of operational research. In this project, we try to show that with a completely different, purely machine-based methodological approach, we can predict routes that are almost as good as or better than the routes that can be obtained by running the state-of-the-art routing optimization heuristics.

Q: What are the advantages of the method you apply compared to other state-of-the-art surgical techniques?

AND: At the moment, the best methods are still very demanding in terms of the computational resources required to train these models, but some of that effort can be put into the front end. Then, the trained model is relatively proficient at creating a recent solution when needed.

Another aspect to consider is that the operating environment of the route, especially in cities, is constantly changing. The available road infrastructure or traffic regulations and speed limits may change, the ideal parking lot may be occupied by something else or a construction site may block the road. With an OR-only approach, you may run into trouble because you would have to solve the entire problem immediately as soon as recent information about the problem comes in. Since the operating environment changes dynamically, you would have to do it over and over again. Whereas if you have a well-trained model that has seen similar problems before, it can potentially suggest the best route, almost immediately. It is more of a tool that would facilitate companies adapt to increasingly unpredictable changes in the environment.

Moreover, optimization algorithms are often hand-crafted to solve a specific business problem. The quality of solutions obtained from such explicit algorithms is confined by the level of detail and sophistication that went into the algorithm design. On the other hand, a learning-based model continuously learns a routing policy from data. Once the model structure is defined, a well-designed learning-route model will extract potential improvements to the routing policy from the expansive number of routes it is trained on. In simpler terms, a learning-based routing tool will continue to find route improvements without having to invest in explicitly designing these improvements into the algorithm.

Finally, optimization-based methods are usually confined to optimizing against a very clearly defined objective function, which often aims to minimize costs or maximize profits. In reality, the goals facing companies and drivers are much more complicated and often somewhat contradictory. For example, a company wants to find proficient routes, but at the same time wants to have a low emission footprint. The driver also wants to feel unthreatening and have a convenient way to serve these customers. In addition, companies also care about consistency. A well-designed route learning model can ultimately capture these multi-dimensional goals on its own, something that could never be achieved in the same way using time-honored optimization approaches.

So, this is the kind of application of machine learning that can have a concrete, real-world impact on industry, society, and the environment. The logistics industry has problems that are much more complicated than this. For example, if you want to optimize an entire supply chain—let’s say, the flow of a product from a manufacturer in China through a network of different ports around the world, through the distribution network of a major retailer in North America, to your store where you actually buy it—there are so many decisions involved, which of course makes it a much more hard task than optimizing a single vehicle route. We hope that with this initial work, we can lay the groundwork for research and also private sector development efforts to build tools that will ultimately enable better optimization of the supply chain from end to end.

Latest Posts

More News