NORAD, a North American military organisation which has been tracking Santa Claus for 65 years, claim that he departs the North Pole at 11am GMT on Christmas Eve and returns 23 hours later, having visited every child in the world. Even taking into account the magical powers of Father Christmas, it’s very easy to underestimate how hard planning and making this journey actually is. There are, according to the world bank, 2.2 billion children on planet earth. Luckily, two children living together only need one delivery, but even when we account for this there are almost 900 million houses Santa needs to visit in one night. Imagine you are the elf who is tasked with planning the trip. How do you go about finding the fastest way to visit every child in the world and return to the North Pole?
This problem is commonly known as the travelling salesman problem (TSP): if you have n points, where n is some whole number, what is the shortest loop you can draw which passes through every point? (A loop is a path which eventually returns to its starting position). Though Santa’s journey is an exceptional case, his problem is one which humans have studied extensively.
On the surface, this does not look difficult. Because Santa can fly, it is easy to find the shortest distance between any two places he needs to land. Up to the curvature of the earth, it is a straight line. This means that Santa can find the shortest path to visit all the children by taking a list of all their houses (plus the North Pole) and writing down every possible order he could visit them in, called a route. Since each route has an associated length, Santa needs to then search his lists for the route with the shortest length and he will have his answer. We call this approach the brute force algorithm.
There is, however, a problem with this method. If Santa had only four children to visit, he would have to devise the shortest route between five points. In this case, there are only 12 possible paths, so searching for the fastest does not take too long. You could certainly do it on the back of an envelope. However, if we look at a table where the number of destinations is compared to the number of paths we see the following pattern:
|Number of destinations, n||Number of Paths, P|
The formula for an unspecified number of destinations, n, may seem to have come out of nowhere, but allow me to convince you that it has not. Let us consider every possible path Santa can take from the North Pole. If there are n total destinations, n-1 of these are childrens’ houses since one is the North Pole. Therefore, at the start of the journey, Santa has n-1 choices for his first house to visit. He picks one at random. He now has n-2 houses he has not visited and again picks one at random. He now has n-3 he hasn’t visited and this process repeats until he has only one child’s house left to visit. At that point, he has no further choices. He goes to the final child’s house and then returns to the North Pole. We can calculate the total number of paths by multiplying his options at every stage, giving the number of paths
P’ = (n-1) x (n-2) x (n-3) x … x 2 x 1 x 1 = (n-1)!
The ! means factorial, where a factorial is worked out by multiplying together all numbers less than or equal to the number you’re taking the factorial of. For example, 4! = 4 x 3 x 2 x 1 = 24. You will notice that the number of paths calculated here is not the number of paths in the table, and for good reason. The method we employed here counts each route twice, once forwards and once backwards. For the purposes of finding the distance Santa has to fly (which is what we are interested in), we don’t care which direction he goes around the loop and therefore we divide the number by two to give the equation in the table P = (n-1)!/2.
It takes time to do maths. Even modern computers, which can do maths very quickly indeed, take a finite amount of time. If we ask a computer to look at a list of y numbers and find the largest, the computer will take time proportional to y, the length of the list. This means that for the brute force algorithm described above, as we increase the number of destinations, n, the computation time increases proportionally to (n-1)!/2. The shorthand for this is to say that the algorithm is of order n! written O(n!). The problem for Santa is that n! grows very quickly. For example, 22! = 1,124,000,727,777,607,680,000. If it took Santa a nanosecond to figure out the best path between five destinations, it would take him a time equal to the age of the universe to calculate the best path for 22. Keep in mind that Santa has nearly 1 billion households he needs to visit, and it becomes clear that computing the best route using this brute force method is nigh on impossible.
Now, obviously, Santa Claus is magic. If he wanted, I am sure he could use that magic to make a computer fast enough to use the brute force algorithm. However, that would be a little premature. Humans (most of whom do not have magic) also need to solve problems like this and we’ve come up with some pretty useful tools.
The first thing you may think of doing to save time is to not bother comparing routes at all. The elf in charge tells Santa to always go to the closest house he has not already visited. When he has visited every child’s house, he is to return to the North Pole. This is called the Nearest Neighbour (NN) algorithm, and appears to be quite promising. If we consider four destinations arranged in a rectangle, we see that this algorithm gives us the best (shortest) path.
Is it possible that simply going to the nearest unvisited house at every opportunity gives the shortest possible route? While such a statement would be hard to prove conclusively, all we need to disprove it is a single example where NN gives us a bad path. Unfortunately such an example is not hard to find. If we again take four destinations, but this time arrange them in a parallelogram, we can see that NN does not always give us the best option (the first one shown below).
In fact, it is even worse than that. If he uses this method, Santa might save on computation time but he could end up following the worst route possible – I doubt the reindeer would be happy with this arrangement! But, luckily for them there is a better way which only requires a little calculation.
In the NN approach, we always run the algorithm starting at the North Pole, because that is where Father Christmas lives and it made sense to start there. However, it’s important to realise that we are using the algorithm to construct loops, and loops have no real starting point. Therefore, if we run the NN algorithm starting at a child’s house, it will still produce a path which Santa can follow, even though he starts from the North Pole (to do this we just need to remember that the North Pole is a destination which must be visited before returning to the starting house).
It is easy to demonstrate that the starting position of the NN algorithm changes the length of the route the algorithm produces, so we propose a new method: the Recursive Nearest Neighbour (RNN) algorithm. This runs the nearest neighbour algorithm for every possible starting position and puts them into a list, and then sorts the list to find the fastest loop. Using the example of the parallelogram from earlier, we would have the algorithm calculate the nearest neighbour paths for all possible starting locations as shown below:
The algorithm would then sort through all these different paths to find the shortest. In this case this is the path starting from the top left, which also happens to be the shortest possible path.
RNN is much faster than the brute force algorithm and, as you can see from the example, it will rarely produce the absolute worst route. Generally, it is much better than the NN algorithm at producing a path close to the shortest one, since it is less likely to fall into traps which result in bad outcomes. However, though it can result in the shortest solution sometimes, it does not do so reliably.
It is unlikely that Santa uses either of these algorithms. They are decent enough and certainly don’t require much computing power compared to the brute force approach. However, non-magical logistics companies use a variety of more complicated approaches to increase the accuracy while keeping down computation costs. One approach is to break down a large number of destinations into smaller problems which can be easily solved, before recombining into a single route. Other algorithms look at nature, mimicking ants or certain types of fungi to find good guesses at the shortest route. Though nothing human has picked a route on the scale of Santa’s journey, we have found the optimal path for visiting every town in Sweden – a problem containing 24,978 destinations – and even one involving 85,900 destinations. As computing power increases and work continues to be done on the problem, I have no doubt the number of destinations we have solutions for will increase. Until then, this Christmas eve, spare a thought for the army of mathematicians working to get you your presents on time.
Find out how fast Santa must travel to reach all 900 million houses in the next article!