Not so smooth criminals: how to use maths to catch a serial killer

The year is 1888, and the infamous serial killer Jack the Ripper is haunting the streets of Whitechapel. As a detective in Victorian London, your mission is to track down this notorious criminal – but you have a problem. The only information that you have to go on is the map below, which shows the locations of crimes attributed to Jack. Based on this information alone, where on earth should you start looking?

Picture1

The fact that Jack the Ripper was never caught suggests that the real Victorian detectives didn’t know the answer to this question any more than you do, and modern detectives are faced with the same problem when they are trying to track down serial offenders. Fortunately for us, there is a fascinating way in which we can apply maths to help us to catch these criminals – a technique known as geospatial profiling.

Geospatial profiling is the use of statistics to find patterns in the geographical locations of certain events. If we know the locations of the crimes committed by a serial offender, we can use geospatial profiling to work out their likely base location, or anchor point. This may be their home, place of work, or any other location of importance to them – meaning it’s a good place to start looking for clues!

Perhaps the simplest approach is to find the centre of minimum distance to the crime locations. That is, find the place which gives the overall shortest distance for the criminal to travel to commit their crimes. However, there are a couple of problems with this approach. Firstly, it doesn’t tend to consider criminal psychology and other important factors. For example, it might not be very sensible to assume that a criminal will commit crimes as close to home as they can! In fact, it is often the case that an offender will only commit crimes outside of a buffer zone around their base location. Secondly, this technique will provide us with a single point location, which is highly unlikely to exactly match the true anchor point. We would prefer to end up with a distribution of possible locations which we can use to identify the areas that have the highest probability of containing the anchor point, and are therefore the best places to search.

With this in mind, let’s call the anchor point of the criminal z. Our aim is then to find a probability distribution for z, which takes into account the locations of the crime scenes, so that we can work out where our criminal is most likely to be. In order to do this, we will need two things.

  1. A prior distribution for z. This is just a function which defines our best guess at what z might be, before we have used any of our information about the crime locations. The prior distribution is usually based off data from previous offenders whose location was successfully determined, but it’s usually not hugely important if we’re a bit wrong – this just gives us a place to start.
  2. A probability density function (PDF) for the locations of the crime sites. This is a function which describes how the criminal chooses the crime site, and therefore how the criminal is influenced by z. If we have a number of crimes committed at known locations, then the PDF describes the probability that a criminal with anchor point z commits crimes at these locations. Working out what we should choose for this is a little trickier…

We’ll see why we need these in a minute, but first, how do we choose our PDF? The answer is that it depends on the type of criminal, because different criminals behave in different ways. There are two main categories of offenders – resident offenders and non-resident offenders.

Resident offenders are those who commit crimes near to their anchor point, so their criminal region (the zone in which they commit crimes) and anchor region (a zone around their anchor point where they are often likely to be) largely overlap, as shown in the diagram:

Picture2

If we think that we may have this type of criminal, then we can use the famous normal distribution for our density function. Because we’re working in two dimensions, it looks like a little hill, with the peak at the anchor point:

Picture3

Alternatively, if we think the criminal has a buffer zone, meaning that they only commit crimes at least a certain distance from home, then we can adjust our distribution slightly to reflect this. In this case, we use something that looks like a hollowed-out hill, where the most likely region is in a ring around the centre as shown below:

Picture4

The second type of offenders are non-resident offenders. They commit crimes relatively far from their anchor point, so that their criminal region and anchor region do not overlap, as shown in the diagram:

Picture5

If we think that we have this type of criminal, then for our PDF we can pick something that looks a little like the normal distribution used above, but shifted away from the centre:

Picture6

Now, the million-dollar question is which model should we pick? Determining between resident and non-resident offenders in advance is often difficult. Some information can be made deduced from the geography of the region, but often assumptions are made based on the crime itself – for example more complex/clever crimes have a higher likelihood of being committed by non-residents.

Once we’ve decided on our type of offender, selected the prior distribution (1) and the PDF (2), how do we actually use the model to help us to find our criminal? This is where the mathematical magic happens in the form of Bayesian statistics (named after statistician and philosopher Thomas Bayes).

Bayes’ theorem tells us that if we multiply together our prior distribution and our PDF, then we’ll end up with a new probability distribution for the anchor point z, which now takes into account the locations of the crime scenes! We call this the posterior distribution, and it tells us the most likely locations for the criminal’s anchor point given the locations of the crime scenes, and therefore the best places to begin our search.

This fascinating technique is actually used today by police detectives when trying to locate serial offenders. They implement the same steps described above using an extremely sophisticated computer algorithm called Rigel, which has a very high accuracy of correctly locating criminals.

So, what about Jack?

If we apply this geospatial profiling technique to the locations of the crimes attributed to Jack the Ripper, then we can predict that it is most likely that his base location was in a road called Flower and Deane Street. This is marked on the map below, along with the five crime locations used to work it out.

Picture7

Unfortunately, we’re a little too late to know whether this prediction is accurate, because Flower and Deane street no longer exists, so any evidence is certainly long gone! However, if the detectives in Victorian London had known about geospatial profiling and the mathematics behind catching criminals, then it’s possible that the most infamous serial killer in British history might never have become quite so famous…

Francesca Lovell-Read

England v Colombia Penalty Shootout

I was asked by the Daily Mirror to analyse the England football team’s penalty kicks against Colombia in the World Cup second round. You can find the key insights below and the full article online here.

unsaveable-zone

Image: Dr Ken Bray, University of Bath

Screen Shot 2018-07-09 at 13.32.28

Harry Kane – Kane’s very calm and confident in his walk up to the penalty spot showing that he has prepared well mentally. He carefully places the ball and adjusts his socks before firing low and hard into the bottom left-hand corner of the net. The keeper goes the right way but it’s too accurate and right in the corner of the ‘unsaveable zone’.

Marcus Rashford – A different approach on the walk up as he keeps his head down to make sure he doesn’t give anything away to the Colombia keeper. He curves his run-up to add extra disguise to the shot and puts it in almost exactly the same place as Harry Kane. Again, the Colombia keeper goes the right way but it’s too fast, too accurate and right in the bottom corner of the ‘unsaveable zone’.

Jordan Henderson – The ‘kick-ups’ on the walk to the penalty area show he’s nervous and the look on his face also hints at a lack of confidence. The placement of the shot is actually very good as he hits the ‘unsaveable zone’ to the left of the keeper, but his shot is a little higher than the previous two making it a more comfortable height for the goalie, and his wide run-up gives the game away as he opens his body to go to the right. If you look closely you’ll see that Ospina moves before Henderson kicks the ball which is why he’s able to reach beyond the ‘diving envelope’ and make the save.

Kieran Trippier – He has his head down and a look of complete focus on his face as he approaches the penalty spot. After a little glance up to make sure he knows where he’s going, he buries it in the top left corner in the perfect spot. Comparing Trippier’s penalty to the fourth Colombian taker, Uribe, who missed, it’s the use of the inside of his foot that makes all of the difference. Despite them both aiming for the top corner of the ‘unsaveable zone’, Uribe leant back and went with his laces making it less controlled than Trippier’s side foot. It’s also interesting that England’s nominated set piece taker went fourth in the line-up. No doubt, because Gareth Southgate knew that the fourth penalty would be key to victory as one that goalkeepers are likely to save.

Eric Dier – Positionally, probably the worst of the five England penalties as it was the closest to the centre of the goal and the edge of the ‘diving envelope’ which is within reach of Ospina. The key aspect of Dier’s penalty that allowed him to score was the fact that it was along the ground. Ospina dives the correct way, but can’t reach close enough to his body to make the save. Compare this to Jordan Henderson’s penalty, which was much closer to the corner, but at a more comfortable height for the save.

Summary:

  • 4 of the 5 penalties went to the left of the goalkeeper and were all scored, whereas the one that went to the right of the keeper was saved.
  • All of England’s penalty takers were right-footed.
  • 2 of the 5 penalty takers were substitutes, likely brought on to take a penalty in the shootout.
  • All of England’s penalties hit the ‘unsaveable zone’, maximising the chances of scoring. For Colombia only 2 of the 5 penalties hit the ‘unsaveable zone’.
  • Jordan Pickford saved the fifth and final penalty, demonstrating how it is more likely for a goalkeeper to make a save later in the shootout.

England benefitted from good preparation from the manager in selecting his line-up months in advance, aiming consistently for the ‘unsaveable zone’ which is the most difficult area for the goalkeeper to reach, and by preparing well mentally and taking their time with each shot. Ultimately, these 3 things were key to the victory.

World Cup 2018: The Perfect Penalty Kick

 

The 2018 World Cup in Russia kicks off today and so I bring you a special double-edition of Throwback Thursday looking at the science behind the perfect penalty kick… Fingers crossed the England players listen/read my website and we don’t lose to Germany in a penalty shootout (though let’s be honest we probably will).

Live interview with BBC Radio Cambridgeshire looking at the ‘unsaveable zone’ and the best way to mentally prepare for a penalty.

 

And if that wasn’t enough, here’s a full description of the ‘Penalty Kick Equation’…

For all of the footballers out there who have missed penalties recently, I thought I would explain the idea of the science behind the perfect penalty a little further, and in particular the maths equation that describes the movement of the ball. On the radio of course I couldn’t really describe the equation, so here it is:

Screen Shot 2017-06-05 at 10.09.22

If you’re not a mathematician it might look a little scary, but it’s really not too bad. The term on the left-hand side, D, gives the movement of the ball in the direction perpendicular to the direction in which the ball is kicked. In other words, how much the ball curves either left or right. This is what we want to know when a player is lining up to take a penalty, because knowing how much the ball will curl will tell us where it will end up. To work this out we need to input the variables of the system – basically use the information that we have about the kick and input it into the equation to get the result. It’s like one of those ‘function machines’ that teachers used to talk about at school: I input 4 into the ‘machine’ and it gives me 8, then I put in 5 and I get 10, what will happen if I input 6? The equation above works on the same idea, except we input a few different things and the result tells us how much the ball will curl.

So, what are the inputs on the right-hand side? The symbol p just represents the number 3.141… and it appears in the equation because footballs are round. Anytime we are using circles or spheres in maths, you can bet that p will pop up in the equations – it’s sort of its job. The ball itself is represented by R which gives the ball’s radius, i.e. how big it is, and the ball’s mass is given by m. We might expect that for a smaller ball or a lighter ball the amount it will curl will be different, so it is good to see these things are represented in the equation – sort of a sanity test if you will. The air that the ball is moving through is also important and this is represented by r, which is the density of the air. It will be pretty constant unless it’s a particularly humid or dry day.

Now, what else do you think might have an effect on how much the ball will curl? Well, surely it will depend on how hard the ball is kicked… correct. The velocity of the ball is given by v. The distance the ball has moved in the direction it is kicked is given by x, which is important as the ball will curl more over a long distance than it will if kicked only 1 metre from the goal. For a penalty this distance will be fixed at 12 yards or about 11m. The final variable is w – the angular velocity of the ball. This represents how fast the ball is spinning and you can think of it as how much ‘whip’ has been put on the ball by the player. Cristiano Ronaldo loves to hit them straight so w will be small, but for Beckham – aka the king of curl- w will be much larger. He did of course smash that one straight down the middle versus Argentina in 2002 though…

So there you have it. The maths equation that tells you how much a football will curl based on how hard you hit it and how much ‘whip’ you give it. Footballers often get a bad reputation for perhaps not being the brightest bunch, but every time they step up to take a free kick or a penalty they are pretty much doing this calculation in their head. Maybe they’re not quite so bad after all…

WordPress.com.

Up ↑