Substitution Ciphers

Georgie Bumpus

Cryptography in its most simple form has existed since antiquity: it is perhaps natural to want to conceal sensitive information from those who could abuse it. Hebrew scholars used a simple cipher in occasional biblical verses and the Kama Sutra references ciphers as a way for lovers to communicate. Even the notorious Roman Emperor Julius Caesar used a cipher – now named for him, as we shall see – in private correspondence. All of these examples fall under the umbrella of what is known today as a monoalphabetic, monographic substitution cipher. All this means is that we take the (English, for our purposes) alphabet, muddle it up so that each letter corresponds to a different letter, and replace each letter in the original text – the plaintext – with the new letter it corresponds to. See an example below:

Fig. 1: This kind of alphabet reversal cipher is called an Atbash cipher, from the Hebrew letters aleph, tav, bet and shin.

The plaintext “Tom Rocks Maths” then becomes “Gln Ilxph Nzgsh”. To make it harder to break, we can split the text up into blocks of equal size to disguise word length (think how easy it would be to decrypt a one-letter word: you’ve only got two choices!) and in addition, remove grammatical markers such as capital letters and punctuation. So our sample text becomes “GLN ILX PHN ZGS H” – far less obvious. In this article I’m going to talk about these substitution ciphers and how to break them.

Most cryptographic methods – even modern-day ones, as you’ll see in later articles – are underpinned mathematically by number theory, the study of integers (whole numbers) or integer-valued functions and their properties. One of the most fundamental topics in number theory is modular arithmetic. We’ll need it to properly understand any of the cryptographic methods I’m going to talk about in this series. If you already know a lot about modular arithmetic, feel free to skip this section.

Modular arithmetic might sound scary, but actually you’ve been doing it multiple times a day since you were young. The idea is that instead of working with all the (infinitely many) integers, we only work with finitely many. When we run out of numbers, we just wrap around and start at the beginning again. Doesn’t sound familiar yet? Look at the time. Right now, for me, it’s 11.37. In half an hour it will be 12.07 – we wrapped around and began counting from 0 minutes again once we got to 60. In general, when we tell the time, we don’t measure in minutes or hours since the beginning of time – that would be daft and impractical. What we do instead is count minutes since the last hour and hours since the last midnight or midday. In the language of modular arithmetic, we measure minutes modulo 60 and hours modulo 12 (or 24, if you’re into the 24 hour clock). For this reason, modular arithmetic is sometimes known as clock arithmetic.

Let’s explore this a bit further and generalise. When the hour hand of a clock reaches number 12, it doesn’t suddenly turn around, go backwards to 0 and start again. It keeps moving forward but we still start counting from 0 again. Essentially what we’re doing is setting 12 to 0 each time we reach it, so that every multiple of 12 “becomes” 0 in this system. 

What about numbers that aren’t multiples of 12? Well, the great thing about multiples of numbers is that there are infinitely many of them, and they occur exactly regularly by definition. Therefore, we can write every single whole number as a multiple of 12 plus some other number between 0 and 11. For example, 13 in 24-hour time is 12+1, and so is 1pm in 12-hour time, or equivalently: 13 = 1 modulo 12.

In general, for any two integers m, n, we can write n = km + r for some integer k and some integer 0 ≤ r < m. For example, take m = 7, n = 33: 33 = 4 x 7 + 5 so 33 5 modulo 7. The triple equals sign here means is congruent to, which I tend to think of as “is the same as in this system”. Working modulo m, we call the set {0,1,…,m-1} the set of remainders modulo m.

Now let’s think about what it means to add and multiply in this system. If I asked you what time it is 9 hours after 8pm, how would you work it out? We’re so used to working with time that you’ll do it quite quickly, but we’re interested in the process here. One way you might do it is work out how many hours it is until 12 (4), subtract this from 9 (5) and get the answer 5am. Alternatively you could add up 9 and 8 (17) and subtract 12, giving 5am again. These processes are actually identical mathematically: in the first one we’re doing 8 – 12 + 9 = 5 and in the second it’s 8 + 9 – 12 = 5. In essence, we’re adding the numbers as we would normally and getting rid of the highest multiple of 12 less than this number. For another example, let’s work modulo 4: 6 + 3 = 9 = 8 + 1, so 6 + 3 1 mod 4. If we apply the modulo to the 6 before we add it to 3, we get 6 = 4 + 2 2 mod 4, and so 6 + 3 2 + 3 5 1 mod 4. This is an illustration of how addition modulo m is well defined, i.e. it doesn’t give us lots of different answers to the same question. 

Subtraction works the same way as addition, because we’re working in the integers: you can just add on a negative number and this is equivalent to subtraction.

You can do an almost identical thing with multiplication, though this doesn’t have a time-related analogue (badum-tsh), so we’ll have to rely on some examples. Working modulo 5: 6 x 7 = 42 = 40 + 2 2 mod 5. Note that multiplying by a multiple of 5 will send the whole thing to 0: 5x 0 mod 5 for any integer x. Just like with addition, we can show you get the same answer regardless of the order of the calculation, so multiplication is also well-defined. 

You may have noticed I strategically left out division. This is because it doesn’t quite work as you’d expect. Let’s take the true statement 6 2 mod 4. Suppose we can just divide by 2. This would give us 3 1 mod 4, which is definitely not true! So what’s gone wrong here? The problem is that the thing we’re dividing by and the thing we’re modulo-ing by have a significant factor in common (by significant, I just mean that it’s not 1, since all numbers have a factor of 1). A number theorist would say that 2 and 4 are not coprime. In fact, division works just fine if the modulus and the thing you’re dividing by are coprime (that is, their highest common factor is 1). In particular, if your modulus is a prime, division by any number will work.

It’s now worth taking a moment to think about what we actually mean by division. When I divide 6 by 7, what I’m actually doing is multiplying 6 by 1/7, where 1/7 x 7 = 1. So 1/7 is a kind of anti-7: a number that will turn 7 into the multiplicative unit 1 (i.e. the special number that does not have an effect on other numbers with multiplication) when they’re multiplied together. We can call 1/7 the multiplicative inverse of 7. A fun consequence of this whole “division works just fine with a prime modulus” thing is that every number modulo a prime p has a multiplicative inverse mod p, and – even better – this inverse is unique. Stated more precisely: for each m between 1 and p – 1, we can find an n between 1 and p – 1 such that m x n 1 mod p. For example, 2 x 3 6 1 mod 5. This is a really useful fact that number theorists use a lot, and we’ll need it later too.

Phew, that was a lot. If you didn’t get it the first time round, don’t worry – read it again later. Alternatively, here’s another article explaining the same concept completely differently, leading into something really cool called Fermat’s Little Theorem. We’ll be using a variant of this later in the series.

Right, let’s get back to the ciphers. First stop is the Caesar Cipher I mentioned back in the introduction. This is a simple substitution cipher too – each letter is replaced with the letter 3 places after it:

Fig. 2: the Caesar Cipher

Encrypting “Tom Rocks Maths” with this cipher gives us “WRP URF HVP DWK V”. Obviously, we don’t actually need the number 3 here to make the concept of this cipher work, we could replace it with any other number. For example, suppose Alice wants to send Bob a message, but doesn’t want her evil twin Eve to be able to read it. Alice’s favourite number is 7, so she shifts the alphabet 7 letters along, sending A to H, B to I and so on. But if Eve finds out that she’s using 7, she’ll have to use a different number to make it secure. How many ciphers of this “alphabet shifting” type can she make?

This is a question that demands a mathematical approach, so the first thing we should do is turn the message into numbers somehow. Let’s say she numbers each letter of the alphabet: A = 1, B = 2, and so on up to Z = 26. Now we’re going to shift this by some number, let’s say 3 to start with (for Caesar). So now we’re sending A to D, B to E and so on, corresponding to 1 to 4, 2 to 5… so far, so good – we just add 3 to the letter we started with. But what about when we get to X, Y and Z? 24 + 3 = 27, which doesn’t correspond to a letter. So we need to wrap around and go back to the beginning of the alphabet. Hopefully this sounds familiar! We’ve translated this from a “shifting” process into a modular arithmetic process, working modulo 26. For some shifting number n, we’re calculating each enciphered letter C by taking the plaintext letter L and adding n, all modulo 26:

C L + n (modulo 26)

Now we can figure out how many ciphers there are! We know that there are only finitely many of these ciphers, since there are only finitely many simple substitution ciphers, so the process must repeat at some point. You can see this if you consider shifting the alphabet by 26: you’ll just get the same thing, and it won’t be encrypted at all. So that means we are looking for some other number m such that:

L + m L + n (modulo 26) for all L

Which implies that:

m – n 0 (modulo 26)

This says m-n must be a multiple of 26, and so m = 26k + n for any integer k. But since this is true for any integer, the number of ciphers must be the same as the number of unique residues modulo 26, so the size of the set {0,1,…,25}. Therefore, we have 26 ciphers of this form, but this includes the 0 cipher which doesn’t change the text at all, so we actually have 25 which aren’t boring and trivial to crack. We call them additive ciphers because…well…you add stuff together to encipher things.

This is all well and good, but maybe you want more ciphers than just 25, and Alice is bored of just shifting things anyway. Worse, once Eve finds out Alice is using an additive cipher, she only needs to try 25 different ciphers to find out which one Alice is using. It’s time consuming, but not that time consuming – Eve could probably figure it out in half an hour. So Alice needs something else. She’s got a lovely representation of her message in modulo 6 – maybe she could try multiplying instead of adding. This turns out to give a perfectly good cipher: below I’ve multiplied each letter by 5.

Fig 3: multiplication cipher given by 5M mod 26

So now we have a new simple substitution cipher of the form C = kM mod 26 for some integer k. How many more does this give us? As before, let’s try to work out when the cipher starts repeating itself: we need to find restraints on k and m such that

kL mL mod 26 for all L

This gives us:

(k – m)L 0 mod 26

Since this must be true for each L between 1 and 26 inclusive, we must have k – m 0 mod 26 (note this wouldn’t necessarily follow if it only needed to be true for one value of L – why not?). This means the cipher using a multiplier of 3 will give the same result as a multiplier of 29, just like in the additive ciphers! So we have at most 26 unique ciphers. Unfortunately, this is where the similarities to additive ciphers end: multiplicative ciphers aren’t so well-behaved. For example, if we used an additive cipher of shift 26, it would map each letter to themselves. Not secure, certainly, but not the end of the world either: Bob would still be able to read the message. However, if we used a multiplicative cipher with multiplier 26, it would map every letter to Z! That’s worse than useless – Eve might not be able to read it, but neither can Bob. 26 isn’t the only number that will have this annoying effect: multiplying by 13 will cause similar problems by mapping every letter to M and Z alternately. The reason multiplicative ciphers have this annoying problem built in is that division modulo something that isn’t prime doesn’t behave well at all, as we discussed earlier. In fact, any number from 1 to 26 sharing a factor greater than 1 with 26 will mess up Alice’s message beyond repair. This means that we only have 12 multiplicative ciphers, including the boring multiplier-1 cipher. This isn’t great for Alice. She only has 25 + 11 = 36 ciphers so far, and Eve could break anything she writes with pretty simple trial and error. Luckily, Alice has another trick up her sleeve…

So far we’ve discussed adding and multiplying modulo 26. But what’s to stop you doing both? Pick a multiplier k and a shift n, and now we have C kM + n mod 26. This is far harder to break with trial and error – we now have 26 x 12 = 312 ciphers total. Eve wouldn’t want to try a pen and paper attack on this. She can still break it, though, and in a minute we’ll discuss how she might go about that. Anyway, this kind of multiply then add cipher is called an affine cipher.

Perhaps you’re wondering what happens if we change the order of the operations – add and then multiply – could we get more ciphers? Unfortunately, that doesn’t get us anywhere new. To see why, take k = 3, n = 5. Now we have C 3(M + 5) mod 26, which gives us C 3M +15 mod 26. But this is just another affine cipher, so no cigar. 

Before you read this section, take a couple of minutes to think about how you might break the above kind of cipher. Obviously there’s a sure-fire way of finding out the original message once you know it’s a simple substitution cipher: try every cipher of this type and see if you get something coherent. This isn’t the most efficient though – in fact, there are *26! = 403291461126605635584000000 different substitution ciphers. Notice, though, that each letter is replaced with exactly one other letter, and that this rule is consistent throughout the ciphertext. This enables an approach known as frequency analysis: counting up how many times each letter occurs in the ciphertext and comparing these numbers with how many times each letter occurs across text in the target language. This is actually a pretty intuitive method – in English the letter e occurs most frequently, so it makes sense to guess that the letter occurring most frequently in the ciphertext corresponds to this (if you’ve ever done a codeword puzzle, you’ll recognise this). 

Let’s try and decode the following message:

fcmvt cunmu fcnav llfcc aqccf ogtxq mxjlv zivgv fovos xqqif ogvod fqetx jvlqe cvtfc mnocv tfbfc uvsxq tctqr ncufo gvost ununv asfcr eccna fogcq fctnx bcuns ezunt tcuns ezunt tqurj snvay vmtqu rjbea vosmu ftina ttunx xgncr nndnz ecnsv ttean vtbna anctv anbna anctm unanz vofuv pnsaq yynsc unrfm qosna vxfzn gentt nsfov rqrno ccuvc fcmvt xqqif ogbqa cunbv ovosc unyvf aqbmu fcnif sgxqp ntvos tunpn ajgqq sovce ansxj lngvo ueocf ogvlq ecbqa cunrl eccun jmnan oqmun ancql ntnno npnaj cufog tnnrn scquv pnzuv ognst foznu natmf rfocu nyqqx voscu nganv cuvxx mfcuc ungxv ttcvl xnvos cunxf ccxns qqauv spvof tunsz qryxn cnxj  

It might look daunting because it’s so long, but actually the longer the text, the better! Statistical methods like frequency analysis have higher accuracy for longer samples, because the proportion of anomalies will tend to decrease. In the above text, the letter n has the highest frequency, followed by c, v and u. We can guess from this that these letters correspond to some selection of e, t, a, i, o, r: these are the most common letters used in English. Here is a full frequency table of letters in English. As an exercise, try using it to decrypt the above puzzle.

This method works better on long texts without too many technical words: frequency analysis on an enciphered version of “a dizzy x-ray fish snoozed” probably wouldn’t get you very far initially because generally uncommon letters like z and x are overrepresented. Something you could try instead on this text is looking for doubled letters. Using the cipher in Fig. 1, we see this becomes zwraa bcizb urhsh mllav w. Here we know that the letter a in the ciphertext is unlikely to correspond to a h in the plaintext as “hh” occurs nowhere in English. It would only be possible if one word ended in h and the next began with h (since we standardised word length in the ciphertext). So a potential approach would be to guess initially that the double letters are “oo”, “ll”, “ss”, and so on. 

The last thing to note about this kind of cipher is that if the attacker has a sample of plaintext and has access to the corresponding ciphertext, then the whole text – and any future texts using this cipher is blown open. This is known as a ciphertext only attack, and the reason that this particular cipher is so vulnerable is that once the encryptor and decryptor have agreed a version of the cipher to use, they have no way of varying it without exchanging some extra bit of information. We’ll talk about this a bit more in the next article, so see you there!

*26! = 1 x 2 x 3 x … x 25 x 26. You can see there are exactly this many ciphers by first noting that no two letters can correspond to the same letter under the cipher, because this would be totally unhelpful when it came to decoding (for an extreme example, imagine if every letter mapped to Z!). So if we start at A, we can map A to any letter of the alphabet: there are 26 choices for A will go. But for B we only have 25 choices (26 minus 1 for the letter A was mapped to). And so on: 24 choices for C, 23 for D until there is only one choice for Z. Multiplying all these choices together gives us 26! substitution ciphers. Note also that not every substitution cipher is affine. Can you think of one that isn’t?

Article 2: RSA Cryptography

Article 3: Elliptic Curve Cryptography


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s