Finding the prime factors of a large number efficiently with the help of Shor’s algorithm

Rajiv Krishnakumar
23 min readJul 29, 2021

--

Image courtesy of Kathleen Martsch

Hello! In this page, we will be looking at how to use Shor’s algorithm (which is an algorithm that can only be run with the help of a quantum computer) to find the prime factors of a large integer. If you’ve stumbled across this page, there’s a good chance you’ve already heard that prime numbers are important for cryptography (if you haven’t I highly recommend this 3 minute youtube video.) There’s also a good chance that you already know that this problem can’t be solved using only regular or classical computers for any reasonably sized number (and if not, I would highly recommend this other 3 minute youtube video). So with that context established, let’s get to it!

Most of this material has been taken from chapter 5 of Ronald de Wolf’s lecture notes on quantum computing, with other bits and pieces taken from a few other sources, notably Umesh Vazirani’s lecture notes on Shor’s algorithm and chapter 10 of Robert Sutor’s textbook. A full list of references can be found at the end of this article. Also any comments/feedback/suggestions are always welcome :) feel free to hit me up on twitter @rajkk1.

The Problem

Let’s first start by clearly defining the problem we want to solve: given a positive integer a, can we find all its prime factors, which we will call x₁, x₂, x₃ etc. Prime factors of a are defined as the set prime numbers x₁, x₂, x₃ etc. where x₁ × x₂ × x₃ ×… = a.

A high level overview of the algorithm to find a single prime factor: Shor’s algorithm

Before delving into the whole end-to-end algorithm, it is useful to fully understand the algorithm to find a single prime factor of a. This is what is typically called Shor’s algorithm, which was first published by Peter Shor in 1994. To start with, we will state the steps and the name of the technique used in each step without going into the details. The details of each technique will be discussed in the next sections. So the steps for finding a prime factor of a are:

  1. Is a even? If yes, 2 is a prime factor and you’re done. If not, go to the next step.
  2. Check if a is a prime number by using any of the efficient classical primality test algorithms. If yes, you’re done. If not, then go to the next step.
  3. Pick a random integer k between 2 and a-1. Find the greatest common divisor g of k and a using Euclid’s algorithm (see [1] in footnotes for the definition of the greatest common divisor). Is g=1? If not, go step 4. If g=1, then go to step 5.
  4. So at this stage, you have found a factor g of a. Check if g is a prime number (again by using any classical primality test algorithm). If yes then g is a prime factor and you’re done! If not you have a non-prime factor. However we can now find the greatest common divisor of a and g (again using Euclid’s algorithm), which we call g’. Because g’ is a factor of a, a prime factor of g’ is also a prime factor of a. So now we find a prime factor of g’ by going back to step 1 of the algorithm but replacing a with g’.
  5. If you’ve gotten to this stage, you have selected a random integer k < a that is coprime with a. i.e. there is no number apart from 1 that divides both k and a without a remainder. In this scenario, we can use this as an input to a separate algorithm (which involves using a quantum computer to solve a related period finding problem) to find a factor of a which we will call g’. We will get into the details of this step later, but right now think of it as a black-box algorithm which takes in k and a and outputs g’. We check if g’ is prime (once again using any classical primality test algorithm). If it is, you’re done. If not, then once again we know that since g’ is a factor of a, a prime factor of g’ is also a prime factor of N. So now we find a prime factor of g’ by going back to step 1 of the algorithm but replacing a with g’.

And that’s it! As you can see, I have left out a lot of details in the above steps. In particular I have not described Euclid’s algorithm, nor have I defined what a period finding problem is, let alone explained which period finding problem to solve, how to solve it using a quantum computer or how solving it can ultimately be used to obtain a factor of a. This is what we will get to in the next sections.

A quick side note before we move on: after reading all this, it might seem to you that the use of a quantum computer, although a crucial part, is only one part of the complete algorithm, and that all the other steps can be solved using just a classical computer. Well that’s absolutely correct, so well done you for realizing that! (And if that didn’t cross your mind, well done you for learning it now!)

OK now on to Euclid’s algorithm and the period finding algorithm.

Euclid’s algorithm

Euclid’s algorithm is a purely classical algorithm used to find the greatest common divisor of two positive integers a and b where a>b. It does this by first dividing a by b and finding the remainder r₀. It then divides b by r₀ and finds the remainder r₁. It then divides r₀ by r₁ and finds the remainder r₂. It continues this procedure until it reaches a remainder rₙ= 0. When this stage is reached, we know that the previous remainder rₙis the greatest common divisor of a and b. Because there are only a integers between 1 and a, and because every remainder rᵢ is strictly smaller than the previous remainder rᵢ(usually by a significant amount), this algorithm is efficient even if a is large. In particular, it is of order log(a). Below is an example showing how Euclid’s algorithm is used for finding the greatest common divisor of 78 and 66.

Image courtesy of inchcalculator.com

This algorithm was first described in Euclid’s book titled Elements in 300 B.C. For rigorous proofs, history and generally more complete information on Euclid’s algorithm, I would highly recommend looking at the algorithm’s wikipedia page.

The period finding problem

Here we will first start by defining what a period of a sequence is, and then in the next section discuss how finding it relates to finding prime factors of a number.

Let us use the same notation as we have been using in step 5 of our algorithm: we have integers k and a which are coprimes, where k < a. We can now define the following sequence:

f(0) = k⁰ mod a

f(1) = k¹ mod a

f(2) = k² mod a

where mod a stands for “modulo a” and means “divide by a and take the remainder”. We know that f(x) < a since we are always dividing by a before taking the remainder. It also turns out that f(x) is periodic, by which we mean the values of f(x) will have a pattern of repeating itself. Let’s take a simple example of k = 2 and a = 5. In this case we have

f(0) = 2⁰ mod 5 = 1

f(1) = 2¹ mod 5 = 2

f(2) = 2² mod 5 = 4

f(3) = 2³ mod 5 = 3

f(4) = 2⁴ mod 5 = 1

f(5) = 2⁵ mod 5 = 2

f(6) = 2⁶ mod 5 = 4

As you can see, this pattern repeats itself in chunks of 4 (with the values 1, 2, 4, 3). We say that this sequence has a period of r = 4. More formally we can write this as f(x) = f(x+4). Now at first it might seem pretty straightforward to find the period r for any sequence: just calculate f(x) for increasing values of x until you come back to f(x)=1. However the example we showed was particularly simple because our value for a was quite small. Typically we would have large numbers for a (e.g. a 2048-bit number, so a number as large as 2²⁰⁴⁸) and r is typically of order a, so using this method to find an x where f(x)=1 (other than x=0) could take a while! In fact, there is no efficient classical algorithm for finding a period of any generic discrete periodic function f(x). However, using a quantum computer, there is an efficient algorithm. But before discussing that algorithm, let us first look at how the period finding problem relates to finding factors of a number.

How the period finding algorithm relates to prime factors

Again let’s go back to step 5 of our algorithm where we have integers k and a which are coprimes and where k < a. Let’s also use the periodic function f(x) from the previous section defined as

f(x) = kˣ mod a

and assume we can find the period r of this function (using a quantum algorithm, which I promise we will get to in the next section!). We also assume that r is an even number. In the actual end-to-end algorithm, if we find an r that is odd, we have to go back to step 3 and choose a different random integer k. This is all addressed in the end-to-end algorithm section below. But for now, we don’t have to worry about it and can assume r is even. We know that

f(0) = k⁰ mod a = 1

for any k and a. Since r is the period, we also know that f(0) = f(0+r) = f(r), so

f(r) = kʳ mod a = 1

Let us now define s = r/2. We can use this to go through the following steps:

kʳ mod a = 1 (as shown above)

(kˢ)² mod a = 1

((kˢ)²-1) mod a= 0

(kˢ+1)(kˢ-1) mod a = 0

(kˢ+1)(kˢ-1) = ca

where c is some positive integer. Let us now define g₁ = (kˢ+1) and g₂ = (kˢ-1). If we take the greatest common divisor (again using Euclid’s algorithm) of g₁ and a, which we will call g’, we have now found a factor of a. If g’ is a prime number, then you’re done! If it isn’t we can go back to step 1 of our original five step algorithm but replace a with g’. In the end-to-end algorithm, if we obtain a g₁ and g₂ that are multiples of N, we will get that g’ = N which isn’t very useful. In this case we again would have to go back to step 3 to pick another random integer k.

The main message here is that it just so happens that we can easily obtain a factor of a if we know the period r of the sequence f(x) = kˣ mod a. This insight was originally discovered by Gary Miller which he described in his paper published in 1976. Now onto the details of the quantum algorithm that solves this problem!

Solving the period finding problem using a quantum algorithm

We finally get to the quantum part! This part was the biggest insight by Shor in his paper which led to the whole algorithm being possible. The general problem that this quantum algorithm solves is the following:

Find the period r of a periodic discrete function f(x)

In the context of finding prime factors, the periodic function we are interested in is f(x) = kˣ mod a, but this algorithm will generally work for any discrete periodic function, so we will look at the general case. This is where we now pick up our quantum computer.

To start with, let us initialize two registers: register 1 of p qubits and register 2 of q qubits, where p and q are some decent sized number (see [2] in the footnotes if you would like to know the conditions of p and q being ‘decently sized’):

We then put register 1 in a superposition state by acting on each qubit with a Hadamard gate:

where here P = 2ᵖ. We now use a black-box unitary U that computes the function f(x) on a quantum computer. From a theoretical perspective, it is OK to just state that we have such a unitary, because any function that can be implemented efficiently in a classical computer can also be implemented efficiently in a quantum computer (see sections 2 and 3 of these lecture notes for the proof). However, just because we know it is doable efficiently, it does not mean it is trivial. For example, implementing the modulo exponentiation function f(x) = kˣ mod a is quite involved (see [3] in the footnotes). However for the purposes of this blog, we will assume we have figured out the quantum circuit that implements the unitary U such that

So now we apply U to our superposition state to obtain:

Now we make a measurement on register 2, which causes it to collapses to a single value of f(x) which we will call l. This will cause register 1 to collapse into a superposition state which will only contain the values x for which f(x)=l (so if we look at our particular periodic function, this would be the values of x for which mod a = l). Let’s define the lowest value of x for which this equation holds as s. Then all values of x for which this equation holds are s, s+r, s+2r, s+3r,…, s+(P/r-1)r. Here we have assumed that we were lucky and happend to choose a value of P which is exactly divisible by r. We have made this assumption in order to make the rest of the procedure easier to go through. In practice, this is often not the case, and we would have to add a few subtleties to handle the fact that P may not be exactly divisible by r, as well as take approximation errors into account (see [4] in the footnotes for more details). But the overall procedure of the algorithm remains the same in either case, so for the purposes of this blog, we will continue with this assumption. Therefore our state after the measurement becomes

Now we will take the quantum Fourier transform of register 1. The quantum Fourier transform unitary QFT is defined as

where

We can think of the quantum Fourier transform as changing a register from its binary representation to its frequency representation. A more in-depth and visual description of the quantum Fourier transform and the circuit to implement it can be found in chapter 3.5 of the Qiskit textbook. Taking the Fourier transform of register 1 gives us

The biggest subtlety in the calculation above is going from the second to last step to the last step. To understand it, let’s take a look at the term

If rn/P is an integer, then the term above = 1. This will be true if n is a multiple of P/r. We will start by focusing on the cases of these special n’s. Now let’s look at the whole second to last line the calculation above. In the first sum, n goes from 0 to P-1 so we know that there are exactly r terms which fullfill this special n condition (n=0, n=P/r, n=2P/r,…, n=(r-1)P/r). For each of these special n’s, there are P/r repeats of these terms. This is because the term in the second sum

will always be 1 regardless of j for these special n’s, and j goes from 0 to P/r-1. Therefore, if we combine the P/r terms for each of the r special terms for which n is a multiple of P/r, we will end up with each of the r terms having an amplitude of

So right now we have computed the amplitudes for the r terms of the special n’s, and we have found that each of these terms has an amplitude of 1/√r. So the probability of collapsing to a specific one of those terms when making a measurement is 1/r. And since there are r such terms, the probability of collapsing to any of those terms is r⋅1/r = 1. So there cannot be any other terms in our final state! So even though we have not explicitly calculated any of the terms for which n is not a multiple of P/r, we can already conclude that all those terms must have 0 amplitude, and hence don’t appear in the final expression. So in the end we are left with the state

Once we have reached this stage in the algorithm, we then measure register 1, which gives us a value of b ⋅ P/r for some b<r. Now we just have a couple of more steps left, but we are already done with our quantum computer, and can put it away! Below is the circuit to implement the whole quantum part of Shor’s algorithm:

At this stage in the algorithm, we have found a value for b ⋅ P/r. We then find the greatest common divisor of b ⋅ P/r and P (using, you guessed it, Euclid’s algorithm!). This will result in a value of P/r because there is a high likelihood that b and P/r are coprimes (see [4] in the footnotes for the proof). Given that we have a value for P/r (with a high probability), and that we know P (because we chose it initially), we can now infer r. Finally, we have to check that that our inferred solution is in fact the period r (by checking f(x+jr) for a few different values of j). If it is, then we’re done! If it isn’t, it means we were unlucky and our b was not a coprime with P/r. In this case we just repeat this period finding algorithm from the beginning and hope that after the final measurement step, we obtain a different value for b ⋅ P/r where b and P/r are coprimes.

And that’s it! Using this algorithm to find the period r is much more efficient than any classical way. It is of order (log₂(N))², where most of the complexity comes from implementing the unitary U to compute the periodic function f(x) (see [5] in the footnotes for more details on the complexity).

So with that, we have fully described Shor’s algorithm and discussed all its (quantum and classical) steps! Now we can use it to go from finding a single prime factor of a to find all the prime factors of a. But before that, I thought it may be useful to discuss a bit about the intuition behind the quantum part of the algorithm.

Quantum intuition (or lack thereof)

Let’s start by summarizing the quantum part of the algorithm. We first created a state that was in superposition of different values of x ranging from 1 to P-1. Then we computed f(x) for each of these values at the same time in superposition. Then we made a measurement which only kept the x’s for which f(x) = l. These x’s were s, s+r, s+2r,… At this point, if we could read out any two consecutive elements of the superposition (e.g. s+r and s+2r) then we would already be able to compute r and there would be no need for the quantum Fourier transform! However we cannot simply read out two elements because if we make a measurement to try and read them out, the superposition state that we have collapses to a single state

for just a single value of j. This situation of having a solution in a quantum state but not being able to read it out directly is a common problem in many quantum algorithms. It’s also unique to quantum algorithms. In classical algorithms, if you have your solution stored in classical bits, you can read out or measure your bits without changing them, because they are never in superposition. So what we need is an efficient method to extract useful information from this state. For the period finding algorithm, this efficient method is the quantum Fourier transform. Although in the previous section we have rigorously computed the way it works, to be completely honest I do not have a great intuition for why the Fourier transform performs the task so well for this algorithm. I would maybe recommend Scott Aaronson’s lecture notes on Shor and the quantum Fourier transform as a good place to start answering that question.

Another perspective on the period finding algorithm

It turns out that this algorithm is also the quantum phase estimation algorithm in disguise! A detailed description of this algorithm can be found in [6], but the general idea of is that given an operator U defined as

you can find the value for θ using controlled U operators along with some other basic gates. But what does this have to do with the period finding algorithm? Well let’s now define U as

where k and a come from the problem that we are trying to solve, which is finding the period of the function f(x) = kˣ mod a (notice that the value of y does not play an important part). Then it turns out it also has the equivalent definition of

for some specific uₛ, s and the period r of f(x). This equivalency is described in more detail in [7]. So we can find the value of s/r using the phase estimation algorithm, and then use the method in [4] to extract the value of r (because this can be thought of as the case where we did not choose a convenient P). This (although not immediately obvious, but eventually becomes clear if worked through with a lot of effort!) is effectively what is being done in Shor’s algorithm.

I have purposefully kept this section brief and brushed many details under the rug since I just wanted to give you an outline of how Shor’s algorithm can be thought of as an instance of the phase estimation algorithm without going into too much detail. However an interesting fact is that the phase estimation algorithm was introduced by Alexei Kitaev in 1995, which was one year after Shor discovered the prime factorization algorithm! And Shor does not explicitly refer to or describe the phase estimation algorithm in his original paper. So although Shor did effectively use the phase estimation algorithm in his algorithm, seemingly he did it through a different intuition without being aware that he was using the phase estimation algorithm!

OK so now let’s get on to finishing this problem of finding prime factors: given a number a, let’s go on to finding all the prime factors!

Going from finding a single prime factor to all the prime factors

Now that we have understood all the parts to finding a single prime factor of an integer a, let us elaborate on how to find all the prime factors of an integer a. The general intution is that if we find a prime factor x₁, we now know that a = x₁ × b, where b is some integer. Therefore we now find a prime factor of b (say x₂) since a prime factor of b will also be a prime factor of a. We now have a = x₁ × x₂ × c, where c is some other integer. Again we can now find a prime factor of c. We keep repeating this process until we get to a stage where the number we want to find a prime factor of is itself prime. This procedure will give us all the prime factors of a.

The end-to-end algorithm to find all the prime factors

In this section we formalize the whole end-to-end algorithm to find all the prime factors. This is essentially the same as the initial 5 step algorithm to find a single prime factor, but also efficiently makes use of the factors that we already obtained to obtain subsequent factors, by adding a couple of extra steps.

Given a positive integer a, we want to find all its prime factors x₁, x₂, x₃ etc.

Set N := a

  1. If N is an even number, you straight away know that 2 is a prime factor. So we set an xᵢ = 2. Then we set N := N/2 and start again from step 1 to find the next prime factor.
  2. Check if N is prime using any of the efficient classical primality test algorithms. If this is the case, then set an xᵢ = N and you’re done and can end (this branch of) the algorithm! If not, move on to step 3.
  3. At this stage, we have the number N that is odd and not a prime number. Now pick a random integer k between 2 and N-1. Then use Euclid’s algorithm to find the greatest common divisor g. If g > 1, then go to step 4. If g = 1, then k and N are coprimes. If that’s the case, then go to step 5.
  4. At this stage we have a factor g of N. Check if g is prime by again using any of the efficient classical primality test algorithms. If it is, we have found another prime factor! So we set xᵢ = g. Then we set N := N/g and start again from step 1 to find the next prime factor. If g is not prime, then move on to step 6.
  5. At this stage we know that k and N are coprimes and that k < N. Run the period finding algorithm (with the help of a quantum computer) to find the period r of the sequence k⁰ mod N, k¹ mod N, k² mod N, etc. If r is an odd number, then you cannot go further and have to go back to step 3 and pick another random integer k. If on the other hand r is an even number, compute s = r/2. Then we can show that g₁ = kˢ+1 and g₂= kˢ -1 are both factors. So we have found two more factors g₁ and g₂! We then go to step 7.
  6. At this stage, we have a non-prime factor g of N. We know that N and g must share a common factor. So using Euclid’s algorithm again, we can find the greatest common divisor of g and N, which we will call g’. We then go back to step 1 and set N := g’ to continue finding a prime factor. In addition, we also know that N/g’ is also a factor of N. Therefore, we start another algorithm from step 1 again but this time setting N :=N/g to find an additional prime factor in a parallel branch.
  7. In this step, we have two factors g₁ and g₂ of N. Following a similar logic as in step 6, we find the greatest common divisors of g₁ and N (which we call g₁’) and g and N (which we call g’ ). Note that if g₁ and g₂ are multiples of N, then we will get that g₁’ = g = N which isn’t very useful, in which case we have scrap the whole thing and go back to step 3 to pick another random integer k. If this is not the case, we then start two algorithms again from step 1, where in one we set N := g₁’ and in the other we set N := gto find two prime factors in a parallel branch. We also know that N/( g₁’ g) is a factor of N. Therefore again we start a third algorithm from step 1 setting N :=N/(g₁’ g) to find a third prime factor in yet another parallel branch. Quick note: you might have observed by this stage that it is possible to get through all the steps and fail at obtaining a factor, and having to go back to step 3 to pick another random integer k. This occurs if (1) r is odd or (2) g₁’ = g = N. However the probability of that happening at each iteration is ≤ 1/2 (see [8] for the proof).

So when can we actually use this algorithm practically?

The main issue with implementing Shor’s algorithm is that we require a very powerful quantum computer for the period finding problem step. A practical use-case of finding prime factors of a large number is to break RSA-2048 encryption, where the prime number a used in the encryption is represented by 2048 bits. In order to break this, we could do it with a quantum computer that had 4099 perfectly noiseless qubits. However, given that qubits always have some intrinsic noise and the fact that we need additional qubits to implement state-of-the-art error correction code to counteract this noise, a realistic quantum computer that could run the algorithm would require roughly 20 million qubits (see [9] in the footnotes for more detail on where these qubit numbers come from). In addition, this computer would need to be able to run very deep circuits. Currently in 2021, state-of-the-art quantum computers have about 60 qubits and can only run relatively shallow depth circuits. So yeah, we’re still a while away. But who knows, given that there has been so much progress recently in the hardware, it may be less far away than we think! I will leave the conclusion of this section intentionally very vague :)

Thanks for reading, and as always feel free to reach me @rajkk1 on twitter with any questions!

Thank you to Jaya Krishnakumar, Raghu Mahajan, Raga Krishnakumar & Greg Bronevetsky for proof-reading and providing feedback. And thank you Dave Clader for pointing me to the resources concerning the equivalency of Shor’s algorithm and the phase estimation algorithm. Y’all rock!

Footnotes

[1] From the wikipedia article on greatest common divisor: In mathematics, the greatest common divisor (GCD) of two or more integers, which are not all zero, is the largest positive integer that divides each of the integers. For example, the GCD of 8 and 12 is 4.

[2] A discussion on the size of the qubit register as well as the general case for when P is not exactly divisible by r is written in sections 3.2 and 3.3 of Umesh Vazirani’s lecture notes on Shor’s algorithm.

[3] This paper by Rhines & Chuang describes how to implement modulo exponentiation on a quantum computer.

[4] See “Easy case” and “Hard case” on page 40 of Ronald de Wolf’s lecture notes on quantum computing.

[5] A discussion of the complexity can be found in the first few paragraph of the wikipedia page for Shor’s algorithm.

[6] A great description of the phase estimation can be found starting on slide 20/39 of the Quantum Algorithms Lecture by Joran van Apeldoorn and Arjan Cornelissen.

[7] The details for the equivalency of the operator U can be found in Jonathan Hui’s blog post.

[8] A proof is described in the footnote on page 36 of Ronald de Wolf’s lecture notes on quantum computing.

[9] This paper by Stephane Beauregard and this paper by Gidney & Ekerå calculate how many qubits are required to implement Shor’s algorithm with noiseless and noisy qubits respectively.

References

  1. Ronald de Wolf’s lecture notes on quantum computing
  2. Umesh Vazirani’s lecture notes on Shor’s algorithm
  3. Robert Sutor’s textbook on quantum computing
  4. Peter Shor’s original publication of his algorithm
  5. Gary Miller’s paper containing the derivation that the factoring problem can be mapped onto the period finding problem
  6. Scott Aaronson’s lecture notes on Shor and the quantum fourier transform
  7. The wikipedia page on Euclid’s algorithm
  8. The quantum Fourier transform chapter of the Qiskit textbook
  9. The wikipedia page on Shor’s algorithm
  10. Stephane Beauregard work on number of noiseless qubits required to run Shor’s algorithm
  11. Gidney & Ekerå’s work on using a real quantum computer to break RSA-2048.
  12. Clip from ABC1's Catalyst on prime numbers & public key cryptography
  13. Clip from Udacity on the hardness of factoring
  14. Thibaut Devereux’s Medium article on typesetting in Medium
  15. Jonathan Hui’s Medium article on phase estimation and Shor’s algorithm

--

--

No responses yet