Cryptography

RSA

GCD

The GCD of two integers is the largest integer that divides both of them without leaving a remainder. For RSA, the GCD is crucial for ensuring that the chosen public and private keys are valid. The GCD of two numbers a and b is denoted as gcd(a,b). AN efficient way to compute the GCD is using the Euclidean algorithm.

Euclidean Algorithm: 1. Given two integers a and b, where \(a \geq b\): 2. Compute the remainder r when a is divided by b: r = a mod  b. 3. Replace a with b and b with r. 4. Repeat steps 2 and 3 until r=0. The non-zero remainder just before r=0 is the GCD.

This can also be represented as: \[gcd(a,b)=gcd(b,a\:mod  b)\] Which is equivalent to:

def gcd(a, b): 
    if b == 0: 
        return a 
    else: 
        return gcd(b, a % b)

Modular Inverse

In modular arithmetic, the modular inverse of an integer a mod m is an integer b such that:

\[a \cdot b≡1(mod\:m)\]

For RSA, the modular inverse is used to find the private key from the public key. Specifically, if e is the public exponent and \(\phi(n)\) is Euler’s totient function of n (where n is the product of two large prime numbers), the private exponent d is the modular inverse of e modulo \(\phi(n)\).

Finding the Modular Inverse:

  1. Use the Extended Euclidean Algorithm explained above to find integers x and y such that:

\[e \cdot x+ \phi(n) \cdot y=1\]

  1. The value of x is the modular inverse of e modulo \(\phi(n)\).

Or mathematically: \[d≡e^{−1}(mod\: \phi(n))\]

Example

Let’s say we have e=7 and \(\phi(n)=40\). We need to find d such that: \[7 \cdot d≡1(mod\:40)\]

Using the Extended Euclidean Algorithm:

  1. 40 = 7 ⋅ 5 + 5
  2. 7 = 5 ⋅ 1 + 2
  3. 5 = 2 ⋅ 2 + 1
  4. 2 = 1 ⋅ 2 + 0

Back-substitute to find x and y:

  1. 1 = 5 − 2 ⋅ 2
  2. 2 = 7 − 5 ⋅1
  3. Substitute 2 in the first equation: 1 = 5 − 2 ⋅ (7 − 5) = 3 ⋅ 5 − 2 ⋅ 7
  4. Substitute 5 from the first step: 1 = 3 ⋅ (40 − 7⋅5) − 2 ⋅ 7 = 3 ⋅ 40 − 17 ⋅ 7

And so, x = −17 and since we need a positive d, we take d = −17 + 40 = 23.

So, d=23 is the modular inverse of 77 modulo 40.

Code

def gcd(a, b):
    if b == 0:
        return a
    else:
        return gcd(b, a % b)

def modInverse(e, phi):
    m0 = phi
    y = 0
    x = 1
    if phi == 1:
        return 0
    while e > 1:
        q = e // phi
        t = phi
        phi = e % phi
        e = t
        t = y
        y = x - q * y
        x = t
    if x < 0:
        x = x + m0
    return x

The recursive GCD code is self explanatory. As you can see modInverse uses the extended euclidian algorithm as explained above. The loop follows the same steps in repeat until e is 1. q is short for the quotient of e divided by phi. t temporarily holds phi in the calculation while phi is updated to \(e\: mod\: \phi\). Similarly t temporarily holds y while y is updated to x - q * y. And just like the last step of the algorithm explained before, if x is negative, it is adjusted to be positive by adding m0.

Putting it all together

def generate_keypair(p, q):
    if not (is_prime(p) and is_prime(q)):
        raise ValueError("Both numbers must be prime")
    n = p * q
    phi = (p - 1) * (q - 1)
    e = random.randrange(1, phi)
    g = gcd(e, phi)
    while g != 1:
        e = random.randrange(1, phi)
        g = gcd(e, phi)

    d = modInverse(e, phi)

    return ((e, n), (d, n))

All of the above steps, GCD, Modular Inverse as well as primality which I haven’t bothered to explain are all put together for the ultimate step which is to generate keypairs for the RSA algorithm. It first ensures p and q are prime as well as computing their product n. Similarly phi is always the product of n-1 and q-1. e has to be a random number in the range \(1 < e < \phi\), and the GCD of this random number with phi must be 1(meaning e and phi are coprime). Finally ((e, n), (d, n)) are the keypairs themselves where the first tuple is the public used for encryption and the second tuple is the private key used for decryption.

Modular exponentiation

Modular exponentiation is taking a base raised to an exponent, and then taking the remainder when divided by another number. For example: \(3^4 mod 19 = 5\)

But using very large numbers like 5628496 mod 7643 makes this task computationally expensive.

How to do it efficiently?

perform the modulo operation at each step 31 mod 19 = 3 mod 19 = 3 32 mod 19 = 3 * 3 mod 19 = 9 33 mod 19 = 9 * 3 mod 19 = 8 34 mod 19 = 8 * 3 mod 19 = 5 etc..

This makes it easy to calculate a modular arithmetic but it is significantly harder to do the reverse and go backwards.

Discrete logarithm problem

3x mod 19 = 5 If you were asked to take the result 5 and the base 3 and calculate x, it would be very difficult to calculate the value of x since there are so many cycles to choose from using modular arithmetic.

This is the discrete logarithm problem.

Primitive root modulo n

To ensure there are as many distinct remainders as possible, you need to choose a primitive root modulo n. A primitive root modulo n generates all the integers from 1 to n−1 that are relatively prime to n when raised to successive powers.

For example, consider modulo 19.

This property is more secure in cryptographic applications because using a primitive root ensures a larger set of potential shared secrets. If the generator doesn’t produce many different remainders, the possible values for the shared secret are limited, making it easier to guess or compute.

Diffie-Hellman

Explanation

The following is Diffie-hellman key exchange algorithm but at different levels of difficulty, the first example uses the swapping of paint, the second example uses simple arithmetic. ### Paint example - First you and your friend agree on a set of colors (lets say red and blue) - You each secretely pick a color from the set (lets say you choose yellow and your friend chooses green) - You both mix your secret color with the agreed upon colors. So your red and blue becomes a yellowish red and a yellowish blue. - You swap these mixtures with your friend. So now you have their greenish red and greenish blue. - Next, you mix your secret color (yellow) with the your friends mixture, and your friend does the same with your mixture. - Finally you both end up with a new mixture that both match eachother. This shared secrete mixture can be used as the code or the public key for your communication.

Diffie-helman key exchange does the above, but the real algorithm uses large numbers and Modular-exponentiation instead of colors and mixing.

Addition

This is still extremely weak though because addition is not a secure operation, since it can be inverted easily.

Modular-exponentiation

This is only secure if: - the modulus is a big number(thousands of bits long). - the modulus is a prime number(harder to factor primes) - the base should be big enough such that there are as many remainder values as possible(cycles)

Parameters

Public parameters

p and \(\alpha\) are the main paremters that are known to everyone p is a prime and \(\alpha\) is an integer ## Private key a = KprA \(\in\) {2, 3, … , p-2} b = KprB \(\in\) {2, 3, … , p-2} these are chosen randomly ## Public key A \(\equiv\) \(\alpha\)a mod p = KpubA B \(\equiv\) \(\alpha\)b mod p = KpubB

KAB \(\equiv\) Ba mod p KAB \(\equiv\) Ab mod p

y = AES(x) -> AES-1(y) = x

Elliptic Curve

Intro

  1. An elliptic curve is chosen and made public, this is defined with a math equation
  2. A base point on the curve is selected and also made public
  3. To generate a private key, a random number is chosen
  4. The public key is created by multiplying the base point by the private key number (point multiplication)

bits needed

ECC RSA
256 3072
384 7680

Cofactor

The cofactor of a curve is the ratio of the number of points on the curve and the order of the base point.

Elliptic curves are defined over finite fields ### Points on the curve These are the points that satisfy the equation of the curve, the number of points is finite since we are in a finite field.

Base point

A base point(or generator point) is a specific point on the curve. The base point generates a cyclic subgroup of the elliptic curve group.

Order of the base point

The order of the base point is the number of times the base point can be added to itself (using point addition) until it reaches the point at infinity on the curve.

Visual

This graph is created with y2 = x3 + ax + b or y2 = x3 +ax2 +bx ECC-graph

Point addition

A dot B -> C(negative C specifically) Also A dot B dot C = 0 point doubling is when you add two points that are the same, so A dot A -> B(negative B) Point doubling A n number of times -> Z

The max line is set arbitrarily on the x axis which is a very large number. The max value is simply the key size

Algorithm (explained here)

You keep repeating this process of doing A dot X n times. nA = E Where nA is A dot A dot A...

It’s difficult to find the number of times you used the dot operation on itself, or the number n, which sets the basis for the trapdoor function. And so the private key is the value n. This is the same complexity as the discrete logarithm problem in Modular Exponentiation

Order independence

In diffie helman it has order (Gn)c = (Gc)n

But in Elliptic curves c(nG) = n(cG) is valid

Example:

Nick: nG = Hn Connie: cG = Hc

Swap results.

Nick: nHc = S Connie: cHn = S

Where S is the public key.

Downsides

Code

def generate_key_pair(params):

    # p (prime field), a (coefficient), b (coefficient).
    p, a, b = params
    # find a valid generator point on the elliptic curve
    generator = find_point_on_curve(p, a, b)
    # create elliptic curve object using a, b, p, generator
    curve = ec.Curve(a, b, p, generator)
    # generate a random private key as a random integer
    private_key = secrets.randbelow(curve.field.n)
    # public key = private_key * generator
    public_key = private_key * curve.g

    return private_key, public_key

The complete code for ECC would be too long for this section, however the most important part of the algorithm happens at this stage here. The function unwraps the elliptic curve parameters which should be generated by the overall algorithm. These parameters are valid if a successful generator point can be found on the elliptic curve. All four parameters together are required to create an Elliptic Curve Object which allows you to correctly perform elliptic curve operations. Now that we have an object, we can easily get n which is the total number of points on the elliptic curve. The private key is simply any random number up to n, and the public key is simply the product of that random number with the generator point. And thats it. Once these values are selected, they are thrown into a larger pipeline of security steps which uses AES or DES or any other Symmetric encryption method.