# Cryptography

Cryptography refers to secure information and communication techniques derived from algorithms that transform messages through various means such as modular arithmetic or bitwise operations, in ways that are difficult to decipher.

Note: General cryptography may refer to other things, but modern cryptography is typically heavily based on mathematical theory, and most commonly refers to plain-text encryption and decryption.

## Historical ciphers

### Rail fence cipher

• Shared secret: key $k\in\N$

#### Encryption

Plaintext written in columns of size $k$. The cipher text is the concatenation of the resulting rows.

Example:

• $k=4$
• $m=\text{THIS IS SOME RANDOM TEXT}$
• $c=\text{T S DTHIOROEISMAMXS EN T}$

#### Decryption

Ciphertext written in rows of size $\f{|c|}{k}$.

#### Analysis

• Key space size $k<|c|$: open to brute force attack due to small key space.

### Substitution cipher

• Shared secret: a permutation $\beta$ of the set of characters.

#### Encryption

Apply $\beta$ to each character of the plaintext.

#### Decryption

Apply $\beta^{-1}$ to each character of the ciphertext.

#### Analysis

• Key space size $|\mathcal{K}|=26! (\approx2^{88})$: brute force infeasible!

### Vigenère cipher

• Shared secret: key - a word $w$ over the English alphabet

#### Encryption

Break the plaintext $m=m_1m_2\ldots m_n$ in $\f{|m|}{|w|}$ blocks, and encrypt each block as follows:

Concatenate the resulting blocks to obtain the ciphertext.

#### Decryption

Break the ciphertext $c=c_1c_2\ldots c_n$ in $\f{|m|}{|w|}$ blocks, and decrypt each block as follows:

## Symmetric Encryption

### Symmetric encryption schemes

A symmetric cipher consists of two algorithms:

• Encryption algorithm: $E:\mathcal{K}\times\mathcal{M}\to\mathcal{C}$
• Decryption algorithm: $D:\mathcal{K}\times\mathcal{C}\to\mathcal{M}$

Such that $\fa k\in\mathcal{K}$, and $\fa m\in\mathcal{M}$:

That is, decrypting an encrypted message $E(k,m)$ with the decryption algorithm $D$, using the same key $k$ that was used to encrypt the original message, will produce the original message $m$.

The key $k$ is intended to be kept secret, between the sender and receiver of the message $m$.

### Good encryption schemes

An encryption scheme is secure against a given attacker if this attacker cannot:

• Recover the secret key $k$.
• Recover the plaintext $m$ underlying a ciphertext $c$.
• Recover any bits of the plaintext $m$ underlying a ciphertext $c$.

### Kerchoff's principle

Kerchoff's principle states that a cryptosystem should be secure even if everything about the system, except the key, is public knowledge.

• The encryption $E$ and decryption $D$ algorithms are public.
• The security relies entirely on the secrecy of the key.

### Attacker's capabilities

• A cryptographic scheme is secure under some assumptions, or against a certain type of attacker.
• A cryptographic scheme may be vulnerable to certain types of attacks, but not others.

• Known ciphertext attack: Some ciphertexts $c_1,\ldots,c_n$, all of which were encrypted using the same key.

The goal is to determine the plaintext for one or more of these ciphertexts or, better yet, to discover the key.

• Known plaintext attack: Some plaintext/ciphertext pairs $(m_1,c_1),\ldots,(m_n,c_n)$ such that $c_i=E(k,m_i)$—each plaintext was encrypted using the same key $k$.

The goal is to determine the key $k$.

• Chosen ciphertext attack: Access to a decryption oracle (can maybe trick a user to decrypt ciphertexts $c_1,\ldots,c_n$ of his choice)—can choose one or more ciphertext messages and get the plaintext that is associated with each one, based on the use of the same key $k$.

• Chosen plaintext attack: Access to an encryption oracle (can maybe trick a user to encrypt messages $m_1,\ldots,m_n$ of his choice)—can choose one or more plaintext messages and get the ciphertext that is associated with each one, based on the use of the same key $k$.

• Computational power: Unlimited, polynomial, or realistic ($\leq 2^{80}$) computational power.

### Pseudo-random number generators

A pseudo-random number generator is a method for generating a sequence of numbers that approximates the properties of a random sequence.

A desired property of a random sequence is that the numbers it generates are uniformly distributed.

### Brute-force attack

In a brute-force attack, the attacker tries all possible keys $k\in\mathcal{K}$. This requires some knowledge about the structure of plaintext.

#### Prevention

The main form of prevention for brute-force attacks is to make exhaustive searches infeasible:

• $\mathcal{K}$ should be sufficiently large - keys should be sufficiently long.
• Keys should be sampled uniformly at random from $\mathcal{K}$

### Perfect secrecy

A cipher $(E, D)$ over $(\mathcal{M}, \mathcal{C}, \mathcal{K})$ satisfies perfect secrecy if for all messages $m_1, m_2\in\mathcal{M}$ of the same length $(|m_1|=|m_2|)$, and for all ciphertexts $c\in\mathcal{C}$:

Where:

• $k\xleftarrow[]{r}\mathcal{K}$ ($k$ is a key sampled uniformly at random from the key-space $\mathcal{K}$)
• $\epsilon$ is some negligible quantity

• Shared secret: key $k\in\mathcal{K}$, where $\mathcal{M}=\mathcal{C}=\mathcal{K}=\set{0,1}^n$ (bit-string of length $n$)

#### Encryption

Perform a logical XOR ($\xor$) operation with the key $k$ and the message:

#### Decryption

Perform a logical XOR operation with the key $k$ and the ciphertext:

#### Proof that OTP satisfies perfect secrecy

First note that $\fa m\in\mathcal{M}$ and $\fa c\in\mathcal{C}$:

Where: $k\xleftarrow[]{r}\mathcal{K}$

Thus, $\fa m_1,m_2\in\mathcal{M}$, and $\fa c\in C$:

#### Limitations of OTP

• Key length: The key must be as long as the plaintext.

• Getting true randomness: The key should not be guessable from an attacker:

Example: 00000000 or 11111111 would be bad keys in $\mathcal{K}=\set{0,1}^8$.

• Perfect secrecy does not capture all possible attacks.

• Malleable.

### Stream ciphers

The motivation behind stream ciphers is to make the OTP practical.

In a stream cipher, each plaintext digit is encrypted one at a time with the corresponding digit of the key-stream, to give a digit of the ciphertext stream.

• Idea: Use a pseudo-random key rather than a really random key.

• The key will not really be random, but will look random.

• The key will be generated from a key seed using a Pseudo-Random Generator (PRG):

With: $s \ll n$ ($s$ much less than $n$)

• Encryption: Using a PRG:

• Decryption: Using a PRG:

Example: Vigenère cipher, RC4.

#### RC4

RC4 is a stream cipher invented by Ron Rivest in 1987.

• Consists of 2 phases: Initialisation and key-stream generation

• Main data structure: Array $S$ of 256 bytes.

• Used in HTTPS and WEP

• Weaknesses:

• First bytes are biased
Solution: Drop the first 256 generated bytes.
• Subject to related keys attacks
Solution: Choose randomly generated keys as seeds.
##### Initialisation
​x1for i = 0 to 255 do2  S[i] = i 3end4​5j = 06​7for i = 0 to 255 do8    j = j + S[i] + K[i (mod |K|)] (mod 256)9    swap(S[i], S[j])10end
##### Key-stream generation
xxxxxxxxxx61while generatingOutput2    i = i + 1 (mod 256)3    j = j + S[i] (mod 256)4    swap(S[i], S[j])5    output(S[S[i] + S[j] (mod 256)])6end

### Block ciphers

In a block cipher, the plaintext is divided into blocks, and each block (group of plaintext symbols) is encrypted one by one - rather than encrypting one symbol at a time like stream ciphers do.

• Most modern symmetric encryption algorithms are block ciphers.
• Block sizes vary (e.g. 64 bits for DES, 128 bits for AES).

A block cipher with parameters $k$ and $l$ is a pair of deterministic algorithms $(E,D)$ such that:

• Encryption $E:\set{0,1}^k\times\set{0,1}^l\to\set{0,1}^l$
• Decryption $D:\set{0,1}^k\times\set{0,1}^l\to\set{0,1}^l$

Example: 3DES ($l=64, k=168$) and AES ($l=128,k=128,192,256$).

#### DES

The Data Encryption Standard (DES) is a symmetric-key block cipher

• 64 bit key (8 bits parity, 56 bits actual data - every 8th bit is a parity bit)
• 64 bit plaintext, 64 bit ciphertext
• 16 rounds, each accepting a 64-bit input and giving a 64-bit output.
##### Encryption algorithm
1. Construct the key $k$ by inserting a parity bit after every 7th bit of the actual data.
This expands $k$ from 56-bit to 64-bit.

2. Generate 16 48-bit sub-keys, $\seq{k}{1}{16}$ from this new 64-bit key.

3. Permute the plaintext.

4. Divide the plaintext into two 32-bit halves, $L_i$ and $R_i$.

5. For each of the 16 rounds, starting from $i=1$:

Round $i$

• $L_{i+1}=R_i$: The left half of the input for the next round, $L_{i+1}$, becomes the right half of the input for the current round, $R_i$.
• $R_{i+1}=L_i\xor M(k_i,R_i)$: The right half of the input for the next round, $R_{i+1}$, becomes the result of taking the logical XOR between the left half of the input for the current round, $L_i$, and the output of the mangler function accepting the sub-key for the current round, $k_i$, and the right half of the input for the current round, $R_i$.
The mangler function $M$ scrambles half a block together with the 48-bit sub-key, $k_i$.
6. Swap the left and right halves of the output from the final round.

7. Permute the resulting text, which produces the ciphertext.

##### Decryption algorithm

Perform the encryption algorithm again, but using the sub-keys $\seq{k}{1}{16}$ in reverse.

##### Problems with DES
• Vulnerable to exhaustive key search (brute-force) attacks.
• 3DES solves this by increasing key space to 168 bits.

#### 3DES

3DES encryption chains 3 DES operations with 3 different DES keys, $K_1,K_2, K_3$.

• 3 different 64-bit (56-bit data + 8-bit parity) DES keys.
• 64 bit plaintext $M$, 64 bit ciphertext $C$

Note: If $K_1=K_2=K_3$, then 3DES works the same as single DES.

##### Encryption algorithm

Where $E$ and $D$ are the encryption and decryption algorithms for DES.

##### Problems with 3DES
• 3 times as slow as DES.

#### AES

Advanced Encryption Standard (AES) is a symmetric-key block cipher that operates on:

• 128-bit blocks (128-bit plaintext and ciphertext)
• 128, 192 or 256 bit keys (AES-128, AES-192 or AES-256)

AES is stronger and faster than 3DES.

##### 128-bit example

AES-128 proceeds in ten rounds. Each round performs an invertible transformation on a 128-bit array, called state.

1. The initial state $X_0$ is the XOR of the plaintext $M$ with the key $K$.

2. Round $i$ (where $i\in\set{1,\ldots,10}$) receives state $X_{i-1}$ as input and produces state $X_i$.

3. The ciphertext $C$ is the output of the final round.

Where each round is built from four basic steps:

1. SubBytes step: An S-box substitution step.
2. ShiftRows step: A permutation step.
3. MixColumns step: A matrix multiplication (Hill cipher) step.
4. AddRoundKey step: An XOR step with a round key derived from the 128-bit encryption key.

## Cryptographic hash functions and MACs

### One-way functions (OWFs)

A function $f$ is a one-way function if for all $y$, there is no efficient algorithm which can compute $x$ such that $f(x)=y$.

#### Examples

• Constant functions are not OWFs,
since any $x$ is such that $f(x)=c$.
• The successor function in $\N$ is not a OWF,
given $\mathrm{succ}(n)$, it is easy to retrieve $n=\mathrm{succ}(n)-1$.
• Multiplication of large primes is a OWF:
integer factorisation is a hard problem - given $pq$ where $p$ and $q$ are primes, it is hard to retrieve $p$ and $q$.

### Collision-resistant functions (CRFs)

A function $f$ is collision-resistant if there is no efficient algorithm that can find two messages $m_1$ and $m_2$ such that $f(m_1)=f(m_2)$.

#### Examples

• Constant functions are not CRFs,
for all $m_1$ and $m_2$, $f(m_1)=f(m_2)$.
• The successor function in $\N$ is a CRF,
the predecessor of a positive integer is unique.
• Multiplication of large primes is a CRF:
every positive integer has a unique prime factorisation.

### Cryptographic hash functions

A cryptographic hash function $H:\cal{M \to T}$ is a function that satisfies the following four properties:

• $\cal{|M|\gg|T|}$
• it is easy to compute the hash value for any given message
• it is hard to retrieve a message from its hashed value (OWF)
• it is hard to find two different messages with the same hash value (CRF)

Examples of cryptographic hash functions include MD5, SHA-256, Whirlpool and bcrypt.

#### Applications

• File integrity: Hashes are sometimes posted along with files on read-only spaces to allow verification of integrity of the files.
• Password verification: Instead of storing passwords in cleartext, only the hash digest of each password is stored. To authenticate a user, the password presented by the user is hashed and compared with the stored hash.

### Message authentication codes (MACs)

A message authentication code (MAC) is a short piece of information used to authenticate a message—in other words, to confirm that the message came from the stated sender and has not been changed.

Essentially, a MAC is an encrypted checksum generated on the underlying message that is sent along with a message to ensure message authentication.

#### Functionality Figure 1: The process of MAC authentication.
Note: This example assumes a cleartext message. (source)

As shown in Figure 1:

1. The sender uses some publicly known MAC algorithm—an algorithm $E$ which accepts the input message $m$ and the secret key $k$ and produces a MAC value.

2. Similarly to a hash function, a MAC function also compresses an arbitrary long input into a fixed length output.

The major difference between hash and MAC is that MAC uses a secret key during the compression.

3. The sender forwards the message along with the MAC.

Note: In Figure 1 this is simply a cleartext message, but in reality (if confidentiality) is necessary, this would be encrypted. If this is the case, then the MAC should always be computed on the ciphertext (two different keys should be used in this case).

4. Once the message and MAC is received, the receiver feeds the received message and the shared secret key $k$ into the same MAC algorithm that was used by the sender, and re-computes the MAC value.

5. The receiver now checks equality of the freshly-computed MAC with the MAC received from the sender.

• If they match, then the receiver accepts the message and assures themself that the message has been sent by the intended sender.

• If the computed MAC does not match the MAC sent by the sender, the receiver cannot determine whether it is the message that has been altered, or the origin that has been falsified.

As a bottom-line, a receiver safely assumes that the message is not genuine.

#### Block ciphers and message integrity

Let $(E,D)$ be a block cipher. We build a MAC $(S,V)$ using $(E,D)$ as follows:

• $S(k,m)=E(k,m)$
• $V(k,m,t)=\begin{cases}\top\quad\text{if m=D(k,t)}\\\bot\quad\text{otherwise}\end{cases}$

However, block ciphers can usually only process 128 or 256 bits. To overcome this limitation, it is possible to split plaintext and chain encryption operations together, often by using logical operators such as XOR. Figure 14: Construction of a MAC with CBC mode encryption (ECBC-MAC).
Note: A zero initialisation vector is always used in this mode of operation. (source)

##### Initialisation vector

An initialisation vector (IV) is a block of bits that is used by several modes to randomise the encryption and hence produce distinct ciphertexts even if the same plaintext is encrypted multiple times.

##### Modes of operation

A mode of operation is an algorithm that uses a block cipher to provide information security such as confidentiality or authenticity.

Example: CBC, ECB and CTR.

## Asymmetric encryption

Asymmetric encryption, also known as public-key cryptography, is a cryptographic system that encrypts and decrypts data using two separate yet mathematically connected cryptographic keys—these keys are known as a public key and a private key.

### Online Trusted Third Party (TTP)

Online Trusted Third Party (TTP) is a method of establishing keys through the use of a trusted third party.

• Users $\seq[,]{U}{1}{n}$
• TTP, $T$
• Encryption and decryption algorithms $E(k,m)$ and $D(k,m)$
• Each user $U_i$ has a shared secret key $K_i$ with $T$
• $U_i$ and $U_j$ can establish a key $K_{i,j}$ with the help of $T$

#### Functionality

Suppose $U_i$ wishes to communicate with $U_j$ securely, but needs to establish a key $K_{i,j}$ first.

1. $U_i$ sends a message to $T$, saying that they wish to communicate with $U_j$.

2. $T$ generates a new secret key $K_{i,j}$.

3. $T$ sends the new key $K_{i,j}$ to:

• $U_i$ in the form of an encrypted message $E(K_i,K_{i,j})$.
• $U_j$ in the form of an encrypted message $E(K_j,K_{i,j})$.
4. To acquire the shared secret key $K_{i,j}$,

• $U_i$ decrypts the message using $K_i$: $D(K_i,E(K_i,K_{i,j}))=K_{i,j}$
• $U_j$ decrypts the message using $K_j$: $D(K_j,E(K_j,K_{i,j}))=K_{i,j}$
5. $U_i$ and $U_j$ can now communicate using $K_{i,j}$.

#### Problems

• The TTP can read all messages between $U_i$ and $U_j$.

Since TTP generated $K_{i,j}$, it can intercept and read any messages went between $U_i$ and $U_j$, since they are also using $K_{i,j}$.

• The TTP can impersonate any user.

Since TTP has a key shared with every user, it can generate a fake channel between $U_i$ and $U_j$.

### Public-key cryptography

• Key generation algorithm: $G:\to\cal{K\times K}$

Generates $(K_\hat{s},K_\hat{p})$ where:

• $K_\hat{s}$ is a secret (private) key for decryption
• $K_\hat{p}$ is a shared public key for encryption
• Encryption algorithm $E:\cal{K\times M\to C}$

• Decryption algorithm $D:\cal{K\times C\to M}$

Such that $\forall(K_\hat{s},K_\hat{p})\in G$ and $\forall m\in\mathcal{M}$, $D(K_\hat{s},E(K_\hat{p},m))=m$.

#### Functionality

Suppose user $U_i$ wishes to send a message to $U_j$.

1. $U_i$ encrypts the message $m$, using a shared public key $K_\hat{p}^j$ from $U_j$, and sends the resulting ciphertext $c=E(K_\hat{p}^j,m)$ to $U_j$.
2. $U_j$ uses their own secret private key $K_\hat{s}^j$ to decrypt the ciphertext $D(K_\hat{s}^j,E(K_\hat{p}^j,m))=m$.

It is important that $K_\hat{s}^j$ is never shared, and that $K_\hat{p}^j\neq K_\hat{s}^j$.

### Number theory

#### Primes

1. $p\in\N$ is a prime if its only divisors are $1$ and $p$
2. Every $n\in\N$ has a unique factorisation as a product of prime numbers (prime factors)

#### Relative primes

1. $a,b\in\Z$ are relative primes if they have no common factors

2. Euler's totient function $\phi(n)$ is the number of integers that are relatively prime with $n$:

Example:

• For $p$ prime: $\phi(p)=p-1$
• For $p$ and $q$ primes: $\phi(pq)=(p-1)(q-1)$

Note: Euler's totient function is multiplicative—if $\gcd(m,n)=1$, then $\phi(mn)=\phi(m)\phi(n)$

#### $\Z_n$

Let $n\in\N$. We define $\Z_n=\set{0,\ldots,n-1}$.

##### Modular inversion

The inverse of $x\in\Z_n$ is $y\in\Z_n$ s.t. $xy\equiv1\pmod{n}$. We denote $x^{-1}$ the inverse of $x\pmod{n}$.

Example:

• $7^{-1}$ in $\Z_{12}$: $7$
• $4^{-1}$ in $\Z_{12}$: $4$ has no inverse in $\Z_{12}$
##### Theorem

Let $n\in\N$, $x\in\Z_n$. $x$ has an inverse in $\Z_n$ iff $\gcd(x,n)=1$.

#### $\Z_n^*$

Let $n\in\N$. We define $Z_n^*=\setb{x\in\Z_n}{\gcd(x,n)=1}$.

Example:

Note: $|\Z_n^*|=\phi(n)$

##### Theorem (Euler)

For all $p$ prime, $\Z_p^*$ is a cyclic group, i.e.

Where $g$ is a generator of $\Z_p^*$. See here.

#### Intractable problems

##### Integer factorisation (into prime factors)
• Input: $n\in\N$
• Output: $\seq[,]{p}{1}{m}$ such that $n=\prod_{i=1}^mp_i$
##### RSAP
• Input:

• $n$ such that $n=pq$ with $p,q$ prime
• $e$ such that $\gcd(e,\phi(n))=1$
• $m^e\pmod n$
• Output: $m$

##### Discrete logarithmic problem (DLOG)
• Input:

• $p$ prime
• generator $g$ of $\Z_p^*$
• $y\in\Z_p^*$
• Output: $x$ such that $y=g^x\pmod p$

##### Diffie-Hellman problem (DHP)
• Input:

• $p$ prime

• generator $g$ of