# One-Liners

## Number Theory

Chinese Remainder Theorem. Given a (pairwise co)prime factorization of an integer $M=m_1m_2$, there is a bijection between $\mathbb{Z}_M$ and the Cartesian product $U_1 \times U_2$ where $U_i$ is the set of units modulo $m_i$ i.e., the set of integers $z$ that are coprime to $m_i$. These are also the invertible elements modulo $m_i$.

More. This bijection takes an element $m \bmod M$ to the tuple $(m \bmod m_1, m\bmod m_2)$. Moreover, given a tuple $(a_1 \bmod m_1, \cdots, a_n \bmod m_2)$, the corresponding element in $\mathbb{Z}_M$ is $\sum{a_i N_i H_i} \bmod M$ where  $N_i = (\prod{m_j})/m_i$ and $H_i$ is the multiplicative inverse of $N_i$ modulo $m_i$.

Fermat’s Integer Factorization of $N$. Try to find two integers $latex a, b$ such that $a \neq \pm b \bmod N$ and $a^2 \equiv b^2 \bmod N$. Then $N$ must divide the product $(a+b)(a-b)$. Consequently, the gcd of $N$ and at least one of these factors must be non-trivial. This gcd divides $N$.

## Coding Theory

Reed-Muller Code. The order-$r$ Reed-Muller code $R(r,m)$ is the set of Boolean polynomials on $m$ variables and degree at most $r$.

Why Reed-Solomon Code is MDS. In Reed-Solomon code, a degree-$(k-1)$  polynomial $m(x)$ (the message) is encoded into a degree-$(n-1)$ polynomial $c(x)$ (the codeword) where the $n$ coefficients $c_0, \cdots, c_{n-1}$ of the codeword are obtained by evaluating $m(x)$ at $\alpha_0, \cdots, \alpha_{n-1}$. Since $m(x)$ can have at most $k-1$ zeros at any field, at most $k-1$ of the $c_j$s can be zero. This means the minimum distance $d$ is at least $n - (k-1)$ which matches the singleton bound.

Singleton Bound. For a linear $[n,k,d]$ code, the distance $d$ is at most $n-k+1$.

Plotkin Bound. If the distance $d$ of a code is strictly larger than half the block length $n$, that is, if $d > n/2 + \alpha$, the number of codewords is at most $\displaystyle \lfloor d/ \alpha \rfloor$.

Hadamard Matrix. Let $H_1 = \left( \begin{array}{cc} 1 & 1 \\ 1 & -1 \end{array}\right)$ . Then

$H := H_{2^n} = \left( \begin{array}{cc} H_{2^{n-1}} & H_{2^{n-1}} \\ H_{2^{n-1}} & -H_{2^{n-1}} \end{array}\right)$

is the Hadamard matrix of order $2^n$. Let us interprete $x, y \in \{0,1\}^n$ as both strings and (corresponding) integers. Then $H(x,y) = (-1)^{x \odot y}$ where $\odot$ is the inner-product modulo $2$. The map $H : \{0,1\}^n \mapsto \{0,1\}^{2^n}$ is an $[2^n, n, 2^{n-1}]$ linear code. The volume enclosed by the (unit-length) columns of $H$ (i.e., the determinant) is the largest possible for any real matrix of the same order.

## Probability/Combinatorics

Birthday Paradox. Suppose $k = \Omega(\sqrt{n})$. When you randomly draw $k$ times from an $n$-set with replacement, the resulting $k$-subset will contain a repeated element with constant probability.

Explanation. Suppose we choose $k$ independent samples $X=\{ x_1, \cdots, x_k \}$ from the numbers $1, \cdots, n$. For any $i\neq j$, the probability that $x_i=x_j$ is $1/n$. The expected number of collisions in $X$ is at most ${k \choose 2}/n$. This is a constant whenever $k = \Omega(\sqrt{n})$.

Coupon Collector’s Problem. Suppose you want to color all items in a set $S$ of size $n$ as follows: at round $1, 2, 3, \cdots$, you pick a random element from $S$; if it is still uncolored, you color it red, put it back, and repeat. This process will require $\Theta(n \log n)$ rounds to color all items.

Poisson Distribution. Let $X \in \{0, 1, 2, \cdots\}$ be a discrete random variable. If it has the Poisson distribution $P(\mu)$, its probability mass function at point $k$ is the probability that $X$ is $k$ given $\mathbf{E}\,X = \mu$. That is,

$f(k;\mu) := = \frac{\mu^k }{k! e^\mu}.$

The expectation and variance of $X$ is $\mu$. The expected deviation of $X$ from its mean is $f( \lfloor \mu \rfloor; \mu)$. A binomial distribution $B(n,p)$ tends to a Poisson distribution when $p \rightarrow 0$.

Note: The events $X = a$ and $X = b$ must be independent.

Probability Generating Functions. Let $\mathsf A(X) = \sum_{t=0}^\infty{a_t X^t}$ be a formal power series with $a_t \geq 0$. If $\mathsf{A}(1) = \sum_t{a_t} = 1$ then $\mathsf{A}$ “generates” the probability distribution $A(t) := \Pr[A = t] = a_t$. Suppose $\mathsf{B}$ generates the distribution $B(t)$. It is easy to see that the convolution $A \ast B$ generates the probability distribution $C = A + B$ whose mass function is

$\displaystyle C(t) := \sum_{t = x + y}{A(x) + B(y)}$.

Composition of Generating Functions. Let $a_n$ and $b_n$ denote the number of ways to build two structures, respectively of types 1 and 2, on a $n$-element set where $a_0 = 0$ and $b_0 = 1$. Let $c_n$ be the number of ways to divide the set $[n]$ into an unspecified number of intervals, build a type-1 structure on each interval, then build a type-2 structure on the set of intervals. Set $c_0 = 1$. Denote by $\mathsf{A}(x), \mathsf{B}(x),\mathsf{C}(x)$ the generating functions of the sequences $\{a_n\}, \{b_n\}, \{c_n\}$. Then $\mathsf{C}(x) = \mathsf{B}\left(\mathsf{A}(x) \right).$

The Gambler’s Ruin Problem. See here.

The Unbiased Random Walk. See here.

Sum of first binomial coefficients. Suppose $k \leq n/2$. Then,

$\displaystyle \sum_{i=0}^k {n \choose i} \leq {n \choose k} \frac{n - k+1}{n - 2k + 1}$ .

This bound is due to Michael Lugo. (Aside: This implies, if $k \leq n/3$, the sum is at most ${n \choose k}$ .)

The above bound is within a factor of $1.5$ as long as $k \leq 0.49 n$ and $n \approx 5000$. When $k \rightarrow n/2$, this multiplicative factor increases.

The following bound is due to Lovasz. It works better when $k \rightarrow n/2$; it stays within a factor of $2$ for $n \approx 5000$.

$\displaystyle \sum_{i=0}^{k-1} {n \choose i} \leq 2^{n-1} \exp\left(-\frac{(n - 2k - 2)^2}{4(n - k - 1)} \right)$ .

Shearer’s Inequality. Let $S_i \subseteq [N], i \leq M$ be coordinate-sets such that each coordinate $i \in [N]$ is covered by at least $r$ different $S_i$s. Let $X = (X_1, \cdots, X_N)$ be a collection of independent random variables. The entropy of $X$ is at most $1/r$ times the sum of the entropies of its restriction to each coordinate set. That is,

$\displaystyle H(X_1, \cdots, X_N) \leq \frac{1}{r} \sum_{i=1}^M{ H(X\rvert_{S_i}}$ .

## Algebraic Graph Theory

Expander Mixing Lemma. Let $G$ be a $d$-regular expander graph on $n$ vertices and $m$ edges. Suppose the second-largest eigenvalue of its adjacency matrix is at most $\lambda$. For any pair of vertex-sets $A,B$, let $E_{A,B}$ be the set of “cross edges” between $A$ and $B$. Let $\#A$ denote the cardinality of the set $A$. Then,

$\displaystyle \left \vert \frac{\# E_{A,B}}{m} - \frac{\# A}{n} \cdot \frac{\#B}{n} \right \vert \leq \frac{\lambda \sqrt{\# A \cdot \# B}}{dn} < \frac{\lambda}{d}\,.$

Random Walk on an Expander. Suppose $G$ is a $d$-regular expander whose second-largest eigenvalue is at most $\lambda$ in magnitude. Let $W$ be a set containing a $\rho$ fraction of vertices. The fraction of random walks of length $\ell$, on the vertices of $G$, that stay entirely within $W$, is small; in particular, it is at most

$\displaystyle \rho \left( \rho + (1 - \rho) \frac{\lambda}{d} \right)^\ell\,$.

## Linear Algebra

Cramer’s Rule. Suppose we want to solve a matrix-vector equation $Ax = b$. Let $A_i$ denotes the matrix obtained by replacing the $i$th column of the matrix $A$ by the $b$. Then,

$\displaystyle x_i = \frac{\text{det}(A_i)}{\text{det} (A)}\, .$

## Randomized Algorithms

Generating Pairwise Independent Random Variables. Choose a field $F$ of size $2^n$. Sample two uniformly random elements $x_1, x_2 \in F$ using $2n$ uniformly random bits. Then the following $2^n$ random variables are uniform over $F$ and pairwise independent:

$\displaystyle Y_i := x_1 + i \cdot x_2 \qquad \forall i \in \{0, \cdots, 2^n - 1\}$

See here for details.

Pairwise Independent Hash Functions. Let $H$ be a family of functions, each having $n$ input bits and $k$ output bits. Suppose the following holds: that if we pick $h \in H$ at random, the probability that it maps two arbitrary but distinct input strings to the same output string is exactly $1/2^{2k}$. Then $H$ is a family of pairwise independent hash functions. In other words,

$\forall x,x^\prime \in \{0,1\}^n, y, y^\prime \in \{0,1\}^k \forall h \in_R H \Pr[h(x)=y \wedge h(x^\prime) = y^\prime]$

$= \Pr[h(x)=y] \Pr[h(x^\prime) = y^\prime] = \frac{1}{\vert A \vert \vert B \vert} = 2^{-2k}$.

Equivalently, the probability that two arbitrary strings will collide under a random $h \in H$ is the square of the uniform probability on the output space. Intuitively, if we pick a random $h \in H$ then for arbitrary $x, y$ in the domain, the strings $h(x), h(y)$ are independent and uniformly distributed.

Hitters. Fix a Boolean function $\sigma$ on $n$-bit inputs. Let $S$ be the set on inputs on which $\sigma$ outputs $1$, and suppose $\vert S \vert = \epsilon 2^n$. An efficient, randomized algorithm $A$, given only oracle access to $\sigma$, is a hitter for $\sigma$ if with probability at least $1 - \delta$, it needs to sample at most $m$ points from $\{0, 1\}^n$ before it “hits $S$,” that is, it outputs an element from $S$. We can construct hitters, using random walks on expanders, which takes $m = O(\epsilon^{-1} log(1/\delta))$ samples and uses only $n + O(1/\delta)$ random bits. In contrast, if we sample uniformly at random, it requires $mn$ random bits. See the appendix of Goldreich’s survey for details.

## Deviation Inequalities

Markov’s Inequality. Any nonnegative $X$ with mean $\mu$ satisfies $\Pr[X \geq \alpha \mu ] \leq 1/\alpha$ for any positive $\alpha$.

Chebyshev’s Inequality. Any random variable $X$ with mean $\mu$ and variance $\sigma^2$ satisfies $\Pr[ \vert X -\mu \vert > \alpha \sigma] \leq 1/\alpha^2$ for any positive $\alpha$.

Note: While Chernoff-Hoeffding inequality requires mutual independence of all random variables, it suffices to have pairwise independence for Chebyshev’s inequality. See Nick Harvey’s notes here.

Hoeffding’s Tail Inequality, Additive Deviation, Symmetric. The probability that the sum of $n$ independent and identically distributed $[a,b]$-bounded random variables deviates at least $\lambda$ away from its expectation is no more than $\displaystyle \exp\left( - 2 \lambda^2 / n(a-b)^2\right)$.

Chernoff Bound, Multiplicative Deviation, Asymmetric. Let $S$ be the sum of many independent Bernoulli trials. Let $\mu = E[S]$ and let $\delta$ be any positive number between zero and one. The probability that $S$ is $\delta \mu$ below the mean is at most $\exp(-\delta^2\mu/2)$. However, The probability that $S$ is $\delta\mu$ above the mean is at most $\exp(-\delta^2\mu/3)$.

(Hoeffding-like statement.) Equivalently, the probability that $S$ is $\lambda$ below the mean is at most $\exp(-\lambda^2/2\mu)$. However, The probability that $S$ is $\lambda$ above the mean is at most $\exp(-\lambda^2/3\mu)$.

Note. When $\lambda$ is large, the normal approximation of the binomial distribution is less accurate for the positive tail. See the appendix of Alon-Spencer for a commentary.

Above are the plots of the Chernoff-tail and the Hoeffding-tail for both 0-1 and $\pm 1$ random variables. The plotted quantity is the bound for the probability that the sum deviates $t n$ away from the expectation, for $t \in [-0.1, 0.1]$ and $n=1000$.

Rudelson-Vershynin Theorem on the Operator Norm of Random Projections.  Let $x_1, \cdots, x_n$ be $n$ identical copies of a random vector $x \in \mathbf{R}^d$ with $\mathbf{E} x^Tx = I$ and magnitude $\Vert x \Vert_2$ at most $R$. Let $\mu := (\sum_i{x_ix_i^T})/n$. For any $\lambda \in(0,1)$,

$\displaystyle \Pr\left[ \left\Vert \mu - I \right\Vert_2 > \lambda \right] < 2d \exp \left( -n\lambda^2/4R^2 \right).$

Note: We can think $xx^T$ as a “scaled” projection operator along $x$ multiplied by $1/\Vert x \Vert_2$. Then $\mu$ acts as the mean of these projection operators. The theorem says that the largest distortion inflicted by $\mu$ on any vector is no more than $\lambda$ with high probability. Notice also the similarity with Hoeffding’s tail inequality, which would give the exponent as $-2(n\lambda)^2/nR^2=-2n\lambda^2/R^2$ for one-dimensional case. We lost a factor of 8 somewhere.

Azuma-Hoeffding Inequality for Martingales. Let $X_0, \cdots, X_n$ be real-valued random variables with the martingale property i.e., $\mathbf{E} \left[ X_{t+1} \mid X_0, \cdots, X_t \right] \leq X_t$ and the bounded-difference property i.e., $\vert X_t-X_{t-1} \vert \leq c$ for some absolute constant $c$. Then, the probability that the difference $X_n - X_0$ is more than $\lambda$ is at most $\displaystyle \exp \left( -\lambda^2/2nc^2 \right)$.

## Complexity Theory

Goldreich-Levin Theorem. Every injective one-way function $f$ has a corresponding boolean function $b$ which is hard to predict; predicting $b$ is effectively inverting $f$, which is unlikely when $f$ is an injective one-way function.

Lipton’s Random Self-Reducibility Theorem. A problem $P$ is called “random self-reducible” if you can solve $P$ on an arbitrary input by evaluating $P$ on a sequence of uniformly random inputs. Suppose you have a black-box to solve $P$ on a random input with high probability. Then you can construct a randomized algorithm that can solve $P$ on every input with high probability.

Yao’s Min-Max Lemma. You and I are playing a two-player game. You are the input player and I am the algorithm player. Your move is to select a distribution $\mathcal{D}$ over inputs $\mathcal{X} = \{x\}$. My move is to select a distribution $\mathcal{R}$ over algorithms $\mathcal{A} = \{A\}$. Both $\mathcal{A}, \mathcal{X}$ have finite size. Your goal is to make $cost(A,x)$ high, while my goal is to make it as low as possible. The best value for you is

$\displaystyle \max_{\mathcal{D}} \min_{A \in \mathcal{A}} \underset{x \sim \mathcal{D}}{\mathbb{E} } cost(A, x)$ ,

while the best value for me is

$\displaystyle \min_{ \mathcal{R} } \max_{x \in \mathcal{X}} \underset{A \sim \mathcal{R} }{\mathbb{E} } cost(A, x)$.

Yao’s min-max lemma says that the two values are equal.

BPP $\subseteq$ P/poly.

Proof: Suppose there is an algorithm $A^\prime$ which decides a language in BPP using a random string $r$ of length $poly(n)$. We can construct a majority-vote algorithm $A$ which uses a random string $R$ of length $latex poly(n)$ but the probability that it errs on a given input string of length $n$ is at most $1/e^n$, thanks to the Chernoff bound. For a fixed $R$, the probability that $A(R,.)$ is incorrect on some input is at most $2^n/e^n < 1$. Then by the probabilistic method, there exists a random string $R^*$ so that $A(x, R^*)$ is correct on every input.

## Cryptography

Shamir’s $(t,n)$-Threshold Secret Sharing. Suppose you have a secret $\sigma$. If you have a degree-$(t-1)$ polynomial $P(x) = \sum_i{a_i x^i}$ over any field $F$, you can set $a_0=\sigma$ and create a set of $n$ arbitrary evaluations of this polynomial in $F$. Any $t$ of these evaluations can reconstruct $P$ via Lagrange interpolation and thus recover $\sigma = a_0$. Any smaller subset of evaluations cannot reconstruct $P$, and thus get no information about the secret $\sigma$.

Note. See the similarity with the Reed-Solomon code.

Feldman’s Verifiable $(t,n)-$Secret Sharing. Suppose you want to share a secret $\sigma \in \mathbb{Z}_p^*$. Initialization: Make $p, g$ public where $g$ generates the field $\mathbb{Z}_p^*$. Share generation: Like Shamir’s secrete sharing scheme, choose a (random) polynomial $P(x) = \sum_i{a_i x^i}$ of degree $t-1$, set $a_0=\sigma$, the secret, and distribute shares $\sigma_i = P(i) \bmod p$ to each user $i \in [n]$. Commitments: Create commitments $c_i = g^{a_i}$ to each coefficient of $P(x)$, and make them public. Verification: The user $i$ can verify whether

$g^{P(i)} \overset{?}{=}\prod_j{c_j^{i^j}}\,$ because

$g^{P(i)} = g^{\sum_j{a_j i^j}} = \prod_j{g^{a_j i^j}} = \prod_j{c_j^{i^j}}.$

Reconstruction: Lagrange interpolation from any $t$ shares.

Bit Commitment from One-Way Permutations. Suppose you want to commit to a bit $b$. Initialization: Let $h$ be a hard-core predicate for a strong one-way permutation $f$. Commit: You sample a random string $r$ and send $\big( f(r), h(r) \oplus b \big)$ to the receiver. Reveal: You send $b$ and $r$.

This scheme is “computationally hiding” because to predict $b$ one has to predict the hard-core $h(r)$ using $f(r)$, which is “difficult” by definition. This is “perfectly binding” because there can be no $r^\prime$ such that $f(r) = f(r^\prime)$.

Bit Commitment from One-Way Functions. Suppose you want to commit to a bit $b$. Initialization: Let $G$ be PRG. The receiver sends you a random string $r$. Commit: You sample a random $n$-bit string $s$ and a random $3n$-bit string  $r^\prime$. If $b == 0$ you send $\big( G(s) \oplus r \big)$ to the receiver; otherwise, you send $\big( r^\prime \oplus r \big)$. Reveal: You send $b, r^\prime, s$ to the receiver.

Recall that one-way functions imply PRG. The definition of a PRG implies that this scheme is “computationally hiding.” To cheat, you will have to find a $s^\prime$ such that $G(s) = G(s^\prime)$. The number of  $(s, s^\prime)$ pairs is $2^{2n}$ while the size of the PRG’s output space is $2^{3n}$. Hence the probability that you’ll succeed is at most $2^{-n}$. Hence this scheme is “perfectly binding.”

Pedersen’s Bit Commitment Scheme. Suppose a sender wants to commit to a bit $b$. Initialization: Everyone agrees on a prime $q$, another prime $p = 2q + 1$. Let $g \in \mathbb{Z}_p^*$ be any element with $ord(g) = q$. The receiver sends a random $y \in \mathbb{Z}_p^*$Commit: The sender selects a random $r \in \mathbb{Z}_q$. The commitment is $c = g^b y^r \bmod p$. Open: The sender sends $(r,b)$ to the receiver, and he verifies whether $c \overset{=}{?} g^b y^r \bmod p$.

The scheme is “hiding”; since $y^r$ is random, $c$ is also random regardless of $b$. The scheme is binding: to cheat, one has to find two different opening pairs, say $(r_0, 0)$ and $(r_1, 1)$, which open the same commitment $c = g^{r_0} = g^{r_1}y \bmod p$. If one can find such $r_0, r_1$, one has also solved the discrete log problem $y = g^{r_0 - r_1} \bmod p$. See here for more schemes. Here’s the original paper.

Chaum-Pedersen Discrete Log Proof-of-Knowledge Scheme. Setup: Fix a group $G=\mathbf{Z}_q$ for some prime $q$.  Suppose you know some element $\alpha$  and two elements $h_1, h_2$ such that $h_1 = g_1^\alpha$ and $h_2 = g_2^\alpha$ for two independently chosen generators $g_1, g_2$. Publish $g_1, h_1, g_2, h_2$Goal: you wanto to convince someone about your knowledge of $alpha$ without revealing what $\alpha$ is.

Here’s what you do. Interaction: You select a random element $w$ from $F$ and give the verifier two elements $a_i = g_i^w$ from $F$ for $i \in [2]$. Then the verifier sends you a random element $c$ as a “challenge”. You then reply with a “response” $r = w - \alpha c \bmod q$. Verification: The verifier then verifies whether $a_i = g_i^rh_i^c = g_i^{r+\alpha c}$.

RSA Encryption Scheme. Select two large primes $p,q$ and compute $N = pq$. Let $\phi(N) = ord(\mathbb{Z}_N^*) = (p-1)(q-1)$. Pick an arbitrary element $e$ in the multiplicative group $\mathbb{Z}_N^*$. This means, $e$ is coprime to and invertible modulo $\phi(N)$. Let $d$ be this inverse. $(N,e)$ is the public key, $d$ is the private key. Encryption: message $m \bmod N$ is encrypted as $c = m^e \bmod N$. Decryption: $c \mapsto c^d \bmod N = m^{ed} \bmod N = m \bmod N$ since $ed = 1 \bmod \phi(N)$.

Decisional Diffie-Hellman (DDH) Problem . Given $g, g^a, g^b$, distinguish $g^{ab}$ from random.

Computational Diffie-Hellman (CDH) Problem. Given $g, g^a, g^b$, compute $g^{ab}$.

Diffie-Hellman Key Exchange. You and I publicly agree on a prime $p$ and a generator $g$ of the group $G=\mathbb{Z}_p^*$. I randomly pick a private value $a \in G$ and you randomly pick your private value $b \in G$.  I send you $A = g^a \bmod p$ in public, while you send me $B = g^b \bmod p$ in public. I construct $S = B^a = g^{ab}$ while you construct $A^b = g^{ab}=S$. This $S$ is our shared secret assuming the discrete log problem is hard.

ElGamal Encryption Scheme. First, you and I publicly agree on a prime $p$ and a generator $g$ of the group $G=\mathbb{Z}_p^*$. Then we construct a shared secret $s$ via the Diffie-Hellman key exchange protocol. You encrypt a message $m \in G$ as $c = m\cdot s$. Since $G$ is cyclic, I can compute the inverse of $s$, call it $s^{-1}$. Finally, I can decrypt your cyphertext as $c \mapsto cs^{-1} = mss^{-1} = m$.

ElGamal Digital Signature Scheme.Initialization: Select a large prime $p$ and a generator $g$ of the group $G = \mathbb{Z}_p^*$. Select a random secret element $x \in \{2, \cdots, p-3 \}$ and set $y := g^x$. Let $H$ be a cryptographically secure hash function. Signning a message $m$: randomly select $k \in \{2,\cdots, p-2\}$ such that $k$ is invertible modulo $p-1$. Let $r:=g^k \bmod p$. Compute an $s$ such that $H(m) \equiv xr + ks \bmod p-1$. If $s=0$, start over. The tuple $(r,s)$ is the signature of $m$. Verification: Check that $0 < r < p-1, 1 < s < p-2$ and $r^sy^r \equiv g^{H(m)} \bmod p-1$. For a correct signature, the LHS is $r^sy^r = g^{ks}g^{xr} = g^{H(m)}$.

Hard-Core Predicate. An easy-to-compute boolean predicate $b : \{0,1\}^n \rightarrow \{0,1\}$ is a hard-core for a function $f$ if no efficient algorithm can predict $b(x)$ with non-negligible success even if they know $f(x)$.

Weak One-Way Functions Imply Storng Ones. If $f$ is a weak one-way function, then $g(x,y) = \big( f(x), f(y) \big)$ is a strong one-way function.

Every Strong One-Way Function has a Hard-Core Predicate. The Boolean inner product $x \odot r$ is a hard-core predicate of the function $g(x,r) := \big( f(x), r \big)$ where $f$ is a strong one-way function and $r$ is arbitrary.

Intuitively, you have to invert $f$ to predict the inner product.

## Optimization

Newton’s Method. Suppose you want to find a root of a locally-convex function $f(x)$ in the vicinity of the initial point $x_0$. The root will be approached in a sequence of points $x_0, x_1, x_2, \cdots$. Suppose you already know $x_n$. You make a linear approximation of $f(x)$ at $x = x_n$ as follows: $y = f(x_n) + (x_{n+1} - x_n) f^\prime(x_n)$. Because you want to find a root of $f$, you set $y = 0$ and solve for

$\displaystyle x_{n+1} = x_{n} - \frac{f(x_n)}{f^\prime(x_n)}$ .

Because you are doing a linear approximation, the error term decreases quadratically as you approach the true root $x = \alpha$. In particular, Taylor expansion of $f$ around $x_n$ tells us that the approximation error is $(\alpha - x_n)^2 f^{\prime \prime}(x_n)/2! + O((\alpha - x_n)^3)$

## Series

Taylor Series of a real function $f(x)$ around the point $a$.

$\displaystyle f(x) = \sum_{k=0}^{\infty} f^{(k)}(a) (x-a)^k / k! .$

$\displaystyle \log (1-x) = -x -x^2/2-x^3/3-\cdots$
$\displaystyle \log(1+x) = x - x^2/2 + x^3/3 + \cdots$

## Inequalities

$(1-x)^{1-x} > \exp(-x+x^2/2)$ when $x \in (0, 1)$

The exponential function is steep above zero and flat below zero.

$e^x \geq 1+x$ for $x\in \mathbb{R}$.

Concave functions are subadditive. Suppose $f(x)$ is concave and $\alpha > 0$. Then

$f(\alpha x) \geq \alpha x$ ,
$f(a)+f(b) \geq f(a+b)$ .