*This page is always under construction.*

Number Theory

Coding Theory

Probability/Combinatorics

Algebraic Graph Theory

Linear Algebra

Randomized Algorithms

Deviation Inequalities

Complexity Theory

Cryptography

Optimization

Series

Inequalities/Identities

## Number Theory

Chinese Remainder Theorem.Given a (pairwise co)prime factorization of an integer , there is a bijection between and the Cartesian product where is the set of units modulo i.e., the set of integers that are coprime to . These are also the invertible elements modulo .

*More.* This bijection takes an element to the tuple . Moreover, given a tuple , the corresponding element in is where and is the multiplicative inverse of modulo .

Fermat’s Integer Factorization of .Try to find two integers $latex a, b$ such that and . Then must divide the product . Consequently, the gcd of and at least one of these factors must be non-trivial. This gcd divides .

**Coding Theory**

Reed-Muller Code.The order- Reed-Muller code is the set of Boolean polynomials on variables and degree at most .

Why Reed-Solomon Code is MDS.In Reed-Solomon code, a degree- polynomial (the message) is encoded into a degree- polynomial (the codeword) where the coefficients of the codeword are obtained by evaluating at . Since can have at most zeros at any field, at most of the s can be zero. This means the minimum distance is at least which matches the singleton bound.

Singleton Bound.For a linear code, the distance is at most .

Plotkin Bound.If the distance of a code is strictly larger than half the block length , that is, if , the number of codewords is at most .

Hadamard Matrix.Let . Thenis the Hadamard matrix of order . Let us interprete as both strings and (corresponding) integers. Then where is the inner-product modulo . The map is an linear code. The volume enclosed by the (unit-length) columns of (i.e., the determinant) is the largest possible for any real matrix of the same order.

## Probability/Combinatorics

Birthday Paradox.Suppose . When you randomly draw times from an -set with replacement, the resulting -subset will contain a repeated element with constant probability.

*Explanation.* Suppose we choose independent samples from the numbers . For any , the probability that is . The expected number of collisions in is at most . This is a constant whenever .

Coupon Collector’s Problem.Suppose you want to color all items in a set of size as follows: at round , you pick a random element from ; if it is still uncolored, you color it red, put it back, and repeat. This process will require rounds to color all items.

Poisson Distribution.Let be a discrete random variable. If it has the Poisson distribution , its probability mass function at point is the probability that is given . That is,The expectation and variance of is . The expected deviation of from its mean is . A binomial distribution tends to a Poisson distribution when .

Note: The events and must be independent.

Probability Generating Functions.Let be a formal power series with . If then “generates” the probability distribution . Suppose generates the distribution . It is easy to see that the convolution generates the probability distribution whose mass function is.

Composition of Generating Functions.Let and denote the number of ways to build two structures, respectively of types 1 and 2, on a -element set where and . Let be the number of ways to divide the set into an unspecified number of intervals, build a type-1 structure on each interval, then build a type-2 structure on theset of intervals. Set . Denote by the generating functions of the sequences . Then

The Gambler’s Ruin Problem.See here.

The Unbiased Random Walk.See here.

Sum of first binomial coefficients.Suppose . Then,.

This bound is due to Michael Lugo. (Aside: This implies, if , the sum is at most .)

The above bound is within a factor of as long as and . When , this multiplicative factor increases.

The following bound is due to Lovasz. It works better when ; it stays within a factor of for .

.

Shearer’s Inequality.Let be coordinate-sets such that each coordinate is covered by at least different s. Let be a collection of independent random variables. The entropy of is at most times the sum of the entropies of its restriction to each coordinate set. That is,.

## Algebraic Graph Theory

Expander Mixing Lemma.Let be a -regular expander graph on vertices and edges. Suppose the second-largest eigenvalue of its adjacency matrix is at most . For any pair of vertex-sets , let be the set of “cross edges” between and . Let denote the cardinality of the set . Then,

Random Walk on an Expander.Suppose is a -regular expander whose second-largest eigenvalue is at most in magnitude. Let be a set containing a fraction of vertices. The fraction of random walks of length , on the vertices of , that stay entirely within , is small; in particular, it is at most.

## Linear Algebra

Cramer’s Rule.Suppose we want to solve a matrix-vector equation . Let denotes the matrix obtained by replacing the th column of the matrix by the . Then,

## Randomized Algorithms

Generating Pairwise Independent Random Variables.Choose a field of size . Sample two uniformly random elements using uniformly random bits. Then the following random variables are uniform over and pairwise independent:See here for details.

Pairwise Independent Hash Functions.Let be a family of functions, each having input bits and output bits. Suppose the following holds: that if we pick at random, the probability that it maps two arbitrary but distinct input strings to the same output string is exactly . Then is a family of pairwise independent hash functions. In other words,.

Equivalently,the probability that two arbitrary strings will collide under a random is the square of the uniform probability on the output space.Intuitively,if we pick a random then for arbitrary in the domain, the strings are independent and uniformly distributed.

Hitters.Fix a Boolean function on -bit inputs. Let be the set on inputs on which outputs , and suppose . An efficient, randomized algorithm , given only oracle access to , is ahitterfor if with probability at least , it needs to sample at most $m$ points from before it “hits ,” that is, it outputs an element from . We can construct hitters, using random walks on expanders, which takes samples and uses only random bits. In contrast, if we sample uniformly at random, it requires $mn$ random bits. See the appendix of Goldreich’s survey for details.

**Deviation Inequalities**

Markov’s Inequality.Any nonnegative with mean satisfies for any positive .

Chebyshev’s Inequality.Any random variable with mean and variance satisfies for any positive .

Note: While Chernoff-Hoeffding inequality requires **mutual independence** of all random variables, it suffices to have **pairwise independence** for Chebyshev’s inequality. See Nick Harvey’s notes here.

Hoeffding’s Tail Inequality, Additive Deviation, Symmetric.The probability that the sum of independent and identically distributed -bounded random variables deviates at least away from its expectation is no more than .

Chernoff Bound, Multiplicative Deviation, Asymmetric.Let be the sum of many independent Bernoulli trials. Let and let be any positive number between zero and one. The probability that is below the mean is at most . However, The probability that is above the mean is at most .

(Hoeffding-like statement.)Equivalently, the probability that is below the mean is at most . However, The probability that is above the mean is at most .

*Note*. When is large, the normal approximation of the binomial distribution is less accurate for the positive tail. See the appendix of Alon-Spencer for a commentary.

Above are the plots of the Chernoff-tail and the Hoeffding-tail for both 0-1 and random variables. The plotted quantity is the bound for the probability that the sum deviates $t n$ away from the expectation, for and .

Rudelson-Vershynin Theorem on the Operator Norm of Random Projections.Let be identical copies of a random vector with and magnitude at most . Let . For any ,

Note: We can think as a “scaled” projection operator along multiplied by . Then acts as the mean of these projection operators. The theorem says that the largest distortion inflicted by on any vector is no more than with high probability. Notice also the similarity with Hoeffding’s tail inequality, which would give the exponent as for one-dimensional case. We lost a factor of 8 somewhere.

Azuma-Hoeffding Inequality for Martingales.Let be real-valued random variables with the martingale property i.e., and the bounded-difference property i.e., for some absolute constant . Then, the probability that the difference is more than is at most .

## Complexity Theory

Goldreich-Levin Theorem.Every injective one-way function has a corresponding boolean function which is hard to predict; predicting is effectively inverting , which is unlikely when is an injective one-way function.

Lipton’s Random Self-Reducibility Theorem.A problem is called “random self-reducible” if you can solve on an arbitrary input by evaluating on a sequence of uniformly random inputs. Suppose you have a black-box to solve on a random input with high probability. Then you can construct a randomized algorithm that can solve on every input with high probability.

Yao’s Min-Max Lemma.You and I are playing a two-player game. You are the input player and I am the algorithm player. Your move is to select a distribution over inputs . My move is to select a distribution over algorithms . Both have finite size. Your goal is to make high, while my goal is to make it as low as possible. The best value for you is,

while the best value for me is

.

Yao’s min-max lemma says that the two values are equal.

BPP P/poly.

*Proof:* Suppose there is an algorithm which decides a language in BPP using a random string of length . We can construct a majority-vote algorithm which uses a random string of length $latex poly(n)$ but the probability that it errs on a given input string of length is at most , thanks to the Chernoff bound. For a fixed , the probability that is incorrect on some input is at most . Then by the probabilistic method, there exists a random string so that is correct on every input.

## Cryptography

Shamir’s -Threshold Secret Sharing.Suppose you have a secret . If you have a degree- polynomial over any field , you can set and create a set of arbitrary evaluations of this polynomial in . Any of these evaluations can reconstruct via Lagrange interpolation and thus recover . Any smaller subset of evaluations cannot reconstruct , and thus get no information about the secret .

*Note.* See the similarity with the Reed-Solomon code.

Feldman’s Verifiable Secret Sharing.Suppose you want to share a secret . Initialization: Make public where generates the field . Share generation: Like Shamir’s secrete sharing scheme, choose a (random) polynomial of degree , set , the secret, and distribute shares to each user . Commitments: Create commitments to each coefficient of , and make them public. Verification: The user can verify whetherbecause

Reconstruction: Lagrange interpolation from any shares.

Bit Commitment from One-Way Permutations.Suppose you want to commit to a bit . Initialization: Let be a hard-core predicate for a strong one-way permutation . Commit: You sample a random string and send to the receiver. Reveal: You send and .

This scheme is “computationally hiding” because to predict one has to predict the hard-core using , which is “difficult” by definition. This is “perfectly binding” because there can be no such that .

Bit Commitment from One-Way Functions.Suppose you want to commit to a bit . Initialization: Let be PRG. The receiver sends you a random string . Commit: You sample a random -bit string and a random -bit string . If you send to the receiver; otherwise, you send . Reveal: You send to the receiver.

Recall that one-way functions imply PRG. The definition of a PRG implies that this scheme is “computationally hiding.” To cheat, you will have to find a such that . The number of pairs is while the size of the PRG’s output space is . Hence the probability that you’ll succeed is at most . Hence this scheme is “perfectly binding.”

Pedersen’s Bit Commitment Scheme.Suppose a sender wants to commit to a bit . Initialization: Everyone agrees on a prime , another prime . Let be any element with . The receiver sends a random . Commit: The sender selects a random . The commitment is . Open: The sender sends to the receiver, and he verifies whether .

The scheme is “hiding”; since is random, is also random regardless of . The scheme is binding: to cheat, one has to find two different opening pairs, say and , which open the same commitment . If one can find such , one has also solved the discrete log problem . See here for more schemes. Here’s the original paper.

Chaum-Pedersen Discrete Log Proof-of-Knowledge Scheme.Setup: Fix a group for some prime . Suppose you know some element and two elements such that and for two independently chosen generators . Publish . Goal: you wanto to convince someone about your knowledge of without revealing what is.Here’s what you do. Interaction: You select a random element from and give the verifier two elements from for . Then the verifier sends you a random element as a “challenge”. You then reply with a “response” . Verification: The verifier then verifies whether .

RSA Encryption Scheme.Select two large primes and compute . Let . Pick an arbitrary element in the multiplicative group . This means, is coprime to and invertible modulo . Let be this inverse. is the public key, is the private key.Encryption:message is encrypted as .Decryption:since .

Decisional Diffie-Hellman (DDH) Problem .Given , distinguish from random.

Computational Diffie-Hellman (CDH) Problem.Given , compute .

Diffie-Hellman Key Exchange.You and I publicly agree on a prime and a generator of the group . I randomly pick a private value and you randomly pick your private value . I send you in public, while you send me in public. I construct while you construct . This is our shared secret assuming the discrete log problem is hard.

ElGamal Encryption Scheme.First, you and I publicly agree on a prime and a generator of the group . Then we construct a shared secret via the Diffie-Hellman key exchange protocol. You encrypt a message as . Since is cyclic, I can compute the inverse of , call it . Finally, I can decrypt your cyphertext as .

ElGamal Digital Signature Scheme.Initialization: Select a large prime and a generator of the group . Select a random secret element and set . Let be a cryptographically secure hash function. Signning a message : randomly select such that is invertible modulo . Let . Compute an such that . If , start over. The tuple is the signature of . Verification: Check that and . For a correct signature, the LHS is .

Hard-Core Predicate.An easy-to-compute boolean predicate is a hard-core for a function if no efficient algorithm can predict with non-negligible success even if they know .

Weak One-Way Functions Imply Storng Ones.If is a weak one-way function, then is a strong one-way function.

Every Strong One-Way Function has a Hard-Core Predicate.The Boolean inner product is a hard-core predicate of the function where is a strong one-way function and is arbitrary.

Intuitively, you have to invert to predict the inner product.

## Optimization

**Newton’s Method.** Suppose you want to find a root of a locally-convex function in the vicinity of the initial point . The root will be approached in a sequence of points . Suppose you already know . You make a linear approximation of at as follows: . Because you want to find a root of , you set and solve for

.

Because you are doing a linear approximation, the error term decreases quadratically as you approach the true root . In particular, Taylor expansion of around tells us that the approximation error is

## Series

Taylor Seriesof a real function around the point .

## Inequalities

when

The exponential function is steep above zero and flat below zero.for .

Concave functions are subadditive.Suppose is concave and . Then,

.