A strength of these codes is that although the algebra is described on extension fields/rings over , the encoding/decoding process uses only Boolean addition/rotation operation and no finite field operation. These codes are also MDS (Maximum Distance Separable), which means they have the largest possible (minimum) distance for a fixed message-length and codeword-length.

(Recall that if a code has data components and parity components in its generator matrix in standard form, its distance is at most by the Singleton bound. Hence the code is MDS if and only if it can tolerate arbitrary disk failures.)

The BASIC code does the following things in relations to the BBV code:

- Adds a virtual parity bit after each disk, giving each disk an even parity
- Does polynomial arithmetic modulo instead of as in the case of BBV code
- Shows equivalence to the BBV code by making a nice observation via Chinese Remainder Theorem
- Proves MDS property for any number of coding disks when is “large enough” and has a certain structure

Open Question:What is the least disk size for which these codes are MDS with arbitrary distance?

Let be an odd prime. (Bounds on in connection to the distance of the code will come later.) Although there can be at most data disks, we will assume there are exactly of them. There are coding disks. Each data disk is represented as a Boolean polynomial of degree , that is, a polynomial in the ring

where

.

Let be the data disks. Let denote a cyclic shift of the polynomial modulo . If , the shifted polynomial is the usual shift (modulo ) followed by a flip in all (Boolean) coefficients, since modulo and an addition modulo is a flip for Boolean coefficients.

With this definition, the th coding disk , for , is is the (Boolean) sum

Given the data disks , the collection

is treated as a *systematic* codeword of length with information symbols and coding symbols. The rate of this code is . Since we want high-rate codes, we are interested in small values of .

Suppose is the minimum distance of this code. The singleton bound says is at most . When code achieves this distance it is said to be an MDS code. For this code, it happens when

- for all
- when modulo and excludes some small subsets (depending on

**A lower-bound on .** Suppose the BBV code has data disks and coding disks. According to this analysis, when is a primitive root modulo , the BBV code is MDS for all if is strictly greater than . (See the above link for an exact bound.) This is a sufficient condition. A lower-bound on for this code to be MDS is still an open problem (as of October 2017).

**Parameters/Conditions:** For a prime , the disk size is ,. There are data disks, , and coding disks. **Fault tolerance:** The code is MDS (that is, a -linear code) for up to disk failures for all ; up to failures when is a prime for which is a primitive element; up to any number of failures if . **Alphabet:** . **Encoding process:** See above. **Decoding process:** Skipped.

**Strengths:**

- Coding disks are independent of each other
- The shifts are modulo which is an irreducible polynomial
- The polynomials have Boolean coefficients
- Only Boolean additions and rotations are performed. No finite field operations needed for encoding/decoding

**Limitations:** Not sure.

The Boolean Addition and Shift Implementable Cyclic-convolutional code, or the BASIC code in short, is like the BBV code. The difference is that the polynomial arithmetic is performed modulo instead of .

**Connections with the BBV code.** When is an odd prime number such that is a primitive root modulo , the polynomial factors into two irreducible factors over as . This post talks about the factoring, and Artin’s conjecture says that there are infinitely many such primes.

The data disks are given an even weight by adding a (virtual) parity bit, and hence both the data and coding disks are Boolean polynomials of an even weight in the ring

.

Let be the subring of containing only even-weight polynomials. By the Chinese Remainder Theorem, there is a bijection taking every in to the tuple . However, since has an even weight, is an even number which is zero (modulo 2) for all . Thus the ring is isomorphic to the field used in the BBV code.

**Data Disks.** Let be an odd number. The disks contain bits each. By (virtually) appending a parity bit, the th data disk is treated as an element of the ring

That is, the polynomial for the th disk has degree and an even number of nonzero coefficients.

The ring can be described in two ways. First, it is the set of polynomials of the form for every . Notice that the dimension of over is because, well, it contains all (and only) the even-weight words. Second, since the th coefficient is the parity bit, it is one (resp. zero) when the Boolean sum of all other coefficients is one (resp. zero). Hence the total weight is always zero.

*Why does the polynomial have an even weight?* Let denote the th coefficient of any polynomial, and let . Notice that is whenever differs from . The number of such 0-1/1-0 transitions is always even because modulo , the first and last coefficients are adjacent. Hence will have even weight.

The ** check polynomial** for any element is the all-ones polynomial

because .Why? Because if we imagine a matrix whose first column is the coefficient-vector , second column is , third column is and so on, and recalling that is a rotation, it becomes clear that the sum along each row will be zero since the first column contains an even number of ones to begin with.

**Coding disks.** Suppose we have data disks

We treat the -tuple as the *message*. Recall that for each of these polynomials, the th coefficient, which is not part of the original data, is the sum of all other coefficients. The th coding disk , then, is defined as the Boolean sum of the columns , each rotated-down by times. That is,

for . In other words, is the vector inner product of the vectors and .

**Generator matrix.** The generator matrix for this code, in standard form, looks like , and a systematic codeword is where

is a Vandermonde matrix with entries in . Hence a codeword is a -linear combination of row-vectors over .

**Invertibility in the ring .** Since is odd, the polynomial is separable over . (Why?) Hence the check polynomial splits into linear factors in its splitting field. It is not hard to see that a polynomial has if and only if is a unit modulo . By the Chinese Remainder Theorem, this happens if and only if is a unit modulo each of the linear factors of . Such a polynomial is called -invertible.

**MDS-ness and Vandermonde minors.** For this code to be MDS, it has to be able to decode with up to disk failures. The structure of the generator matrix tells us that if any data disks fail, any submatrix of involving any coding columns and the remaining data columns must be invertible. Since the columns corresponding to the data disks are standard basis vectors, this means any order- submatrix of the Vandermonde matrix must be invertible in the ring . Put differently, *every -minor of must be -invertible.* This invertibility condition is, in fact, equivalent to the MDS condition.

**MDS-ness and a lower-bound on .** The analysis in the BASIC code paper mentioned above and this paper shows that for this code to be MDS, it is sufficient that is an odd prime number such that is a primitive element modulo .

The above bound is derived as follows. To prove that all minors of a generalized Vandermonde matrix are invertible modulo , they computed the maximum degree of any minor modulo using this result. They observed that is irreducible over when is a primitive root modulo . Therefore, any polynomial over with degree at most must be coprime to , and hence invertible modulo . This gives the bound .

What could be the largest degree of a minor of order ? Consider an order- submatrix given by the rows and columns of the Vandermonde matrix . Assume that the index sets and are in increasing order. If we look into the Leibniz expansion of the determinant of , the term having the largest degree must be the product of the diagonal entries. The degree of this term will be , which is no more than . Since , the bound follows.

**Parameters/Conditions:** For a prime , the disk size is ,. There are data disks and coding disks. **Fault tolerance:** The code is MDS (that is, a -linear code) for up to disk failures for all ; up to failures when is a prime for which is a primitive element; up to any number of failures if . **Alphabet:** . **Encoding process:** See above. **Decoding process:** Skipped.

**Strengths:**

- All of the BBV code, but taking remainders modulo is computationally and conceptually simpler than doing the same modulo

**Limitations:** Not sure

Filed under: Algebra, Coding Theory, Computer Science, Expository, Mathematics Tagged: Chinese Remainder Theorem, disk erasure code, generalized vandermonde matrix, MDS array code, Vandermonde matrix ]]>

An equivalent description of is the largest row-power and the set of *missing rows* from : that is, the items that are in but not in . Let be this set. Define the *punctured* Generalized Vandermonde matrix . Let be the th elementary symmetric polynomial.

Now we are ready to present some interesting results without proof, from the paper “Lower bound on Sequence Complexity” by Kolokotronis, Limniotis, and Kalouptsidis.

I will add some applications later on.

Filed under: Algebra, Computer Science, Mathematics, Matrix Analysis Tagged: determinant, generalized vandermonde matrix, Vandermonde matrix ]]>

Currently, we are asking whether all submatrices of the order- Vandermonde matrix over a finite extension of are invertible where is prime. The answer is “no” in general: there are examples of fields where the Vandermonde matrix has a singular submatrix.

We can ask an easier(?) question, though. What happens if we randomly sample a set of columns and look into submatrices formed by a subset of the sampled columns. With a touch of beautiful insight, Professor Russell has connected Szemeredi’s theorem on arithmetic progressions with this question.

Let denote an arithmetic progression of length $latek k$. Let for .

The Szemerédi theorem says, any “sufficiently dense” subset contains infinitely many for all . A finitary version says: Fix your favourite . Then, there exists a natural such that if you look any subset of size at least , you will find an . Yet another version says:

Szemerédi’s Theorem.The size of the largest subset without an cannot be too large; in particular, it is .

Recall that a function is if it grows too slow compared to , so that .

On the other hand, let be a set of distinct nongenative integers. be a principal th root of unity in the field . Suppose we have a generalized Vandermonde matrix over such that for some , every __column__ has the form

for some . Since the exponents of along any row of form an arithmetic progression, it is easy to see that is invertible.

What can we say about the set above? Let be a proper subspace of so that . Suppose . By our observation above, cannot contain an arithmetic progression of length , because otherwise, the matrix would be invertible and its columns would span ; since by assumption, it would imply , a contradiction.

Observation:If contains an arithmetic progression of length , then .

Now let us ask the following question.

Question:Let be an arbitrary strict subspace of . If we pick a at random, what is the probability that ?

Suppose there is a set such that the vectors . Since is a strict subspace of , does not contain an arighmetic progression of length , and by Szemeredi’s theorem. The probability that equals the probability that , which is , which tends to zero as goes to infinity.

Lemma (Due to Alex Russell).Let be an arbitrary strict subspace of . If we pick a at random,as goes to infinity.

We are ready to prove the following nontrivial statement about the minors of the Vandermonde matrix. (This theorem is due to Alex Russell and Ravi Sundaram.)

Let be a prime and be an order- Vandermonde matrix over the field containing a principal th root of unity. Let be the submatrix of indexed by the rows in and columns in .

Theorem A (Due to Alex Russell).Let a constant. Select two size- subsets where is arbitrary and is uniformly random. The probability that the submatrix is invertible is , i.e., the probability tends to one as goes to infinity.

The outline of the proof is as follows. We select distinct random integers one by one and add it to a set which is initially empty. At any time, let . Everytime we pick a random , we invoke the lemma above to show that as goes to infinity. We repeat this for times. Because the samples are independent, the probability that *some* (out of the samples) is bad is at most . If this happens, we say that the subset is bad. Otherwise, and we got ourselves an invertible -submatrix given by the columns for all .

**A Chebotarev-like Statement. ** Suppose the field contains all th roots of unity, let be the corresponding Vandermonde matrix. A matrix is * totally invertible* if every submatrix is invertible. Chebotarev’s thoerem states that the matrix is totally invertible.

*Theorem A* allows us to make a probabilistic statement which is similar to Chebotarev’s theorem in spirit. Continuing the proof-outline of *Theorem A*, suppose is an arbitrary subset of of a constant size. Similarly, let be a uniformly random subset of of a constant size. Let . Let .

The probability that *some* order- submatrix of is singular is no more than

This probability is as long as the sets have constant sizes. Also notice that the same argument holds if we use the transpose for . This leads us to the following corollary.

Corollary.Fix . Select two constant-size subsets where is arbitrary and is uniformly random. Then, the submatrices and are totally invertible with probability .

**An attempt via Schwartz-Zippel.** Let be the set of th roots of unity contained in a field . Select a set and a set of distinct th roots of unity . Let be the determinant of the order- submatrix of the Vandermonde matrix obtained from the rows in and the columns in .

Notice that in the -variable polynomial is . It can be shown that the number of zeros of is at most . Consequently, if the elements of are selected uniformly at random from all size- subsets, the probability that will be at most where .

For our coding theory problem, is at most the number of data disks, . Set , and let be a random subset of size . Let . The probability that *some* submatrix of is singular is no more than

This probability is at most . It will be at most if . Not great, but as Don Knuth said, a bound is a bound.

Filed under: Algebra, Computer Science, Fourier Transform, Mathematics, Randomized Algorithms, Theoretical Computer Science Tagged: arithmetic progression, Szemeredi's theorem, uncertainty principle, Vandermonde matrix ]]>

Open Question:Is there a prime and a positive integer such that all submatrices of the Discrete Fourier Transform matrix over the field are nonsingular?

Currently, I have only counterexamples: Let be the degree of the smallest extension over which contains a nontrivial th root of unity. Then, I know a lot of primes for which the matrix has a singular submatrix.

In this post, I am going to show a failed attempt to answer this question using the results in this paper by Evra, Kowalski, and Lubotzky.

Let me introduce some more context.

Let be the ring of polynomials of degree at most with coefficients in . Let be the group . Every function can be identified with the polynomial . We will abuse the notation and write to represent both the function and the polynomial, where which-is-which will be clear from the context.

Let be the matrix representation of the Discrete Fourier Transform , where is a vector in containing at its th coordinate, . We know that the th coordinate of contains . The entries of the matrix are . Such a matrix is called the Vandermonde matrix.

For a polynomial , its support is the number of nonzero coefficients.

Satisfying the uncertainty principle.If the inequalityis satisfied for every polynomial and a prime , we say that the field satisfies the uncertainty principle for .

If the above inequality is satisfied for all primes, then satisfies the uncertainty principle. For example, satisfies the uncertainty principle for all primes .

There is a theorem concerning the determinants of the Vandermonde-submatrices (see the appendix of the EKL paper).

Chebotarev’s Theorem.Let be a principal th root of unity in a field which satisfies the uncertainty principle for . Then every submatrix of the Vandermonde matrix is nonsingular.

For example, can be , the complex numbers where .

Having this view, here are several rephrasals of my first question:

Question:Is there an and a (family of) prime such that satisfies the additive uncertainty principle for ?

Question:Is there a prime and an such that in the field , all (square) submatrices of the Vandermonde matrix are nonsingular? Here, we assume that contains a th principal root of unity .

However, the EKL paper shows that the question about which fields satisfy the uncertainty principle is equivalent to the following:

Question (EKL):Is there an asymptotically good cyclic code for an infinite family of primes ?

**My failed attempt. **The Proposition 4.3 in this paper (by Evra, Kowalski, and Lubotzky) says that if is a primitive root modulo a prime , then satisfies the uncertainty principle for . Although the statement says has to be a prime, the argument carries through even when is a prime power.

Proposition 4.3 in EKL.Suppose is a field such that is a primitive root modulo a prime . Then satisfies the uncertainty principle.

Although intriguing, this proposition doesn’t lead us to Chebotarev’s theorem.

Let . The proof of the above proposition requires that be irreducible over (which is the case indeed).

In Proposition 4.3, set . Since 2 is a primitive root modulo 11, the implication is that all polynomials in the ring satisfies the uncertainty principle i.e., . However, Chebotarev’s theorem does not apply here because the field does not contain . On the other hand, if we start from Chebotarev’s theorem and require to contain , then the proof of the above proposition breaks since it requires to contain no nontrivial th root of unity.

Therefore, it seems like going as-it-is from Proposition 4.3 to Chebotarev’s theorem is a dead end. We’ve got to do something clever.

**Afterthoughts.** *I have a code to exhaustively search each submatrix of a given Vandermonde matrix. Although my hope was that one can go from Proposition 4.3 to the Chebotarev’s theorem, my code gave me counterexamples. Only during writing this post I understood why.*

Filed under: Algebra, Computer Science, Expository, Fourier Transform, Theoretical Computer Science Tagged: chebotarev's theorem, Fourier Transform, Vandermonde matrix ]]>

**HBO charged me $15 via Amazon for my subscription that I forgot to cancel** after the Game of Thrones was over. Upon writing to Amazon, the representative said:

As per policy we are unable to issue refund for the previous month charge (September) for your HBO subscription.

However, I am able to make an exception for youand have issued you a promotional credit of $15.14 in your account.

**Walmart manager said he would not accept my returns (worth $107) without a receipt.** Although the first guy said he would give me a gift card, the system didn’t let him proceed. The manager came in and told me they wouldn’t take these returns. Why? Because they have recently changed the rule: without a receipt, I can return items worth only up to $25.

A rule is a rule, he said, but I could try other Walmarts if I wanted. He said he wouldn’t make multiple $25 transactions because that would be called “structuring.” He called *his* manager, who wasn’t picking up. There’s nothing more for me here, he said.

**If it were a week ago, I would have given up at this stage.** But I was listening to the audiobook of “Never Split the Difference”, and it says that **“No is the beginning of a negotiation.”** So, I persisted. I said, isn’t there anything *he* can do? Maybe his manager? I could wait till his manager comes. Upon waiting a few minutes, the higher manager dropped by. She was not happy, but she made it quick. We don’t take returns like this, and we will not in future. But only this time, we’ll make an exception. You cannot have cash, but we’ll put it in a gift card.

My $15 Audible subscription fee for this book paid off over $100 in the first week. I loved this book.

Without further ado here are my takeaways from the book:

**People, including you, are emotional**, not rational**Listen**. Then listen. Then listen some more. Show that you are listening**No is the beginning**of a negotiation**Don’t try to defeat your counterpart**. Strive to understand him**Ask open-ended questions**leading with “what” and “how”, but not “why”**When he answers, summarize/rephrase****his viewpoint**in a few words. Your goal is to make him say “That’s right.” This is the magic moment when he would think that you “get” his situation- When you are listening,
**give a label to his feelings/situation**: “It looks like you feel hurt,” or “It sounds like you are frustrated.” This will open him up - When asked to give a number to a monetary demand,
**always give a non-rounded number such as $4,123 instead of $4000**or $5000. These numbers give a sense of precision and reason behind them, regardless such a reason exists or not **People are more willing to avoid losing**something (of whatever value) than to gain the same amount of value- When asked to name a price (when you are a buyer),
**start with a very low offer**(called an extreme anchor) which would lower their expectations; then move up. For example, if your target buying price is $100, start with $65, then raise it to $85, then $95 and then $100. This is the**65-20-10 rule**. This gives the counterpart a feeling that you are really squeezed - Similarly, when asked to name your desired salary,
**give a range**whose upper end is too high (an extreme anchor) and the lower end is at (or slightly higher than) your target. This way,**the low end will look much favorable to them**. Make sure to make these numbers non-rounded as well - Try to
**figure out what makes your counterpart want what he wants** **Say no without saying no. Ask “How am I supposed to do that?”**or “How am I supposed to accept such a low offer?” Have them work out the solution for you. When they respond, ask another “How am I supposed to …” question. Against immediate/improbable demands, this question works magic. It buys time, and it shows the counterpart how “unreasonable” his demand is**Nonverbal communication**rules. Remember the 7-38-55 rule: When we speak, only 7% is communicated via the words themselves, 38% from the tone, and 55% from the body language**Your counterpart is not stupid or crazy.**They are either lacking information, or have mistrust towards you (or something else), have constraints that they are not willing to admit, or have their own worldview. Try to understand them. Make them feel you are on the “same side,” whatever that is, or at least you understand which beliefs they identify with.

Filed under: Book Review, Books, Self Improvement, Uncategorized Tagged: negotiation ]]>

*Every time I thought I understand Fourier Transform, I was wrong.*

Doing the Fourier Transform of a function is just seeing it from “another” point of view. The “usual” view of a function is in the standard basis . For example, can be seen as a vector (in the basis given by the elements in the domain) whose coordinates are the evaluations of on the elements in the domain. It can also be seen as a polynomial (in the monomial basis) whose coefficients are these evaluations. Let us call this vector .

The same function can also be seen from the “Fourier basis”, which is just another orthogonal basis, formed by the basis vectors . The th coordinate in the new basis will be given by inner product between and the th basis vector . We call these inner products the Fourier coefficients. The Discrete Fourier Transform matrix (the DFT matrix) “projects” a function from the standard basis to the Fourier basis in the usual sense of projection: taking the inner product along a given direction.

In this post, I am going to use elementary group theoretic notions, polynomials, matrices, and vectors. The ideas in this post will be similar to this Wikipedia article on Discrete Fourier Transform.

*Please let me know in the comments if you find any error or confusion.*

**Roots of Unity.** Let be the the multiplicative group of th roots of unity. That is, . is a primitive th root if the smallest natural exponent that makes is . The root is principal if . In what follows, we assume that is a principal and primitive th root of unity.

Let be a field containing . For example, can be the set of complex numbers and where is the imaginary unit. It is easy to check that the roots appear on the unit circle of radius 1 around the origin.

**Representing Functions as Polynomials or Vectors.** Let be the set of integers . To identify a function , it suffices to write down its evaluations on all . Thus we can treat as a dimensional vector, whose th entry is just for . This shows that .

We can also represent a function (from above) as a polynomial of degree at most , the representation being . This shows , where is the ring of polynomials with coefficients in and indeterminate , and contains polynomials with degree at most . Why? Because when we take , we set every in , turning every into . Neat. (Why is $latex F[x] a ring?)

**The Chinese Remainder Theorem.** The good news is, factors completely in because by assumption, contains all th roots of unity. This means, . Since the factors are relatively prime to each other, by the Chinese Remainder Theorem, there is a bijection which maps to a -tuple whose th entry is .

**From Coefficients to Evaluations.** Let us write . We can express where the degree of the remainder, , is strictly less than the degree of the divisor . (This means is just a scalar.) It is easy to check that and , giving us . Therefore, the transformed polynomial can be completely described by the evaluations of over the roots of unity , the evaluations being equal to for each .

**Connections with Polynomial Interpolation.** This way, the transformed polynomial can be completely described by the evaluations of over the roots of unity . Notice how it sounds like an interpolation problem: every polynomial of degree at most can be reconstructed from evaluations in any field. Indeed, the Lagrange interpolation theorem is a restatement of the Chinese Remainder Theorem.

**Enter DFT.** The transform is the Discrete Fourier Transform. The th coefficient of the transformed polynomial is called the th Fourier coefficient of . A root of unity , when treated as a function from defined as for any , is called a character of . The Fourier Transform maps a function from the standard (monomial) basis to the (Fourier) basis of characters.

**The Matrix of the Transform.** So, how does the transform look like? Let , where both are tread as dimensional vectors over , where the th coordinate of is . Recall that in polynomial representation, . We want to force every coordinate to be equal to . Notice that this equals where is the vector mentioned above, is the vector whose th entry is , and is the usual inner product. Then we can readily see that $\hat{f} = Tf$ makes sense as a matrix-vector multiplication, when the th row of the matrix is the transpose of the vector . All told, the matrix , containing at its row and column , is the Discrete Fourier Transform matrix. The above construction works even when is not a prime. (Notice that we have abused the notation and used to denote both a linear transform and its matrix.)

*Actually, I think I missed a scale-factor. The coefficients of the Discrete Fourier Transform of a function are actually scaled by a factor . I will correct my derivation later. The above discussion at least gives the structural insight behind the Fourier Transform.*

**Afterthoughts.** The transform is discrete because of the multiplicative group of the roots of unity, , is a finite group. It is isomorphic to under a map which sends the generator of (which is when is a prime) to the generator of , which is . The transform is also linear, because the inner product formulation is linear in both the coefficients of and entries of the matrix .

**A Glimpse of the Uncertainty Principle. **The weight of a polynomial is the number of its nonzero coefficients. Loosely speaking, the uncertainty principle says that the weight of and cannot both be sparse. An extreme example is that if has only one nonzero coefficient, none of its Fourier coefficients is zero. Similarly, if has no nonzero coefficient, then its Fourier transform has weight one. Let’s examine below why that is true.

A (Dirac) delta function can be represented as a vector with all zero entries except at coordinate , which contains . A uniform function for all will be the vector , where is the all-ones vector.

Everyone tells us that the Fourier transform of is , and the Fourier transform of is . If we see the Fourier Transform as multiplying a vector by the matrix , it is easy to see why.

All entries in the first row and column of the matrix are , becasue their values are where either or is zero. Multiplying with gives us the th column of , which is when . Otherwise, all Fourier coefficients of are nonzero. In particular, .

In contrast, multiplying with the vector adds up all columns of (times the scale factor ). This makes each Fourier coefficient (except the th coefficient) equal to the sum of all roots of unity, and this sum is zero. To see this, note that where . Since satisfies , this implies . Since , it follows that , which is the sum of all th root of unity, must be zero. On the other hand, the th coefficient is just the sum of ones. Hence, .

Filed under: Algebra, Computer Science, Expository, Fourier Transform, Mathematics Tagged: Chinese Remainder Theorem, Fourier Transform, roots of unity ]]>

None of the factors above can be factored anymore, hence they are irreducible over . Let us call the trivial factor since the root already belongs to . But why do we have two nontrivial irreducible factors of , each of degree 3, whereas has only one non-trivial irreducible factor of degree 12? It appears that either there is only one nontrivial factor, or the degree of all nontrivial factors is the same. Why?

More formally, let be the number of the nontrivial irreducible polynomials of degree appearing in the factorization of over GF(2). We would like to understand $lateax n_p(d)$.

In another post, I discussed the structure of the quotient ring . There, we have shown that

where is the smallest natural that makes , and . is called the multiplicative order of modulo . Since Fermat’s little theorem tells us that , it follows that must divide and will always be an integer.

So, what? (A little head scratching.) Okay. A root of the polynomial $x^p-1$ is called a th root of unity. Our base field, , contains 1, the trivial root of unity. It turns out, the smallest extension over that contains a non-trivial th root of unity is . (Why?) This extension , once again, is isomorphic to the quotient field where is an irreducible polynomial of degree over . Each copy of in the decomposition of the quotient ring corresponds to an irreducible polynomial of degree . These polynomials are the factors of over . Therefore,

This means, in the factorization of over GF(2),

- The degree of all nontrivial irreducible factors is the same, and it equals the order of modulo
- The number of these factors times their degree equals

Now that is awesome. Moreover, this argument is valid over any base field of prime order , which was in the above discussion.

For primes and , ; hence there is only nontrivial irreducible factor of degree in the expansion of over GF(2). This is true in general for primes for which 2 is a primitive root.

When , the order of 2 modulo 7 is . This is why there are irreducible factors of degree 2 in the expansion of over GF(2). When , the order of 2 modulo 17 is . This is why there are only irreducible factors of degree 8 in the expansion of over GF(2). A similar analysis is true for .

Let be the number of monic degree- irreducible polynomials over $latex F_2$. (See this sequence and this Wikipedia article.) The splitting fields of two irreducible polynomials of the same degree are isomorphic. (Why? See this post again.) Hence the polynomials of a given degree appearing in the factorization of are not unique; unless, of course, all of them appear together and makes the factorization unique. This is indeed the case for , where all monic irreducible polynomials of degree appear as its factors.

Filed under: Algebra, Computer Science, Expository, Number Theory Tagged: polynomial factorization, quotient ring ]]>

where each is an irreducible polynomial. What are the degrees of ? What is ?

We claim that where is the multiplicative order of mod .

This structure has been used in a beautifully instructive paper by Evra, Kowalski, and Lubotzky. (Yes, the one in Lubotzky-Philip-Sarnak.) In this paper, they have established connections among the Fourier transform, uncertainty principle, and dimension of ideals generated by polynomials in the ring above. We’ll talk about these connections in another post. The content of this post is Proposition 2.6(3) from their paper.

Let be the order of mod , which means is the smallest natural number satisfying , or equivalently, . If contains any non-trivial th root of unity, then would indeed contain all the th roots. Then would be separable in , resulting in .

Let be a non-trivial root of unity which does not already belong to . Then, the cyclotomic extension is the smallest extension containing all the th roots of unity, it is a splitting field of , and , where is an irreducible polynomial of degree with the property that , and hence every root of is a root of unity. Thus in the equation (*).

In a fixed algebraic closure , is unique (up to isomorphism) as “the” splitting field of any irreducible polynomial (in ) of degree . Thus is isomorphic to the cyclotomic extension for any via the isomorphism of the roots . Moreover, where is an irreducible factor of (over ) of degree . Thus we can set in equation (*).

Continuing in this manner, we find (the existence of) the irreducible polynomials , each of degree , so that . This shows us

Notice that this factorization is not unique. Let be the number of irreducible polynomials of degree over . This number is given by the Necklace Polynomial. For example, fix . When but , so we could use any two of the 30 available irreducible polynomials.

Filed under: Algebra, Computer Science, Elementary, Expository Tagged: quotient ring ]]>

Let be a finite metric space. Let be a Gaussian process where each is a zero-mean Gaussian random variable. The distance between two points is the square-root of the covariance between and . In this post, we are interested in upper-bounding .

**Question:** How large can the quantity be?

In this post we are going to prove the following fact:

where is the distance between the point from the set , and is a specific sequence of sets with . Constructions of these sets will be discussed in a subsequent post.

Notice that the metric structure (i.e., the pairwise distances) does not change if we translate all distances by a constant. Hence, let us pick an arbitrary point and see that equals by the linearity of expectation. This formulation has the advantage that the random quantity is clearly non-negative unless all equals .

Let . Recall that by the definition of expectation, . Looks like we need to upper-bound the quantity first. An immediate thought is to use a union bound to show that a sum is at least as large as the max:

However, this could potentially burn us if either there are “too many” points (albeit each with a small contribution), or if the points are “too close” to each other (so that each point makes a large contribution). What can we do to prevent this?

Maybe we should try to group nearby points together in clusters. Here is an idea: we will do a “gradual clustering”, as described below.

At steps we will maintain a set containing some “representative points”. We would try to spread these points across the entire space as best as we can. At step , every point will use its nearest point in as its representative. Let us call this point . Let us initialize with the (arbitrary) point we selected in the above discussion.

When , note that . The first two points are close (as we had planned), and hence the supremum, when taken across all points , should be small and more controllable. As for the last two terms, note that the number of representative points should be small compared to the total number of points (we need to carefully control it, though), and hence we hope that the union bound on this part will not hurt us much.

After steps of choosing representative sets , we can write the telescoping sum provided we can make sure that when is sufficiently large. (That is, each point would form its own cluster.)

We control the size of the sets ; in particular, we ensure that where . (We’ll show later how we do it.) Because are Gaussian random variables, it follows that for , we have

.

Let us denote by the event that for all and all points ,

Think of as a good event, which means successive approximations of points by representitives are “close”, and hence possibly converging. By combining the two previous equations, we have

.

Let be the complement of the event ; we think of as a bad event which means that for each point , all successive approximations to it are “far apart”. We can find the probability of this bad event by a union bound over all and all point-pairs between and .

At the th step, there are

representative pairs between and . Therefore,

Because the high-order terms in the sum in fall off quickly when , we can write where is a constant.

When occurs, it means for any point, all its successive approximations are “close”. In this case, we see (by telescoping ) that where .

For any point , using triangle inequality, since between and , is farther from by construction. Thus, .

However, when does not occur, the following holds: . The probability of this is exactly . In other words,

By integrating, we get that = since does not depend on , and the integral of a Gaussian is bounded by a constant.

This completes our proof of the following fact:

Filed under: Computer Science, Expository, Mathematics, Probability, Theoretical Computer Science Tagged: gaussian process, generic chaining, hierarchical clustering, supremum, talagrand ]]>

*To this day, no method of finding a generator of is known to be more efficient than essentially trying 2, then 3, and so on. Who cares? Well, the difficulty of breaking a certain public key cryptosystem (due to El Gamal) depends on the difficulty of working with generators of .* — Keith Conrad

An th root of unity in a finite field is an element satisfying , where is an integer. If is the smallest positive integer with this property, is called a *primitive* th root of unity. If is a primitive th root of unity, then all elements in the set are also roots of unity. Actually, the set form a cyclic group of order under multiplication, with generator .

**Problem:** Suppose you are given a finite field of degree , and you are promised that there indeed exists a primitive th root of unity for prime. Find , and in particular, produce a C++code that finds it.

In what follows, we talk about how to find such a root and provide my C++ code; the code uses the awesome NTL library.

Let be the multiplicative group of the finite field . Let is the order of . For a th root of to exist, must divide . Why? Because by Lagrange’s theorem, the order of a subgroup, here the group of the th roots of unity under multiplication, must divide the order of the group, which is .

So, how does one find ? The short answer is, via a generator of in three steps: (1) Randomly pick an element . (2) check that for all prime factors of . (3) The element will be a th root of unity in .

**About the code**

The following C++ code is written using Victor Shoup’s super awesome library NTL. The code is part of another project I am working on: I need to compute the rank of submatrices of a Vandermonde matrix generated by a th root of unity in the finite field , given divides . The field is represented as the set of polynomials of degree strictly less than over , which is isomorphic to the quotient ring where is an irreducible polynomial of degree exactly . Since is irreducible over , this ring is indeed a field.

The following code implicitly uses namespace NTL. The type NTL::Vec has been typedef’d to Vector.

**Part 1: Finding a generator of a cyclic group **

The multiplicative group of a finite field is cyclic. (Why?) Moreover, it is a direct product of many cyclic subgroups.

There is no algebraic method to directly pinpoint a generator of a multiplicative group. The number of generators of equals the number of integers in that are coprime to . This quantity is , where is Euler’s totient function defined as and . It is known that the ratio is at least . (Source?) Hence a random element has a decent probability of being a generator.

Now, given an element , how do we test whether a given element generates the entire multiplicative group and not any subgroup? For to be a generator of , we need . Obviously, we need to check that . Second, we need to check that for all positive integers between and , that . However, there is a better of

However, there is a better way of checking this provided we are also given the list of prime factors of . If generates a subgroup of , then would be a proper divisor of since by Lagrange’s theorem, the order of a subgroup divides the order of the group. Since we have already seen $latx g^p=1$, we can be sure that for all other prime factors of . We don’t need to test all divisors of ; checking that for all primes dividing suffices since they cover the candidate divisors.

The following code uses the function factorInt() to extract unique prime factors of ; this function is discussed later on.

GF2X findGenerator() { GF2X gen; ZZ otherFactor(groupOrder / params.p); factors.kill(); factorInt(factors, otherFactor); unique&amp;lt;zz&amp;gt;(factors); factors.append(ZZ(params.p)); bool genGood = false; while (not genGood) { // try a random element as a generator gen = GF2X::zero(); while (IsZero(gen)) random(gen, params.degree); // gen != zero genGood = true; // temporarily for (auto factor : factors) { ZZ orderOverFactor(groupOrder / factor); if (unity == PowerMod(gen, orderOverFactor, modpoly)) { genGood = false; break; } } } // gen^factor != unity for all factors, including p return gen; }

**Part 2: Finding a primitive root of unity**

Next, let us find a primitive th root of unity using the generator . If we set and where is a generator of , then . Note that must be a positive integer since divides . It follows that is a th root of unity in the finite field . Moreover, since is prime, the cyclic group has no non-trivial subgroups, meaning there cannot be any integer such that . Hence is a primitive th root of unity.

(To do: how would one proceed if was not a prime and we insisted on a *primitive* root instead of any root?)

// Set the unity and the monomial X element SetCoeff(unity, 0, 1); SetCoeff(X, 1, 1); // Multiplicative group order ZZ one(1L); groupOrder = LeftShift(one, params.degree) - 1; // unknown degree, got to build irred and the modulus BuildIrred(irred, params.degree); build(modpoly, irred); GF2E::init(irred); // p must divide the (multiplicative) group order if (groupOrder == (ZZ) params.p) { // prime order gen = rootP = X; factors.kill(); factors.append((ZZ) params.p); } else { gen = findGenerator(); // gen is a generator // compute r=gen^(ord/p) so that r^p = gen^ord = 1 // since gen is a generator, gen^x = 1 for no x &amp;lt; ord rootP = PowerMod(gen, groupOrder / params.p, modpoly); }

**Part 3: Finding the unique prime factors of any integer**

The above code uses a factorization function to factorize . Since we don’t need the exponents of individual primes for our application, only a list of unique factors suffices. Here is a code which implements Pollard’s rho algorithm, combined with Miller’s probabilistic primality test. It’s not supersonic, but good enough.

Given an integer , the algorithm basically finds one (possibly composite) factor and then proceeds recursively. First, it checks if is even (or a power of 2), in which case a factor. (We could have done this for first few primes, but I didn’t.)

Next, the obvious: it (probabilistically) checks whether is a prime, and if so, it gives up. The checking is performed using the celebrated Miller-Rabin probabilistic primality test. (Digression: I had a chanced to attend a lecture/conference from Gary Miller; he is awesome.)

Next, we randomly seed Pollard’s rho algorithm and find a factor. If it fails, we retry several times. If it still fails, we think that is probably prime. I tried 10 times, but you could do as many times as you please. The function is used to imitate a pseudo-random sequence.

Once Pollard’s rho algorithm finds a factor of , the next is obvious: we remove all powers of from . Then we recurse on and .

void factorInt(Vec&amp;lt;ZZ&amp;gt;&amp;amp; factors, ZZ n) { if (n &amp;lt;= 1) return; ZZ x, y, gcd, seed; ZZ numTrials(10L); ZZ trial(0L); ZZ q; //long n2 = conv&amp;lt;long&amp;gt;(n); do { // even if (not IsOdd(n)) { factors.append(ZZ(2)); while (n &amp;gt; 2 &amp;amp;&amp;amp; divide(q, n, 2)) n = q; // power of two if (n &amp;lt;= 2) return; } if (ProbPrime(n)) break; // n is not a prime RandomBnd(seed, numTrials); x = y = seed + 2; gcd = ZZ(1L); //n2 = conv&amp;lt;long&amp;gt;(n); while (gcd == 1) { g(x, x, n); g(y, y, n); g(y, y, n); gcd = GCD( abs(x-y), n); } long gcd2 = conv&amp;lt;long&amp;gt;(gcd); if (gcd == n) { trial++; continue; // retry } else { // remove all powers of gcd from n while (divide(q, n, gcd)) n = q; //long n2 = conv&amp;lt;long&amp;gt;(n); // subsequent factors Vec&amp;lt;ZZ&amp;gt; childFactors; if (n &amp;gt; 1) factorInt(childFactors, n); if (gcd &amp;gt; 1) factorInt(childFactors, gcd); factors.append(childFactors); // done return; } } while (n &amp;gt; 1 &amp;amp;&amp;amp; trial &amp;lt; numTrials ); // probably prime factors.append(n); return; }

**A brief remark on Pollard’s rho algorithm.** *(Please let me know if I have a gap in my argument.)* Imagine the group as a cycle of size . Then, the function samples a (sub)cycle of this big cycle where divides . If we are unlucky, the sampled cycle is the original one. Otherwise, the sampled cycle is pretty small (of size ). This means the set , corresponding to the seed , the element being the identity element with order . Hence it is a subgroup of $Z/nZ$. By Lagrange’s theorem, must divide $n=ord(Z/nZ)$.

Note that Pollard’s rho algorithm actually detects two elements which are congruent modulo . Then the difference must be a multiple of . The algorithm outputs , which is strictly less than if indeed .

As mentioned in the Wikipedia article linked above, this algorithm can be improved by replacing multiple GCD computations with a single computation (in batch, you could say).

**Bonus material: Unique elements from an NTL vector**

Finally, the following code keeps only the unique elements in an NTL vector. I used a brute force time loop. Ideally, one should use a better search method (possibly via a binary search tree or a hash table.) However, GCC was complaining when I wanted to use a hashmap with NTL::ZZ keys/values, so I gave up for the sake of some development-time.

template &amp;lt;typename T&amp;gt; void unique(Vec&amp;lt;T&amp;gt; &amp;amp; src) { std::vector&amp;lt;T&amp;gt; arr; toStdVector&amp;lt;T&amp;gt;(arr, src); src.kill(); while (arr.size() != 0) { T elem = arr[0]; src.append(elem); auto it = arr.begin(); while (it != arr.end()) { if (*it == elem) it = arr.erase(it); else it++; } } // // // typedef std::unordered_map&amp;lt;T, T&amp;gt; Map; // Map ht; // for (T&amp;amp; item : src) { // if (ht.find(item) == ht.end()) // ht[item] = item; // } // src.kill(); // for (auto pos = ht.begin(); pos != ht.end(); pos++) { // src.append(pos-&amp;gt;first); // } }

I’d be glad to hear what you think. How would *you* have done it?

Filed under: Code, Computer Science, Mathematics, Number Theory, Randomized Algorithms Tagged: C++, finite field, generator, NTL, prime factorization, primitive roots of unity ]]>