In this post, I am going to review two erasure codes: the Blaum-Bruck-Vardy code and the BASIC code (also here). These are erasure codes, which means, their purpose is to encode a number of data disks into a number of coding disks so that when one or more data/coding disks fail, the failed disk can be reconstructed using the existing data and coding disks.
A strength of these codes is that although the algebra is described on extension fields/rings over , the encoding/decoding process uses only Boolean addition/rotation operation and no finite field operation. These codes are also MDS (Maximum Distance Separable), which means they have the largest possible (minimum) distance for a fixed message-length and codeword-length.
(Recall that if a code has data components and parity components in its generator matrix in standard form, its distance is at most by the Singleton bound. Hence the code is MDS if and only if it can tolerate arbitrary disk failures.)
The BASIC code does the following things in relations to the BBV code:
- Adds a virtual parity bit after each disk, giving each disk an even parity
- Does polynomial arithmetic modulo instead of as in the case of BBV code
- Shows equivalence to the BBV code by making a nice observation via Chinese Remainder Theorem
- Proves MDS property for any number of coding disks when is “large enough” and has a certain structure
Open Question: What is the least disk size for which these codes are MDS with arbitrary distance?
The Blaum-Bruck-Vardy code
Let be an odd prime. (Bounds on in connection to the distance of the code will come later.) Although there can be at most data disks, we will assume there are exactly of them. There are coding disks. Each data disk is represented as a Boolean polynomial of degree , that is, a polynomial in the ring
Let be the data disks. Let denote a cyclic shift of the polynomial modulo . If , the shifted polynomial is the usual shift (modulo ) followed by a flip in all (Boolean) coefficients, since modulo and an addition modulo is a flip for Boolean coefficients.
With this definition, the th coding disk , for , is is the (Boolean) sum
Given the data disks , the collection
is treated as a systematic codeword of length with information symbols and coding symbols. The rate of this code is . Since we want high-rate codes, we are interested in small values of .
Suppose is the minimum distance of this code. The singleton bound says is at most . When code achieves this distance it is said to be an MDS code. For this code, it happens when
- for all
- when modulo and excludes some small subsets (depending on
A lower-bound on . Suppose the BBV code has data disks and coding disks. According to this analysis, when is a primitive root modulo , the BBV code is MDS for all if is strictly greater than . (See the above link for an exact bound.) This is a sufficient condition. A lower-bound on for this code to be MDS is still an open problem (as of October 2017).
Summary of the BBV code
Parameters/Conditions: For a prime , the disk size is ,. There are data disks, , and coding disks. Fault tolerance: The code is MDS (that is, a -linear code) for up to disk failures for all ; up to failures when is a prime for which is a primitive element; up to any number of failures if . Alphabet: . Encoding process: See above. Decoding process: Skipped.
- Coding disks are independent of each other
- The shifts are modulo which is an irreducible polynomial
- The polynomials have Boolean coefficients
- Only Boolean additions and rotations are performed. No finite field operations needed for encoding/decoding
Limitations: Not sure.
The BASIC code
The Boolean Addition and Shift Implementable Cyclic-convolutional code, or the BASIC code in short, is like the BBV code. The difference is that the polynomial arithmetic is performed modulo instead of .
Connections with the BBV code. When is an odd prime number such that is a primitive root modulo , the polynomial factors into two irreducible factors over as . This post talks about the factoring, and Artin’s conjecture says that there are infinitely many such primes.
The data disks are given an even weight by adding a (virtual) parity bit, and hence both the data and coding disks are Boolean polynomials of an even weight in the ring
Let be the subring of containing only even-weight polynomials. By the Chinese Remainder Theorem, there is a bijection taking every in to the tuple . However, since has an even weight, is an even number which is zero (modulo 2) for all . Thus the ring is isomorphic to the field used in the BBV code.
Data Disks. Let be an odd number. The disks contain bits each. By (virtually) appending a parity bit, the th data disk is treated as an element of the ring
That is, the polynomial for the th disk has degree and an even number of nonzero coefficients.
The ring can be described in two ways. First, it is the set of polynomials of the form for every . Notice that the dimension of over is because, well, it contains all (and only) the even-weight words. Second, since the th coefficient is the parity bit, it is one (resp. zero) when the Boolean sum of all other coefficients is one (resp. zero). Hence the total weight is always zero.
Why does the polynomial have an even weight? Let denote the th coefficient of any polynomial, and let . Notice that is whenever differs from . The number of such 0-1/1-0 transitions is always even because modulo , the first and last coefficients are adjacent. Hence will have even weight.
The check polynomial for any element is the all-ones polynomial
because .Why? Because if we imagine a matrix whose first column is the coefficient-vector , second column is , third column is and so on, and recalling that is a rotation, it becomes clear that the sum along each row will be zero since the first column contains an even number of ones to begin with.
Coding disks. Suppose we have data disks
We treat the -tuple as the message. Recall that for each of these polynomials, the th coefficient, which is not part of the original data, is the sum of all other coefficients. The th coding disk , then, is defined as the Boolean sum of the columns , each rotated-down by times. That is,
for . In other words, is the vector inner product of the vectors and .
Generator matrix. The generator matrix for this code, in standard form, looks like , and a systematic codeword is where
is a Vandermonde matrix with entries in . Hence a codeword is a -linear combination of row-vectors over .
Invertibility in the ring . Since is odd, the polynomial is separable over . (Why?) Hence the check polynomial splits into linear factors in its splitting field. It is not hard to see that a polynomial has if and only if is a unit modulo . By the Chinese Remainder Theorem, this happens if and only if is a unit modulo each of the linear factors of . Such a polynomial is called -invertible.
MDS-ness and Vandermonde minors. For this code to be MDS, it has to be able to decode with up to disk failures. The structure of the generator matrix tells us that if any data disks fail, any submatrix of involving any coding columns and the remaining data columns must be invertible. Since the columns corresponding to the data disks are standard basis vectors, this means any order- submatrix of the Vandermonde matrix must be invertible in the ring . Put differently, every -minor of must be -invertible. This invertibility condition is, in fact, equivalent to the MDS condition.
MDS-ness and a lower-bound on . The analysis in the BASIC code paper mentioned above and this paper shows that for this code to be MDS, it is sufficient that is an odd prime number such that is a primitive element modulo .
The intuition behind the above bound is as follows. To prove that all minors of a generalized Vandermonde matrix are invertible modulo , they computed the maximum degree of any minor modulo using this result. They observed that is irreducible over when is a primitive root modulo . Therefore, any polynomial over with degree at most must be coprime to , and hence invertible modulo . This gives the sufficient bound . This bound on can be turned on its head and used as a bound on if we know an upper bound on .
What could be the largest degree of a minor of order ? Consider an order- submatrix given by the rows and columns of the Vandermonde matrix . Assume that the index sets and are in increasing order. If we look into the Leibniz expansion of the determinant of , the term having the largest degree must be the product of the diagonal entries. (Why? We can prove it by induction. Use the case as the base case and use the fact that for all real .) The degree of this term will be , which is no more than . Since , the bound follows.
Summary of the BASIC code
Parameters/Conditions: For a prime , the disk size is ,. There are data disks and coding disks. Fault tolerance: The code is MDS (that is, a -linear code) for up to disk failures for all ; up to failures when is a prime for which is a primitive element; up to any number of failures if . Alphabet: . Encoding process: See above. Decoding process: Skipped.
- All of the BBV code, but taking remainders modulo is computationally and conceptually simpler than doing the same modulo
Limitations: Not sure