In this post, I am going to review two erasure codes: the Blaum-Bruck-Vardy code and the BASIC code (also here). These are erasure codes, which means, their purpose is to encode a number of data disks into a number of coding disks so that when one or more data/coding disks fail, the failed disk can be reconstructed using the existing data and coding disks.
A strength of these codes is that although the algebra is described on extension fields/rings over , the encoding/decoding process uses only Boolean addition/rotation operation and no finite field operation. These codes are also MDS (Maximum Distance Separable), which means they have the largest possible (minimum) distance for a fixed message-length and codeword-length.
(Recall that if a code has data components and
parity components in its generator matrix in standard form, its distance is at most
by the Singleton bound. Hence the code is MDS if and only if it can tolerate
arbitrary disk failures.)
The BASIC code does the following things in relations to the BBV code:
- Adds a virtual parity bit after each disk, giving each disk an even parity
- Does polynomial arithmetic modulo
instead of
as in the case of BBV code
- Shows equivalence to the BBV code by making a nice observation via Chinese Remainder Theorem
- Proves MDS property for any number of coding disks when
is “large enough” and has a certain structure
Open Question: What is the least disk size for which these codes are MDS with arbitrary distance?
The Blaum-Bruck-Vardy code
Let be an odd prime. (Bounds on
in connection to the distance of the code will come later.) Although there can be at most
data disks, we will assume there are exactly
of them. There are
coding disks. Each data disk is represented as a Boolean polynomial of degree
, that is, a polynomial in the ring
where
.
Let be the data disks. Let
denote a cyclic shift of the polynomial
modulo
. If
, the shifted polynomial is the usual shift (modulo
) followed by a flip in all (Boolean) coefficients, since modulo
and an addition modulo
is a flip for Boolean coefficients.
With this definition, the th coding disk
, for
, is is the (Boolean) sum
Given the data disks , the collection
is treated as a systematic codeword of length with
information symbols and
coding symbols. The rate of this code is
. Since we want high-rate codes, we are interested in small values of
.
Suppose is the minimum distance of this code. The singleton bound says
is at most
. When code achieves this distance it is said to be an MDS code. For this code, it happens when
for all
-
when
modulo
and
excludes some small subsets (depending on
A lower-bound on . Suppose the BBV code has
data disks and
coding disks. According to this analysis, when
is a primitive root modulo
, the BBV code is MDS for all
if
is strictly greater than
. (See the above link for an exact bound.) This is a sufficient condition. A lower-bound on
for this code to be MDS is still an open problem (as of October 2017).
Summary of the BBV code
Parameters/Conditions: For a prime , the disk size is
,. There are
data disks,
, and
coding disks. Fault tolerance: The code is MDS (that is, a
-linear code) for up to
disk failures for all
; up to
failures when
is a prime for which
is a primitive element; up to any number of failures if
. Alphabet:
. Encoding process: See above. Decoding process: Skipped.
Strengths:
- Coding disks are independent of each other
- The shifts are modulo
which is an irreducible polynomial
- The polynomials have Boolean coefficients
- Only Boolean additions and rotations are performed. No finite field operations needed for encoding/decoding
Limitations: Not sure.
The BASIC code
The Boolean Addition and Shift Implementable Cyclic-convolutional code, or the BASIC code in short, is like the BBV code. The difference is that the polynomial arithmetic is performed modulo instead of
.
Connections with the BBV code. When is an odd prime number such that
is a primitive root modulo
, the polynomial
factors into two irreducible factors over
as
. This post talks about the factoring, and Artin’s conjecture says that there are infinitely many such primes.
The data disks are given an even weight by adding a (virtual) parity bit, and hence both the data and coding disks are Boolean polynomials of an even weight in the ring
.
Let be the subring of
containing only even-weight polynomials. By the Chinese Remainder Theorem, there is a bijection taking every
in
to the tuple
. However, since
has an even weight,
is an even number which is zero (modulo 2) for all
. Thus the ring
is isomorphic to the field
used in the BBV code.
Data Disks. Let be an odd number. The disks contain
bits each. By (virtually) appending a parity bit, the
th data disk is treated as an element
of the ring
That is, the polynomial for the
th disk has degree
and an even number of nonzero coefficients.
The ring can be described in two ways. First, it is the set of polynomials of the form
for every
. Notice that the dimension of
over
is
because, well, it contains all (and only) the even-weight words. Second, since the
th coefficient is the parity bit, it is one (resp. zero) when the Boolean sum of all other coefficients is one (resp. zero). Hence the total weight is always zero.
Why does the polynomial have an even weight? Let
denote the
th coefficient of any polynomial, and let
. Notice that
is
whenever
differs from
. The number of such 0-1/1-0 transitions is always even because modulo
, the first and last coefficients are adjacent. Hence
will have even weight.
The check polynomial for any element is the all-ones polynomial
because .Why? Because if we imagine a matrix whose first column is the coefficient-vector
, second column is
, third column is
and so on, and recalling that
is a rotation, it becomes clear that the sum along each row will be zero since the first column contains an even number of ones to begin with.
Coding disks. Suppose we have data disks
We treat the -tuple
as the message. Recall that for each of these polynomials, the
th coefficient, which is not part of the original data, is the sum of all other coefficients. The
th coding disk
, then, is defined as the Boolean sum of the columns
, each rotated-down by
times. That is,
for . In other words,
is the vector inner product of the vectors
and
.
Generator matrix. The generator matrix for this code, in standard form, looks like , and a systematic codeword is
where
is a Vandermonde matrix with entries in . Hence a codeword is a
-linear combination of
row-vectors over
.
Invertibility in the ring . Since
is odd, the polynomial
is separable over
. (Why?) Hence the check polynomial
splits into linear factors in its splitting field. It is not hard to see that a polynomial
has
if and only if
is a unit modulo
. By the Chinese Remainder Theorem, this happens if and only if
is a unit modulo each of the linear factors of
. Such a polynomial
is called
-invertible.
MDS-ness and Vandermonde minors. For this code to be MDS, it has to be able to decode with up to disk failures. The structure of the generator matrix tells us that if any
data disks fail, any
submatrix of
involving any
coding columns and the remaining
data columns must be invertible. Since the columns corresponding to the data disks are standard basis vectors, this means any order-
submatrix of the Vandermonde matrix
must be invertible in the ring
. Put differently, every
-minor of
must be
-invertible. This invertibility condition is, in fact, equivalent to the MDS condition.
MDS-ness and a lower-bound on . The analysis in the BASIC code paper mentioned above and this paper shows that for this code to be MDS, it is sufficient that
is an odd prime number such that
is a primitive element modulo
.
The intuition behind the above bound is as follows. To prove that all minors of a generalized Vandermonde matrix are invertible modulo , they computed the maximum degree
of any minor modulo
using this result. They observed that
is irreducible over
when
is a primitive root modulo
. Therefore, any polynomial
over
with degree at most
must be coprime to
, and hence invertible modulo
. This gives the sufficient bound
. This bound on
can be turned on its head and used as a bound on
if we know an upper bound on
.
What could be the largest degree of a minor of order
? Consider an order-
submatrix
given by the rows
and columns
of the Vandermonde matrix
. Assume that the index sets
and
are in increasing order. If we look into the Leibniz expansion of the determinant of
, the term having the largest degree must be the product of the diagonal entries. (Why? We can prove it by induction. Use the
case as the base case and use the fact that
for all real
.) The degree of this term will be
, which is no more than
. Since
, the bound
follows.
Summary of the BASIC code
Parameters/Conditions: For a prime , the disk size is
,. There are
data disks and
coding disks. Fault tolerance: The code is MDS (that is, a
-linear code) for up to
disk failures for all
; up to
failures when
is a prime for which
is a primitive element; up to any number of failures if
. Alphabet:
. Encoding process: See above. Decoding process: Skipped.
Strengths:
- All of the BBV code, but taking remainders modulo
is computationally and conceptually simpler than doing the same modulo
Limitations: Not sure