Reed-Solomon Codes

March 14, 2025

The subject of this post is an introduction to error correcting codes with the main goal of defining a class of codes called Reed-Solomon codes. I eventually want to discuss another class of codes called Algebraic Geometry codes in the next post, so this will serve as sort of an optional post giving a quick and dirty background on codes for those unfamiliar with the general theory.

The notion of an algebraic error correcting code over an alphabet Fq\bbf_q of length nn and distance kk is a kk-dimensional linear subspace C\calc of Fqn\bbf_q^n. Codes are a way of efficiently introducing redudancy into messages in order to protect against interference by noisy channels.

For example, Alice and Bob are standing on opposite sides of a highway and Alice wants to tell Bob the message "Hello". If she just yells it once, Bob might not hear, so instead she can yell "HelloHelloHello" to make sure, even if parts of the message are lost, Bob is still likely to understand.

This is not very efficient, however. Consider Alice sending a binary message of length kk, 101100100101100100... and suppose that our noisy channel randomly flips bits, so Bob receives 111000101111000101 which has 33 bits flipped. How can we encode Alice's message such that Bob can always correct at most ee errors?

Insead of binary, we can send messages in an alphabet of size qq, say Fq\bbf_q. Then one way to encode the message is to linearly map the message space into higher dimensional space, i.e. if Alice is sending qkq^k possilbe messages of length kk, we can construct a linear injective map G ⁣:FqkFqnG \colon \bbf_q^k \to \bbf_q^n.

The vectors c=Gxc = Gx are called codewords and the image C={GxxFqk}\calc = \{Gx \mid x \in \bbf_q^k\} is called the code. GG is called the generator matrix for the code. The integer kk is the dimension of the code C\calc, and nn is the message length of C\calc. The Hamming distance between any two vectors d(v,w)=#{i ⁣:viwi}d(v, w) = \#\{i \colon v_i \neq w_i\} is the number of indices at which viv_i and wiw_i differ.

Definition The distance of the code C\calc is defined as the minimum distance between any two codewords in C\calc,

d=minccCd(c,c)d = \min_{c \neq c' \in \calc} d(c, c')

We usually call a code C\calc with dimension kk, message length nn, distance dd, and alphabet size qq a linear [n,k,d]q[n,k,d]_q code.

Suppose Alice encodes her message xx as c=Gxc = Gx, and the noisy channel introduces an error vector ee such that Bob receives y=Gx+e=c+ey = Gx + e = c+ e. Then the idea is for Bob to decode yy by taking the codeword c=arg mincCd(y,c)c' = \argmin_{c \in \calc} d(y,c) closest to the received yy in terms of modified indices. Then Bob can recover xFqkx' \in \bbf_q^k such that c=Gxc' = Gx'. The hope is that c=cc = c', i.e. Bob recovers Alice's message x=xx = x'.

How many errors can the channel introduce before Bob decodes the wrong codeword? Observe that the Hamming ball of radius (d1)/2\lfloor (d-1)/2 \rfloor centered at a codeword cc, defined as

Bn(c,(d1)/2)={yFqnd(y,c)(d1)/2}B_n(c, \lfloor (d-1)/2 \rfloor) = \{y \in \bbf_q^n \mid d(y,c) \leq \lfloor (d-1)/2 \rfloor\}

is the largest such ball of received messages yy such that Bob corrects yy to cc, i.e. c=arg mincCd(y,c)c = \argmin_{c' \in \calc} d(y,c'). This is because the midpoint between any two codewords c,cc,c' is at least Hamming distance (d1)/2(d-1)/2, which means if yy exceeds this distance from cc it might be decoded to another codeword.

Thus we can correct errors ee with at most (d1)/2\lfloor (d-1)/2\rfloor nonzero entries. This is the reason why we usually want to construct codes with good distance, since the larger the distance the more errors we can correct. Of course, we also want good rate, which is the amount of redudancy we have to introduce.

Definition For a family of linear [n,k,d]q[n,k,d]_q codes C\calc, the rate RR and distance δ\delta are defined as

R=limnknandδ=limndnR = \lim_{n \to \infty} \frac{k}{n} \quad and \quad \delta = \lim_{n \to \infty} \frac{d}{n}

Our goal is to construct families of codes with asymptotically good rate and distance tradeoff.

We define the weight of a vector wt(v)\wt(v) as the number of nonzero entries, so in particular we can correct ee as long as wt(e)(d1)/2\wt(e) \leq \lfloor (d-1)/2\rfloor.

A very useful alternative definition for distance is the minimum weight of a codeword, d=mincCwt(c)d = \min_{c \in \calc}\wt(c). Note that 0=(0,,0)0 = (0,\ldots,0) is always a codeword, and wt(c)=d(c,0)\wt(c) = d(c,0) is the Hamming distance between cc and the 00 vector. Then one can check with not too much difficulty that d(c,c)=d(cc,0)=wt(cc)d(c,c') = d(c-c', 0) = \wt(c-c'). Since C\calc is a linear subspace, ccCc-c' \in \calc which means

d=minccCd(c,c)=mincCwt(c)d = \min_{c \neq c' \in \calc} d(c,c') = \min_{c \in \calc} \wt(c)

We'll jump into the main example of a linear code that we're interested in, called the Reed-Solomon code.

Definition Let qnq \geq n and α1,,αnFq\alpha_1,\ldots,\alpha_n \in \bbf_q. The Reed-Solomon code is defined as

RSq(n,k):={(f(α1),,f(αn)) ⁣:fFq[x],deg(f)<k}FqnRS_q(n,k) := \{(f(\alpha_1), \ldots, f(\alpha_n)) \colon f \in \bbf_q[x], \deg(f) < k\} \subseteq \bbf_q^n

This is indeed a linear code. If f,gFq[x]f,g \in \bbf_q[x] with deg(f),deg(g)<k\deg(f),\deg(g) < k and a,bFqa,b \in \bbf_q then

a(f(α1),,f(αn))+b(g(α1),,g(αn))=((af+bg)(α1),,(af+bg)(αn))\begin{align*} &a(f(\alpha_1), \ldots, f(\alpha_n)) + b (g(\alpha_1), \ldots, g(\alpha_n)) \\ &= ((af+bg)(\alpha_1), \ldots, (af+bg)(\alpha_n)) \end{align*}

The generator matrix GG is given by

G=(1α1α1k11αnαnk1)G = \begin{pmatrix} 1 & \alpha_1 & \ldots & \alpha_1^{k-1} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & \alpha_n & \ldots & \alpha_n^{k-1} \end{pmatrix}

since if f=a0+a1x++ak1xk1f = a_0 + a_1 x + \ldots + a_{k-1} x^{k-1} and a=(a0,,ak1)a = (a_0, \ldots, a_{k-1}) then

Ga=(f(α1)f(αn))Ga = \begin{pmatrix} f(\alpha_1) \\ \vdots \\ f(\alpha_n) \end{pmatrix}

Here's the interesting bit: computing the distance dd of RSq(n,k)RS_q(n,k). Note that a codeword c=(f(α1),,f(αn))c = (f(\alpha_1), \ldots, f(\alpha_n)) has weight dd if and only if ff vanishes on ndn-d points αi1,,αind\alpha_{i_1}, \ldots, \alpha_{i_{n-d}}. Since deg(f)<k\deg(f) < k, ff has at most k1k-1 roots which implies ndk1n-d \leq k-1 or

dnk+1d \geq n-k+1

There is a theorem that establishes for any code the reverse inequality.

Theorem (Singleton Bound) For any [n,k,d]q[n,k,d]_q linear code C\calc, i.e. a linear code over Fq\bbf_q with dimension kk, message length nn, and distance dd,

dnk+1d \leq n-k+1

Any code that meets this bound is called maximum distance separable or MDS. As we have just seen, Reed-Solomon codes are MDS, i.e. d=nk+1d = n-k+1 which is a very nice property.

In particular, we note that

dn=1kn+1n\frac{d}{n} = 1 - \frac{k}{n} + \frac{1}{n}

and therefore for MDS codes (and in particular RS codes),

R+δ=1R + \delta = 1