Toggle lights
How can a CD still be read even if some scratches adorn its back? How can a rover that is to conduct investigations on the Moon transmit its data to the base station on Earth, even though there will certainly be some errors in the transmission? How can I efficiently store my data on three hard disks so that if one of them fails, all the data can still be reconstructed?
Coding theory deals with these and similar questions. The basic idea is the following: Given a finite alphabet (e.g.
in the binary case) and a code length
. The data to be encoded is now always stored/transmitted in blocks of length
. However, such a block
actually contains only the information of a smaller number
of
-bits. That is, during encoding, one word of length
above the alphabet
is always converted into one word of length
above the same alphabet. Thereby
bits are redundant. Mathematically speaking, this encoding is nothing more than an injective mapping
. Your image
is called a
code (
is called the length of
and
the dimension).
Now the following can happen: In the block of length
some symbols are changed (by noise in the channel or similar). Let us say these are
symbols. Thus, a new disturbed block
is created. Can the original information still be reconstructed? To do this, we define the Hamming distance of two codewords
as
(where
and
with
). The idea now is to modify the disturbed block
to the codeword in
that is closest to this block measured in Hamming distance. But is this code word even unique? Generally not!
In order to recognize “too heavily disturbed” blocks as such, and to identify reconstructable blocks can always be reconstructed unambiguously, a ball
of radius
is therefore drawn at Hamming distance around each admissible codeword
, so that no two of these balls overlap:
for
different. Now, if
lies in one of these balls, it is reconstructed to the codeword
. In this way
is always unique due to the above condition.
But how to choose the radius here? We want
and
not to intersect for
from
. But this is exactly right if
. This condition must now be satisfied for all
and
. Thus, if we set
(the minimum distance from
),
is the best possible radius. The parameter
thus determines the correction radius. In this situation,
is also called a
code.