Wednesday, 25 January 2012

IMO Problem-Linear Code-Probabilistic Method

Yesterday on some studying session, we solved the following Combinatorics Problem which is an IMO 1998 problem 2 day 1.


IMO 1998/2

In a contest, there are $m$ candidates and \(n\) judges, where \(n\geq 3\) is an odd integer. Each candidate is evaluated by each judge as either pass or fail. Suppose that each pair of judges agrees on at most \(k\) candidates. Prove that

\[\frac{k}{m} \geq \frac{n-1}{2n}\]



On my way home from the session, I tried to transform the problem into following statement which might be related to Coding Theory


Given $v_1,v_2,\cdots,v_n$ are all vectors in $\mathbb{F}_2^m$ where $n\geq 3$ is an odd integer. If $k$ is an integer such that

\[k \leq \left\lfloor \frac{(n-1)m}{2n} \right\rfloor\]

then there exist vectors $v_i$ and $v_j$ with $1\leq i < j \leq n$, such that $d_H(v_i,v_j) \leq n-k-1$, where $d_H$ denotes the Hamming Distance on $\mathbb{F}_2^m$
Recall that Hamming Distance gives us the number of different components on two vectors.

To see how the second form is derived, we create an $m \times n$ matrix $A$ where the rows represent the candidates and the columns represent the judges. We assign to this matrix, the $ij$-component as follows:

\[[A]_{ij}= \begin{cases} 1 &\text{If candidate $i$ is accepted by judge $j$} \\\ 0 &\text{If candidate $i$ is rejected by judge $i$}\end{cases}\]

Now denote the $i$-th column vectors of $A$ by $v_i$ which is clearly in $\mathbb{F}_2^m$. The Hamming Distance $d_H(v_j,v_i)$ represents the number of different result obtained by judge $i$ compared to judge $j$. Since $v_i \in \mathbb{F}_2^n$, the number of the same results obtained by the pair of judges $i$ and judges $j$ is $n-d_H(v_j,v_i)$

By IMO 1998 problem, if the above quantity is known to be at most $k$ for every pairs $(v_i,v_j)$ then the inequality $\frac{k}{m} \geq \frac{n-1}{2n}$ holds. The contraposition of this statement asserts that:

If the inequality $\frac{k}{m} \geq \frac{n-1}{2n}$ failed to hold then there exist at least one pairs $(v_i,v_j)$ such that $n-d_H(v_j,v_i)$ is greater than $k$. Which is simplified to

If $k \leq \left\lfloor \frac{(n-1)m}{2n} \right\rfloor$ then $d_H(v_j,v_i) \leq n-k-1$.

Next we will use the standard notation of Linear Code in Coding Theory which is $(n,k,d)$ and immediately deduce the following Corollary

Corollary

For any odd integers $k\geq 3$ and integers $n$, there is no linear code $(n,k,d)$ whenever $d > \frac{2k^2-kn+n}{2k}$.

Proof : If such a code exists, then $d > \frac{2k^2-kn+n}{2k}$ implies $k-d < \frac{-n+kn-2k^2 + 2k^2}{2k} = \frac{n(k-1)}{2k}$ thus there exists two vectors $v_j$ and $v_i$ on $\mathbb{F}_2^n$ with $ d_H (v_i , v_j )< k-( k- d) = d$. Contradiction, since $d$ is the minimum distance.

The Official Solution can be found on IMO Compendium Book.

We can also prove this by Probabilistic Method which is a powerful method for proving existence (see The Probabilistic Method by Noga Alon & Joel H. Spencer)

Solution:

Take two judges randomly and uniformly, let $X$ be a random variable which count the number of times the random pairs of judges agree. We wish to prove the contraposition, that is whenever $\frac{k}{m} < \frac{n-1}{2n}$ then $X$ is greater than $k$, with positive probability (i.e $\text{Prob}(X > k) > 0$).

For, $i=1,2,\cdots,m$, let $X_i$ be the $\{0,1\}$-valued random variable which represent wether or not the candidates $i$ being agreed by the pair of judges. Here $X_i=1$ if the pair of judges agree and $0$ otherwise. Thus we have
\[X=X_1+\cdots+X_m\]
taking the Expected Value of the both sides and using the linearity of Expected Value, we have
\[E[X]=E[X_1]+\cdots+E[X_m]\]
Next, we calculate $E[X_i]=\sum_{X_i \in \{0,1\} } X_i \cdot \text{Prob}(X_i)$, since $X_i$ only has value $0$ or $1$, we have $E[X_i]=\text{Prob}(X_i=1)$. Let's calculate $\text{Prob}(X_i=1)$, which is the probability that the candidate $i$ being agreed by the pairs of judges. There are ${n \choose 2}$ possible pairs. Suppose that $t_i$ is the number of judges who accept candidate $i$, then $n-t_i$ is the number of judges who reject candidate $i$. The number of pairs of judges who accept $i$ is ${t_i \choose 2}$, and the number of pairs of judges who reject $i$ is ${n-t_i \choose 2}$. The total number of pairs of judges who agree on $i$ is ${t_i \choose 2}+{n-t_i \choose 2}$. And so

$E[X_i]=\text{Prob}(X_i=1) = \frac{{t_i \choose 2}+{n-t_i \choose 2}}{{n\choose 2}}$
Thus
\[E[X]=\sum_{i=1}^m \frac{{t_i \choose 2}+{n-t_i \choose 2}}{{n\choose 2}}\]

We prove that for odd $n$, the inequality ${t_i\choose 2}+{n-t_i \choose 2} \geq \frac{(n-1)^2}{4}$. Indeed, the inequality is equivalent to
\[(n-2t_i)^2 \geq 1 \Rightarrow t_i \leq \frac{n-1}{2} \mbox{ or } t_i \geq \frac{n+1}{2}\]

which is true, since the candidate is either being failed/accepted by at most $\frac{n-1}{2}$ judges or being failed/accepted by at least $\frac{n+1}{2}$.

Using this inequality we have

\[E[X]\geq m \frac{\left(\frac{n-1}{2}\right)^2}{\frac{n(n-1)}{2}} = \frac{m(n-1)}{2n}\]
by hypothesis $\frac{m(n-1)}{2n} > k$, so we have $E[X]>k$, thus $\text{Prob}(X>k) >0$, and hence we are done.

The essence of proof is actually no different with the official proof, which is shorter. But what all I want to rephrase here is the probabilistic method :D

0 comments:

Post a Comment