Wednesday 25 January 2012

Yesterday on some studying session, we solved the following Combinatorics Problem which is an IMO 1998 problem 2 day 1.


IMO 1998/2
In a contest, there are $m$ candidates and $n$ judges, where $n\geq 3$ is an odd integer. Each candidate is evaluated by each judge as either pass or fail. Suppose that each pair of judges agrees on at most \(k\) candidates. Prove that
\[\frac{k}{m} \geq \frac{n-1}{2n}\]


On my way home from the session, I tried to transform the problem into following statement which might be related to Coding Theory


Given $v_1,v_2,\cdots,v_n$ are all vectors in $\mathbb{F}_2^m$ where $n\geq 3$ is an odd integer. If $k$ is an integer such that

\[k \leq \left\lfloor \frac{(n-1)m}{2n} \right\rfloor\]

then there exist vectors $v_i$ and $v_j$ with $1\leq i < j \leq n$, such that $d_H(v_i,v_j) \leq n-k-1$, where $d_H$ denotes the Hamming Distance on $\mathbb{F}_2^m$
Recall that Hamming Distance gives us the number of different components on two vectors.

To see how the second form is derived, we create an $m \times n$ matrix $A$ where the rows represent the candidates and the columns represent the judges. We assign to this matrix, the $ij$-component as follows:

\[[A]_{ij}= \begin{cases} 1 &\text{If candidate $i$ is accepted by judge $j$} \\\ 0 &\text{If candidate $i$ is rejected by judge $i$}\end{cases}\]

Now denote the $i$-th column vectors of $A$ by $v_i$ which is clearly in $\mathbb{F}_2^m$. The Hamming Distance $d_H(v_j,v_i)$ represents the number of different result obtained by judge $i$ compared to judge $j$. Since $v_i \in \mathbb{F}_2^n$, the number of the same results obtained by the pair of judges $i$ and judges $j$ is $n-d_H(v_j,v_i)$

By IMO 1998 problem, if the above quantity is known to be at most $k$ for every pairs $(v_i,v_j)$ then the inequality $\frac{k}{m} \geq \frac{n-1}{2n}$ holds. The contraposition of this statement asserts that:

If the inequality $\frac{k}{m} \geq \frac{n-1}{2n}$ failed to hold then there exist at least one pairs $(v_i,v_j)$ such that $n-d_H(v_j,v_i)$ is greater than $k$. Which is simplified to

If $k \leq \left\lfloor \frac{(n-1)m}{2n} \right\rfloor$ then $d_H(v_j,v_i) \leq n-k-1$.

Next we will use the standard notation of Linear Code in Coding Theory which is $(n,k,d)$ and immediately deduce the following Corollary

Corollary

For any odd integers $k\geq 3$ and integers $n$, there is no linear code $(n,k,d)$ whenever $d > \frac{2k^2-kn+n}{2k}$.

Proof : If such a code exists, then $d > \frac{2k^2-kn+n}{2k}$ implies $k-d < \frac{-n+kn-2k^2 + 2k^2}{2k} = \frac{n(k-1)}{2k}$ thus there exists two vectors $v_j$ and $v_i$ on $\mathbb{F}_2^n$ with $ d_H (v_i , v_j )< k-( k- d) = d$. Contradiction, since $d$ is the minimum distance.

The Official Solution can be found on IMO Compendium Book.

We can also prove this by Probabilistic Method which is a powerful method for proving existence (see The Probabilistic Method by Noga Alon & Joel H. Spencer)

Solution:

Take two judges randomly and uniformly, let $X$ be a random variable which count the number of times the random pairs of judges agree. We wish to prove the contraposition, that is whenever $\frac{k}{m} < \frac{n-1}{2n}$ then $X$ is greater than $k$, with positive probability (i.e $\text{Prob}(X > k) > 0$).

For, $i=1,2,\cdots,m$, let $X_i$ be the $\{0,1\}$-valued random variable which represent wether or not the candidates $i$ being agreed by the pair of judges. Here $X_i=1$ if the pair of judges agree and $0$ otherwise. Thus we have
\[X=X_1+\cdots+X_m\]
taking the Expected Value of the both sides and using the linearity of Expected Value, we have
\[E[X]=E[X_1]+\cdots+E[X_m]\]
Next, we calculate $E[X_i]=\sum_{X_i \in \{0,1\} } X_i \cdot \text{Prob}(X_i)$, since $X_i$ only has value $0$ or $1$, we have $E[X_i]=\text{Prob}(X_i=1)$. Let's calculate $\text{Prob}(X_i=1)$, which is the probability that the candidate $i$ being agreed by the pairs of judges. There are ${n \choose 2}$ possible pairs. Suppose that $t_i$ is the number of judges who accept candidate $i$, then $n-t_i$ is the number of judges who reject candidate $i$. The number of pairs of judges who accept $i$ is ${t_i \choose 2}$, and the number of pairs of judges who reject $i$ is ${n-t_i \choose 2}$. The total number of pairs of judges who agree on $i$ is ${t_i \choose 2}+{n-t_i \choose 2}$. And so

$E[X_i]=\text{Prob}(X_i=1) = \frac{{t_i \choose 2}+{n-t_i \choose 2}}{{n\choose 2}}$
Thus
\[E[X]=\sum_{i=1}^m \frac{{t_i \choose 2}+{n-t_i \choose 2}}{{n\choose 2}}\]

We prove that for odd $n$, the inequality ${t_i\choose 2}+{n-t_i \choose 2} \geq \frac{(n-1)^2}{4}$. Indeed, the inequality is equivalent to
\[(n-2t_i)^2 \geq 1 \Rightarrow t_i \leq \frac{n-1}{2} \mbox{ or } t_i \geq \frac{n+1}{2}\]

which is true, since the candidate is either being failed/accepted by at most $\frac{n-1}{2}$ judges or being failed/accepted by at least $\frac{n+1}{2}$.

Using this inequality we have

\[E[X]\geq m \frac{\left(\frac{n-1}{2}\right)^2}{\frac{n(n-1)}{2}} = \frac{m(n-1)}{2n}\]
by hypothesis $\frac{m(n-1)}{2n} > k$, so we have $E[X]>k$, thus $\text{Prob}(X>k) >0$, and hence we are done.

The essence of proof is actually no different with the official proof, which is shorter. But what all I want to rephrase here is the probabilistic method :D

Post a Comment: