1. ## Expectation of longest increasing subsequence

Problem

Let $\displaystyle x_1, ..., x_n$
be i.i.d random variables uniformly on [0,1]. Let $\displaystyle X$ be the length of the longest increasing subsequence of $\displaystyle x_1, ..., x_n$. Show that $\displaystyle E(X) \ge (1-o(1))(1-\frac{1}{e}) \sqrt{n}$

Hi forum!

Using the Erdos' lemma I can only deduce that
$\displaystyle E(X) \ge \frac{1}{2} \sqrt{n}$, which is a weaker bound unfortunately.

I would appreciate any further ideas!

Michael

PS: Would it be more suitable to post it in the statistics forum?

2. ## Re: Longest increasing subsequence

Hey Naumberg.

I'm interested in this: is e an epsilon?

3. ## Re: Longest increasing subsequence

Hi chiro!

"e" is just Euler's number, so no epsilon included.

4. ## Re: Longest increasing subsequence

So basically you are looking at a runs problem where you want to find the distribution of biggest run and then find the expectation of the distribution?

There are run statistics where you can use a series of indicator variables to construct the runs and then get the expectation of that distribution.

I think this is a standard technique in non-parametric testing and you should find some valuable information in these links:

Mann

Wilcoxon signed-rank test - Wikipedia, the free encyclopedia

5. ## Re: Longest increasing subsequence

Hi chiro,

thank you for your answer. The problem comes from a lecture about randomized algorithms and probabilistic methods (theoretical computer science). I think the testing approach is not appropriate here as we one really has to show this true inequality.

6. ## Re: Longest increasing subsequence

Hi, I solved the problem. Please find my tex code below. Cheers, Michael

We want to determine a strategy to select an increasing subsequence of $x_1,...,x_n$. Let $Y$ be the length of our increasing subsequence. As $X$ is defined as the longest increasing subsequence we surely have $E[X] \ge E[Y]$. Depending on how good our strategy is we hope to get $E[Y] \ge (1-o(1))(1-\frac{1}{e})\sqrt{n}$ which would complete the proof.

Let us assume that $m:= \sqrt{n}$ is an integer and partition our random sequence in blocks $L_j=(x_{(j-1)m+1},...,x_{jm} )$ for $j=1,...,m$ (we can assume this as we look at asymptotics in $n$ in the end).

The strategy picks the first number $y_1$ out of $x_1,...,x_n$ that is $\le \frac{1}{m}$ and skips to the next block. It then continuous to pick a number $y_i$ in each of the remaining blocks if $y_{i-1} \le y_i \le y_{i-1} + \frac{1}{m}$ and skips this block otherwise. At the end we receive an increasing subsequence $y_1,...,y_Y$ of length $Y$.

\begin{lstlisting}
input: sequence x(1),...x(n)
output: length Y of an increasing subsequence y(1)<=...<=y(Y)

Y = 0 \\ counting the length of the subsequence
s = zero array \\ storing the subsequence here

\\ go through intervals elements of L_j
for j = 1 to m
{
\\ boolean helper to implement stopping time, i.e. breaking condition for the loop
success == 0
while (success == 0) do
{
\\ go through elements of L_j
for k = (j-1)*m+1 to j*m
{
\\ find a larger element which is still small enough
if (s(Y) <= x_k <= s(Y)+1/m)
{
Y = Y+1 \\ length of subsequence ++
s(Y) = x_k \\ store element
success == 1 \\ stop searching in L_j
\\ and go to next interval
}
}
}
}

return(Y)
\end{lstlisting}

Now let us estimate the expectation of $Y$,

\begin{alignat*}{1}
E[Y] &= E[\sum_{j=1}^{m} \mathds{1}_{\{ \text{ "found suitable number in }L_j\text{ " }\}}]\\
&= \sum_{j=1}^{m}E[ \mathds{1}_{\{ \text{ "found suitable number in }L_j\text{ " }\}}]
\end{alignat*}

As our "tolerance of increase" $\frac{1}{m}$ stays the same for all numbers we search for, all $x_i$ are independently and uniformly distributed and all parts of the sequence $L_j$ contain the same amount of numbers $m$ we get that

\begin{alignat*}{1}
E[Y] &= m P[ \text{ "found suitable number in }L_1 \text{ " }]\\
&= m (1- P[\forall i=1,...,m : x_i > \frac{1}{m}]) \\
&= m (1- (P[x_1 > \frac{1}{m}])^m) \\
&= m (1- (1- \frac{1}{m})^m) \\
&= \sqrt{n} (1- (1- \frac{1}{\sqrt{n}})^{\sqrt{n}}) \\
&= (1-o(1))(1-\frac{1}{e})\sqrt{n}.
\end{alignat*}

So our strategy gives the lower bound $E[X] \ge E[Y]$.