# Math Help - Expected distance between two random vectors in n-dimensional space

1. ## Expected distance between two random vectors in n-dimensional space

Hi, Any help with this appreciated.

If I have two column vectors $\mathbf{a}$ and $\mathbf{b}$, where the $n$ elements of each are drawn independently from a Guassian $\mathcal{N}( \mu, \sigma^2)$, then the distance, $d$ ,between them is given by

$d = \sqrt{(\mathbf{a} - \mathbf{b})^\mathsf{T}(\mathbf{a} - \mathbf{b})}$

But what is the expected value of $d$?

I reckon, in the case where $\mu = \vec{0}$, it's $\mathbb{E}[d] = \sqrt{2n\sigma^2}$

My feeling is that a non-zero $\mu$ should make no difference, as it's just shifting the origin, so to speak, and the distance is determined by the relative positions of the vectors.

Can anyone prove the the general case (for $\mathcal{N}( \mu, \sigma^2)$ ) or show it to be wrong, and if wrong, say what it is in fact in the general case?

I notice also that my expression bears resemblance to the denominator in the normalising term in the Gaussian pdf $1/\sqrt{2 \pi \sigma^2}$, except that n takes the place of $\pi$. Is this coincidence, or does it reflect something deeper?

2. ## Re: Expected distance between two random vectors in n-dimensional space

Hey Mathsdog.

What did you get for the distribution for the distance d? (Hint: think about the sum of products of normal first and then the square root of that final answer).

3. ## Re: Expected distance between two random vectors in n-dimensional space

Right, sorted I reckon. Good hint Chiro. It all comes down to the $X^2$ distribution

For the square of the norm of $(\mathbf{a}- \mathbf{b})$, i.e.

$(\mathbf{a}- \mathbf{b})^{\mathsf{T}}(\mathbf{a}- \mathbf{b}) = \mathbf{c}^{\mathsf{T}}\mathbf{c}= d^2$

we first note that for the elements of $\mathbf{c}$ denoted $c_1, c_2,\ldots, c_i , \ldots c_n$

$c_i \sim \mathcal{N}(0, \sigma^{2}_{c})$

where

$\sigma^{2}_{c}=\sigma^{2}_{a}+\sigma^{2}_{b}$

i.e. the sum of the variances of the elements of $\mathbf{a}$ and $\mathbf{b}$

From the fact that the elements of $1/\sigma_c \cdot \mathbf{c}$ are distributed as follows

$c_i/ \sigma_c \sim \mathcal{N}(0, \sigma^{2}_{c}/\sigma^{2}_{c})$

it follows then that

$1/\sigma^{2}_{c} \cdot \mathbf{c}^{\mathsf{T}}\mathbf{c} \sim X^{2}(n)$

$So, \mathbb{E}[1/\sigma^{2}_{c} \cdot \mathbf{c}^{\mathsf{T}}\mathbf{c}] =1/\sigma^{2}_{c} \cdot \mathbb{E}[\mathbf{c}^{\mathsf{T}}\mathbf{c}] = n$

So, $\mathbb{E}[\mathbf{c}^{\mathsf{T}}\mathbf{c}]=n \sigma^{2}_{c}= n(\sigma^{2}_{a}+\sigma^{2}_{b})$

Where $\sigma^{2}_{a}=\sigma^{2}_{b}=\sigma^{2}_{ab}$

then
$\mathbb{E}[\mathbf{c}^{\mathsf{T}}\mathbf{c}]= 2n\sigma^{2}_{ab}= \mathbb{E}[d^2]$

And since the distance between the two vectors $\mathbf{a}$ and $\mathbf{b}$ is always just the (positive) square root of this squared norm, d is, as previously hypothesised, given by the following

$\mathbb{E}[d] = \sqrt{2n\sigma_{ab}^2}$

Does that look right to you Chiro?

I think the general case for a non-0 mean would follow from a similar treatment using the noncentral chi squared distribution, but I haven't worked the details out yet.

Thanks again. MD