# Minimal sufficient statistics for a two sample variable with normal distribution

• Sep 9th 2011, 01:10 PM
Minimal sufficient statistics for a two sample variable with normal distribution
Consider $X_1,X_2,...,X_n$ is a random sample from $N( \mu , \sigma _x ^2 )$ and $Y_1,...,Y_m$ is an independent random sample from $N( \mu, \sigma _y ^2 )$

Let $P_ \theta$ denote the joint distribution of these n+m variables with $\theta = ( \mu , \sigma _x^2 , \sigma _y^2)$. Find a minimal sufficient statistics for this family of distributions.

My solution so far:

Now, we know that the joint density function for X and Y is:

$f_ \theta (X_1,...,X_n,Y_1,...,Y_m)$

$= \frac {1}{ \sqrt {2 \pi \sigma _x ^2 }} \cdot e^{ - \frac { (x_1- \mu )^2}{2 \sigma _x ^2 }} \cdots \frac {1}{ \sqrt {2 \pi \sigma _x ^2 }} \cdot e^{ - \frac { (x_n- \mu )^2}{2 \sigma _x ^2 }} \cdot \frac {1}{ \sqrt {2 \pi \sigma _y ^2 }} \cdot e^{ - \frac { (y_1- \mu )^2}{2 \sigma _y ^2 }} \cdots \frac {1}{ \sqrt {2 \pi \sigma _y ^2 }} \cdot e^{ - \frac { (y_m - \mu )^2}{2 \sigma _y ^2 }}$

$= \left( \frac {1}{ \sqrt {2 \pi \sigma _x^2}} \right) ^n \cdot \left( \frac {1}{ \sqrt {2 \pi \sigma _y^2}} \right) ^m \cdot e^{ \sum ^n_{i=1} \frac {-(x_i- \mu)^2}{2 \sigma _x^2}} \cdot e ^{ \sum ^m_{j=1} \frac {-(y_j- \mu)^2}{2 \sigma _y^2}}$

$= \left( \frac {1}{ \sqrt {2 \pi \sigma _x^2}} \right) ^n \cdot \left( \frac {1}{ \sqrt {2 \pi \sigma _y^2}} \right) ^m \cdot e^{ \frac {-1}{2 \sigma _x ^2} \sum ^n_{i=1} x^2_i - \frac {1}{2 \sigma _y^2} \sum ^m_{j=1} y_j^2 } \cdot e^ { \frac { \mu }{ \sigma _x^2} \sum ^n_{i=1} x_i + \frac { \mu }{ \sigma _y^2} \sum ^m_{j=1} y_j } \cdot e^ { \frac {- \mu ^2 n }{ \sigma _x^2}+ \frac {- \mu ^2 m }{ \sigma _y ^2 }}$

First, since the Xi's are not independent, I'm not even sure if I can do this. Second, now I can't really compass my statistic $T(X,Y)$, mainly to get the Xi's and Yj's together. Any hints? Thank you very much!
• Sep 10th 2011, 12:50 AM
matheagle
Re: Minimal sufficient statistics for a two sample variable with normal distribution
I would think that all n+m random variables are independent.

I would expect to see that the sample variances are

$S^2_x={\sum_{i=1}^n(X_i-\hat\mu)^2\over n}$

$S^2_y={\sum_{i=1}^m(Y_i-\hat\mu)^2\over m}$

while the mean is a pooled estimator

$\hat\mu={\sum_{i=1}^n X_i+\sum_{j=1}^mY_j\over n+m}$
• Sep 10th 2011, 10:50 AM