# Thread: Factorization Theorem for Sufficient Statistics

1. ## Factorization Theorem for Sufficient Statistics

Problem:
Let Y1,Y2,...,Yn denote a random sample from the uniform distribution over the interval (0,theta). Show that Y(n)=max(Y1,Y2,...,Yn) is a sufficient statistic for theta by the factorization theorem.
Solution:

1) While I understand that I_A (x)I_B (x)=I_A intersect B (x), I don't understand the equality circled in red above.

In the solutions, they say that I_0,theta (y1)...I_ 0,theta(yn)=I_0,theta (y(n)). Is this really correct?
Shouldn't the right hand side be I_0,theta (y(n))I_0,infinity (y(1)) ? I believe that the second factor is necessary because the largest observation is greater than zero does not guarantee that the smallest observation is greater than zero.
Which one is correct?

2) Also, is I_0,theta (y(n)) a function of y(n), a function of theta, or a function of both y(n) and theta?
If it is a function of both y(n) and theta, then there is something that I don't understand. Following the definition of indicator function that I_A (x) is a function of x alone (it is a function of only the stuff in the parenthesis), shouldn't I_0,theta (y(n)) be a function of only y(n) alone?

Thank you for explaining! I've been confused with these ideas for at least a week.

2. I'm lecturing on suff stats on monday.
The point is that the indicator function is 0 or 1.
It's 1 as long as each X_i is between 0 and theta.
NOW look at the order stats
YOU need to convince yourself that
0<X_1,...., X_n<theta here these are the unordered data
is the same as
0<X_(1)<X_(2)<.....<X_(n)<theta
which is the same as
0<X_(n)<theta
All we need is the largest to be less than theta
Now if theta was a lower value for our rvs instead of an upper bound,
then the smallest order stat would be suff for theta
And in the case of U(a,b)
the smallest and largest order stats are suff for the TWO
parameters, a and b.
ARE you using Wackerly too?
I knew Dennis at Florida and Scheaffer was my chair there.

3. Originally Posted by matheagle
I'm lecturing on suff stats on monday.
The point is that the indicator function is 0 or 1.
It's 1 as long as each X_i is between 0 and theta.
NOW look at the order stats
YOU need to convince yourself that
0<X_1,...., X_n<theta here these are the unordered data
is the same as
0<X_(1)<X_(2)<.....<X_(n)<theta
which is the same as
0<X_(n)<theta
All we need is the largest to be less than theta
Now if theta was a lower value for our rvs instead of an upper bound,
then the smallest order stat would be suff for theta
And in the case of U(a,b)
the smallest and largest order stats are suff for the TWO
parameters, a and b.
ARE you using Wackerly too?
I knew Dennis at Florida and Scheaffer was my chair there.
Yes, I am using Wackerly.

But I don't think
0<X_(1)<X_(2)<.....<X_(n)<theta
is EQUIVALENT to (iff)
0<X_(n)<theta
=> is true but <= is not.

So that's why I think we should have I_0,theta (y1)...I_ 0,theta(yn) = I_0,theta (y(n))I_0,infinity (y(1)) instead of I_0,theta (y1)...I_ 0,theta(yn)=I_0,theta (y(n)).

4. I DO think
0<X_(1)<X_(2)<.....<X_(n)<theta
is EQUIVALENT to 0<X_(n)<theta
Or better yet
it's equvivalent to
I(0<X(1)) times I(X(n)<theta)
and who needs the rest
as long as the smallest order stat is greater than zero
and the largest order stat is less than theta
thats the same
as all the data (ordered or not) are between
0 and theta
You can toss the I(0<X(1)) in the factorization theorem into the other term, h(.), but you cannot separate theta and our largest order stat
HENCE the largest order stat is suff.

OR get the conditional density of your vector given the largest order stat.
That's a lot more work.
I always have a student ask me to do that.
And it's harder, but a lot of fun.

-------------------------------------------------------------------------

Likewise as I said earlier
Look at problem 9.51, here we need the variables to be greater than theta in our underlying distribution
In that case the smallest order stat is suff for theta.
In 9.52 theta is larger than our y's, so the largest order stat is suff for theta...
I may just assign all of these on Monday.

---------------------------------------------------------------------------

I(0<X_1<theta) times I(0<X_2<theta) ... I(0<X_n<theta)
is 1 iff all the rvs are between 0 and theta.
And
I(0<X_(1)<X_(2)<.....<X_(n)<theta)
is equal to 1 iff all the rvs are between 0 and theta.
Likewise
I(0<X_(1)) times I(X_(n)<theta)
is 1 iff all the rvs are between 0 and theta.
So throw I(0<X_(1)) into h(X_1,..., X_n).
BUT that shows that the largest order stat is suff for our parameter.