• Oct 26th 2012, 04:10 AM
statsman1
Hi all

I have been given some data and I need to do some significance testing on it.

The data revolves around food types. Firstly I have been given data in Excel, about 1000 rows, of a food type with the date the food was bought and date it went off, I need to test whether after X number of days, a significant number of the food has expired.

So simply put, I have data on about 1000 types of a food, with the number of days it stayed edible and I need to check whether a day X=a the food becomes significantly bad.

I also have further data on the food; location grown, conditions grown, food used etc. So after this I would like to calculate similar tests to see whether these conditions are also significant.

I hope this all makes sense and I welcome any help you guys can give!

Thanks!
• Oct 26th 2012, 05:12 AM
chiro
Hey statsman1.

The first thing in all of these kinds of analyses is that you want to decide what your criteria for food going off is: is it a probability? Maybe the mean number of days for something to go off? Perhaps a rate at which the food goes off?

This is the first and most critical step in statistical analyses and this will help you decide how to form your hypotheses.

You say you want to check at what time the food goes bad but how do you quantify what bad is? Is it more than 50%? 75%? 90%?

Once you answer this, you will have actually solved half the problem since the rest is setting up the analyses, doing the computation and interpreting the results.
• Oct 26th 2012, 09:49 AM
statsman1
Hi Chiro!

Thanks for your reply. Quantifying what bad is isn't really part of the analysis, the only data I have is the number of days it has lasted before it went bad, from this I need to test whether after X=a days a significant amount of food is lost. For instance I can easily calculate that the (mean) average number of days the food lasts is 2000 days, and 90% of the food lasts beyond X=100 days. Is there a test I can perform to see whether X=100 days is significant to the food expiring?

How would I calculate the rate at which the food goes off?

Thanks!
• Oct 26th 2012, 05:59 PM
chiro
You still need to say what significant means: What does expire mean quantitatively? Does it mean > 75%?

You need to have something to compare to and this means having a specific number to compare to. I have mentioned proportions (percentages), rates, means but you need to give a specific thing that you are comparing against because without this you don't have anything.
• Oct 27th 2012, 03:41 AM
statsman1
OK, so since this data is from a sample population could I say that x bar = 2000, standard deviation = 1500, then from this could I calculate the probability of randomly given food goes off before 90 days?

Perhaps a test where H_0: mu > 90 days and H_1: mu < 90 days?

The problem is I have been given this data with little direction and quite vague objectives to complete. I don't have any information of the population mean or anything like that to compare it to. I am a bit lost in this respect, what would you do in my situation?
• Oct 27th 2012, 05:05 AM
chiro
It is hard when things are vague, but I might suggest somewhere to start.

One suggestion might be to consider doing a hypothesis test where after so many days, you want to see statistically whether you have evidence that H0: p > p0, H1: p <= p0 where p0 is a proportion of things going out of date (as an example, consider 0.8 as 80% of things going out of date).

To do this, what you can do is create new data fields that have a 1 if the thing has gone out of date already by that day and 0 otherwise. You can then look at the hypothesis for each day and reject or fail to reject a particular hypothesis.

You can also use this to look at rate of decay and get confidence intervals on the average rate of food decay on a per day basis.

Doing the first one is very straight-forward and if you choose something relatively high that leaves some room for error (my suggestion would be > 75%) then this is a concrete thing you can test and can be built upon for other idea (like the rates of going off as an example).
• Oct 28th 2012, 07:45 AM
statsman1
OK, so given that p=0.21 of food expire in the first 100 days I've tested H0: p<=p0, H1: p>p0 where p0=0.2, n=990 so I get a Z_calc value of about -2.43. Now I forget how to use the Stats Tables and see which hypothesis to reject! Also, what do you think of my hypothesis test?

I did the rate of decay as you suggested and noticed that the r.o.d in the first 1000 days is almost twice that of the remaining 3000 days, so thanks for that!
• Oct 28th 2012, 05:31 PM
chiro