I'm a biotech PhD student currently researching a rice disease throughout Asia known as rice blight. Unfortunately my knowledge of statistics is quite weak, but I would really like some advice from anybody who can help. I apologise for the lengthy explanation.
Background: Xanthomonas oryzae is the name of a bacteria that infects rice leaves, causing them to wilt. What we are trying to do is find out which genes in the bacteria are essential for its infection. To find out which genes are important, we remove specific ones from the bacteria using lab methods and then see if they grow more slowly than those that haven't been altered. If they grow slowly, it suggests that the gene we removed is somehow connected to the cause of disease.
Method: So I have infected the rice with equal concentrations of both kinds of bacteria, and will leave them to spread for 12 days, and then collect the leaves and spin the bacteria out of them. Because there are way too many cells to count, what I need to do is dilute them in water by factors of 10, from 10^(-1) right up to 10^(-9). Then I drop 3 X 10 microlitres of each dilution onto growth media.
If some of the drops have an average of 9 cells in them, for example, and that dilution was 10^(-7), then I can assume that there were around 9 X 10(^7) cells in 10 microlitres of the original.
I am trying to determine exactly how many times I should repeat each drop, for my end data to be accurate.
What happens is that when I do a test run of bacterial counting, there is quite a large difference between each repetition.
From exactly the same starting sample of bacteria, I plated out 3 individual drops (10 microlitre) X 5 plates. The average count for each plate is as follows:
Plate 1: 272 (X 10^-6)
Plate 2: 304 (X 10^-6)
Plate 3: 195 (X 10^-6)
Plate 4: 285 (X 10^-6)
Plate 5: 316 (X 10^-6)
I want to know how many plate repetitions I should do to be confident in the final average value. For example, if I take the average of these, it becomes 274.4 (X 10^-6). But if I had only chosen to do three plates instead of five, and they happened to be plates 1, 3 and 4, then the average would have been 250.6 (X 10^-6). This difference is too great. Obviously, the more repetitions I do, the better, but each repetition is a very time consuming process and uses a lot of materials.
Perhaps I should eliminate the outliers? I remember something about a confidence level in my first year statistics classes, but I've forgotten a lot.
Also, how do I get Microsoft Excel to produce a line graph (time on the x-axis) showing standard deviation for each value? Or is there a better (free and downloadable) program that I can use?
Thanking you in advance.