Average test and Mycostats

 

to read in advance 

 

Summary express
Conclusions of the test
General simplified information about the average test
Practical example with comments

mycostats : premiers pas

to read in advance

           In statistics, the concept of hypothesis test is far from an easy concept and requires a good mastery of the concepts of statistical calculation. Yet it is a process that many mycologists believe they understand and know use intuitively. They are duped by the misleading term "confidence coefficient" which is a phrase that could not be more ill-chosen as discussed below!.

Summary express train

Problem : you lay out a series of measurements which you wish to compare with a standard data or other measurements

Comparison of a sample to a value
ex: comparing the results of a lot of measurements with the average of a possible specie
 to enter  :

- the number of measurements, their average and their standard deviation
- enter the average of the supposed standard  specie


Comparison of two samples
ex: comparing measurements of two different samples

- enter  for each sample : the number of measurements, their averages and their standard deviation


 = >   the lower window immediately posts the numerical results of the test and the associated comments


     Conclusions


        2 possibilities:

The assumption H0 is rejected. The test is significant :
                The differences observed are not due to the only chance and are "significantly different".

The assumption  H0 is not rejected. The test is not significant :
             The differences observed are can be due randomly of sampling (but it should not be deduced in so far that the averages are the same ones!)


General information simplified about the average test

For more details, one will be able to refer usefully to the concepts of "Statistics for Mycologue" which were posted on the Mycomètre forum.

The traditional problem of the initial mycologist is as follows:

"I have a sample of N measurements of length of spores and I calculated the average and the standard deviation of this sample (manually, calculator, etc...).

Should I say that these measurements are the same ones as those obtained by the author? "

Don't let us dreaming : except very particular case, the statistical tests never makes it possible to affirm, as it is often heard, "than there is 95% of ' chances' so that dimensions similar" or "lie between such and such terminal".

Statistical calculation makes it possible to calculate only one risk.

There are two kinds of risks:

a/ the risk to dismiss an assumption whereas it is true (risk "alpha")

b/ the risk to accept the contrary assumption, whereas first is true (risk "beta")

Contrary to a spread idea, these two risks are absolutely not complementary (we will not develop this delicate concept here).

It is important to know at the beginning which risk one chooses. The risk which interests the mycologist is the Beta risk (i.e. not to want a priori that the given species is"the good one", without going further in the investigations), which is precisely most difficult to calculate.

The Beta risk depends on the alpha risk chosen at the beginning. To calculate the Beta risk, it is also necessary to know as a preliminary the "rule of the game", i.e. the law of distribution of the population. Unfortunately, one generally does not know the law of distribution of the sample (to find it is even more delicate than to seek the answer to the question put above).

On the other hand one knows the distribution of the Average which is roughly, when the sample is rather large, a law of Laplace-Gauss known as "Normal Law".

There is not, for as much, the right to compare measurements with averages. It acts of two populations absolutely different (the proof being that they do not have the same distribution's law), even if the measuring units are identical, which is even more fallacious (that is invited "to mix the towels and the cloths").

One thus can, fault of being able to compare directly and easily the samples, to compare the averages known between them. Also let us specify that the sampling of the measured spores must be done "honestly", without any sorting of any kind: one must systematically measure every spore which is present, except those which are obviously mutilated.

Calculation of the beta risk

We saw that the calculation of the beta risk is not elementary for a non-statistician. Mycostats makes it possible to obtain it without any effort, under certain assumptions (of Normality for the moment). It is as to note as we do not know any other software which provides this result.

Assumptions:

In the case of the average test, it is necessary to know, in theory, the two averages to be compared and their standard deviations. But, very often, one knows model population only his average or even as an indicative interval.

The cases of figure which can arise at the mycologist are:

a. one knows only the average of the standard species 

 It is the default of Mycostats ("Echant / Type")

b. only one standard interval is known: one can try to take the risk (still one!) to compute for average with the center of the interval and to proceed as at the a). If the distribution of the standard species is symmetrical, the approximation can be sufficient.

One is brought back to the preceding case

particular case : one knows the estimated standard deviation of the average : to notch the corresponding option in the test "Echant / Type"

C. If the author of the harvest of reference well specified the average and the thought standard deviation of the sample

One will choose, in Mycostats, the "2 samples" test..



Practises examples with Mycostats[1]


Nb: Initially, one will keep the default options.

Open Mycomètre  (version demo or pro)

Press the Mycostats button 

Choose in the bar of small  tests/average

An example of data is posted automatically.

(one will be unaware of the data "column" who allows to enter automatically the data resulting from measurement with Mycomètre).

 

In this example, the sample of 20 measurements has for average me = 11,1 and a standard deviation

s = 1,25.

The average of the type is mt= 11,5

The estimated s the Type is not known here,

Mycostats admits that it is roughly equal to that of the sample.

It is known that me< mt : one thus will carry out a unilateral test (a unilateral test is preferable for the tests which concern us. Mycostats holds account of it automatically).

The assumptions to be tested are:

H0 : "the samples have even average: me = mt "

H1 : "me is higher than mt"

The risk chosen by defect for alpha is 5 %

For A= 5%, the H0 assumption is not rejected.

Moreover, Mycostats gives b= 59,74 % so approximately 60%

Conclusion

With the risk 5% to be mistaken, one does not reject the assumption that "the averages are identical". The differences observed are can be due to the risks of sampling.

Moreover, the risk of 5% being selected, there is 60% of risk to be mistaken if it is said that mE < mT (different averages).

Attention not to reject H0 does not mean that it is accepted!

(Options: if one connait the estimated standard deviation of the average, one will notch the corresponding option.)

Another test: 

Let us choose a risk a = 10%.

With the risk a = 10% , the H0 assumption is rejected. That means that assumption "the averages are the same ones" is rejected, with a risk 10% to be mistaken (thus higher).

The value of b comes below 50%.

But then, would it be enough to work with a risk a different to have an optimal value of b ?

NOT! it would be so simple, and it is even the opposite, because if one observes the value of the power of the test, it is better for a=10%.

With the data of this example, it would be difficult to conclude effectively (rejection of the equality or not, with a certain risk)

What to do then?

We can remake complementary (and independent!) measurements to try to confirm one of the two possibilities.


  

For example, for the same data, a=10% and a lot of 100 measurements, the power of the test goes up to 97 %, with b=3 %, which would lead us to conclude that it is necessary to reject the assumption of equality.

Caution ! One never should benefit from the Mycostats'calculation facility to reason anyhow:

One must be given at the beginning a risk (a=5 % or 10 % are current prices), and one observes the value of b posted.

According to the value obtained for b, it will be can be useful to remake complementary measures. If the low value of b is confirmed, it will be necessary can be to be solved to admit that the average of the sample does not coincide with that of the type.

If a great value of b is confirmed, one will be able to admit that it is not necessary to reject the assumption of equality.

But it is not test of acceptance!

Copy the result of the test : the button "copy" let us copy the result of the test in the copyboard.
For the other tests of comparison, only the input differs.
The conclusions are interpreted as the same manner.


NB : selon le nombre de mesures, Mycostats effectue les calculs selon la loi Normale ou la loi de Student-Fisher



[1] Nb: Mycostat software is freely available on Internet