Performing experiments using FTFs
Chi-square, degrees of freedom and critical values
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Table 1: critical values of χ² | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
How to use this table
The table lists critical values of χ² for degrees of freedom, df, from 1 to 25. It quotes critical values at two levels: with a probability of error, p = 0.05, i.e., 1:20, and p = 0.01, i.e., with an error of 1:100. Usually an error rate of 1 in 20 is quite sufficient. This means that there is a chance of one in twenty of detecting a change in the sample when there is no real change in the population. If you choose the lower threshold, you tend to make the opposite error of being over-cautious.
The number of degrees of freedom for a contingency table column is the number of cells, minus one, usually written as df = r-1 where r is the number of rows in the column. The number of degrees of freedom for an entire table or set of columns, is df = (r-1) x (c-1), where r is the number of rows, and c the number of columns.
We should remind the reader that each cell in the expected distribution must have at least 5 cases in it; if not, you should collapse values together, as in the two-features-in-a-clause example here (see the effect of mood on ditransitive). The number of degrees of freedom will then fall.
If your value of df is greater than 25, you will need to refer to other tables. (NB. Sometimes error levels are quoted by the probability that the test is successful, i.e., p = 0.95; 0.99, as below. If you see figures like this, just subtract them from 1.)
With multiple degrees of freedom, it can be difficult to determine the reason for a result being significant. In a large table, "a significant result" means that it is probable that the dependent and independent values correlate. It does not tell us how they correlate, which values change more than others, etc. To study this question it is necessary to break the table down into sub-tables.
The mathematics of chi-square
|
|||||||||||||||||||||||||||||||||||||
| Table 2: critical values of z (two-tailed) | |||||||||||||||||||||||||||||||||||||
The chi-square test is derived from the z test, which can be thought of as another way of carrying out χ² tests for 1 degree of freedom.
- The simplest test, the 2 x 1 goodness of fit χ² test, calculates the same result as the single sample z test, only squared.
- The simplest pairwise test, the 2 x 2 χ² test for homogeneity, obtains the same result (squared) as the independent sample z test (where data is taken from the same population).
- Critical values of z (see above) are the square root of the critical values of χ² for one degree of freedom.
- Modern improvements on z tests employ the Wilson score interval to create a 'better 2 x 2 χ² test' (see also here).
Critical values of chi-square may be calculated from first principles. This is useful if you need to calculate the critical value for fractional degrees of freedom or for different error levels.
![]() |
|
The shaded area is beyond the limit (p = 1 / probability of an error is 0). The critical value for p = 1 is finite and can be exceeded by a χ² test, especially if a large amount of data is found.
Question: in this graph does an error level of 0 mean that an experimental result is guaranteed correct? (A. No, it is due to a rounding error in the approximation.)
See also
- Statistical tables from the US National Institute of Standards and Technology (NIST) handbook
- 2 x 2 χ² (Excel)
- Further reading: z-squared: the origin and use of χ² (PDF)
- Sean Wallis' statistics publications
- corp.ling.stats blog
FTF home pages by Sean Wallis
and Gerry
Nelson.
Comments/questions to s.wallis@ucl.ac.uk.
This page last modified 25 April, 2013 by Survey Web Administrator.
