Chi-Square Functions in Excel

CHISQ.DIST, CHISQ.DIST.RT, CHISQ.INV, CHISQ.INV.RT, CHIDIST and CHIINV Functions

Chi-square
Example of Pearson's chi squared test.

 Joxemai/CC BY-SA 3.0/Wikimedia Commons

Statistics is a subject with a number of probability distributions and formulas. Historically many of the calculations involving these formulas were quite tedious. Tables of values were generated for some of the more commonly used distributions and most textbooks still print excerpts of these tables in appendices. Although it is important to understand the conceptual framework that works behind the scenes for a particular table of values, quick and accurate results require the use of statistical software.

There are a number of statistical software packages. One that is commonly used for calculations at the introductory is Microsoft Excel. Many distributions are programmed into Excel. One of these is the chi-square distribution. There are several Excel functions that use the chi-square distribution.

Details of Chi-square

Before seeing what Excel can do, let’s remind ourselves about some details concerning the chi-square distribution. This is a probability distribution that is asymmetric and highly skewed to the right. Values for the distribution are always nonnegative. There are actually an infinite number of chi-square distributions. The one in particular that we are interested in is determined by the number of degrees of freedom that we have in our application. The greater the number of degrees of freedom, the less skewed our chi-square distribution will be.

Use of Chi-square

A chi-square distribution is used for several applications.

These include:

  • Chi-square test—To determine if the levels of two categorical variables are independent of one another.
  • Goodness of fit test—To determine how well-observed values of a single categorical variable match with values expected by a theoretical model.
  • Multinomial Experiment—This is a specific use of a chi-square test.

    All of these applications require us to use a chi-square distribution. Software is indispensable for calculations concerning this distribution.

    CHISQ.DIST and CHISQ.DIST.RT in Excel

    There are several functions in Excel that we can use when dealing with chi-square distributions. The first of these is CHISQ.DIST( ). This function returns the left-tailed probability of the chi-squared distribution indicated. The first argument of the function is the observed value of the chi-square statistic. The second argument is the number of degrees of freedom. The third argument is used to obtain a cumulative distribution.

    Closely related to CHISQ.DIST is CHISQ.DIST.RT( ). This function returns the right-tailed probability of the selected chi-squared distribution. The first argument is the observed value of the chi-square statistic, and the second argument is the number of degrees of freedom.

    For example, entering =CHISQ.DIST(3 ,4 ,true) into a cell will output 0.442175. This means that for the chi-square distribution with four degrees of freedom, 44.2175% of the area under the curve lies to the left of 3. Entering =CHISQ.DIST.RT(3 ,4 ) into a cell will output 0.557825. This means that for the chi-square distribution with four degrees of freedom, 55.7825% of the area under the curve lies to the right of 3.

    For any values of the arguments, CHISQ.DIST.RT(x, r) = 1 – CHISQ.DIST(x, r, true). This is because the part of the distribution that does not lie to the left of a value x must lie to the right.

    CHISQ.INV

    Sometimes we start with an area for a particular chi-square distribution. We wish to know what value of a statistic we would need in order to have this area to the left or the right of the statistic. This is an inverse chi-square problem and is helpful when we want to know the critical value for a certain level of significance. Excel handles this sort of problem by using an inverse chi-square function.

    The function CHISQ.INV returns the inverse of the left tailed probability for a chi-square distribution with specified degrees of freedom. The first argument of this function is the probability to the left of the unknown value.

    The second argument is the number of degrees of freedom.

    Thus, for example, entering =CHISQ.INV(0.442175, 4) into a cell will give an output of 3. Note how this is the inverse of the calculation we looked at earlier concerning the CHISQ.DIST function. In general, if P = CHISQ.DIST(x, r), then x = CHISQ.INV( P, r).

    Closely related to this is the CHISQ.INV.RT function. This is the same as CHISQ.INV, with the exception that it deals with right-tailed probabilities. This function is particularly helpful in determining the critical value for a given chi-square test. All we need to do is to enter the level of significance as our right-tailed probability, and the number of degrees of freedom.

    Excel 2007 and Earlier

    Earlier versions of Excel use slightly different functions to work with chi-square. Previous versions of Excel only had a function to directly calculate right tailed probabilities. Thus CHIDIST corresponds with the newer CHISQ.DIST.RT, In a similar way, CHIINV corresponds to CHI.INV.RT.