What Is the F-Distribution?

Illustration of situation where ANOVA is used.
The mean lengths of the flower petals of three varieties of a species can be compared using ANOVA. ANOVA answers the question, "Is the variation in these lengths due to chance from the sample, or does it reflect a difference from the population?". C.K.Taylor

There are many probability distributions that are used throughout statistics. For example, the standard normal distribution, or bell curve, is probably the most widely recognized.  Normal distributions are only one type of distribution.  One very useful probability distribution for studying population variances is called the F-distribution. We will examine several of the properties of this type of distribution.

Basic Properties

The probability density formula for the F-distribution is quite complicated.  In practice we do not need to be concerned with this formula. It can however be quite helpful to know some of the details of the properties concerning the F-distribution. A few of the more important features of this distribution are listed below:

  • The F-distribution is a family of distributions. This means that there are an infinite number of different F-distributions. The particular F-distribution that we use for an application depends upon the number of degrees of freedom that our sample has. This feature of the F-distribution is similar to both the t-distribution and the chi-square distribution.
  • The F-distribution is either zero or positive, so there are no negative values for F. This feature of the F-distribution is similar to the chi-square distribution.
  • The F-distribution is skewed to the right. Thus this probability distribution is nonsymmetrical. This feature of the F-distribution is similar to the chi-square distribution.

These are some of the more important and easily identified features. We will look more closely at the degrees of freedom.

Degrees of Freedom

One feature shared by chi-square distributions, t-distributions and F-distributions is that there are really an infinite family of each of these distributions. A particular distribution is singled out by knowing the number of degrees of freedom. For a t distribution the number of degrees of freedom is one less than our sample size. The number of degrees of freedom for an F-distribution is determined in a different manner than for a t-distribution or even chi-square distribution.

We will see below exactly how an F-distribution arises. For now we will only consider enough to determine the number of degrees of freedom. The F-distribution is derived from a ratio involving two populations. There is a sample from each of these populations and thus there are degrees of freedom for both of these samples. In fact, we subtract one from both of the sample sizes to determine our two numbers of degrees of freedom.

Statistics from these populations combine in a fraction for the F-statistic. Both the numerator and denominator have degrees of freedom. Rather than combining these two numbers into another number, we retain both of them. Therefore any use of an F-distribution table requires us to look up two different degrees of freedom.

Uses of the F-Distribution

The F-distribution arises from inferential statistics concerning population variances. More specifically, we use an F-distribution when we are studying the ratio of the variances of two normally distributed populations.

The F-distribution is not solely used to construct confidence intervals and test hypotheses about population variances. This type of distribution is also used in one factor analysis of variance (ANOVA). ANOVA is concerned with comparing the variation between several groups and variation within each group.  To accomplish this we utilize a ratio of variances. This ratio of variances has the F-distribution.  A somewhat complicated formula allows us to calculate an F-statistic as a test statistic.