Learning Goals

  • Gain familiarity with the interpretation of coefficients and variance components in two-predictor ANOVA models
  • Compare and contrast two-predictor ANOVA models with their linear regression equivalents

Additive Two Predictor Models (No Interaction)

The Data

Some students were interested in the effect of acidic environments on plant growth. They planted alfalfa seeds in fifteen cups, randomly assigning five to be watered with plain water, five to receive a solution containing moderate concentration of hydrochloric acid (1.5 moles per liter) and five to get a stronger acid solution (3 moles per liter).

The cups were arranged indoors near a window in five rows of three with one cup from each acid level in each row (with row a nearest the window, and row e farthest away). The response variable was average Height of the alfalfa sprouts (I assume in centimeters, though the dataset doesn’t specify) after four days.

The predictor of interest is Acid, which has three categorical levels: water, 1.5HCl and 3.0HCl.

We could write the prediction equation for a one-way ANOVA model as follows:

\[ Height_i = \mu_{Overall} + \alpha_{Acid_i} + \epsilon_i \]

  1. Given the experimental method, do you have any immediate concerns about the conditions for the residuals?

RESPONSE

The individual heights, organized by acidity condition and row, are as follows:

   Acid
Row 1.5HCl 3.0HCl water
  a 1.00   1.03   1.45 
  b 0.70   1.22   2.79 
  c 1.37   0.45   1.93 
  d 2.80   1.65   2.33 
  e 1.46   1.07   4.85

Since it’s likely that position relative to the window (Row) is a relevant influence on height, if we do not include it as a predictor in the model, it will introduce non-independence among the residuals for the sprouts in the same row. So even though it is not the target of study, it should be included in the model so that the inferences we make about Acid are valid.

A Two-Predictor ANOVA Model

A two-predictor model which is effect coded (that is, where the “intercept” represents the overall mean rather than a reference mean) would look like this:

\[ Height_i = \mu_{Overall} + \alpha_{Acid_i} + \beta_{Row_i} + \epsilon_i \]

with individual parameters \(\mu_{Overall}\), \(\alpha_{water}\), \(\alpha_{1.5HCl}\), \(\alpha_{3.0HCl}\), \(\beta_a\), \(\beta_b\), \(\beta_c\), \(\beta_d\) and \(\beta_e\).

So for example, if plant \(6\) received plain water and is in row \(b\), its prediction equation would be:

\[ Height_6 = \mu_{Overall} + \alpha_{Water} + \beta_{c} + \epsilon_6 \]

because \(Acid_6 = Water\) and \(Row_6 = c\).

Parameter Estimates

As before the \(\alpha\)s represent deviations between the mean of each Acid group and the overall mean.

Similarly, the \(\beta\)s represent deviations between the mean in reach Row group and the overall mean.

We can get the Row and Acid means, as well as the overall mean, as follows:

  1. Calculate estimates for the \(\alpha\)s and \(\beta\)s.

SOLUTION

  1. Find the predicted value for a sprout in row c that receives the 1.5HCl solution. In the data there is only one plant with this combination — what is its residual?

SOLUTION

Variance Decomposition

With a one-predictor ANOVA model, we could write the deviation of each observation from the overall mean as

\[ Y_i - \mu_{Overall} = \alpha_{X_i} + \epsilon_i \]

We derived a decomposition of the total sum of squared deviations (total “Sum of Squares”) as follows:

\[ \sum_{i=1}^N (Y_i - \mu_{Overall})^2 = \sum_{i=1}^N \alpha_{X_i}^2 + \sum_{i=1}^N \epsilon_i^2 \]

We rewrote the sum over squared alphas as a double summation, letting \(N_k\) represent the number of observations in group \(k\) and indexing each observation with its group (\(k\)) and its case number within its group (\(j\)). So we had

\[ \sum_{i=1}^N (Y_i - \mu_{Overall})^2 = \sum_{k=1}^K \sum_{j=1}^{N_k} \alpha_{k}^2 + \sum_{i=1}^N \epsilon_i^2 \]

which, after using the fact that the \(\alpha\)s didn’t depend on \(j\), simplified to:

\[ \sum_{i=1}^N (Y_i - \mu_{Overall})^2 = \sum_{k=1}^K N_k \alpha_{k}^2 + \sum_{i=1}^N \epsilon_i^2 \]

We can do something similar with the two-predictor model:

\[ \sum_{i=1}^N (Y_i - \mu_{Overall})^2 = \sum_{i=1}^N \alpha_{Acid_i}^2 + \sum_{i=1}^N \beta_{Row_i}^2 + \sum_{i=1}^N \epsilon_i^2 \]

defining the left-hand side as \(SS_{Total}\) as before, and breaking apart \(SS_{Model}\) into two pieces:

\[ SS_{Acid} = \sum_{i=1}^N \alpha_{Acid_i}^2 \] and

\[ SS_{Row} = \sum_{i=1}^N \beta_{Row_i}^2 \]

Then, as before, we have

\[ SS_{Residuals} = \sum_{i=1}^N \epsilon_i^2 \]

  1. How many times will each \(\alpha\) value be repeated in the summation for \(SS_{Acid}\)? How many times will each \(\beta\) value be repeated in the summation for \(SS_{Row}\)? How many times will each \(\epsilon\) be repeated in the summation for \(SS_{Residuals}\)?

SOLUTION

  1. Calculate \(SS_{Acid}\) and \(SS_{Row}\) for this data.

SOLUTION

We can find \(SS_{Total}\) as follows:

## [1] 17.1066
  1. Using this value and the values of \(SS_{Acid}\) and \(SS_{Row}\) you found above, find \(SS_{Residuals}\).

SOLUTION

To check that our decomposition is valid, let’s find the sum of squared residuals directly.

Here are the actual data values (since there is only one observation in each combination of Acid and Row, the mean at that combination is just that data value):

##    Acid
## Row 1.5HCl 3.0HCl water
##   a   1.00   1.03  1.45
##   b   0.70   1.22  2.79
##   c   1.37   0.45  1.93
##   d   2.80   1.65  2.33
##   e   1.46   1.07  4.85
##   1.5HCl 3.0HCl water
## a  0.886  0.504  2.09
## b  1.296  0.914  2.50
## c  0.976  0.594  2.18
## d  1.986  1.604  3.19
## e  2.186  1.804  3.39

And here are the residuals

##    Acid
## Row 1.5HCl 3.0HCl water
##   a  0.114  0.526 -0.64
##   b -0.596  0.306  0.29
##   c  0.394 -0.144 -0.25
##   d  0.814  0.046 -0.86
##   e -0.726 -0.734  1.46

So then \(SS_{Residuals}\) should be

Does this match the value you found using the variance decomposition?

Degrees of freedom

With a one-way model, we noted that the alphas will always sum to zero if they are appropriately weighted by sample size first, because the overall mean is a weighted average of the group means, so the average difference between the group means and the overall mean is zero.

The same thing holds true here, for both the alphas and betas.

  1. Verify this fact by summing your estimates for the alphas and summing your estimates for the betas.

SOLUTION


  1. Given these sum to zero constraints, how many alphas and how many betas do we actually need to know to be able to find the rest? That is, how many degrees of freedom are there for the Acid and Row variance components? How many are there in total for the full set of deviation scores (which also sum to zero)? The degrees of freedom for the residuals is just the number of df remaining from the total after we take out those for the modeled components.

SOLUTION

  1. Fill out the ANOVA table below using everything you have calculated so far. There will now be two F ratios; both use \(MS_{Residuals}\) in the denominator.

SOLUTION
Source df SS MS F
Acid
Row
Residuals N/A
Total N/A

We can get a P-value for an F-statistic as follows, provided the residuals satisfy the standard independence, Normality and constant variance conditions (plugging in the F value and the degrees of freedom):

  1. What conclusions would you draw in context based on these results?

SOLUTION

You can check your ANOVA table by running the following code:

##             Df Sum Sq Mean Sq F value Pr(>F)  
## Acid         2  6.852   3.426   4.513 0.0487 *
## Row          4  4.183   1.046   1.378 0.3235  
## Residuals    8  6.072   0.759                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interactions in ANOVA

An experiment recorded the amount of force in Newtons (the response) that it takes to separate two pieces of plastic that have been glued together

The plastic had one of three different thicknesses (thin, moderate, thick) and was held together using one of two types of glue (wood vs. plastic)

There are two cases at each combination of factors.

A two-predictor ANOVA model would look as follows:

\[ Force_i = \mu_{Overall} + \alpha_{Thickness_i} + \beta_{Glue_i} + \epsilon_i \]

Here is a plot of the Force values Thickness and Glue type, with the means at each combination shown with large squares.

## Warning: `data_frame()` was deprecated in tibble 1.1.0.
## Please use `tibble()` instead.

  1. The model says to adjust the predicted force if the material is thin by \(\alpha_{Thin}\), regardless of the Glue type, and to adjust the predicted force if the glue is Plastic glue by \(\beta_{Plastic}\) regardless of the Thickness. Looking at the plot, why does this a questionable way of modeling the means?

SOLUTION

If we don’t necessarily expect the impact of one predictor to be the same across all levels of another, then we should allow for that in the model by introducing an interaction term. The model equation looks as follows:

\[ Force_i = \mu_{Overall} + \alpha_{Thickness_i} + \beta_{Glue_i} + \gamma_{{Thickness}_i,{Glue}_i} + \epsilon_i \]

where the \(\alpha\)s and \(\beta\)s have the same meaning as they do in the additive model, but the \(\gamma\)s allow for the adjustment due to Glue to itself be adjusted according to the Thickness (or vice versa).

The average adjustment due to Thin material should be \(\alpha_{Thin}\) and the average adjustment due to using Wood glue should be \(\beta_{Wood}\). Therefore the average “adjusted adjustment” should be zero for each Thickness and each glue type. That is, if we make a table:

Thickness/Glue Wood Plastic
Thin
Moderate
Thick

filling in the cells with the gamma values, then each row and each column will sum to zero.

  1. How many nonredundant gammas are there, then? That is, how many Interaction degrees of freedom will we have?

SOLUTION

We’ll look closer at interaction models later.