Fixed vs. Random Variables

This is a topic that you will need to think about when you’re developing your models for ANOVAs.  We’ll define what a fixed and a random variable are then we’ll talk about how to incorporate into a PROC GLM and a PROC MIXED.

What is a FIXED variable?

A FIXED variable is one that you have set in your experimental design.  Think of this as your treatment effect(s):

  • diet
  • dilution levels
  • breed or variety
  • age group
  • fertilizer level

Think of it as a source of variation or an effect that is clearly defined.  Another common example is SEX.

Another way of thinking about a FIXED variable, is one that you are only interested in looking at or examining its current level.  So I’m only interested in fertilizers of 10, 100, and 1000.  I’m only interested in diets with 2% corn, 5% corn and 8% corn.  I am NOT interested in what happens between 10, 100 and 1000 OR the 2%, 5% and 8%.  I am ONLY interested in the levels I set out in my experimental design.

What is a RANDOM variable?

A RANDOM variable is one that you expect or are aware is a source of variation in your experimental design, but one that  you cannot set levels to.  For example:

  • year
  • location
  • sire / dam

Think of it as a source of variation or an effect that you cannot clearly define, one that you know plays a role in your experimental design, or one that you are NOT interested in examining but are interested in accounting for.

As an example, you have conducted a field experiment in two separate locations, one in the southern part of the province and the second in the northern part of the province.  You recognize that there are possible differences between the two locations due to the soil type, climate, and management practices that are associated with the location of the trial.  You are interested only in the treatment effects that you set out, variety of corn planted.  To account for the variation in  your location, you would include location as a random variable and treatment (corn variety) as your fixed effect.  Ask yourself, do I want to report differences in the location?  If the answer is NO, then keep it as a random variable, but if the answer is YES then by all means include it as a fixed variable.  In this case though, please recognize that location would include ALL effects due to location – climate, soil type, etc…  you would not be able to pull those effects apart – confounding exists.

Let’s look at a different way to decide whether you have a FIXED or RANDOM effect.  The following example is from a mentor of mine.

I am interested in studying soft drink consumption of 1st year students at University X.  I collect data that includes amount of pop students drink during the semester and the brand of pop they drink.

My hypothesis is:
1st year students drink more Coke than Pepsi.

Is the brand of pop in this case FIXED or RANDOM?

Next I redo my analysis with the following hypothesis:
1st year students drink a variety of pop brands.

Is the brand of pop in this case FIXED or RANDOM?

FIXED and RANDOM in SAS

Once you decide whether you have a FIXED or RANDOM variable, the next question is how do I incorporate this into SAS?  The 2 most common PROCs we use in SAS for ANOVAs are PROC GLM and PROC MIXED, each treats the RANDOM variable a little different.

FIXED and RANDOM variables are listed in the CLASS statement of any PROC in SAS, that includes GLM, MIXED, MEANS, any PROC were you need SAS to recognize that there are difference levels within your variables.

RANDOM variables on the other hand are treated a bit differently in GLM and MIXED.  You will always add a RANDOM statement to your procedure, so a RANDOM statement will identify which variables should be treated as RANDOM.  This is the case for both GLM and MIXED.  However, if you are using GLM to analyze your ANOVA you will need to include both your FIXED and RANDOM effects in your model statement.   Whereas in MIXED you only need to include your FIXED effects in your model statement.

Proc GLM;
   class  variety location; 
   model yield = variety location;
   random location;
Run;

Proc MIXED;
   class  variety location;
   model yield = variety;
   random location;
Run;

Screen Shot 2013-11-18 at 7.33.07 PM