Wednesday, May 25, 2016

Analysis of Variance (ANOVA) Explained

Surveys are effective at collecting data. However, insights develop after the fact and arise from the analysis we subject the data to. One of those techniques currently on my favored list is the tried and true analysis of variance (ANOVA).

If we are collecting metric data with our surveys, perhaps in the form of responses to a Likert scale, the amount spent on a product, customer satisfaction scores, or the number of purchases made then we open the door for analyzing differences in average score between respondent groups. If we are comparing two groups at a time (e.g. men vs. women, new vs. existing customers, employees vs. managers, etc.) then it is appropriate to use a t-test to assess the significance of any differences. However, if there are more than two groups it becomes necessary to look to another technique.

ANOVA, or its non-parametric counterparts, allow you to determine if differences in mean values between three or more groups are by chance or if they are indeed significantly different. ANOVA is particularly useful when analyzing the multi-item scales common in market research. In the table below respondents in a restaurant, survey rated three diners on overall satisfaction. The null hypothesis is there is no difference in satisfaction between the three restaurants. However, the data seems to imply otherwise.

Larry’s Diner     6.28
Curly’s Diner     6.05
Moe’s Diner       5.33
Overall                5.65

ANOVA makes use of the F-test to determine if the variance in response to the satisfaction questions is large enough to be considered statistically significant.
In this example, the F-test for satisfaction is 51.19 which is considered statistically significant indicating there is a real difference between average satisfaction scores. ANOVA indicates whether or not there is a significant difference, it does not provide, however, direction as to which group is higher or lower. Statistical packages, such as SPSS and SAS, allow the survey researcher the option of selecting a posthoc test which compares groups for individual differences. In regard to satisfaction, Larry’s Diner was the clear winner with an average score significantly greater than either Curly’s or Moe’s. The difference between Curly’s and Moe’s was not large enough, given the number of respondents, to be significant.
The proper use of ANOVA in analyzing survey data requires that a few assumptions be met including normal distribution of data; independence of cases, and equality of variance (each group’s variance is equal). If these assumptions cannot be met then there are non-parametric tests available which do not require these assumptions.

Data by itself is just that. However, when we judiciously employ statistical tests we can create insight that can have a positive impact on our marketing efforts.
iframe {max-width:100%;} .embed{ width: 100%; }