Skip to contents

Data from an experiment by William D. Rohwer on kindergarten children designed to examine how well performance on a set of paired-associate (PA) tasks can predict performance on some measures of aptitude and achievement.

Format

A data frame with 69 observations on the following 10 variables.

group

a numeric vector, corresponding to SES

SES

Socioeconomic status, a factor with levels Hi Lo

SAT

a numeric vector: score on a Student Achievement Test

PPVT

a numeric vector: score on the Peabody Picture Vocabulary Test

Raven

a numeric vector: score on the Raven Progressive Matrices Test

n

a numeric vector: performance on a 'named' PA task

s

a numeric vector: performance on a 'still' PA task

ns

a numeric vector: performance on a 'named still' PA task

na

a numeric vector: performance on a 'named action' PA task

ss

a numeric vector: performance on a 'sentence still' PA task

Source

Timm, N.H. 1975). Multivariate Analysis with Applications in Education and Psychology. Wadsworth (Brooks/Cole), Examples 4.3 (p. 281), 4.7 (p. 313), 4.13 (p. 344).

Details

The variables SAT, PPVT and Raven are responses to be potentially explained by performance on the paired-associate (PA) learning taskn, s, ns, na, and ss.

References

Friendly, M. (2007). HE plots for Multivariate General Linear Models. Journal of Computational and Graphical Statistics, 16(2) 421–444. http://datavis.ca/papers/jcgs-heplots.pdf

Examples


str(Rohwer)
#> 'data.frame':	69 obs. of  10 variables:
#>  $ group: int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ SES  : Factor w/ 2 levels "Hi","Lo": 2 2 2 2 2 2 2 2 2 2 ...
#>  $ SAT  : int  49 47 11 9 69 35 6 8 49 8 ...
#>  $ PPVT : int  48 76 40 52 63 82 71 68 74 70 ...
#>  $ Raven: int  8 13 13 9 15 14 21 8 11 15 ...
#>  $ n    : int  1 5 0 0 2 2 0 0 0 3 ...
#>  $ s    : int  2 14 10 2 7 15 1 0 0 2 ...
#>  $ ns   : int  6 14 21 5 11 21 20 10 7 21 ...
#>  $ na   : int  12 30 16 17 26 34 23 19 16 26 ...
#>  $ ss   : int  16 27 16 8 17 25 18 14 13 25 ...

# Plot responses against each predictor
library(tidyr)
library(dplyr)
library(ggplot2)

yvars <- c("SAT", "PPVT", "Raven" )
xvars <- c("n", "s", "ns", "na", "ss")
Rohwer_long <- Rohwer %>%
  pivot_longer(cols = all_of(xvars), names_to = "xvar", values_to = "x") |>
  pivot_longer(cols = all_of(yvars), names_to = "yvar", values_to = "y") |>
  mutate(xvar = factor(xvar, xvars), yvar = factor(yvar, yvars))

ggplot(Rohwer_long, aes(x, y, color = SES, shape = SES, fill = SES)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, formula = y ~ x) +
  stat_ellipse(geom = "polygon", level = 0.68, alpha = 0.1) +
  facet_grid(yvar ~ xvar, scales = "free") +
  labs(x = "predictor", y = "response") +
  theme_bw(base_size = 14)



## ANCOVA, assuming equal slopes
rohwer.mod <- lm(cbind(SAT, PPVT, Raven) ~ SES + n + s + ns + na + ss, data=Rohwer)
car::Anova(rohwer.mod)
#> 
#> Type II MANOVA Tests: Pillai test statistic
#>     Df test stat approx F num Df den Df    Pr(>F)    
#> SES  1   0.37853  12.1818      3     60 2.507e-06 ***
#> n    1   0.04030   0.8400      3     60  0.477330    
#> s    1   0.09271   2.0437      3     60  0.117307    
#> ns   1   0.19283   4.7779      3     60  0.004729 ** 
#> na   1   0.23134   6.0194      3     60  0.001181 ** 
#> ss   1   0.04990   1.0504      3     60  0.376988    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Visualize the ANCOVA model
heplot(rohwer.mod)

# Add ellipse to test all 5 regressors
heplot(rohwer.mod, hypotheses=list("Regr" = c("n", "s", "ns", "na", "ss")))

# View all pairs
pairs(rohwer.mod, hypotheses=list("Regr" = c("n", "s", "ns", "na", "ss")))


# or 3D plot
if (FALSE) { # \dontrun{
col <- c("red", "green3", "blue", "cyan", "magenta", "brown", "gray")
heplot3d(rohwer.mod, hypotheses=list("Regr" = c("n", "s", "ns", "na", "ss")), 
                     col=col, wire=FALSE)
} # }

## fit separate, independent models for Lo/Hi SES
rohwer.ses1 <- lm(cbind(SAT, PPVT, Raven) ~ n + s + ns + na + ss, data=Rohwer, subset=SES=="Hi")
rohwer.ses2 <- lm(cbind(SAT, PPVT, Raven) ~ n + s + ns + na + ss, data=Rohwer, subset=SES=="Lo")

# overlay the separate HE plots
heplot(rohwer.ses1, ylim=c(40,110),col=c("red", "black"))
heplot(rohwer.ses2, add=TRUE, col=c("blue", "black"), grand.mean=TRUE, error.ellipse=TRUE)