11  Visualizing Multivariate Models

Tests of multivariate models, including multivariate analysis of variance (MANOVA) for group differences and multivariate multiple regression (MMRA) can be easily visualized by plots of a hypothesis (“H”) data ellipse for the fitted values relative to the corresponding plot of the error ellipse (“E”) of the residuals, which I call the HE plot framework.

For more than a few response variables, these result can be projected onto a lower-dimensional “canonical discriminant” space providing an even simpler description.

Packages

In this chapter we use the following packages. Load them now

11.1 HE plot framework

Chapter 9 illustrated the basic ideas of the framework for visualizing multivariate linear models in the context of a simple two group design, using Hotelling’s \(T^2\). The main ideas were illustrated in Figure 9.9.

Having described the statistical ideas behind the MLM in Chapter 10, we can proceed to extend this framework to larger designs. Figure 11.1 illustrates these ideas using the simple one-way MANOVA design of the dogfood data from Section 10.2.1.

Figure 11.1: Dogfood quartet: Illustration of the conceptual ideas of the HE plot framework for the dogfood data. (a) Scatterplot of the data; (b) Summary using data ellipses; (c) HE plot shows the variation in the means in relation to pooled within group variance; (d) Transformation from data space to canonical space
  • In data space (a), each group is summarized by its data ellipse (b), representing the means and covariances.

  • Variation against the hypothesis of equal means can be seen by the \(\mathbf{H}\) ellipse in the HE plot (c), representing the data ellipse of the fitted values. Error variance is shown in the \(\mathbf{E}\) ellipse, representing the pooled within-group covariance matrix, \(\mathbf{S}_p\) and the data ellipse of the residuals from the model. For the dogfood data, the group means have a negative relation: longer time to start eating is associated with a smaller amount eaten.

  • The MANOVA (or Hotelling’s \(T^2\)) is formally equivalent to a discriminant analysis, predicting group membership from the response variables which can be seen in data space. (The main difference is emphasis and goals: MANOVA seeks to test differences among group means, while discriminant analysis aims at classification of the observations into groups.)

  • This effectively projects the \(p\)-dimensional space of the predictors into the smaller canonical space (d) that shows the greatest differences among the groups. As in a biplot, vectors show the relations of the response variables with the canonical dimensions.

For more complex models such as MANOVA with multiple factors or multivariate multivariate regression, there is one sum of squares and products matrix (SSP), and therefore one \(\mathbf{H}\) ellipse for each term in the model. For example, in a two-way MANOVA design with the model formula (y1, y2) ~ A + B + A*B and equal sample sizes in the groups, the total sum of squares accounted for by the model is \[\begin{aligned} \mathbf{SSP}_{\text{Model}} & = \mathbf{SSP}_{A} + \mathbf{SSP}_{B} + \mathbf{SSP}_{AB} \\ & = \mathbf{H}_A + \mathbf{H}_B + \mathbf{H}_{AB} \:\: . \end{aligned}\]

11.2 HE plot construction

The HE plot is constructed to allow a direct visualization of the “size” of hypothesized terms in a multivariate linear model in relation to unexplained error variation. These can be displayed in 2D or 3D plots, so I use the term “ellipsoid” below to cover all cases.

Error variation is represented by a standard 68% data ellipsoid of the \(\mathbf{E}\) matrix of the residuals in \(\boldsymbol{\Large\varepsilon}\). This is divided by the residual degrees of freedom, so the size of \(\mathbf{E} / \text{df}_e\) is analogous to a mean square error in univariate tests. The choice of 68% coverage allows you to ``read’’ the residual standard deviation as the half-length of the shadow of the \(\mathbf{E}\) ellipsoid on any axis (see Figure 3.10). The \(\mathbf{E}\) ellipsoid is then translated to the overall (grand) means \(\bar{\mathbf{y}}\) of the variables plotted, which allows us to show the means for factor levels on the same scale, facilitating interpretation. In the notation of Equation 3.2, the error ellipsoid is given by \[ \mathcal{E}_c (\bar{\mathbf{y}}, \mathbf{E}) = \bar{\mathbf{y}} \; \oplus \; c\,\mathbf{E}^{1/2} \:\: , \] where \(c = \sqrt{2 F_{2, n-2}^{0.68}}\) for 2D plots and \(c = \sqrt{3 F_{3, n-3}^{0.68}}\) for 3D.

An ellipsoid representing variation in the means of a factor (or any other term reflected in a general linear hypothesis test, Equation 10.6) in the \(\mathbf{H}\) matrix is simply the data ellipse of the fitted values for that term. Dividing the hypothesis matrix by the error degrees of freedom, giving \(\mathbf{H} / \text{df}_e\), puts this on the same scale as the ellipse. I refer to this as effect size scaling, because it is similar to an effect size index used in univariate models, e.g., \(ES = (\bar{y}_1 - \bar{y}_2) / s_e\) in a two-group, univariate design.

11.2.1 Example: Iris data

TODO: Introduce the iris data earlier. Where???

Figure 11.2 shows two views of the relationship between sepal length and sepal width for the iris data. …

knitr::include_graphics("images/iris-HE1.png")
Figure 11.2: Iris data: Data ellipses and HE plot

11.2.2 Significance scaling

The geometry of ellipsoids and multivariate tests allow us to go further with another re-scaling of the \(\mathbf{H}\) ellipsoid that gives a visual test of significance for any term in a MLM. This is done simply by dividing \(\mathbf{H} / df_e\) further by the \(\alpha\)-critical value of the corresponding test statistic to show the strength of evidence against the null hypothesis. Among the various multivariate test statistics, Roy’s maximum root test, based on the largest eigenvalue \(\lambda_1\) of \(\mathbf{H} \mathbf{E}^{-1}\), gives \(\mathbf{H} / (\lambda_\alpha df_e)\) which has the visual property that the scaled \(\mathbf{H}\) ellipsoid will protrude somewhere outside the standard \(\mathbf{E}\) ellipsoid if and only if Roy’s test is significant at significance level \(\alpha\). The critical value \(\lambda_\alpha\) for Roy’s test is \[ \lambda_\alpha = \left(\frac{\text{df}_1}{\text{df}_2}\right) \; F_{\text{df}_1, \text{df}_2}^{1-\alpha} \:\: , \] where \(\text{df}_1 = \max(p, \text{df}_h)\) and \(\text{df}_2 = \text{df}_h + \text{df}_e - \text{df}_1\).

For these data, the HE plot using significance scaling is shown in the right panel of .

knitr::include_graphics("images/iris-HE2.png")
Figure 11.3: HE plots for sepal width and sepal length in the iris dataset. Left: effect scaling of the \(\mathbf{H}\) magtrix; right: significance scaling

11.2.3 Contrasts

11.2.4 HE plot matrices

11.3 Canonical discriminant analysis

#> Writing packages to  C:/R/projects/Vis-MLM-book/bib/pkgs.txt
#> 8  packages used here:
#>  broom, car, carData, dplyr, ggplot2, heplots, knitr, tidyr