Skip to contents

Computes pseudo-R² and related fit measures for a "nestedLogit" object and related models for a polytomous response. For the "nestedLogit" case, the result shows one row per binary logit sub-model (dichotomy) and an additional "Combined" row for the overall polytomous model.

Usage

RSQ(x, ...)

# S3 method for class 'nestedLogit'
RSQ(
  x,
  which = c("McFadden", "CoxSnell", "Nagelkerke"),
  include = "AIC",
  digits = 3L,
  ...
)

# S3 method for class 'RSQ.nestedLogit'
print(x, digits = attr(x, "digits"), ...)

# S3 method for class 'multinom'
RSQ(
  x,
  which = c("McFadden", "CoxSnell", "Nagelkerke"),
  include = "AIC",
  digits = 3L,
  ...
)

# S3 method for class 'RSQ.multinom'
print(x, digits = attr(x, "digits"), ...)

# S3 method for class 'polr'
RSQ(
  x,
  which = c("McFadden", "CoxSnell", "Nagelkerke"),
  include = "AIC",
  digits = 3L,
  ...
)

# S3 method for class 'RSQ.polr'
print(x, digits = attr(x, "digits"), ...)

Arguments

x

a "nestedLogit" object.

...

currently unused.

which

character vector naming the pseudo-R² measures to compute. Any subset of c("McFadden", "McFaddenAdj", "CoxSnell", "Nagelkerke", "Tjur"), or "ALL" to include all of them. Default: c("McFadden", "CoxSnell", "Nagelkerke").

include

character vector of additional columns to append to the result. Any subset of c("AIC", "BIC", "n"), where "n" adds the number of observations used for each row, or "ALL" to include all of them. Default: "AIC".

digits

integer; number of decimal places used when printing (default 3L).

Value

An object of class c("RSQ.nestedLogit", "data.frame") with one row per dichotomy plus a final "Combined" row, and columns response (the sub-model name), the requested pseudo-R² measures, and any additional statistics requested via include. The formula, object name, and digits are stored as attributes and used by the print method.

Details

RSQ is implemented as an S3 generic with methods for "nestedLogit", as well as nnet::multinom(), and MASS::polr() objects, which are other methods for modeling a polytomous response variable.

In contrast to standard, Gaussian linear models, where \(R^2\) has a uniformly simple interpretation as "variance accounted for" by the model, and with different, yet equivalent computational formulas, there is no single commonly accepted measure for logistic regression models for a binary response or a dichotomy among outcomes.

The following measures are available via the which argument:

"McFadden"

1 - L/L\(_0\), where L is the fitted model log-likelihood and L\(_0\) that of the null (intercept-only) model (McFadden, 1979). Values of 0.1–0.3 indicate a reasonable fit in logistic regression.

"McFaddenAdj"

1 - (L - k)/L\(_0\), where k is the number of non-intercept parameters; penalises model complexity (Hosmer & Lemeshow, 2000).

"CoxSnell"

1 - exp(2(L\(_0\) - L)/n); bounded strictly below 1 for discrete outcomes (Cox & Snell, 1989).

"Nagelkerke"

Cox-Snell divided by its theoretical maximum, rescaling to [0, \1] (Nagelkerke, 1991).

"Tjur"

Mean fitted value for \(y = 1\) minus mean fitted value for \(y = 0\); the coefficient of discrimination (Tjur, 2009). Per-dichotomy only (NA in the Combined row).

For the Combined row the log-likelihood is the sum of the sub-model log-likelihoods (exploiting the independence of the nested dichotomies), and \(n\) is nrow(x$data) — the full sample size of the polytomous model — not the sum of per-dichotomy observation counts, which would double-count observations that appear in more than one sub-model.

A wider range of pseudo-R² measures for logistic-type models (glm, polr, multinom, vglm) is available in DescTools::PseudoR2(), including the Efron (1978) and McKelvey & Zavoina (1975) measures not implemented here. For an accessible overview see https://statisticalhorizons.com/r2logistic/.

References

Cox, D. R., & Snell, E. J. (1989). The Analysis of Binary Data (2nd ed.). Chapman and Hall.

Efron, B. (1978). Regression and ANOVA with zero-one data: Measures of residual variation. Journal of the American Statistical Association, 73(361), 113–121. https://doi.org/10.2307/2286498

Hosmer, D. W., & Lemeshow, S. (2000). Applied Logistic Regression (2nd ed.). Wiley. https://doi.org/10.1002/0471722146

McFadden, D. (1979). Quantitative methods for analysing travel behaviour of individuals: Some recent developments. In D. A. Hensher & P. R. Stopher (Eds.), Behavioural Travel Modelling (pp. 279–318). Croom Helm.

McKelvey, R. D., & Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4(1), 103–120. https://doi.org/10.1080/0022250X.1975.9989847

Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78(3), 691–692. https://doi.org/10.1093/biomet/78.3.691

Tjur, T. (2009). Coefficients of determination in logistic regression models — a new proposal: The coefficient of discrimination. The American Statistician, 63(4), 366–372. https://doi.org/10.1198/tast.2009.08210

Author

Michael Friendly

Examples

data("Womenlf", package = "carData")
wlf.nested <- nestedLogit(partic ~ hincome + children,
  logits(work = dichotomy("not.work", c("parttime", "fulltime")),
         full = dichotomy("parttime", "fulltime")),
  data = Womenlf)

# Default: McFadden, CoxSnell, Nagelkerke + AIC
RSQ(wlf.nested)
#> Pseudo R² measures for nestedLogit model:
#>   partic ~ hincome + children 
#> 
#>     model McFadden CoxSnell Nagelkerke      AIC 
#>      work   0.1023   0.1293     0.1743 325.7325 
#>      full   0.2761   0.3085     0.4185 110.4948 
#> ----------------------------------------------- 
#>  Combined   0.1524   0.2517     0.2958 436.2274 

# All measures and all extra columns
RSQ(wlf.nested, which = "ALL", include = "ALL")
#> Error in match.arg(which, choices = c("McFadden", "McFaddenAdj", "CoxSnell",     "Nagelkerke", "Tjur"), several.ok = TRUE): 'arg' should be one of "McFadden", "McFaddenAdj", "CoxSnell", "Nagelkerke", "Tjur"

# Multinomial logit for comparison
if (requireNamespace("nnet", quietly = TRUE)) {
  wlf.multi <- nnet::multinom(partic ~ hincome + children, data = Womenlf,
                              trace = FALSE)
  RSQ(wlf.multi)
}
#> Error in UseMethod("RSQ"): no applicable method for 'RSQ' applied to an object of class "c('multinom', 'nnet')"

# Proportional-odds model for comparison
if (requireNamespace("MASS", quietly = TRUE)) {
  wlf.polr <- MASS::polr(partic ~ hincome + children, data = Womenlf)
  RSQ(wlf.polr)
}
#> Error in UseMethod("RSQ"): no applicable method for 'RSQ' applied to an object of class "polr"