The hospital manpower data, taken from Myers (1990), table 3.8, are a well-known example of highly collinear data to which ridge regression and various shrinkage and selection methods are often applied.
The data consist of measures taken at 17 U.S. Naval Hospitals and the goal is to predict the required monthly man hours for staffing purposes.
Format
A data frame with 17 observations on the following 6 variables.
Hours
monthly man hours (response variable)
Load
average daily patient load
Xray
monthly X-ray exposures
BedDays
monthly occupied bed days
AreaPop
eligible population in the area in thousands
Stay
average length of patient's stay in days
Source
Raymond H. Myers (1990). Classical and Modern Regression with Applications, 2nd ed., PWS-Kent, pp. 130-133.
Details
Myers (1990) indicates his source was "Procedures and Analysis for Staffing Standards Development: Data/Regression Analysis Handbook", Navy Manpower and Material Analysis Center, San Diego, 1979.
References
Donald R. Jensen and Donald E. Ramirez (2012). Variations on Ridge Traces in Regression, Communications in Statistics - Simulation and Computation, 41 (2), 265-278.
See also
manpower
for the same data, and other
analyses
Examples
data(Manpower)
mmod <- lm(Hours ~ ., data=Manpower)
vif(mmod)
#> Load Xray BedDays AreaPop Stay
#> 9597.570761 7.940593 8933.086501 23.293856 4.279835
# ridge regression models, specified in terms of equivalent df
mridge <- ridge(Hours ~ ., data=Manpower, df=seq(5, 3.75, -.25))
vif(mridge)
#> Load Xray BedDays AreaPop Stay
#> 0.0000000000 9597.570762 7.940593 8933.086501 23.293856 4.279835
#> 0.0002836352 5602.507260 7.927390 5215.176404 19.045017 3.923247
#> 0.0009043689 2438.603928 7.912946 2270.761610 15.667945 3.638481
#> 0.0026667276 634.529233 7.890568 591.831974 13.699280 3.467754
#> 0.0203643899 23.623930 7.715258 23.249737 12.530582 3.312552
#> 0.1364900421 7.336948 6.733109 8.136369 9.838466 2.795880
# univariate ridge trace plots
traceplot(mridge)
traceplot(mridge, X="df")
# bivariate ridge trace plots
plot(mridge, radius=0.25, labels=mridge$df)
pairs(mridge, radius=0.25)
# \donttest{
# 3D views
# ellipsoids for Load, Xray & BedDays are nearly 2D
plot3d(mridge, radius=0.2, labels=mridge$df)
# variables in model selected by AIC & BIC
plot3d(mridge, variables=c(2,3,5), radius=0.2, labels=mridge$df)
# plots in PCA/SVD space
mpridge <- pca(mridge)
traceplot(mpridge, X="df")
biplot(mpridge, radius=0.25)
#> Vector scale factor set to 8774.365
# }