Skip to contents

The hospital manpower data, taken from Myers (1990), table 3.8, are a well-known example of highly collinear data to which ridge regression and various shrinkage and selection methods are often applied.

The data consist of measures taken at 17 U.S. Naval Hospitals and the goal is to predict the required monthly man hours for staffing purposes.

Format

A data frame with 17 observations on the following 6 variables.

Hours

monthly man hours (response variable)

Load

average daily patient load

Xray

monthly X-ray exposures

BedDays

monthly occupied bed days

AreaPop

eligible population in the area in thousands

Stay

average length of patient's stay in days

Source

Raymond H. Myers (1990). Classical and Modern Regression with Applications, 2nd ed., PWS-Kent, pp. 130-133.

Details

Myers (1990) indicates his source was "Procedures and Analysis for Staffing Standards Development: Data/Regression Analysis Handbook", Navy Manpower and Material Analysis Center, San Diego, 1979.

References

Donald R. Jensen and Donald E. Ramirez (2012). Variations on Ridge Traces in Regression, Communications in Statistics - Simulation and Computation, 41 (2), 265-278.

See also

manpower for the same data, and other analyses

Examples


data(Manpower)
mmod <- lm(Hours ~ ., data=Manpower)
vif(mmod)
#>        Load        Xray     BedDays     AreaPop        Stay 
#> 9597.570761    7.940593 8933.086501   23.293856    4.279835 
# ridge regression models, specified in terms of equivalent df
mridge <- ridge(Hours ~ ., data=Manpower, df=seq(5, 3.75, -.25))
vif(mridge)
#>                     Load     Xray     BedDays   AreaPop     Stay
#> 0.0000000000 9597.570762 7.940593 8933.086501 23.293856 4.279835
#> 0.0002836352 5602.507260 7.927390 5215.176404 19.045017 3.923247
#> 0.0009043689 2438.603928 7.912946 2270.761610 15.667945 3.638481
#> 0.0026667276  634.529233 7.890568  591.831974 13.699280 3.467754
#> 0.0203643899   23.623930 7.715258   23.249737 12.530582 3.312552
#> 0.1364900421    7.336948 6.733109    8.136369  9.838466 2.795880

# univariate ridge trace plots
traceplot(mridge)

traceplot(mridge, X="df")


# bivariate ridge trace plots
plot(mridge, radius=0.25, labels=mridge$df)

pairs(mridge, radius=0.25)


# \donttest{
# 3D views
# ellipsoids for Load, Xray & BedDays are nearly 2D
plot3d(mridge, radius=0.2, labels=mridge$df)
# variables in model selected by AIC & BIC
plot3d(mridge, variables=c(2,3,5), radius=0.2, labels=mridge$df)

# plots in PCA/SVD space
mpridge <- pca(mridge)
traceplot(mpridge, X="df")

biplot(mpridge, radius=0.25)

#> Vector scale factor set to  8774.365 
# }