Skip to contents

Data on amyotrophic lateral sclerosis (Lou Gehrig's disease) from Section 17.2. There are 1822 observations on individuals with ALS. The goal is to predict the rate of progression dFRS of a functional rating score, using 369 predictors based on measurements (and derivatives of these) obtained from patient visits.

Format

A data frame with 1822 rows and 371 variables. The key variables are testset (logical indicator for training/test split) and dFRS (response: rate of progression of the ALS functional rating score). The 369 predictor variables include:

  • Demographics: Age, Sex.Male, Sex.Female, and race indicators (Race...Caucasian, Race...Asian, etc.)

  • Family history of neurological diseases in relatives (e.g., Father, Mother, Brother, Sister)

  • Neurological disease indicators (e.g., Neurological.Disease.ALS, Neurological.Disease.PARKINSON.S.DISEASE)

  • Site of onset (Site.of.Onset.Onset..Bulbar, Site.of.Onset.Onset..Limb)

  • Symptoms (Symptom.Atrophy, Symptom.Cramps, Symptom.Fasciculations, Symptom.Speech, etc.)

  • Study arm indicators (Study.Arm.ACTIVE, Study.Arm.PLACEBO)

  • Clinical measurements with summary statistics (first, last, min, max, mean, sd, slope): ALSFRS scores, blood pressure, forced/slow vital capacity (fvc.liters, svc.liters), respiratory rate, weight, height

  • ALSFRS subscale items: climbing.stairs, cutting, dressing, handwriting, salivation, speech, swallowing, turning, walking

Details

These data were kindly provided by Lester Mackey and Lilly Fang, who won the DREAM challenge prediction prize in 2012 (Kuffner et al., 2015). It includes some additional variables created by them. Their winning entry used Bayesian trees, not too different from random forests.

References

Efron, B. and Hastie, T. (2016). Computer Age Statistical Inference. Cambridge University Press, Section 17.2.

Examples

data(als)
str(als)
#> 'data.frame':	1822 obs. of  371 variables:
#>  $ testset                                 : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
#>  $ dFRS                                    : num  -0.915 -0.108 -0.557 -0.296 -1.087 ...
#>  $ Onset.Delta                             : int  -1181 -1324 -1061 -1736 -354 -500 -1091 -217 -820 -1037 ...
#>  $ Symptom.Speech                          : int  1 0 0 0 1 1 0 0 0 0 ...
#>  $ Symptom.WEAKNESS                        : int  0 1 0 1 0 1 0 1 1 1 ...
#>  $ Symptom.OTHER                           : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Symptom.Swallowing                      : int  1 0 0 0 0 0 0 0 0 0 ...
#>  $ Symptom.GAIT_CHANGES                    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Symptom.Atrophy                         : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Symptom.Cramps                          : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Symptom.Fasciculations                  : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Symptom.SENSORY_CHANGES                 : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Symptom.Stiffness                       : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Symptom..                               : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Site.of.Onset.Onset..Bulbar             : int  1 0 0 0 1 1 0 0 1 1 ...
#>  $ Site.of.Onset.Onset..Limb               : int  0 1 1 1 0 0 1 1 0 0 ...
#>  $ Site.of.Onset.Onset..Limb.and.Bulbar    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Race...Asian                            : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Race...Black.African.American           : int  0 0 0 0 0 0 1 0 0 0 ...
#>  $ Race...Caucasian                        : int  1 1 1 1 1 1 0 0 1 1 ...
#>  $ Race...Other                            : int  0 0 0 0 0 0 0 1 0 0 ...
#>  $ Age                                     : int  38 72 46 66 70 37 41 70 67 71 ...
#>  $ Sex.Female                              : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Sex.Male                                : int  0 0 1 0 0 0 1 1 0 0 ...
#>  $ Aunt                                    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Aunt..Maternal.                         : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Cousin                                  : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Father                                  : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Grandfather..Maternal.                  : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Grandmother                             : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Grandmother..Maternal.                  : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Mother                                  : int  0 0 0 1 0 0 0 0 0 0 ...
#>  $ Uncle                                   : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Uncle..Maternal.                        : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Uncle..Paternal.                        : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Son                                     : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Daughter                                : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Sister                                  : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Brother                                 : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Family                                  : int  0 0 0 1 0 0 0 0 0 0 ...
#>  $ Neurological.Disease.OTHER              : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Neurological.Disease.STROKE.NOS         : int  0 0 0 1 0 0 0 0 0 0 ...
#>  $ Neurological.Disease.DEMENTIA.NOS       : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Neurological.Disease.PARKINSON.S.DISEASE: int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Neurological.Disease.DAT                : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Neurological.Disease.ALS                : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Neurological.Disease.BRAIN.TUMOR        : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Neurological.Disease.STROKE.ISCHEMIC    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Neurological.Disease.STROKE.HEMORRHAGIC : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ Study.Arm.PLACEBO                       : int  0 0 1 0 0 0 1 0 0 0 ...
#>  $ Study.Arm.ACTIVE                        : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ max.alsfrs.score                        : int  24 28 35 30 33 23 26 29 35 33 ...
#>  $ min.alsfrs.score                        : int  19 26 30 29 29 14 23 28 33 31 ...
#>  $ last.alsfrs.score                       : int  21 26 30 29 33 14 25 28 33 31 ...
#>  $ mean.alsfrs.score                       : num  21.2 27.3 31.8 29.5 31 ...
#>  $ num.alsfrs.score.visits                 : int  4 3 4 4 4 4 4 3 4 4 ...
#>  $ sum.alsfrs.score                        : int  85 82 127 118 124 78 100 86 138 128 ...
#>  $ first.alsfrs.score.date                 : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ last.alsfrs.score.date                  : int  70 67 78 90 84 91 85 35 91 84 ...
#>  $ meansquares.alsfrs.score                : num  455 748 1012 870 963 ...
#>  $ sd.alsfrs.score                         : num  1.785 0.943 1.92 0.5 1.414 ...
#>  $ alsfrs.score.slope                      : num  0 -0.909 -1.951 -0.338 0.725 ...
#>  $ lessthan2.alsfrs.score                  : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ no.alsfrs.score.data                    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ max.speech                              : int  2 4 4 4 3 2 4 4 2 1 ...
#>  $ min.speech                              : int  2 4 4 4 2 1 4 4 2 0 ...
#>  $ last.speech                             : int  2 4 4 4 2 1 4 4 2 0 ...
#>  $ mean.speech                             : num  2 4 4 4 2.25 1.75 4 4 2 0.75 ...
#>  $ sum.speech                              : int  8 12 16 16 9 7 16 12 8 3 ...
#>  $ meansquares.speech                      : num  4 16 16 16 5.25 3.25 16 16 4 0.75 ...
#>  $ sd.speech                               : num  0 0 0 0 0.433 ...
#>  $ speech.slope                            : num  0 0 0 0 -0.362 ...
#>  $ max.salivation                          : int  4 4 4 4 3 3 4 4 3 2 ...
#>  $ min.salivation                          : int  2 4 3 4 2 1 4 3 3 1 ...
#>  $ last.salivation                         : int  3 4 3 4 3 2 4 4 3 1 ...
#>  $ mean.salivation                         : num  3 4 3.75 4 2.5 ...
#>  $ sum.salivation                          : int  12 12 15 16 10 8 16 10 12 6 ...
#>  $ meansquares.salivation                  : num  9.5 16 14.2 16 6.5 ...
#>  $ sd.salivation                           : num  0.707 0 0.433 0 0.5 ...
#>  $ salivation.slope                        : num  0 0 -0.39 0 0.362 ...
#>  $ max.swallowing                          : int  4 4 4 4 3 3 4 4 3 3 ...
#>  $ min.swallowing                          : int  4 4 4 4 2 2 4 3 3 2 ...
#>  $ last.swallowing                         : int  4 4 4 4 3 2 4 4 3 2 ...
#>  $ mean.swallowing                         : num  4 4 4 4 2.75 ...
#>  $ sum.swallowing                          : int  16 12 16 16 11 11 16 10 12 9 ...
#>  $ meansquares.swallowing                  : num  16 16 16 16 7.75 ...
#>  $ sd.swallowing                           : num  0 0 0 0 0.433 ...
#>  $ swallowing.slope                        : num  0 0 0 0 0 ...
#>  $ max.handwriting                         : int  0 4 3 3 4 2 0 3 4 4 ...
#>  $ min.handwriting                         : int  0 4 2 3 4 1 0 3 4 4 ...
#>  $ last.handwriting                        : int  0 4 2 3 4 1 0 3 4 4 ...
#>  $ mean.handwriting                        : num  0 4 2.25 3 4 1.25 0 3 4 4 ...
#>  $ sum.handwriting                         : int  0 12 9 12 16 5 0 9 16 16 ...
#>  $ meansquares.handwriting                 : num  0 16 5.25 9 16 1.75 0 9 16 16 ...
#>  $ sd.handwriting                          : num  0 0 0.433 0 0 ...
#>  $ handwriting.slope                       : num  0 0 -0.39 0 0 ...
#>  $ max.cutting                             : int  1 4 3 3 4 2 0 3 4 4 ...
#>  $ min.cutting                             : int  1 3 2 3 4 1 0 2 3 4 ...
#>  $ last.cutting                            : int  1 3 2 3 4 1 0 2 3 4 ...
#>   [list output truncated]