Course Description

This course is designed as a broad, applied introduction to the statistical analysis of categorical (or discrete) data, such as counts, proportions, nominal variables, ordinal variables, discrete variables with few values, continuous variables grouped into a small number of categories, etc.

  • The course begins with methods designed for cross-classified table of counts, (i.e., contingency tables), using simple chi square-based methods.

  • It progresses to generalized linear models, for which log-linear models provide a natural extension of simple chi square-based methods.

  • This framework is then extended to comprise logit and logistic regression models for binary responses and generalizations of these models for polytomous (multicategory) outcomes.

Throughout, there is a strong emphasis on associated graphical methods for visualizing categorical data, checking model assumptions, etc. Lab sessions will familiarize the student with software using R for carrying out these analyses.

Course and lecture topics are listed below, in a visual overview.

Overview & Introduction

Topics:

  • Course outline, books, R
  • What is categorical data?
  • Categorical data analysis: methods & models
  • Graphical methods

Lecture notes

Discrete Distributions

Topics:

  • Discrete distributions: Basic ideas
  • Fitting discrete distributions
  • Graphical methods: Rootograms, Ord plots
  • Robust distribution plots
  • Looking ahead

Lecture notes

Two-way Tables

Topics:

  • Overview: \(2 \times 2\), \(r \times c\), ordered tables
  • Independence
  • Visualizing association
  • Ordinal factors
  • Square tables: Observer agreement
  • Looking ahead: models

Lecture notes

Loglinear models & mosaic displays

Topics:

  • Mosaic displays: Basic ideas
  • Loglinear models
  • Model-based methods: Fitting & graphing
  • Mosaic displays: Visual fitting
  • survival on the Titanic
  • Sequential plots & models

Lecture notes

Correspondence Analysis

Topics:

  • CA: Basic ideas
  • Singular value decomposition (SVD)
  • Optimal category scores
  • Multiway tables: MCA

Lecture notes

Logistic regression

Topics:

  • Model-based methods: Overview
  • Logistic regression: one predictor, multiple predictors, fitting
  • Visualizing logistic regression
  • Effect plots
  • Case study: Racial profiling
  • Model diagnostics

Lecture notes

Logistic regression: Extensions

Topics:

  • Case study: Survival in the Donner party
  • Polytomous response models
    • Proportional odds model
    • Nested dichotomies
    • Multinomial models

Lecture notes

Extending loglinear models

Topics:

  • Logit models for response variables
  • Models for ordinal factors
  • RC models, estimating row/col scores
  • Models for square tables
  • More complex models

Lecture notes

GLMs for count data

Topics:

  • Generalized linear models: Families & links
  • GLMs for count data
  • Model diagnostics
  • Overdispersion
  • Excess zeros

Lecture notes

 

Copyright © 2018 Michael Friendly. All rights reserved. || lastModified :

friendly AT yorku DOT ca

                  ORCID iD iconorcid.org/0000-0002-3237-0941