This course is designed as a broad, applied introduction to the statistical analysis of categorical (or discrete) data, such as counts, proportions, nominal variables, ordinal variables, discrete variables with few values, continuous variables grouped into a small number of categories, etc.

The course begins with methods designed for cross-classified table of counts, (i.e., contingency tables), using simple chi square-based methods.

It progresses to generalized linear models, for which log-linear models provide a natural extension of simple chi square-based methods.

This framework is then extended to comprise logit and logistic regression models for binary responses and generalizations of these models for polytomous (multicategory) outcomes.

Throughout, there is a strong emphasis on associated
**graphical methods** for visualizing categorical data,
checking model assumptions, etc. Lab sessions will familiarize the
student with software using R for carrying out these analyses.

Course and lecture topics are listed below, in a visual overview.

- See the Course schedule for details of readings, lecture notes, R scripts, etc.
- For students, see Assignments and Evaluation

- Course outline, books, R
- What is categorical data?
- Categorical data analysis: methods & models
- Graphical methods

- Discrete distributions: Basic ideas
- Fitting discrete distributions
- Graphical methods: Rootograms, Ord plots
- Robust distribution plots
- Looking ahead

- Overview: \(2 \times 2\), \(r \times c\), ordered tables
- Independence
- Visualizing association
- Ordinal factors
- Square tables: Observer agreement
- Looking ahead: models

- Mosaic displays: Basic ideas
- Loglinear models
- Model-based methods: Fitting & graphing
- Mosaic displays: Visual fitting
- survival on the
*Titanic* - Sequential plots & models

- CA: Basic ideas
- Singular value decomposition (SVD)
- Optimal category scores
- Multiway tables: MCA

- Model-based methods: Overview
- Logistic regression: one predictor, multiple predictors, fitting
- Visualizing logistic regression
- Effect plots
- Case study: Racial profiling
- Model diagnostics

- Case study: Survival in the Donner party
- Polytomous response models
- Proportional odds model
- Nested dichotomies
- Multinomial models

- Logit models for response variables
- Models for ordinal factors
- RC models, estimating row/col scores
- Models for square tables
- More complex models

- Generalized linear models: Families & links
- GLMs for count data
- Model diagnostics
- Overdispersion
- Excess zeros

Copyright © 2018 Michael Friendly. All rights reserved. || lastModified :

*friendly AT yorku DOT ca*