  ## Course Description

This course is designed as a broad, applied introduction to the statistical analysis of categorical (or discrete) data, such as counts, proportions, nominal variables, ordinal variables, discrete variables with few values, continuous variables grouped into a small number of categories, etc.

• The course begins with methods designed for cross-classified table of counts, (i.e., contingency tables), using simple chi square-based methods.

• It progresses to generalized linear models, for which log-linear models provide a natural extension of simple chi square-based methods.

• This framework is then extended to comprise logit and logistic regression models for binary responses and generalizations of these models for polytomous (multicategory) outcomes.

Throughout, there is a strong emphasis on associated graphical methods for visualizing categorical data, checking model assumptions, etc. Lab sessions will familiarize the student with software using R for carrying out these analyses.

Course and lecture topics are listed below, in a visual overview.

## Overview & Introduction #### Topics:

• Course outline, books, R
• What is categorical data?
• Categorical data analysis: methods & models
• Graphical methods

## Discrete Distributions #### Topics:

• Discrete distributions: Basic ideas
• Fitting discrete distributions
• Graphical methods: Rootograms, Ord plots
• Robust distribution plots

## Two-way Tables #### Topics:

• Overview: $$2 \times 2$$, $$r \times c$$, ordered tables
• Independence
• Visualizing association
• Ordinal factors
• Square tables: Observer agreement

## Loglinear models & mosaic displays #### Topics:

• Mosaic displays: Basic ideas
• Loglinear models
• Model-based methods: Fitting & graphing
• Mosaic displays: Visual fitting
• survival on the Titanic
• Sequential plots & models

## Correspondence Analysis #### Topics:

• CA: Basic ideas
• Singular value decomposition (SVD)
• Optimal category scores
• Multiway tables: MCA

## Logistic regression #### Topics:

• Model-based methods: Overview
• Logistic regression: one predictor, multiple predictors, fitting
• Visualizing logistic regression
• Effect plots
• Case study: Racial profiling
• Model diagnostics

## Logistic regression: Extensions #### Topics:

• Case study: Survival in the Donner party
• Polytomous response models
• Proportional odds model
• Nested dichotomies
• Multinomial models

## Extending loglinear models #### Topics:

• Logit models for response variables
• Models for ordinal factors
• RC models, estimating row/col scores
• Models for square tables
• More complex models

## GLMs for count data #### Topics:

• Generalized linear models: Families & links
• GLMs for count data
• Model diagnostics
• Overdispersion
• Excess zeros