Syllabus

0.1 Course Description

A continuation of Statistics 113 intended for students in the physical, social, or behavioral sciences. Topics include simple and multiple linear regression, model diagnostics and testing, residual analysis, transformations, indicator variables, variable selection techniques, logistic regression, and analysis of variance.

Much of this course will focus on the Model and Communicate portions of the “Data Analysis Life Cycle” with an emphasis on becoming familiar with different types of statistical models:

Those interested in learning more about the other parts of the cycle (especially the Tidy part) are encourages to take STAT 234 - Introduction to Data Science.

0.2 Software

We will mainly be using the R Programming Language. Written by-and-for statisticians, R is an excellent tool to learn how to improve your knowledge of statistical modeling. (FYI, other popular languages that are uses for data analysis include Python and SQL.)

0.2.1 Accessing R

SLU has an R Studio server available at http://rstudio.stlawu.local:8787 accessible through for SLU credentials. This server can be also accessed from off-campus through VPN. You are encouraged to set up VPN on your personal machines if you tend to do a lot of work off campus.

0.3 Time and Location

  • MWF 10:30 am - 11:30
  • Valentine Hall 124

0.4 Instructor

0.5 Course Materials

  • Text (Required): Stat2: Modeling with Regression and ANOVA (2nd ed.) by Canon et al.

    • The PQRC and Science Library both have copies available to borrow.
    • Solutions to odd numbered exercises are also available
  • Canvas: Canvas will used mainly for displaying aggregate grades, submitting work (e.g., exercise sets), and as a repository for a few useful links.

  • T drive: Most course materials will be posted on the T drive with the intent that we will import them to R Studio.

  • Calculator, Three-ring binder, and hole punch (recommended): You will receive a lot of handouts in this class. It will be very beneficial for you to keep these well organized. While you will typically have access to the computer, you may find having a calculator nearby useful. A graphing calculator is not necessary for this course (but if you have one that’s fine). While a cell phone is an acceptable substitute for a calculator for class examples, you will not be able to use one for quizzes or exams.

  • ChatGPT: Ha ha…just kidding! ChatGPT has been shown to be quite bad at generating correct solutions for programming (even though they look good). In fact, the leading source of programming help, Stack Overflow, has already banned the use of ChatGPT generated text from their site. (In other words, don’t trust it to provide much help. Instead, learn to sift through online examples on your own.)

0.5.1 Prerequisite

Although R is a statistical programming language (which for some people can initially be intimidating), no prior programming experience is expected. However, knowledge of basic statistics (such as those covered in STAT 113) is required.

0.5.2 Attendance

Unless ill or otherwise instructed by health official, students are expected to attend class. The material for each class builds on the previous day’s material and it becomes increasingly more difficult to catch up as you fall further behind. In the event you are ill (for whatever reason) and cannot attend class, I do appreciate those who send a brief email letting me know beforehand. Regardless of your reason, if you do miss a class, it is your responsibility to get the information you missed before the next class.

0.5.2.1 About attendance and timeliness in a computation-based course

For many of you, this will be the first computation-based course you have taken. Being in class and on time is crucial for your learning in this course. We will spend the vast majority of our time in RStudio, and many of your course notes will be in the form of R Markdown files. Absences and tardiness will leave holes in your notes, as well as in your understanding of course concepts. If you must miss a class, you are expected to get the day’s R Markdown files from a classmate before the next class.

0.6 Assignments

0.6.1 Homework Exercises

Homework exercises will be assigned each week and are intended to help you practice the material. I will normally assign two types of homework problems: Core exercises and Extra exercises.

Core problems will be turned-in (electronically, via Canvas). Details on deadlines for exercises will be forthcoming. These exercises are graded for completion only. While you do not have to get perfects on these problems, I expect that you have attempted, and provided an answer for, each part of the core homework questions.

Extra problems are provided for those that would like more practice (e.g., when preparing for quizzes and/or exams). While assigned at the same time as the core problems, you do not need to turn them in for credit.

You are strongly encouraged to attempt the homework as soon as it is assigned. A solutions manual to the odd-numbered book problems is available in the PQRC and Science Library. Since homework is for you to do outside of class, I will not answer any questions about it during class. Instead, feel free to ask me outside of class (such as during office hours).

0.6.2 Quizzes

Most Mondays we will have a quiz (and will be announced the class prior to the quiz).

Make-up quizzes will not be given unless your absence is due to a documentable University related reason (e.g., isolation due to covid, traveling with a sports team, or a class field trip). At the end of the semester, your lowest quiz score will be dropped.

While most quizzes will be in-class, I will occasionally assign a take-home quiz. If I do, they will be due at the beginning of the next class period.

0.6.3 Projects

There will be several projects of varying sizes assigned throughout the semester. More details will be forthcoming; however, I will expect that these assignments be typed in R Markdown and submitted through Canvas, unless otherwise specified. Some of the topics required for projects may not be directly covered in lecture.

0.6.4 Exams

We will have two in-class exams and a final. Each exam will have a in-class portion and a (smaller) take-home portion. Further details on the topics will be given closer to the exam dates.

In-class portions of the exam will be held in the Evening

The first exam will be Monday, February 27 from 7pm - 9pm. The take-home portion will be due by the start of class on Wednesday, March 1.

The second exam will be Monday, April 17 from 7pm - 9pm. The take-home portion will be due by the start of class on Wednesday, April 19.

Make-ups will not be given for exams unless your absence was cleared with me in advance and only documentable University related absences will be excused. After exams are returned, you have two weeks if there is a question; after that the grades are final.

The final exam will be cumulative and is scheduled by the registrar for Tuesday, May 9 from 8:30 – 11:30am. You can expect the take-home portion to be due by approximately noon on Wednesday, May 10.

0.6.5 Late Assignments

  • Exercises

You have a 24-hour grace period to turn in exercises in which you may still receive up to half credit. After that, no credit will be given.

  • Projects

You are allowed one no penalty 24-hour extension on a project this semester. You must notify me before the due date that you plan to use your extension. Once you have used your extension, any other late projects will receive a 25 percentage point penalty if submitted up to 24 hours past the original deadline. No additional extensions beyond 24 hours will be given.

  • Quizzes

As noted earlier, make-up quizzes will not be given unless your absence is due to a documentable University related reason. However, recall that you do get to drop your lowest quiz at the end of the semester. Further, no extensions will be granted for take-home quizzes.

  • Exams

Make-up exams will not be given for exams unless your absence was cleared with me in advance. Further, only documentable University related absences will be excused. No extensions will be given on take-home portions - plan accordingly.

0.7 Grading

Percentage grade the course will be determined according to the performance on the each of the assignment categories according to the following weighted average.

  • Quizzes: 15%
  • Exam 1: 20%
  • Exam 2: 20%
  • Exercises: 5%
  • Projects: 15%
  • Final Exam: 25%

0.7.1 Tentative Grade Scale

A rough grade scale for the class is as follows. I reserve the right to lower this scale without informing the class if I deem it necessary.

Score Grade
0.95 4.00
0.92 3.75
0.89 3.50
0.86 3.25
0.83 3.00
0.80 2.75
0.77 2.50
0.74 2.25
0.71 2.00
0.68 1.75
0.65 1.50
0.62 1.25
0.60 1.00

0.7.2 Pass/Fail

Pass/Fail is available to eligible students in this course. A passing grade is equivalent to a 1.0 or higher. According to University policy, to be considered eligible, you must not be a declared major or minor in a field where STAT 213 is either required (e.g., Statistics, Data Science, Math-Envs, or Math-Econ) or will be counted as an elective (e.g., Mathematics).

0.8 Tentative List of Topics

Date DoW Meeting Tenative Topic
18-Jan Wed 1 Stat 113 Review, Intro to R/RStudio
20-Jan Fri 2 Intro to Modeling
23-Jan Mon 3 Intro to Simple Linear Regression
25-Jan Wed 4 Intro to Simple Linear Regression
27-Jan Fri 5 SLR Model Assumptions, Assessing Model Fit
30-Jan Mon 6 Transformations
1-Feb Wed 7 Outliers and Influence
3-Feb Fri 8 Outliers and Influence
6-Feb Mon 9 Inference for Regression Parameters
8-Feb Wed 10 Inference for Regression Parameters
10-Feb Fri No Class - Mid Winter Break
13-Feb Mon 11 ANOVA for Regression
15-Feb Wed 12 ANOVA for Regression
17-Feb Fri 13 Confidence Intervals for Mean Response
20-Feb Mon 14 Prediction Intervals
22-Feb Wed 15 SLR Wrap-up
24-Feb Fri 16 Catch-up, Start MLR
27-Feb Mon 17 Exam 1
1-Mar Wed 18 Multiple Linear Regression
3-Mar Fri 19 Multiple Linear Regression
6-Mar Mon 20 MLR Assessing Model Fit; Estimation and Prediction
8-Mar Wed 21 Multicollinearity
10-Mar Fri 22 Indicator Variables
13-Mar Mon 23 Indicator Variables
15-Mar Wed 24 Indicator Variables
17-Mar Fri 25 Nested Models
20-Mar Mon Late winter break!
22-Mar Wed Late winter break!
24-Mar Fri Late winter break!
27-Mar Mon 26 Nested Models
29-Mar Wed 27 Polynomial Regression
31-Mar Fri 28 Interactions
3-Apr Mon 29 Complete Second Order Models
5-Apr Wed 30 Choosing Predictions
7-Apr Fri 31 Cross Validation
10-Apr Mon 32 Odds/Odds Ratio
12-Apr Wed 33 Logistic Regression
14-Apr Fri 34 Logistic Regression
17-Apr Mon 35 Exam 2
19-Apr Wed 36 Logistic Regression: Inference and Interpretations
21-Apr Fri 37 Logistic Regression: Predictions and Multiple Predictors
24-Apr Mon 38 Logistic Regression: Nested LRT and Classification
26-Apr Wed 39 Logistic Regression: Model Selection and wrap-up
28-Apr Fri No Class - Festival of Science, Scholarship, and Creativity
1-May Mon 40 One-way ANOVA
3-May Wed 41 Two-way ANOVA (additive model)
5-May Fri 42 Two-way ANOVA (interaction model)
9-May Tue Final Exam