Read in the Stat 113 data from http://myslu.stlawu.edu/~iramler/stat201/datasets/Stat113Fall2016.csv.
Find the mean GPA and number of students by section. Try using piping.
Select the SAT columns using the select function (see “Helper” part on sheet). Print rows with the top 10 Math SAT scores.
Count how many are missing Math SAT scores.
Count how many are missing at least one of the SAT scores. (Hint: Use | for “or”.)
Calculate BMI for students…append this new variable to the stat 113 data and create a new object called statBMI to store all the results.
Keep only columns of BMI and sport question. Call this new object sportBMI.
Keep only rows where people answered the sport question and replace sportBMI with this cleaned data.
table
function that is not part of the dplyr package.)head
function.If you have not done so already, combine parts e, f, and g into one long string of piped commands. (Be sure to drop the NA BMIs as well.)
Assuming that the Stat 113 students in this data represent a random (or at least representative) sample of SLU students, is there a statistically significant difference in average BMI values between athletes and non-athletes?
Using the 5 number summary, compare athletes vs non-athlete by gender. Hint: You’ll need to “remake” the data and possible clean it some first.
Using ggplots, graphically compare athletes vs non-athlete by gender. Hint: You have two factors here…if googling for help, be sure to include that in your search.