--- output: word_document --- Project #2 ======================================================== ```{r echo=FALSE} library(readr) #need to use the slunova() function to provide modified ANOVA output - put a path to it here source("../../RScripts/slunova.R") ``` ```{r echo=FALSE} #put a path to wherever student datafiles are stored path="../../Data/Project1/" #edit "Mystudent.csv" in the chunk below for a specific student's data ``` ```{r message=FALSE} cars=read_csv(paste(path,"Mystudent.csv",sep=""),show_col_types = FALSE) ``` **Model #1 Age & Mileage** ```{r} mod1=lm(price~age+mileage,data=cars) ``` ```{r echo=F} x=summary(mod1) x$coeff a=anova(mod1) ``` (1) R-sq=`r round(x$r.squared,3)` , adjR-sq=`r round(x$adj.r.squared,3)`, $\hat{\sigma}_\epsilon=$ `r round(x$sigma,3)`, F=`r round(x$fstatistic[1],2)` with 2 and `r x$df[2]` df, p-value=`r 1-pf(x$fstatistic[1],2,x$df[2])` ```{r echo=F, results=F} aage=cars$age[1] apred=round(mod1$fitted[1],2) aprice=cars$price[1] amiles=cars$mileage[1] ares=round(mod1$residual[1],2) ``` (2) For age=`r aage` and mileage=`r amiles` we have $\widehat{price}=$ `r apred` and residual=`r ares` ```{r echo=F} source("../../RScripts/VIF.R") v=vif(mod1) ``` (5) VIF=`r round(v[1],2)` **Model #2: Polynomial** ```{r} mod2=lm(price~age+I(age^2),data=cars) ``` ```{r echo=F} m=mod2$coeff ``` (6-7) $\widehat{price}=`r m[1]`+`r m[2]`age+`r m[3]`age^2$ ```{r fig.width=4.3, fig.height=4.3,echo=F} plot(price~age,main="Quadratic",xlim=c(0,30),ylim=c(0,max(cars$price)+2),data=cars) curve(m[1]+m[2]*x+m[3]*x^2,add=T) abline(0,0) ``` ```{r} newx=data.frame(age=3) forecast=predict.lm(mod2,newx,interval="prediction") ``` (7) The predicted price for a 3 year old car is \$`r round(forecast[1],3)` thousand and we are 95\% sure that the price will be between \$`r round(forecast[2],3)` thousand and \$`r round(forecast[3],3)` thousand. (10) Cubic model ```{r} modCube=lm(price~age+I(age^2)+I(age^3),data=cars) ``` ```{r echo=F} x=summary(modCube) x$coeff ``` Cubic: R-sq=`r round(x$r.squared,3)` , adjR-sq=`r round(x$adj.r.squared,3)`, $\hat{\sigma}_\epsilon=$ `r round(x$sigma,3)`, F=`r round(x$fstatistic[1],2)` with 3 and `r x$df[2]` df, p-value=`r noquote(format(1-pf(x$fstatistic[1],2,x$df[2]),digits=2))` ```{r echo=F} x=summary(mod2) ``` Quadratic: R-sq=`r round(x$r.squared,3)` , adjR-sq=`r round(x$adj.r.squared,3)`, $\hat{\sigma}_\epsilon=$ `r round(x$sigma,3)`, F=`r round(x$fstatistic[1],2)` with 2 and `r x$df[2]` df, p-value=`r noquote(format(1-pf(x$fstatistic[1],2,x$df[2]),digits=2))` **Model #3: Complete Second Order** (11) $price=\beta_0+\beta_1age+\beta_2miles+\beta_3age^2+\beta_4miles^2+\beta_5age*miles+\epsilon$ ```{r} mod3=lm(price~age+mileage+I(age^2)+I(mileage^2)+I(age*mileage),data=cars) ``` (12) Fit the second order model ```{r echo=F} x=summary(mod3) noquote(format(mod3$coeff,scientific=FALSE,digits=4)) ``` ```{r echo=F} xa=slunova(mod3) xa[1:5] ``` (13) $H_0:\beta_3=\beta_5=0$ vs. $H_a: \beta_3 \ne 0$ or $\beta_5 \ne 0$ ```{r echo=F} mod4=lm(price~age+mileage+I(mileage^2),data=cars) xb=slunova(mod4) xb[1:5] ``` ```{r echo=F} SSMFull=xa[1,2] SSMRed=xb[1,2] SSE=xa[2,2] dfE=xa[2,1] MSEFull=xa[2,3] Delta=SSMFull-SSMRed Fstat=(Delta/2)/MSEFull pv=1-pf(Fstat,2,dfE) ``` $F=\frac{(`r round(SSMFull,1)`-`r round(SSMRed,1)`)/2}{`r round(SSE,1)`/`r dfE`}=\frac{(`r round(Delta,1)`)/2}{`r round(SSE,1)`/`r dfE`}=\frac{`r round(Delta/2,1)`}{`r round(MSEFull,1)`}=`r round(Fstat,2)`$ p-value with 2 and `r dfE` df is `r round(pv,5)`.