Assignment 11
Learning Objectives
- assess normality through visual inspection
- apply an ANOVA to determine significance of a factor
- construct and evaluate a multiple regression model
- interpret principal components and build a minimal model that avoid overfitting
- calculate confidence intervals
- judge and present results
Data Files & Tools
Tasks
After reading the case study background information, using the UFFI data set above, answer these questions using R.
- Are there outliers in the data set? If so, what is the appropriate action and how are they discovered?
- Using visual analysis of the sales price with a histogram, is the data normally distributed and thus amenable to parametric statistical analysis?
- Using a z-test, is the presence or absence of UFFI alone enough to predict the value of a residential property?
- Is UFFI a significant predictor variable of selling price when taken with the full set of variables available?
- What is the ideal multiple regression model for predicting home prices in this data set? Provide a detailed analysis of the model, including Adjusted R-Squared, MAD, and p-values of principal components.
- On average, how do we expect UFFI will change the value of a property?
- If the home in question is older than 45 years old, doesn’t have a finished basement, has a lot area of 5000 square feet, has a brick exterior, 2 enclosed parking spaces, 1700 square feet of living space, central air, and no pool, what is its predicted value and what are the 95% confidence intervals of this home with UFFI and without UFFI?
- If $215,000 was paid for this home, by how much, if any, did the client overpay, and how much compensation is justified due to overpayment?
- Build predictive models for forecasting prices for the next year based on average historical sales prices per year. You must build a weighted moving average model with weights of 5, 3, 2 where 5 is the weight for the most recent year, an exponential smoothing model with an alpha of 0.8, and a linear regression time series model. Evaluate the models based on their MSEs. Calculate the forecast for the next year and provide a 95% confidence interval for the linear regression time series model forecast using standard error.
- What is the ideal set of weights for the moving average model?
Deliverables & Submission Instructions
Submit your report, along with .Rmd plus your .nb.html file generated by R Notebooks combined into a zip file; add any Excel workbooks you may have created. Upload the zip file to Blackboard. Spelling, grammar, formatting, and presentation count. Add graphs, tables, and charts as needed to support your arguments. This must be a report that can be presented in court!
Scoring
Total Number of Earnable Points: 100
Approximate Time to Complete: 6-8 hours
Due Date: see Calendar or Blackboard
Approximate Time to Complete: 6-8 hours
Due Date: see Calendar or Blackboard