Term Project
You are required to complete a term project that demonstrates your mastery of the data analytics pipeline. Therefore, your project must collect data from one or more sources via files, web API, or web scraping. The data must then be explored and shaped into analyzable form and stored in a database of your choice (relational or non-relational). The data must be selectively retrieved and used to construct one or more predictive models that are properly tuned, evaluated, and compared. Finally, the models must be used to make a prediction.
The analysis process and the results must be documented in an R Notebook that contains a combination of narratives, visualizations, and code fragments and follows the CRISP-DM framework. The results and the analysis process must be presented.
Note that this project should be a signature piece and should be usable for interviews and as part of your professional portfolio.
The analysis process and the results must be documented in an R Notebook that contains a combination of narratives, visualizations, and code fragments and follows the CRISP-DM framework. The results and the analysis process must be presented.
Note that this project should be a signature piece and should be usable for interviews and as part of your professional portfolio.
Rubric & Submission
Use this rubric and use it to guide your project:
- create a slide deck summarizing your business problem, data set used (and source), approach, CRISP-DM steps, and results
- be sure to put your name on the slide deck, notebook, and rubric
- upload the .nb.html file, the R Notebook, slide deck, data set, link to video presentation, and public clickable link to completed rubric to BB -- add the links in the comments section on Blackboard so they are easily accessible
- if the file(s) are too large, you can zip them and then upload a link to an externally stored file (e.g., Google Drive, OneDrive)
- record a narrated presentation of the slide deck and running select pieces of R code from your Notebook
- limit the presentation to 10 minutes
- upload the presentation to a video sharing site such as YouTube
- post a link to the presentation in your group on the BB Discussion Forum
- thoughtfully comment on all presentation of your colleagues in your group
- the presentation and comments are graded as an item in the rubric and are required to pass the project and course
- there is no in-class presentation
Sources for Data Sets
Submission Details
Total Number of Earnable Points: 100+ (60% minimum required to pass course)
Approximate Time to Complete: 20-25 hours
Due Date: see Calendar or Blackboard
Required Submissions: report (as PDF or HTML), all code, data set, rubric, video
Approximate Time to Complete: 20-25 hours
Due Date: see Calendar or Blackboard
Required Submissions: report (as PDF or HTML), all code, data set, rubric, video