Milestone Due Dates

Abstract Friday, Sep. 27, On Google Colab, Submit your URL to Piazza 2 pts
Progress Checks Oct 15, Nov 01, Nov.193 pts
Presentation/ Demo Nov. 26 in Class, 3 minute talk presentation
5 pts
Final Report Dec. 06 (Complete report in Colab)5 pts

Introduction

The data project is an opportunity to tackle a more challenging data science activity. For the project, you are required to individually work on a dataset of your choosing that is interesting, significant, and relevant to Data Science. The ultimate goal of your data project is to apply the techniques learn in each week of the class towards your dataset (exploration, wrangling, machine learning, visualization). We are going to use Google Colab (Colaboratory) (https://colab.research.google.com/), a free Jupyter notebook environment that requires no setup and runs entirely in the cloud. With Colaboratory you can write and execute code, save and share your analyses, and access powerful computing resources, all for free from your browser

Project Abstract

The abstract (in Googel Colab) should include the following information: 

     Data  Source

     Your end goal with this dataset (build a recommender system, prediction model/classifier, evaluaiton of models, visualizing something, infer something, or something else)

     Any secondary datasets you are planning to utillize to augment your primary dataset (should be clearly specified that this is a secondary dataset)

     You can take as much space as you need for the project abstract, but I would guess that most would be in the two-three page range. 

     You need to have an acceptable abstract submitted by the deadline.

Project Presentation (3 Minute Talk)

Your presentation/Demo should briefly and succinctly tell us *why* we should care and *what* interesting insight you have about the chosen dataset. Give us some insight into the tough / cool / interesting aspects of your project. This is your time to shine, so carefully prepare what exactly you want to show off that will impress us in this summary. View the audience as potential upper management in your company -- so convince us that your problem is important, that you have the appropriate insight about the dataset.

During the 3 Minute Talk session, the author for each dataset should be prepared to present a less than three minute (180 second) preview talk about the main idea(s) of your project. The 3-minute time limit will be strictly enforced by a timer and buzzer. You'll be stopped right after the 3 minute mark whether you finish your summary talk or not!  Practice, practice, practice, and time your self before the presentation.

Follow the Guidelines preparing your Summary section for the talk (This should be at the very end of your Colab)

Project Final Report

A comprehensive report describing the project. This should be a "complete" document, so it should include front matter (title page, abstract, table of content, chapters), or a sidebar index that connect to your report elements. These should include problem statement, explain your design and implementation, results and evaluation. This report should stand by itself as the archival description of the project.   

 


Data sources for projects