Project Proposal (Due April 20th)
Formally write up your proposed project. Your write-up should address each below point individually, It should be single spaced, grammatically correct, and submitted to Blackboard by the deadline. Include in your project the following: Project name (descriptive and concise). Significance of the project Dataset description Describe the contents of the dataset. Link to where it can be located Dataset format Provide a description of the attributes and target variable. Implementation What type of pre-processing, EDA and modeling you anticipate using? Results Why are the results useful? Who would be interested in the results?
Technical Report (Integrated in Jupyter Notebook).
You need to write a technical report describing your approach and findings. Your report must be written in Jupyter Notebook and interleaved with your python code. The report should be organized, clear, concise and easy to understand and follow. Your notebook should have the following sections at a minimum (in the order given below): Introduction: This section must briefly describe the dataset you used and the data mining task you implemented. Briefly describe your findings. Data Analysis: This section must provide details about the dataset. You must include: Information about the dataset itself, e.g., the attributes and attribute types, the number of instances, and the attribute being used as the label. Relevant summary statistics about the dataset. Data visualizations highlighting important/interesting aspects of your dataset. Visualizations may include frequency distributions, comparisons of attributes (scatterplot, multiple frequency diagrams), box and whisker plots, etc. The goal is not to include all possible diagrams, but instead to select and highlight diagrams that provide insight about the dataset itself. Note that this section must describe the above (in paragraph form) and not just provide diagrams and statistics. Also, each figure included must have a figure caption (Figure number and textual description) that is referenced from the text (e.g., “Figure 2 shows a frequency diagram for …”). You should provide you source code using Jupyter Notebook and files. Modeling Results: This section should describe the modeling approach you developed and its performance. Explain what techniques you used, briefly how you designed and implemented model, how you tested the predictive ability, and how well it performs. Conclusion: Provide a conclusion of your project, including a short summary of the dataset you used and any of its inherent challenges, the modeling approach you developed and any ideas you have on ways to improve its performance Project Submission
Submit your project to blackboard by the due date, no late submissions will be accepted.
You should submit a well-documented Jupyter Notebook and dataset files. Submit both .ipynb and .pdf files, name your files First_Lastname_FinalProject.ipynb.