Assignment_7_Fall2023(1)
pdf
keyboard_arrow_up
School
Rensselaer Polytechnic Institute *
*We aren’t endorsed by this school
Course
4960
Subject
Information Systems
Date
Dec 6, 2023
Type
Pages
2
Uploaded by MinisterHorseMaster970
Assignment 7: Data Analytics (Fall 2023) (20% written)
Due: Tuesday, November 28
th
2023 by 11:59pm EST
. (11/28/2023 by 11:59 pm ET)
Submission method: written by LMS Please use the following file naming for electronic
submission: DataAnalytics2023Fall_A7_YOURFIRSTNAME_YOURLASTNAME.xxx, etc.
Level: _________ (4000 or 6000)
Late submission policy:
If you are more than 5 days late it is likely that you will not have
your grade for this assignment included in your final grade before they need to be
submitted
. First time with valid reason
–
no penalty, otherwise 20% of score deducted each late
day.
Note: Your assignment should be the result of your own individual work. Take care to avoid
plagiarism (“copying”), and include references to all web resources, texts, and class
presentations. You may discuss the project with other students, but do not take written notes
during these discussions, and do not share your written assignment or presentation before the
class they are presented in.
General assignment: Predictive and Prescriptive data analytics. You should develop and validate
predictive models (regression, classification, clustering
–
using one or more of the methods
covered in class to date or one of your choosing) for
two
of the ten datasets below and apply
them for decision purposes. Use the section numbering below for your written submission for
this assignment. Include references
–
websites, papers, packages, data refs...
http://archive.ics.uci.edu/ml/datasets/News+Popularity+in+Multiple+Social+Media+Platforms
http://archive.ics.uci.edu/ml/datasets/detection_of_IoT_botnet_attacks_N_BaIoT
http://archive.ics.uci.edu/ml/datasets/Absenteeism+at+work
http://archive.ics.uci.edu/ml/datasets/Bank+Marketing
http://archive.ics.uci.edu/ml/datasets/Communities+and+Crime
.
https://archive.ics.uci.edu/ml/datasets/Cervical+Cancer+Behavior+Risk
https://archive.ics.uci.edu/ml/datasets/Estimation+of+obesity+levels+based+on+eating+habits+a
nd+physical+condition+
https://archive.ics.uci.edu/dataset/890/aids+clinical+trials+group+study+175
https://archive.ics.uci.edu/dataset/856/higher+education+students+performance+evaluation
https://archive.ics.uci.edu/dataset/20/census+income
Conduct the following analysis for
both
datasets.
1. Exploratory Data Analysis (3%) Explore the statistical aspects of both datasets. Analyze the
distributions and provide summaries of the relevant statistics. Perform any cleaning,
transformations, interpolations, smoothing, outlier detection/ removal, etc. required on the
data. Include figures and descriptions of this exploration and a short description of what you
concluded (e.g. nature of distribution, indication of suitable model approaches you would try,
etc.). Min.1 page text + graphics (required).
2. Model Development, Validation, Optimization and Tuning (14%) Choose two (4000-level*)
or three (6000-level) or more different models (e.g. a model with a different set/ number of
variables/ features in a regression, or classification, etc. does NOT count as a different model).
Explain why you chose them. Construct the models, test/ validate them. Explain the validation
approach. You can use any method(s) covered in the course. Include your code in your
submission. Compare model results if applicable. Report the results of the model (fits,
coefficients, graphs, trees, other measures of fit/ importance, etc.), predictors, and summary
statistics. Min. 4 pages of text + graphics (required). * 4000-level will receive extra credit for
6000-level responses.
3. Decisions (3%) Describe your conclusions in regard to the model fit, predictions and how well
(or not) it could be used for decisions and why. Min. 1 page of text + graphics.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help