14934514
pdf
keyboard_arrow_up
School
Mount Saint Vincent University *
*We aren’t endorsed by this school
Course
MISC
Subject
English
Date
Nov 24, 2024
Type
Pages
4
Uploaded by LieutenantTreeCobra43
American University of Sharjah
College of Engineering
NGN 112 Introduction to Artificial Intelligence and Data Science
Section 09 | Fall 2023
Project announcement date:
5/Nov/2023
Project submission date:
19/Nov/2023, Sunday, end of day
Start date of student presentations:
the last week of the semester, starting 20/Nov/2023
Project rules:
1.
This is a team-based project, each project consists of 3 or 4 students. It is your responsibility
to form a team.
Please email the list of team members and the dataset selected by
Wednesday November 8
th
.
2.
The project entails working on a classification problem and submitting a detailed report (details
below) and a PowerPoint presentation
3.
Your professor will email you the presentation orders. However, you need to be ready for
presenting starting on the 20
th
of Nov.
Submission:
You are required to submit the following:
1.
A report that contains all project requirements as specified below including:
a. Description of the dataset used.
b. Python code
c. Graphical summaries
d. Numerical summaries
e. Classification results
2.
PowerPoint slides to present your work to the class
3.
Only one student out of each team is required to make a submission on ilearn
Your report must contain a cover page with course information, semester, professor and names
and IDs of the team members.
The Python code:
The Python code you include in your report must be similar to what was introduced in class.
If you include code from other resources, then you must write the source of the information as a
comment in your code.
Project details:
Your professor will provide you with sample datasets that you can use for your project. These will
be classification datasets.
Once you have access to the dataset, you are required to perform the following:
1.
Descriptive tasks
:
a.
Write a description on the dataset using your own words
b.
Provide numerical summaries for the feature variables. This includes but is not limited
to measures of center and spread.
c.
Provide graphical summaries of the feature variables. This includes but not limited to,
boxplots, histograms, pair plots and heat maps.
d. Then comment on numerical and graphical summaries generated. In other words,
what are your observations?
2. Preprocessing tasks:
You are requested to repeat all the experiments in Section 4 below using the following
normalization techniques:
a.
Normalize the feature variables using z-scores.
b.
Normalize the feature variables using min-max.
3.
Data split into train and test sets:
You are requested to repeat all the experiments in Section 4 below using three splits of the
data. You can do that by fixing the random_state parameter to 1, 20 and 40. This will generate
3 different train and test sets. Then:
a.
Report the classification accuracies for each test split, as described in Section 4.
b.
Report the average classification accuracies for all test splits, as described in Section
4.
4. Classification:
You need to use all of the following classifiers:
a. Decision trees
b. k-NN with k=5
c. Naïve Bayes
d. SVM with polynomial kernel
e. SVM with RBF kernel
f.
Neural Networks
In summary, you need to carry out the experiments in the following manner:
Loop
for both normalization techniques of Section 2 of the project details
Loop
for each of the three data splits of Section 3 of the project details
Loop
for each classifier of Section 4 of the project details
So the total number of experiments is:
If your professor chose classification: 2(normalizations) x 3(data splits) x 6(classifiers) = 36
experiments
Report the results of these experiments as one big Table. Each cell in the table will have
the corresponding classification accuracy. An example Table is given below (Norm 1 and
Norm 2 stand for corresponding normalization scheme.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Test Split 1
Test Split 2
Test Split 3
Average
Classifier
Norm 1
Norm 2
Norm 1
Norm 2
Norm 1
Norm 2
Norm 1
Norm 2
k-NN
Decision
Trees
Naïve
Bayes
SVM
(polyn.)
SVM
(RBF)
Neural
Networks
Note on academic dishonesty:
AUS is strict about plagiarism and academic dishonesty. The work that you submit must be
developed by your and your team only. Otherwise, an academic dishonesty case will be reported
to the dean’s office and the penalty will range from receiving a zero in
the project to getting an XF
in the course.
Your professor will examine the code that you submitted and will ask you to explain it.