Assignment7_MeethiSharma (1)
pdf
keyboard_arrow_up
School
Rutgers University *
*We aren’t endorsed by this school
Course
101
Subject
Mathematics
Date
Jan 9, 2024
Type
Pages
9
Uploaded by Meethi.exe
Project Overview
Education is a multifaceted field where various factors influence a student's
academic performance. Among these factors, the completion of test
preparation courses has garnered significant attention. These courses are
designed to enhance a student's readiness for exams, but their effectiveness
remains a subject of interest and debate. In this data science project, we dive
into a simulated dataset to investigate whether completing a test preparation
course is associated with higher math scores among students.
Hello!
I’m
Meethi,
I
attend
Rutgers
University-New
Brunswick,
and
I’m
majoring
in
Computer Science and Cognitive Science, with a possible minor in Data Science. As
a student, I recognise the significance of education and its impact on individuals and
society.
●
In
this
project,
we
aim
to
uncover
potential
connections
between
test
preparation and math performance by leveraging statistical analysis and data
visualization.
●
We will explore a fictional dataset that includes information about students,
their
parental
education
levels,
lunch
options,
test
preparation
course
completion, and their math scores.
●
The core of our analysis revolves around testing a hypothesis:
Does completing the test preparation course have a significant impact on
the probability of students achieving a math score above 70?
To
answer
this
question,
we'll
employ
statistical
tests,
including
hypothesis testing and Bayesian analysis, to evaluate the strength of the
relationship
between
test
preparation
course
completion
and
math
scores.
Data Set Used
➔
The "Student Performance in Exams" dataset, which is available on Kaggle, is a
comprehensive
dataset
that
covers
information about high school students'
academic performance.
➔
The
dataset
includes
student
demographic
information
such
as
gender,
race/ethnicity, and parental education, as well as their performance on maths,
reading, and writing tests. The dataset, which contains 1,000 observations, was
gathered through surveys.
➔
Overall, the "Student Performance in Exams" dataset is a valuable resource for
anyone interested in education research and analysis.
●
The dataset has been cleaned and prepped for study, making it appropriate
for researchers and data analysts interested in investigating the relationship
between student demographics and academic performance.
●
It can be used to answer a variety of research issues, such as whether
academic performance differs by gender or race/ethnicity, or whether parental
education levels influence a student's academic success. The information also
includes categorical variables such as lunch and exam preparation courses,
which
provide
further
insight
into
things
that
may
influence
academic
achievement.
●
Researchers
and
educators
can
use
this
information
to
obtain
a better
understanding
of
the
elements that contribute to academic success and
develop ways to improve student performance.
●
Please click on the following for viewing the dataset:
Students Performance in Exams
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Analysis of Data
Hypothesis Testing:
●
Hypothesis:
Does the completion of a test preparation course have a significant
impact on math scores compared to those who did not complete the course?
●
Null Hypothesis (H0):
There is no significant difference in math scores between
students who completed the test preparation course and those who did not.
●
Alternative Hypothesis (H1)
: Students who completed the test preparation course
have higher math scores than those who did not.
We will use a z-test to test this hypothesis, assuming that the math scores are normally
distributed.
Major Hypothesis
Hypothesis:
There is a significant difference in the probability of students achieving a math score
above 70 based on whether they completed the test preparation course or not.
Null Hypothesis (H0):
There is no significant difference in the probability of students achieving a math score above
70 between those who completed the test preparation course and those who did not.
In this hypothesis, we are testing whether completing the test preparation course has a
significant impact on the probability of students achieving a math score above 70.
The null hypothesis assumes that there is no difference in this probability between the two
groups, and the alternative hypothesis suggests that there is a significant difference.
Evidence of Data
The
box
itself
represents
the
interquartile
range
(IQR),
which
contains
the
middle
50%
of
the
data. The vertical line inside the box
represents the median.
The "whiskers" extend from the box
to
the
minimum
and
maximum
values within 1.5 times the IQR.
Any
data
points
beyond
the
whiskers
are
considered
outliers
and are plotted individually as dots.
The
plot
provides
a
visual
comparison
of
the
distribution
of
math scores between students who
completed
the
test
preparation
course and those who did not.
You can see the spread of scores,
the median value, and the presence
of any outliers in each group.
This code generates a bar chart to
compare the probabilities of getting
a
math
score
above
70
for
two
groups:
students
who
completed
the
test
preparation
course
("Completed")
and
students who did not complete
the course ("Not Completed").
This
visualization
helps
you
understand
and
compare
the
likelihood of achieving a high math
score in each group.
Evidence of Data
To further support the claim that completing a test preparation course is associated with
higher math scores, we can add a scatter plot and a bell curve (normal distribution) graph
for the math scores of both groups.
Results:
The scatter plot visualizes the relationship between completing the test preparation course
and math scores. Each point on the plot represents a student.
-
The x-axis shows the numeric label, where 1 represents students who completed the
course, and 0 represents those who didn't.
-
The y-axis shows the math scores.
Here's what you can observe from the scatter plot:
●
Students
who
completed
the
test
preparation course (labeled as 1) are mostly
concentrated on the left side of the plot.
●
Students who did not complete the course (labeled as 0) are scattered across the right
side of the plot.
This indicates a visual distinction between the two groups.
The scatter plot helps you see the distribution of math scores for both groups and
understand how completion of the test preparation course might be associated with math
scores.
In this case, it suggests that completing the course may be related to higher math scores,
as students with a label of 1 tend to have higher scores.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Results:
The bell curve
graphs visualize the distribution of math scores for students who completed
the test preparation course (left) and those who did not (right).
Here's what you can observe from these graphs:
The histograms
represent the distribution of math scores for each group, with the x-axis
showing the math score range and the y-axis showing the probability of observing a score
within each bin.
The bell curves (normal distribution curves)
overlay the histograms, representing the
expected distribution of scores if they were normally distributed. In each graph, the mean
and standard deviation of the math scores are used to generate the corresponding bell
curve.
The overlap between the histogram and the bell curve indicates how closely the data
matches a normal distribution.
The bell curve graphs
allow you to visually compare the distribution of math scores for
students who completed and those who did not complete the test preparation course. You
can assess the shape of the distribution, its central tendency, and spread.
In the context of your analysis, these graphs provide insight into how math scores are
distributed in each group and help you understand the underlying distribution of scores.
What about data sources - are they trustworthy?
So, has the data fooled us?
The data itself hasn't "fooled" us, but it's important to interpret the data carefully and
consider
the
limitations
of
the
analysis.
The analysis I have performed suggests an
association between completing the test preparation course and higher math scores, and the
visualizations support this association.
However, there are a few important points to keep in mind:
1. Correlation vs. Causation:
While the data shows an association, it doesn't prove
causation. Completing the test preparation course may be correlated with higher math
scores, but it doesn't necessarily mean that completing the course causes higher scores.
Other factors, such as the motivation of the students who choose to complete the course,
may also be at play.
2. Data Limitations:
The dataset you provided is limited in terms of the variables it includes.
There may be other confounding variables that are not accounted for in the analysis. For a
more robust analysis, you would need a more comprehensive dataset with a broader range
of
factors.
3. Sample Size:
The analysis is based on a specific sample of students. The results may not
be generalizable to all students, and the sample size can impact the statistical significance of
the findings.
4. Statistical Tests:
The results of hypothesis testing and Bayesian analysis provide
statistical evidence, but the significance level and effect size are crucial for interpreting the
results. A smaller p-value and a larger effect size would provide stronger evidence of an
association.
5. Educational Context:
To draw meaningful conclusions for educational practice, it's
important to consider the real-world context. Educational outcomes are influenced by a wide
range of factors, and interventions like test preparation courses should be considered as part
of a broader strategy.
In summary, the data analysis doesn't fool us, but it should be interpreted with caution.
It suggests an association between completing the test preparation course and higher math
scores, but it's essential to be aware of the limitations and consider the broader context
when making any educational decisions or policy recommendations.
Further research, including controlled experiments or observational studies, may be needed
to establish causation and gain a deeper understanding of the relationship between test
preparation and academic performance.
Conclusion
As per our data right now, we can conclude that there is a significant difference in the
probability of students achieving a math score above 70 based on whether they completed
the test preparation course or not.
Students who completed the test preparation course exhibit a higher likelihood of
obtaining a math score above 70 compared to students who did not complete the
course
. However, it must be kept in mind that this data is obtained through the means of a
survey, and therefore, all the information provided might not be accurate. There could be
other factors in the data (for example, Test Preparation, Lunch) that might have affected our
results and the probability of achieving a math score above 70.
Hence! some data might be biased. In the same sense, this data is a fictional dataset,
created for the purpose of a data science exercise. These results might not be true, and
there could be many other factors that could have affected our results and the probability of
students achieving a math score above 70 based on test preparation completion.
This conclusion emphasizes the significant difference in the probability of achieving a math
score above 70 based on test preparation completion and acknowledges the limitations and
potential biases in the data.
Sources
Kaggle:
https://www.kaggle.com/datasets/spscientist/students-performance-in-exams
Acknowledgements
Royce Kimmons:
http://roycekimmons.com/tools/generated_data/exams
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help