Assignment7_MeethiSharma (1)

pdf

School

Rutgers University *

*We aren’t endorsed by this school

Course

101

Subject

Mathematics

Date

Jan 9, 2024

Type

pdf

Pages

9

Uploaded by Meethi.exe

Report
Project Overview Education is a multifaceted field where various factors influence a student's academic performance. Among these factors, the completion of test preparation courses has garnered significant attention. These courses are designed to enhance a student's readiness for exams, but their effectiveness remains a subject of interest and debate. In this data science project, we dive into a simulated dataset to investigate whether completing a test preparation course is associated with higher math scores among students. Hello! I’m Meethi, I attend Rutgers University-New Brunswick, and I’m majoring in Computer Science and Cognitive Science, with a possible minor in Data Science. As a student, I recognise the significance of education and its impact on individuals and society. In this project, we aim to uncover potential connections between test preparation and math performance by leveraging statistical analysis and data visualization. We will explore a fictional dataset that includes information about students, their parental education levels, lunch options, test preparation course completion, and their math scores. The core of our analysis revolves around testing a hypothesis: Does completing the test preparation course have a significant impact on the probability of students achieving a math score above 70? To answer this question, we'll employ statistical tests, including hypothesis testing and Bayesian analysis, to evaluate the strength of the relationship between test preparation course completion and math scores.
Data Set Used The "Student Performance in Exams" dataset, which is available on Kaggle, is a comprehensive dataset that covers information about high school students' academic performance. The dataset includes student demographic information such as gender, race/ethnicity, and parental education, as well as their performance on maths, reading, and writing tests. The dataset, which contains 1,000 observations, was gathered through surveys. Overall, the "Student Performance in Exams" dataset is a valuable resource for anyone interested in education research and analysis. The dataset has been cleaned and prepped for study, making it appropriate for researchers and data analysts interested in investigating the relationship between student demographics and academic performance. It can be used to answer a variety of research issues, such as whether academic performance differs by gender or race/ethnicity, or whether parental education levels influence a student's academic success. The information also includes categorical variables such as lunch and exam preparation courses, which provide further insight into things that may influence academic achievement. Researchers and educators can use this information to obtain a better understanding of the elements that contribute to academic success and develop ways to improve student performance. Please click on the following for viewing the dataset: Students Performance in Exams
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Analysis of Data Hypothesis Testing: Hypothesis: Does the completion of a test preparation course have a significant impact on math scores compared to those who did not complete the course? Null Hypothesis (H0): There is no significant difference in math scores between students who completed the test preparation course and those who did not. Alternative Hypothesis (H1) : Students who completed the test preparation course have higher math scores than those who did not. We will use a z-test to test this hypothesis, assuming that the math scores are normally distributed. Major Hypothesis Hypothesis: There is a significant difference in the probability of students achieving a math score above 70 based on whether they completed the test preparation course or not. Null Hypothesis (H0): There is no significant difference in the probability of students achieving a math score above 70 between those who completed the test preparation course and those who did not. In this hypothesis, we are testing whether completing the test preparation course has a significant impact on the probability of students achieving a math score above 70. The null hypothesis assumes that there is no difference in this probability between the two groups, and the alternative hypothesis suggests that there is a significant difference.
Evidence of Data The box itself represents the interquartile range (IQR), which contains the middle 50% of the data. The vertical line inside the box represents the median. The "whiskers" extend from the box to the minimum and maximum values within 1.5 times the IQR. Any data points beyond the whiskers are considered outliers and are plotted individually as dots. The plot provides a visual comparison of the distribution of math scores between students who completed the test preparation course and those who did not. You can see the spread of scores, the median value, and the presence of any outliers in each group. This code generates a bar chart to compare the probabilities of getting a math score above 70 for two groups: students who completed the test preparation course ("Completed") and students who did not complete the course ("Not Completed"). This visualization helps you understand and compare the likelihood of achieving a high math score in each group.
Evidence of Data To further support the claim that completing a test preparation course is associated with higher math scores, we can add a scatter plot and a bell curve (normal distribution) graph for the math scores of both groups. Results: The scatter plot visualizes the relationship between completing the test preparation course and math scores. Each point on the plot represents a student. - The x-axis shows the numeric label, where 1 represents students who completed the course, and 0 represents those who didn't. - The y-axis shows the math scores. Here's what you can observe from the scatter plot: Students who completed the test preparation course (labeled as 1) are mostly concentrated on the left side of the plot. Students who did not complete the course (labeled as 0) are scattered across the right side of the plot. This indicates a visual distinction between the two groups. The scatter plot helps you see the distribution of math scores for both groups and understand how completion of the test preparation course might be associated with math scores. In this case, it suggests that completing the course may be related to higher math scores, as students with a label of 1 tend to have higher scores.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Results: The bell curve graphs visualize the distribution of math scores for students who completed the test preparation course (left) and those who did not (right). Here's what you can observe from these graphs: The histograms represent the distribution of math scores for each group, with the x-axis showing the math score range and the y-axis showing the probability of observing a score within each bin. The bell curves (normal distribution curves) overlay the histograms, representing the expected distribution of scores if they were normally distributed. In each graph, the mean and standard deviation of the math scores are used to generate the corresponding bell curve. The overlap between the histogram and the bell curve indicates how closely the data matches a normal distribution. The bell curve graphs allow you to visually compare the distribution of math scores for students who completed and those who did not complete the test preparation course. You can assess the shape of the distribution, its central tendency, and spread. In the context of your analysis, these graphs provide insight into how math scores are distributed in each group and help you understand the underlying distribution of scores.
What about data sources - are they trustworthy? So, has the data fooled us? The data itself hasn't "fooled" us, but it's important to interpret the data carefully and consider the limitations of the analysis. The analysis I have performed suggests an association between completing the test preparation course and higher math scores, and the visualizations support this association. However, there are a few important points to keep in mind: 1. Correlation vs. Causation: While the data shows an association, it doesn't prove causation. Completing the test preparation course may be correlated with higher math scores, but it doesn't necessarily mean that completing the course causes higher scores. Other factors, such as the motivation of the students who choose to complete the course, may also be at play. 2. Data Limitations: The dataset you provided is limited in terms of the variables it includes. There may be other confounding variables that are not accounted for in the analysis. For a more robust analysis, you would need a more comprehensive dataset with a broader range of factors. 3. Sample Size: The analysis is based on a specific sample of students. The results may not be generalizable to all students, and the sample size can impact the statistical significance of the findings. 4. Statistical Tests: The results of hypothesis testing and Bayesian analysis provide statistical evidence, but the significance level and effect size are crucial for interpreting the results. A smaller p-value and a larger effect size would provide stronger evidence of an association. 5. Educational Context: To draw meaningful conclusions for educational practice, it's important to consider the real-world context. Educational outcomes are influenced by a wide range of factors, and interventions like test preparation courses should be considered as part of a broader strategy. In summary, the data analysis doesn't fool us, but it should be interpreted with caution. It suggests an association between completing the test preparation course and higher math scores, but it's essential to be aware of the limitations and consider the broader context when making any educational decisions or policy recommendations. Further research, including controlled experiments or observational studies, may be needed to establish causation and gain a deeper understanding of the relationship between test preparation and academic performance.
Conclusion As per our data right now, we can conclude that there is a significant difference in the probability of students achieving a math score above 70 based on whether they completed the test preparation course or not. Students who completed the test preparation course exhibit a higher likelihood of obtaining a math score above 70 compared to students who did not complete the course . However, it must be kept in mind that this data is obtained through the means of a survey, and therefore, all the information provided might not be accurate. There could be other factors in the data (for example, Test Preparation, Lunch) that might have affected our results and the probability of achieving a math score above 70. Hence! some data might be biased. In the same sense, this data is a fictional dataset, created for the purpose of a data science exercise. These results might not be true, and there could be many other factors that could have affected our results and the probability of students achieving a math score above 70 based on test preparation completion. This conclusion emphasizes the significant difference in the probability of achieving a math score above 70 based on test preparation completion and acknowledges the limitations and potential biases in the data. Sources Kaggle: https://www.kaggle.com/datasets/spscientist/students-performance-in-exams Acknowledgements Royce Kimmons: http://roycekimmons.com/tools/generated_data/exams
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help