Phase 5 Statistics High School Dropout Rates per State Final Draft

docx

School

University of North Alabama *

*We aren’t endorsed by this school

Course

DA291

Subject

Economics

Date

Feb 20, 2024

Type

docx

Pages

9

Uploaded by AmbassadorTitanium7975

Report
High School Dropout Rates per State John Heacock May 3 rd , 2023 In my paper, I will be explaining how different variables affect students dropping out of high school. The variables that I chose were as follows: median household income, divorce rates, and race. There are a lot of jobs that require, at a minimum, a high school diploma. Children that drop out of high school lose a lot of opportunities compared to those people who finish high school. There are many factors that can potentially affect a child’s decision to finish high school. This research project isolates certain variables and shows that lower household incomes, divorced parents, and even race are big factors in children dropping out of high school. Without finishing high school, you will not be able to attend a college. College graduates are able to get better and higher paying jobs than those who do not finish college. In addition, without a high school diploma, some jobs are simply unattainable. Completing high school looks more attractive on a job application and proves that one can finish tasks that they start. In other words, they are not a quitter. There should not be much argument that it is better for someone to complete their high school degree. This research paper helps to illustrate some of the economic and societal variables that can impact that path for children.
I chose to research certain factors that can impact high school dropout rates per state. My question is of interest because I was able to determine some factors that contribute to high school students choosing to drop out, therefore providing an opportunity to address the situation. This question is relevant because in today’s society, high school drop outs are typically less successful than people who receive a high school diploma. A high school diploma gives a person a much higher ceiling for potential job opportunities. The variables that I chose to predict high school dropout rates per state were median household income, divorced versus married parents, and ethnicity. I chose to research this subject using data from 2018. I found that the mean household income in the data I collected was $62,013. The mean divorce rate was 8%. Caucasians led the majority of races with a mean of 67%. Originally, I intended to also include gender as a variable, but decided that the other three variables I chose were more related to the problem I was trying to evaluate. The data I found was on point and obtained from reliable sources. Additionally, the data was very thorough and readily accessible. I did not have to interpolate or otherwise rationalize the data inputs. My regression output was also very accurate. I did not have to change any variables at this point. I had a very good R-squared, which I will discuss in this paper. Also, my coefficients took the sign I thought they would, which got rid of any doubt I had about having issues with the model itself. I also did not have to alter my original model statement. My variables explained my hypothesis perfectly. There were no hiccups in my model. My regression results supported my research question because my variables took the expected value they should to produce a good linear regression equation. An F-Test is a test that proves whether your model is significant enough to be considered a significant model. The value of your Significance F must be below what your confidence interval is. For example, if your
confidence interval is 95%, your Significance F must take the value of a number that is less than 0.05. If it does not, than the model is not significant, and something needs to be fixed. My Significance F value was well below 0.05, proving that my model was significant. The R-squared explains how much of the variability is explained by your model. The R- squared must take a positive value, and it must be less than one. If your R-squared is over 0.7, this is an indicator that you may have a data problem. My R-squared was 0.63, meaning that 63% of the variability in high school dropout rates is explained by household income, divorce rates, and race. This is a very good R-squared because it is not over 0.7, but is also a very good percentage that explains my hypothesis well. Similar to a F-test, the T-test also tests whether a number is significant. But, instead of testing if the model is significant, it tests whether your variables are significant. All of your variables have a p-value. The p-value, for the variable to be significant, must fall below your confidence interval, just like on the F-test. If the confidence interval is 95%, your p-value must be below 0.05 for the variable to be significant. Not all variables are significant in your model though, especially when you use a variable such as race. When using a variable such as race, normally one race will be a significantly larger percentage than the others. In my model, Caucasian’s were the highest percentage of population. When this occurs, your other race variables will be less significant, because they are such a smaller number. In my model, out of Caucasian, African American, and Hispanic, the only race that was significant was Caucasian. This is because it was such a higher percentage then the other variables. The mean population percentage for Caucasian’s was 67%, with Hispanic following at 12%, and African American last at 11%. This means that because the Caucasian population was so much larger, it is the most significance race in providing information for the analysis.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Marginal effect tests how much y changes when one of your variables, (an x value), changes by one unit. For example, in my model, for every 1000 divorced couples, on average, 2 more children will drop out of high school. This might not seem like a huge change, but with the number of marriages and divorced marriages in each state, this is a bigger number. You must remember, this is only testing per 1000 divorced rates, not the total amount of divorces in each state. Marginal effect in my model tells you how many more kids will drop out based on the coefficient of each variable. A prediction for the model includes plugging in numbers to your linear regression to predict the number of the hypothesis you are testing. For example, in the state of Alabama, the median household income is $49,861, divorce rate per 1000 marriages is 9.8, the percentage of Caucasian’s is 65%, African American’s 26%, and Hispanic 4%. My linear regression formula is y = 0.1308 - 9.5765E-07x + 0.0020x – 0.0482x – 0.0258x + 0.0132x + e, (with e standing for standard error). By inputting these numbers into the equation, it produces a dropout rate percentage of 6.5%. Now, this is a little higher than the actual dropout percentage of Alabama, which is 5.3%. But that is where e, (standard error) comes in. The number the linear regression formula creates is a prediction, not the actual number. So, the answer would read: The predicted high school dropout rate percentage of Alabama is 6.5%. Since there are more things that will actually go into a student’s choice to dropout, the predicted value will never be exactly right, but with an R-squared of 63%, 63% of high school dropout rates variability can be shown with this model. Meaning that this model is a very accurate description of predicted high school dropout rates. My model supports my hypothesis because the R-squared is significant, the F-test is significant, and the coefficients that are expected to be significant are significant. When numbers
are plugged into my linear regression, they give an accurate prediction of what the high school dropout rate would be per state. If I were to try to improve my model, I may add average grade point average, or GPA, as an independent variable. I would imagine that with a lower GPA, a student has a higher chance to dropout of high school than a student with a higher GPA. Another idea for future research would be to determine the number of people who, instead of going to high school, go to some kind of trade school. Trade schools can also offer very good jobs such as welding. Some students are more fit for trade school than they are high school, which is completely fine. They may have talents in a different field than what high school can teach. Appendix A: The descriptive statistics analyzes each independent variable separate from the other independent variables. It will provide averages of each independent variable. For example, in median household income, it provides the mean which is $62,013. That means that the average household income in the 50 observations I collected is $62,013. It will also give you a range, which means that it shows the difference between the highest number and the lowest number for that independent variable. In divorce rates, the highest rate was 13%, and the lowest was 4.7%.
The difference in these two numbers is 8.3%, which gives you your range of the data. The standard error is how far off your answer may be from the actual answer, or showing how accurate your regression equation is. Descriptive statistics are helpful because they show you in depth analyzation of each independent variable, answering questions you may have about each variable. Appendix B: Appendix C:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help