Final Project Summary Report OH

docx

School

University of Texas, Arlington *

*We aren’t endorsed by this school

Course

3321

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

Uploaded by PresidentRainFrog35

BSTAT 3321 Final Project Summary Report Omar Hernandez Omar.hernandez3@mavs.uta.edu The University of Texas at Arlington

The statistics that were analyzed for the dataset crime.csv focuses on diverse variables across 51 states. The dataset included the rates of murder, poverty, high school graduation, college attendance, percentage of single parent households, and the population distribution in metropolitan areas. The main objective was to conduct a statistical analysis to find the relationship between these various socioeconomic factors with the help of utilizing linear regression models. This statistical analysis will help determine factors that influence crime rates across the United States. The analytical process involves data preparation, correlation tests, and the use of linear regression and multiple linear regression. These methods help provide a comprehensive understanding of the relationships within the dataset, enabling more informed decision-making and give a better understanding of the factors influencing crime rates in the United States. The analysis’ primary result aids to enhance decision-making tools for policymakers and law enforcement agencies. These decision-making tools can be provided by identifying the socioeconomic reasons associated with the crime rates in different regions. Data preparation involves the refining of the crime.csv dataset by removing the non- numerical variable “state” to create the dataset known as crime.data. By removing this variable, the dataset becomes more suitable for the application of statistical models like linear regressions and multiple linear regressions. This dataset specifically includes murder rates, poverty rates, high school graduation rates, college attendance rates, the percentage of single parent households, and population distribution in metropolitan areas. The newly refined dataset is now able to provide analysis about the relationships between socioeconomic variables and crime rates across different states. In simple linear regression, the focus is on predicting the murder rate based on a single predictor variable, the percentage of single parent households. The model focuses on the linear relationship between these two variables, and because of its simplicity, it results in this equation: murder rate =− 8.2477 + 0.5595 ∗ single parent percentage . The P value associated with the predictor variable, is (P < 0.001), indicating that this predictor variable significantly influences the variation in murder rates across states. P-Value Value P ( T ≤ t ) One Tail 9.822 E − 47 P ( T ≤ t ) Two Tail 1.964E-46 To illustrate the practical application of this model, imagine a state with a single parent percentage of 29. The model predicts a murder rate of approximately 7.98. In contrast, a state with a single parent percentage of 25.4 is estimated to have a murder rate of approximately 5.96. These statistics offer tangible insights into the potential impact of single-parent percentages on murder rates, demonstrating the utility of simple linear regressions in analyzing relationships. In multiple linear regression, a model is used to predict the murder rate based on several predictor variables simultaneously. These variables include poverty rates, high school graduation rates, college attendance rates, unemployment rates, and the proportion of the population residing in metropolitan regions. The model's overall P value indicates that at least one predictor variable has statistical relevance in forecasting murder rates.

Statistic Value P-Value 9.0139E − 07 The overall P value for this multiple linear regression model is highly significant at (P<0.0001). This indicates the statistical relevance of at least one predictor variable in forecasting murder rates. This insight signifies that the selected predictors have statistical importance in shaping variations in murder rates. Each of the identified variables (poverty rates, high school graduation rates, college attendance rates, unemployment rates, and the proportion of the population residing in metropolitan regions) plays a distinct role in the multiple linear regression model. The coefficient of determination ( R 2 ) indicates that approximately 57.62% of the variance in murder rates can be explained by the combination of these socioeconomic variables. The equation for the final multiple regression model is: murder rate + 2.2427 + 0.0879 ∗ poverty − 0.1129 ∗ high school − 0.0195 ∗ college + 0.3650 ∗ single parent 0.2326 ∗ un The predicted murder rate based on this final model, with the given values of the predictor variables (poverty: 10, high school: 80, college: 25, single parent: 30, unemployed: 5, metropolitan: 60), is 7.6652, or close to 8 people. The final multiple regression model includes the coefficients for each predictor variable, allowing for the prediction of murder rates based on given values. In conclusion, the analysis of the crime dataset using linear regression models has provided valuable insights into the relationship between socioeconomic factors and crime rates across the United States. This analysis also gave the correct tools and practical insights for decision-making. The analysis showed that the percentage of single parent households has a significant influence on murder rates, with higher percentages correlating to higher murder rates. Additionally, the multiple linear regression model demonstrated that poverty rates, high school graduation rates, college attendance rates, unemployment rates, and the proportion of the population residing in metropolitan regions collectively explain approximately 57.62% of the variance in murder rates. These findings can help policymakers and law enforcement agencies make more informed decisions and develop effective strategies to address crime rates in different regions of the U.S.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version