Unit 4 Homework_ Data Distributions

pdf

School

CUNY Queens College *

*We aren’t endorsed by this school

Course

205

Subject

Economics

Date

Jan 9, 2024

Type

pdf

Pages

Uploaded by DrComputer9650

Unit 4 Homework: Data Distributions (42 points) Zander Guadalupe For this homework, we will use R built-in data. R comes with several built-in data sets related to the 50 states of the United States of America. Professor XU has combined these data sets into a single CSV file named “us_states.csv”. Below is a list of variables in this data: ● name: the full state names. ● abb: 2-letter abbreviations for the state names. ● region: the geographic region (Northeast, South, North Central, West) that each state belongs to. ● division: the geographic division (New England, Middle Atlantic, South Atlantic, East South Central, West South Central, East North Central, West North Central, Mountain, and Pacific) that each state belongs to. ● population: population estimate as of July 1, 1975. ● income: per capita income (1974). ● illiteracy: illiteracy (1970, percent of population). ● life_exp: life expectancy in years (1969–71). ● murder: murder and non-negligent manslaughter rate per 100,000 population (1976). ● hs_grad: percent high-school graduates (1970). ● frost: mean number of days with minimum temperature below freezing (1931–1960) in a capital or large city. ● area: land area in square miles. Exercise 1: Reviewing Variable Labels and Values (14 points) Let’s start by taking a look at the structure of the U.S. states dataset and what’s included in it. To do this, we use the str() command. Question 1. How many observations are there in this data set? How many variables? (2 points) 50 observations, 12 variables Question 2. Which variables are nominal? Which variables are interval or ratio? (12 points) Nominal: illiteracy, life_exp, murder, hs_grad Interval: population, income, frost, area

Exercise 2: Percentages in Tables and Charts (8 points) In class and in your readings, we’ve covered different measures of dispersion. The most basic is the percentage, which we can read from tables and from charts. In this exercise, we’ll go one step further to characterize the dispersion in a distribution. Here’s a frequency table for the variable geographic division in the U.S. states dataset, followed by one that reports on percentage and a bar chart. Question 1. What is the mode of geographic division, and what is the percentage of states that are located in this division? (4 points) Mountain and south Atlantic. 16% Question 2. Which geographic division includes the least number of states? What is the percentage of states that are located in this division? (2 points) Middle atlantic. 6% Question 3. How would you describe the distribution of geographic divisions in terms of dispersion? Low, medium, or high dispersion? Justify your answer. (2 points) I would say medium because all of the percentages seem to be around the same range. Exercise 3. Medians and Quartiles (10 points) The summary() command provides summary statistics on continuous/numeric variables and reports on the minimum and maximum, the quartiles, and the mean and median. Here we call this command for the variable income: We’ll pair this text output with a histogram of income so that we can visualize the shape of the distribution. Finally, we’ll also look at a boxplot for the same variable. You can do this by changing the geom_histogram argument in the command line to geom_boxplot . Because there is only one variable to examine here, R gives us a sideways rendering of the boxplot instead of an up and down one. Use all three pieces of information–the summary output, the histogram, and the boxplot to answer the questions in this section.

Question 1. Describe the distribution of per capita income. Use the following values in your discussion: range, interquartile range, mean, median, skew, and outliers. (6 points – one for each correct depiction of the keyword) Data range- 3271 Interquartile range - 443 Mean- 4436 Median- 4516 There is only one outlier which is almost over 6000. So it is skewed to the left. Question 2. Compare what you learn about the distribution of per capita income from the histogram and the boxplot. Which one do you find more helpful in summarizing the information and why? (4 points—either answer is acceptable as long as they back it up) I prefer the box plot because it's more informative and can draw my conclusion faster. Even though there are states pulling it to the left there is still definitely an outlier at the end of the box plot. Exercise 4: Using medians and distributions to compare states in different geographic divisions (10 points) In this next exercise, we will again use boxplots, this time to compare the per capita income for different geographic divisions. Question 1. Use the text output on means and standard deviations. Which geographic division had the highest mean per capita income in 1974? Looking at the standard deviation of its mean, how would you describe the dispersion of its per capita income relative to other geographic divisions? (2 points) Pacific had the highest mean (5183) per capita income in 1974. Looking at the standard deviation (654) of its mean which is very large as compared to other, the dispersion of its per capita in-come relative to other geographic divisions is high i.e. is data points vary much larger from its mean and shape of distribution is right skewed. Question 2. Now compare the income distributions using the boxplot. Which geographic division had the lowest median per capita income in 1974? How was the dispersion of the income distribution in this geographic division compared to other divisions? (3 points) East South center has the lowest median. The shape of distribution is left skewed. Question 3. Based on the text output and the boxplot, which geographic division had the highest level of income inequality in 1974? And why? (3 points) The South Atlantic has the largest income inequality. Because it has the largest boxplot. Question 4. Judging by the distribution of per capita income, which geographic division would you choose to live in? And why? (2 points for the reasoning only.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Students can choose any division, but they only receive points for good reasoning.) I would prefer to live in West North Central because the average income of this region is second largest which is good. Also it has lower standard deviation and the size of the box is small which means there is less income inequality and there are no outliers.

Related Documents

MGMT601_2305A_03_ColeKaylie_IP5-3.docx

Sample Midterm 2.docx

Sample Final Exam 2.docx

Sample Midterm 2 answer key.docx

Sample Midterm Exam 1_AK.docx

Wk 4 Summative Assessment.docx

Unit 2 Homework_ Research Process.pdf

[SOLUTIONS] ECON313 - PS3 (Winter 2023).pdf

Microeconomics Portfolio Project.docx

Unit 3 Homework_ Levels of Measurement.pdf

Payroll & Procedures Week 2 Computing Wages.xlsx

Evaluation of the Effectiveness of Using Statistical Tools to Evaluate and Assess the Performance of

Recommended textbooks for you

Principles of Economics 2e

Economics

ISBN:9781947172364

Author:Steven A. Greenlaw; David Shapiro

Publisher:OpenStax

Economics Today and Tomorrow, Student Edition

Economics

ISBN:9780078747663

Author:McGraw-Hill

Publisher:Glencoe/McGraw-Hill School Pub Co

Macroeconomics: Private and Public Choice (MindTa...

Economics

ISBN:9781305506756

Author:James D. Gwartney, Richard L. Stroup, Russell S. Sobel, David A. Macpherson

Publisher:Cengage Learning

Economics: Private and Public Choice (MindTap Cou...

Economics

ISBN:9781305506725

Author:James D. Gwartney, Richard L. Stroup, Russell S. Sobel, David A. Macpherson

Publisher:Cengage Learning

EBK HEALTH ECONOMICS AND POLICY

Economics

ISBN:9781337668279

Author:Henderson

Publisher:YUZU

Essentials of Economics (MindTap Course List)

Economics

ISBN:9781337091992

Author:N. Gregory Mankiw

Publisher:Cengage Learning

SEE MORE TEXTBOOKS

Recommended textbooks for you

Principles of Economics 2e
Economics
ISBN:9781947172364
Author:Steven A. Greenlaw; David Shapiro
Publisher:OpenStax
Economics Today and Tomorrow, Student Edition
Economics
ISBN:9780078747663
Author:McGraw-Hill
Publisher:Glencoe/McGraw-Hill School Pub Co
Macroeconomics: Private and Public Choice (MindTa...
Economics
ISBN:9781305506756
Author:James D. Gwartney, Richard L. Stroup, Russell S. Sobel, David A. Macpherson
Publisher:Cengage Learning
Economics: Private and Public Choice (MindTap Cou...
Economics
ISBN:9781305506725
Author:James D. Gwartney, Richard L. Stroup, Russell S. Sobel, David A. Macpherson
Publisher:Cengage Learning
EBK HEALTH ECONOMICS AND POLICY
Economics
ISBN:9781337668279
Author:Henderson
Publisher:YUZU
Essentials of Economics (MindTap Course List)
Economics
ISBN:9781337091992
Author:N. Gregory Mankiw
Publisher:Cengage Learning