3.
Below is a
Two-way table
or cross-tabulation of sex and marital status.
a)
What evidence in the table suggests sex and marital status are not statistically independent?
Statistical
independence requires that the joint probability is the product of the marginal probabilities, e.g. from the table
above, the marginal probabilities are: P(Male) =
.4801, P(Widowed) = .0602.
-
P(AnB) is 0.011,
P(A)0.4801 x P(B)0.0602=P(A)P(B)= 0.029. For two variables to be statically
independent, P(AnB)= P(A)P(B). However, based on the calculations above 0.011 does not equal 0.029.
b)
Explain this evidence in a few sentences. I.e what may explain for the differences between the male distribution
and the female distribution?
-
Women are more likely to be separated, widowed or divorced than men. On average, men get married
later than women which showcases the fact that men do not get married young. On the other hand,
women get married young and have a less greater life expectancy than men.
4. A
Linear regression is estmated using hourly wage in dollars(imphwe1) as the dependent variable. Independent
variables are a constant(cons), a person’s age in years(ecage26), and their years of work experience(yrxfte11).
a)
R-squared
-
0.1033
roughly 10% of the variance in the hourly
wage rate is determined through years of
experience and age. Therefore, the other 90% of the variance may be determined by other factors which
are missing in the data group such as income.
b)
Regression coeficients, including the constant.
-
16.12
For a person of age 0 that has no experience, the hourly wage would be $16.12/ hour
(constant)
-
0.076
For every additional age, an addition of 7.6 cents/hour is added to the hourly wage rate
-
0.268
for every additional year of experience, an addition of 26.8 cents/hour is added to the hourly
wage rate
c)
Coeficient standard errors.
-
For a given mean of the distribution, if the error of the mean is comparable to the mean itself, then the
estimate is probably not a good one.
d)
t statistics
-
The t-statistic showcases and examines the amount of standard errors the coefficient is away from 0. Any
t-value greater than 2 or less than 2, in absolute values, is a good indicator since greater confidence
levels come from higher t-values.
7.24
23.94
53.19
e)
p-values
(in the P>|t|) column
-
The data above showcases a confident rejection of the null hypothesis since the coefficients are equal to
0.
f)
95% confidence intervals
-
These values showcase the true mean of the population within a 95% range.
.0555472
.0967696
.2459602
.2898193
15.52841
16.71656