ST2023-EC220 - detailed solutions

pdf

School

London School of Economics *

*We aren’t endorsed by this school

Course

220

Subject

Economics

Date

Nov 24, 2024

Type

pdf

Pages

Uploaded by ONYNO_

EC Introduction to Econometrics Exam Solutions Summer Section A (Answer all questions. This section carries / of the overall mark.) Question [ . marks] Mining is dangerous. We are interested in understanding the relationship between the price of min- erals and the number of mining accidents. There are two hypothesized mechanisms whereby an increase in the price of a mineral may affect mining safety: ( ) safety may improve because mines have more revenue and are able to put more funds into safety, ( ) the opportunity cost of safety is higher, and mines may use more dangerous mining practices because they want to maximize pro- duction when prices are high. We are curious about which effect is dominant. We gather data from N mines. We have data on y i , the number of accidents in the mine, price i , the local market price of the mineral that is mined, and labour i , the number of labourers hired by the mine. We consider a regression model ln( y i ) = β 0 + β 1 ln( price i ) + β 2 labour i + β 3 labour 2 i + u i ( . ) After estimating regression ( . ) with OLS, we obtain estimates of coef cients, shown in the Stata output below. (a) Perform a t-test with the null hypothesis that the approximate elasticity of accidents with re- spect to price is , holding xed labour and labour squared. [ . marks] Solution : Because price i and y i are both in logs, β 1 is approximately interpreted as the elas- ticity of accidents with respect to price holding xed labour and labour squared. Thus we are performing a t-test with H 0 : β 1 = 1 . The t-statistic is: 0 . 5751439 - 1 0 . 3267842 = - 1 . 300 © LSE ST /EC R Page of 6

| - 1 . 300 | < 1 . 96 We fail to reject H 0 at the % signi cance level. (b) At present, the regression equation allows for a quadratic relationship between labour i and ln( y i ) . We want to change this. How could we use dummy variables to write a regression equation that allows for an effect of labour i on ln( y i ) that differs in ranges: ( ) labour i tak- ing values from to , ( ) labour i taking values to , and ( ) labour i taking values or greater? (In case it is not clear, the regression you write should allow for the effect of labour i on ln( y i ) to change depending on which category labour i is in, but, within a category, changing labour i doesn’t affect y i . E.g. labour i being rather than should have an effect, but labour i being rather than should not have an effect). [6 marks] Solution : Let us de ne dummy variables, x 1 i = 1[ labour i ≤ 99] , x 2 i = 1[100 ≤ labour i ≤ 999] , and x 3 i = 1[1000 ≤ labour i ] which are dummy variables that labour i takes a value in the range shown in brackets. One possible regression equation is: ln( y i ) = β 0 + β 1 ln( price i ) + β 2 x 1 i + β 3 x 2 i + u i Alternatively, one of x 1 or x 2 could be excluded and x 3 could be included. Or instead, the con- stant could be excluded and x 3 included. For the rest of the exam, ignore any change to the regression equation you may have written in (b). (c) Our data on price is the local market price at the time accidents are measured. Mines often have contracts for prices that are signed many months in advance. Thus, the market price recorded in our data may not be the price the mine receives. Describe how this could affect estimates. Regardless of whether you believe this will cause bias, also describe any other po- tential source of bias that may be a concern in regression ( ). Discuss the direction of the bias and the intuition of each. [8 marks] Solution : We observe ln( price i ) = ln( price i ) * + w i where ln( price i ) * is the true natural log of the price received by the mine and w i is measurement error. Classical measurement error is when w i is uncorrelated with all regressors and with the regression error. In this case, we have attenuation bias, and our estimate is biased toward , | b β 1 | < β 1 . Because our estimate is positive, the bias is negative. If the measurement error is not classical, then measurement error will in general cause bias, however, the direction of the bias will depend on the form of the measurement error. Are there any other sources of bias to be concerned about? Reverse causality is probably not a concern. The number of accidents in a single mine should not be causing the market price of a mineral to change. A student could conceivably argue that accidents reduce production, lower supply, and affect the market price. This is likely a minimal effect, but would cause pos- itive bias because the outcome increasing (accidents going up) is associated with lower sup- ply, and thus higher prices (treatment going up). Credit is given if the discussion is logical, but students do not need to claim there is bias due to reverse causality. There may be other con- founders. For example, the presence of toxic fumes or the use of explosive equipment makes a mine less safe, and causes the number of accidents to go up. Mines that use such explosive equipment might also be mining in a region where the geology requires this, and the miner- © LSE ST /EC R Page of 6

als in such a region may be more (conceivably less) expensive. In this case, the confounder causes both ln( price i ) and y i to increase, causing positive bias. (d) Our price variable is the local market price for a mine. For each mineral, there are multiple mines in our data mining that mineral. We create a new variable, averageother i , which is the average of the log price for all other mines that mine the same mineral as i . Describe intu- itively and mathematically how averageother i could be used as an instrument for ln( price i ) . What biases discussed in question (c) might it solve? [ marks] Solution : Intuition: The goal of an IV is to create “good” variation in the regressor such that we overcome bias. averageother i is probably correlated with ln( price i ) because prices of the same mineral are probably correlated across geographic markets, thus the instrument will create variation, but is it good variation? The average price in other markets is probably not correlated with measurement error (the difference between the market price and the contract price for a single mine), thus our instrument creates variation which is plausibly uncorrelated with measurement error, and will help overcome that bias. The other major concern about bias was confounders, that factors (such as using explosive equipment) might be correlated with price and also accidents. The instrument probably does not help us overcome this bias. If one mine uses explosive equipment to mine a mineral, other mines that mine the same mineral probably also use explosive equipment, which is correlated with high prices. Thus, the variation created by the instrument would still be correlated with unsafe mining practices, and the bias would not be overcome. Mathematically: The rst stage regression is: ln( price i ) = δ 0 + δ 1 averageother i + δ 2 labour i + δ 2 labour 2 i + i We create predicted values from this, \ ln( price i ) . This represents the portion of ln( price i ) that is predicted by our instrument (hopefully having “good” variation that doesn’t cause bias). We then use these predicted values instead of the original data for ln( price i ) in the original equa- tion of interest, known as the second stage, ln( y i ) = β 0 + β 1 \ ln( price i ) + β 2 labour i + β 3 labour 2 i + u i If a student wrote about bias due to reverse causality in (c), then in the answer to (d) the stu- dent should comment on this as well. Such bias is probably overcome. Accidents in a mine are probably not correlated with the price of the mineral in other regions, so the instrument creates variation that is exogenous to the variation that causes bias. A student could argue that accidents in a mine affect prices in other markets, but I think this would be hard to justify. (e) What assumptions are needed for the IV approach in (d) to give causal estimates? Discuss if they seem valid in context. Which assumptions can be tested (if any)? [8 marks] Solution : Relevance and Exogeneity (which comprises “as good as randomly assigned” and “exclusion”). Relevance: averageother i is correlated with ln( price i ) . This is certainly true because price of a mineral will be correlated across geographic markets. Relevance can be tested by evaluat- ing if the estimate for δ 1 is signi cant in the rst stage regression. Exogeneity: Cov ( u i , averageother i ) = 0 . The instrument is uncorrelated with the error term © LSE ST /EC R Page of 6

Your preview ends here