Activity Indexing R

docx

School

University of Guelph *

*We aren’t endorsed by this school

Course

232

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

4

Uploaded by ConstableStar6714

Report
Activity: Indexing Packages used: psych Data: GSS27.sav (SPSS) Objective: Learn how to complete a reliability analysis when indexing multiple variables to create one (large) variable. Overview: We learned about measures of association that measure how similar two (or more) variables are and how much these variables co-vary. At what point do similar variables actually measure the same underlying attitudes or phenomena? Sometimes when variables are measuring the same, or similar, attitudes or behaviors, the easy thing to do is to simply drop one measure. However, many attitudes (especially constructs) like political knowledge, efficacy, compassion, and trust are multi-dimensional, or multi-faceted, and answers to multiple questions may give us a more precise (valid) and reliable (consistent) estimate of a person’s opinions. For example, the variables we will use for this analysis looks at trust. Questions include: how much do you trust people in your family? how much do you trust people in your neighborhood? & how much do you trust strangers? Some people will trust all of these people, but some may only trust people in their neighborhood, and not strangers. We would probably want to use these questions to differentiate between people who are very trusting (of everyone) from those who are moderately trusting (or everyone, or select people) and those who are very distrustful. To do that, we would need to combine answers to multiple questions. This is important because trust is an important component of social capital, and may be closely related to high rates of political participation and engagement. Just choosing one of the above measures of trust may be inadequate for such explanatory tasks. Combining multiple measures is a rather easy, mechanical task. You can combine just about any variable within the same dataset. This activity involves reliability analysis that will test whether one ought to be able to combine multiple measures. By calculating Cronbach’s alpha, the computer tells us whether the variables ought to be combined based on the pattern of the observations. If Cronbach’s alpha is greater than 0.9 then the measure is excellent. If the alpha is greater than 0.8, the measure is good and the variables can be combined. Greater than 0.7, and some scholars think that the measure can is still acceptable for combining the variables. Less than 0.7, and the index is questionable (or worse) and the measures should not be combined. Explanation and Example(s): Reliability analysis is a way of evaluating whether multiple survey questions (or other variables) should be combined based on the distribution of answers. Remember that the computer does not know what the questions are asking and this analysis does not speak to how
similar the questions are – the analysis looks at the responses to see if they fit a pattern consistent with a scale. What would be such a pattern? Obviously, the answers would have to be highly associated. If these answers did not covary, they would presumably be measuring distinct things. More specifically, we would expect some questions would be “easy,” as many people answer in the affirmative or at the high end of the response scale. Other questions would be “hard” with few people answering in the affirmative or at the high end of the scale, but almost everyone who answers the hard question in a certain way would also answer the easy questions in that way. Some questions, of course, would fall in between easy and hard, but still we would expect that while fewer people would answer these medium questions in the affirmative compared to “easy,” they would overwhelmingly also answer the easy question in the affirmative. Hypothetically, with three questions you would have something like: Q1: 80% answer 1 (Yes) Q2: 60% answer 1 (Yes), 99% of whom also answered 1 to Q1 Q3: 30% answer 1 (Yes), 99% of whom also answered 1 to Q1 and Q2 What if there exists question Q4, with the same 60% affirmative answer rate as Q2, but with 99% of those who answered 0 to Q1. Therefore, about half of those who answered yes to Q1 would answer yes to Q4. Q4 would not fit as well as Q2 or Q3 onto this scale because it fails to share many responses with Q1 and those responses are less associated with Q1 and Q2. In politics, one opinion question that often is asked as a scale is abortion. It is far more accurate to ask people if they support abortion in select scenarios than a blanket question of supporting abortion in all/most circumstances. Most respondents tend to support a woman’s right to an abortion when the mother’s life is in danger. So that question would be “easy” (because so many answer in the affirmative). Far fewer respondents tend to support a woman’s right to an abortion when the mother discovers that the sex of the baby is not what the family desires. Most everyone who supports abortion when the baby’s sex is undesirable would be expected to also support abortion when the mother’s life is in danger. This is true even though few of those who support abortion when the women’s life is in danger also support abortion when the mother does not want a boy/girl. These two questions would scale, as would a question that gauges support for abortion when the fetus has severe birth defects. A question on whether one supports abortion when the father opposes the procedure, though, often did not fit the same scale, presumably because in addition to views about abortion rights, that question also tapped in views about family power. Instructions: A. Open up the data and load the necessary packages. B. Confirm that the following variables have had missing values correctly declared. The easiest way to do that is ask the computer to summarize the descriptive statistics for each variable using psych::describe or RCPA3::describeC. Everything should be coded correctly without any need to declare missing values, but it is always wise to check. Please notice that TMP_10 is not measured the same as the other five variables. If you have to recode, remember that it is best to create a new variable (just in case you make a mistake in your recoding)
TIP_10: Trust – members of family (1 = Cannot be trusted -> 5 = Can be trusted a lot) TIP_15: Trust – people in neighborhood (1 = Cannot be trusted -> 5 = Can be trusted a lot) TIP_20: Trust- people from work or school (1 = Cannot be trusted -> 5 = Can be trusted a lot) TIP_25: Trust – strangers (1 = Cannot be trusted -> 5 = Can be trusted a lot) TNP_10: Trust- Neighbourhood people (1 = Most -> 4 = Nobody) C. Create a data-frame that includes all five variables. You can name this data-frame anything. The following example calls it ‘gsstrust’ (this should be all on one line in R): gsstrust<- data.frame(GSS27$TIP_10,GSS27$TIP_15,GSS27$TIP_20,GSS27$TIP_22,GSS27$T IP_25,GSS27$TNP_10) D. Run the reliability analysis using the command psych::alpha – with the option check.keys=TRUE so that if any of the variables needs to be reversed, the computer will do so automatically. alpha(gsstrust, check.keys=TRUE) 1. What alpha (raw_alpha) do you find? a. 0.38 b. 0.50 c. 0.78 d. 0.81 E. Look below the first line of results to see lines marked “Feldt” and “Duhachek” above a line that says, “Reliability if an item is dropped.” In the that table, find a list of each of the variables. In the first column, raw_alpha, you will find the Cronbach’s alpha that would result if the variable is taken out of the index. 2. Which variable, if deleted, will increase the alpha? a. TIP_10 b. TIP_15 c. TIP_20 d. TIP_25 e. TMP_10 3. What does the computer predict will be the alpha be after you delete that one variable? a. 0.77 b. 0.78 c. 0.81 d. 0.83
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
F. Delete that variable from the analysis. Do this by creating a new data frame that excludes that one variable, and then run psych::alpha on the new data frame. 4. What alpha do you find? a. 0.77 b. 0.78 c. 0.81 d. 0.83 5. Does this mean that the four-item index is better or worse at measuring people’s level of trust? a. The four-item index is much better at measuring people’s level of trust than the five- item index. b. The four-item index is slightly better at measuring people’s level of trust than the five- item index. c. The four-item index is slightly worse at measuring people’s level of trust than the five- item index. d. The four-item index is much worse at measuring people’s level of trust than the five- item index. G. Create a data frame made up the following variables from GSS27, that record the answers to variations of the same question: “To what extent do you feel that Canadians share the following values?” SVR_10: Human rights (1 To a great extend -> 4 Not at all) SVR_25: Respect for the law (1 To a great extend -> 4 Not at all) SVR_30: Gender equality (1 To a great extend -> 4 Not at all) SVR_35: English and French as Canada’s official languages (1 To a great extend -> 4 Not at all) SVR_40: Ethnic and cultural diversity (1 To a great extend -> 4 Not at all) SVR_45: Respect for Aboriginal culture (1 To a great extend -> 4 Not at all) Tip: after you create the data frame, you can run psych::describe on the data frame instead of each individual variable. Then run psych::alpha on the new index. 6. What alpha (raw_alpha) did you find? a. 0.77 b. 0.78 c. 0.81 d. 0.83