Week 8 - Lecture

docx

School

American Military University *

*We aren’t endorsed by this school

Course

300

Subject

Statistics

Date

Jan 9, 2024

Type

docx

Pages

20

Uploaded by papadogiannis

Report
Week 8: Consumer of Scholarly Research Overview: Welcome to Week 8. In the previous lessons, you learned the importance of writing to the audience (not addressing the audience) as you construct a research proposal. You learned how to conduct a literature review. You also learned the significance of performing ethical research. You learned about the different approaches you may use in conducting research (qualitative, quantitative, and mixed methods), the meaning and importance of validity and reliability pertaining to the instrument used to obtain data, and the differences between quasi-experimental and non-experimental research. All of this information was intended to assist you in becoming a better researcher and a better consumer of research. The purpose of statistical analysis is to take the numerical data obtained from the study and interpret it in a meaningful way (Ellis et al., 2010). This week's lesson focuses on interpreting the information before you. Course Objective(s): CO8: Produce an academic research proposal based on a thorough analysis of a current issue in the social sciences reflecting the need for further study and demonstrating a well thought out research design. MO1: Understand basic statistical analysis and its interpretation. MO?2: Understand difference between descriptive and inferential statistics. MO3: Understand what is meant by type | and Il errors. MO4: Understand significance levels and the importance of P-values. MO5: Understand measurements of spread, and measurements of central tendency.
SSGS300 RESEARCH METHODS LESSON EIGHT R e el i N S 5 I T A g B R s g R e T Wt Wi - T o B ety B B e e g EEete s % § R el § PP REA PRSI I e RO R BT B LREOE PEVEDT RS/ TS BRD oo FOR R e § o FREAE FRAER T o Toe? Moot Bl B R EOF R L nad R F Riend e WOW ¥ PR E et Tn® B § TR R s SR & w 3 - s o R el WoR Foag e A B T Have you ever been ovenvhelmed reading a research study? How many times reading a study have you found yourself simply iust wanting the botiom- line because you are lost in symbols and stalistical jargon? This lesson will hriefly describe what you should look for o find the answers 1o these guestions as well as 1o identify what you need 1o include in your completed study when the time comes, The purpose of statistical analysis is {0 take the numerical data obtained from the study and interpret it in a meaningful way. With this, the data collected from the study will turn into evidence that may prove or disprove a phenomenon, Most importantly, T will answer two of the most important guestions: Did the introduction of a program or a treatment make a difference in the experimental group? And is the difference statistically significant? Topics 10 be covered include: Basic statistical analysis and #s interpretation Descriptive siatistics Measurements of central tendency Measurements of spread inferential statislics Common {esting methods Type | and H errors Significance levels P-values
Descriptive Statistics Population by Sex and Age Total Population: 601,723 T T 1 40,000 20,000 0 20,000 40,000 Male Female DESCRIPTIVE STATISTICS PROVIDE SUMMARIES Descriptive statistics provide summaries of the samples and measurements of a study. They deal with the graphic representation, enumeration, and organization of data. Typically, descriptive statistics involve reports of the mean (average) and frequency (how often a certain answer/number occurs). The purpose of descriptive statistics is to simply describe the findings of a study, not to draw conclusions from the statistics. EXAMPLES This type of statistics takes large sets of observation data and parses it down into easily understandable and meaningful numbers. An example of descriptive statistics is the national population census of the United States. All the residents in the United States were asked to provide information, such as age, sex, race, and marital status. This data can then be arranged into charts, graphs, and tables that describe the characteristics of the population during a specific timeframe. IMPORTANCE One reason why it is important to utilize descriptive statistics in research is because if a researcher simply presented raw data it would be difficult for the reader to visualize what the data was showing, especially if there was a great amount of it. Descriptive statistics therefore enable the researcher to present the data in a more meaningful way, allowing for a simpler interpretation of the
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
data. Imagine if a police department had a set of data showing crimes committed in a certain region. The department may be interested in the overall rise or fall in crime in that region, but they would also be interested in its distribution or spread; for instance, how many crimes may be committed in a certain neighborhood as opposed to another. Descriptive statistics allow for this. COMBINATION OF FORMATS When we use descriptive statistics it is useful to summarize groups of data using a combination of tabulated description (tables), graphical description (graphs and charts) and statistical commentary (textual explanation). Generally speaking, there are two types of descriptive statistics, those that measure central tendency and those that measure spread. Measurements of Central Tendency
mode 50%50% median mean
A graphical representation of the mode, median, and mean One type of descriptive statistics are measurements of central tendency. These are numbers that describe the central position of a distribution. Mean: Most often this central position is the mean, or the average of a data set when all numbers in the distribution are added together and divided by the number of members in that set. Median: Other times it can be the median, or the number in the center of the data set when the numbers are arranged from least to greatest. Mode: The mode is the number that appears the most frequently in the data set. Measures of central tendency are sometimes called measures of central location or summary statistics. STRENGTHS OF MEAN The mean, while the average of all the numbers in a data set, is not often one of the actual values that you may observe in a data set. One of its important properties is that it is the one value that is closest to all the rest. An important property of the mean is that it includes every value in the set as part of the calculation. In addition, the mean is the only measure of central tendency where the sum of the deviations of each value from the mean is always zero. WEAKNESS OF MEAN However, the mean of a data set is not always the best number to use for a measurement of central tendency. It is particularly susceptible to the influence of values that are especially small or large in numerical value. An example may be a data set containing the salaries of a group of ten employees in a local office. The general manager at this office makes a salary of $80,000 a year, while the assistant manager’s salary is $60,000. The other eight employees each make $18,000 a year. If using the mean to determine the central tendency of this data set, the result would be $28,400—over 50 percent more than four-fifths of the employees make, and less than half what the other one-fifth do. Clearly, this is not the best measurement to use to describe the central position of this distribution; median or mode would work better in this situation.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
P o s e iy, i - gt . A [ = # o iy ] 2 & Measurements of Spread Measurements of spread, anocther type of descriptive statistics, describe how the data is spread out across a distribution, Usually, not all observed data is near the central position; rather, there often is outlving data. s important (o understand the relationship between the measure of the spread of data values and measures of central tendency. A measure of spread describes how well the mean or other number representing the central position represents the data. If the spread of values in the data set is large, the mean does not represent of the data as well as it would were the spread smaller. Alarge spread indicates large differences between numbers in a data set. In researeh, minimal variation within a dala set is consideraed desirable. Types of measurement of spread include the range, the standard deviation, and the varance, Range The range, the simplest measure of spread, is the difference between the highest and lowest numbers in a data set when the numbers are ordered from lowest 1o highest. Range is calculaled as maximum value - minimum value. For example, assume that a study is being conducted with ten participants aged 23, 56, 45, 65, b9, 55, 62, 54, 85, and 25. The mastimum value {oldest participanty is 85 and the minimum value (youngest participant} is 23. This results in a range of 62, which is 85 minus 23, While using therange as a measure of spread is imited, it can be useful it vou are measuring a variable that has either a critical ow or high threshold {or both) that should not be crossed. The range will instantly inform vou whether at least one vaiue broke these critical thresholds, For example, if the study shown above was intended to study only participants forty and older, determining the rangs would immediately inform the researcher that they had made an error in choosing the study's sample size,
Smith Vargas 9 3 10 7 11 11 12 15 13 19 | Mean 11 11 I Range 4 8 J Measurements of Spread: Variance and Standard Deviation 59 0 0 55 -4 16 62 3 9 54 -5 25 Mean = 59 Column Sum = 86 Sample variance is 86 divided by the difference of 5 and 1, which equals 21.5. Standard deviation is the square root of the variance. The square root of 21.5 is approximately 4.6.
VARIANCE Unlike range, which is only concerned with extremes, variance looks at all the data points and then determines their distribution. Statistical variance gives a measure of how the data distributes itself around the central position. Determining variance is more difficult than range and requires a complex mathematical formula, but to put it plainly, it is simply the average of the squared differences from the mean. For the sake of simplicity, let's work with a smaller data set when determining variance: 65, 59, 55, 62, and 54. There are five numbers in this set and their mean is 59. CALCULATING VARIANCE First, we will take each number in the set and subtract the mean from it. Some of the results we will have will be negative. In the case of these five numbers, we will end up with 6, 0, —4, 3, and —5. The next step will be to square each of these numbers; the results from this step are 36, 0, 16, 9, and 25. Add together these five numbers and divide the sum by the number in the set: 86 divided by 5is 17.2, which is the variance for this data set (the range in this case would be 11). SAMPLE VARIANCE When are looking at just a sample instead of the whole population, you find the sample variance. The sample variance will give you an idea of how spread out your sample data is. In order to use sample variance, simply subtract 1 from the number of items in the set: 5 minus 1 is 4. So instead of dividing 86 by 5, you will divide 86 by 4, giving you a sample variance of 21.5. STANDARD DEVIATION The last type of measurement of spread, standard deviation, is both the most complex and the most commonly used. Standard deviation is formulating by first determining the variance of a data set, and then by simply finding its square root. In the example above, the standard deviation would be approximately 4.1. An important attribute of the standard deviation is that if the mean and standard deviation are known, it is possible to compute the percentile rank of any number in a set. In a normal distribution, about 68 percent of the numbers are within one standard deviation of the mean, and about 95 percent of the numbers are within two standard deviations of the mean. For this reason, it is often used in situations where bell curves are employed, like in testing and polling situations.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
e e Ty g . o /- ),»:-:_: LA AN i i S £ 4% o inferential Statistics Draw Conclusions inferential statistics aliow researchers (o draw conclusions from their data sels, and n particular 1o draw conclusions that extend beyond the observed data. In this way, a researcher can make more generalized conclusions based an thelr study’s data sets. inferential statistics are concerned with reaching conclusions from information that is not complete. In other words, they generalize from the population studied by using information obiained from a sample of the population to say something about the whole population, Opinion Poll Example An example of inferential statistics is an opinion poll, such as those seen during a political election. Such a poll attempts to make inferences about the possibie oulcoms of the election. You more than likely have observed a sampling taken from a televised poll consisting of a portion of the population in a specific county, state, or even the entive country. The results of the preferences selected are then tabulated, and inferences are drawn as o how the entire population would vote that day, - =7 I s 5, ;;’35’ S .‘;E,' A vy " M ;{mf%;, - 7 [ . f.f.f.»:""’f»‘; %, i"t‘i"t‘i‘?f; ;;«!’::’f,«a - % B E9955550 , K s % P R A, 7% St o F4 = e P % R A 2 P o o e % -~ = o R S\\\\\\\u & ?\\x\'&‘ \‘Qw\-’fi' & You may have also noticed in your reading of research that data comes from o common types of investigations: exgeriments and surveys. This is not SUrprising, as experiments and surveys are the two fundamental types of research nvestigations, Chservational Survey: An observational survey, for example, involves data that is ohiained representing observations of a phenomenon, event, or group over a period of time, wherein few or no controls are used. Observational Example: An example of an observational study might involve the effects of tear gas used by the police on the residents of Ferguson, Missourt during the 2014 riots, In this example, the researcher obviously could not have controlled the amount of tear gas that the residents were exposed {0 and s duration. Experiment: in contrast, an experiment would purposely involve the use of controls over the amount and duration of tear gas exposure given a study popuiation over a specified period,
Testing the Hypothesis Researchers typically use inferential statistics to determine whether or not the data obtained from a study supports a particular hypothesis. In social research, the investigator usually attempts to determine if the null hypothesis IS supported or not. CORRELATION For a correlation, the null hypothesis states that the correlation is not reliably different from zero but simply due to chance. COMPARISON For a comparison of two data sets, the null hypothesis states that any observed group difference is not reliable, but due to chance. TESTING THE HYPOTHESIS Inferential statistical tests tell what the chances are that the null hypothesis is “true.” If the chances are low, less than 5 percent, then people (usually) reject the null hypothesis. This leaves the alternate hypothesis, which states that the difference or correlation is real or reliable. If the null hypothesis is “true” less than 5 percent of the time, people say the correlation or difference is statistically significant. STATISTICAL SIGNIFICANCE “Statistically significant” does not mean “important”; it means “reliable,” and “reliable” means that the observed difference is likely to show up if the same kind of data is collected again.
Common Testing Methods VARIATION BETWEEN GROUPS <l [ _— FREQUENCY VARIATION WITHIN GROUPS <zl - < e | I I I I I SCORE There are many methods of testing a hypothesis. The researcher must choose the method most appropriate based on the type of data obtained from their research. T-TEST The t-test is a statistical analysis of the means of two populations. It determines whether there is a real variation of the two distributions, and assesses whether the means of two groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means of two groups. There are three basic types of t-tests: the one-sample t-test, the independent-samples t-test, and the dependent-samples (or paired- samples) t-test. All three types of t-tests look at the difference between the means and divide that difference by some measure of variation. ANALYSIS OF VARIANCE An ANOVA is a statistical test that is also used to compare means. This test is used to determine if there are significant differences between more than two independent groups. The difference between a t-test and an ANOVA is that a
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
t-test can only compare two means at a time, whereas an ANOVA can compare multiple means at the same time. ANOVAS can also compare the effects of different factors on the same measure. They can become very complicated, and only trained statisticians should conduct their analyses. There are several types of ANOVAS, including the one-way ANOVA, within- groups (or repeated-measures) ANOVA, and factorial ANOVA. EVE The chi-square is an inferential statistical technique designed to test for significant relationships between two variables organized in a table showing two variants. This test measures how well the anticipated or expected results of the study fit with the observed values. The chi-square requires no assumptions about the shape of the population distribution from which a sample is drawn. However, like all inferential techniques it assumes random sampling. Type | and Il Errors NEED TO PREVENT ERRORS In order to determine if the results of a study are significant, the researcher must set initial parameters to compare their results against. This must be done to help prevent a type | error, in which a null hypothesis is falsely rejected, or a type Il error, in which it is falsely accepted. SIGNIFICANCE LEVEL The process of hypothesis testing can seem to be quite varied, but regardless of the topic or discipline the general process is the same. Hypothesis testing involves the statement of a null hypothesis, and the selection of a significance level. The null hypothesis is either true or false, and represents a default claim. After formulating the null hypothesis and choosing a level of significance, we acquire data through observation. Statistical calculations tell us whether or not we should reject the null hypothesis. TYPE | ERROR In an ideal world we would always reject the null hypothesis when it is false, and we would not reject the null hypothesis when it is indeed true. There are two other scenarios that are possible, each of which will result in an error. The first kind of error that is possible involves the rejection of a null hypothesis that
is actually true, a type | error or error of the first kind. Type | errors are equivalent to false positives. TYPE Il ERROR The other kind of error that is possible occurs when we do not reject a null hypothesis that is false, a type Il error, also referred to as an error of the second kind. Type Il errors are equivalent to false negatives. The probability of a type Il error is given by the Greek letter beta. ERROR CANNOT BE ELIMINATED COMPLETELY Type | and Type Il errors are part of the process of hypothesis testing and cannot be completely eliminated, but one type of error can be minimized. When researchers try to decrease the probability one type of error, the probability for the other type increases. Many times the real world application of a hypothesis test will determine if the public is more accepting of type | or type Il errors, and the researcher can think accordingly when he or she develops a statistical design. Hypothesi True False ' B Type | error Correct Reject (False interpretation of Experimental positive) data Result S, Correct align="center"Type Rzljecc:)t interpretation | Erro_r(FaIse B B of data Negative) Significance Levels Set Significance Level BEFORE Conduct Test The researcher must initially set a significance level before conducting an inferential test. The significance level, also called the alpha value, is a value chosen before testing. This may seem confusing to a person who is not well- versed in statistical analysis, but this is the only value you need to be familiar with along with its meaning. This coupled with an explanation of a very few
basic statistical terms and symbols will make you a better consumer of research. The symbol for significance level is a for “alpha.” The significance level is a measure as to how significant a result is. Higher Alpha Means Greater Confidence The concept of statistical significance is fundamental to hypothesis testing. In a study that involves drawing a random sample from a larger population in an effort to prove some result that can be applied to the population as a whole, there is the constant potential for the study data to be a result of sampling error or simple coincidence or chance. The significance level, in the simplest of terms, is the threshold probability of incorrectly rejecting the null hypothesis when it is in fact true (a type | error). The significance level or alpha is therefore associated with the overall confidence level of the test. This essentially means that the higher the alpha value, the greater the confidence in the test. P-Values The researcher must also determine the p-value of their statistical test to compare against the significance level. A p-value is the probability of finding extreme results when the null hypothesis is true. If the level of significance a = p, this demonstrates the probability of rejecting the null hypothesis. If the p- value is less than or equal to alpha (p < .05), the null hypothesis is rejected and is said to be statistically significant. In other words, more than likely something besides chance alone from the data evaluated lead to the finding. | > A If the p-value is greater than alpha (p > .05), the data evaluated failed to indicate the need to reject the null hypothesis and the result is not statistically significant. In other words, the results from the data evaluated more than likely can be explained by chance alone. SIGNIFICANT OR NOT In quantitative research, the findings often indicate the results were either statistically significant (p < .05) or not statistically significant (p > .05). This simply means that the data from the sample population and either did or did not support the null hypothesis; therefore, the null hypothesis in the studies
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
you read will be either supported or rejected based on the p-value. The statistical probability or confidence level for a = 0.05 is 95 percent. ASTERISKS You may have also observed in tables in studies or journals the use of asterisks in the interpretation of the p-values. No asterisk (as in p > 0.05) means that the result was not significant; one asterisk, such as “* if p < 0.05”, means the result was significant; and two asterisks, “** if p < 0.01”, means the result was highly significant. Inconclusive Results However, just because a result is found not to be significant does not mean the null hypothesis is fully supported and the alternate hypothesis should be rejected. You may, in your reading of a study, notice from time to time that while the researcher opted not to reject the null, the alternate was not fully rejected. This may be for various reasons. One reason may be simply that the researcher made the determination that the population sample was too small to provide a sufficient indication to support or reject the null and therefore guestions the accuracy of the findings. Another reason could be that knowledge regarding the survey or the participants (or both) indicated there was a problem with the data after the collecting was closed. Therefore, this needed to be taken into account when reporting the findings as well as providing an explanation as to what happened so future researchers as well as research consumers could understand what the problem was. Conclusion In conclusion, in the midst of all the numerical jargon and presentation in the study or in tables and graphs, the focus is solely on the p (significance) or a (alpha also significance) value. The statistical process as to how the significance level is reached or determined is not important at this time. You do not need to worry about the equations or any other part of the statistics. What is important is for you to be a better consumer of research by understanding the meaning and importance of the significance level as you read research. You will learn about the how and when if you take a course in statistics.
KEY TERMS Descriptive statistics: Brief descriptive coefficients that summarize a given data set, which can be either a representation of the entire population or a sample of it. Inferential statistics: A model that makes inferences about populations using data drawn by collecting a sample or samples from the millions of residents and using it to make inferences about the entire population. Mean: The average of a group of numbers. Measurement of central tendency: A single value that describes the way in which a group of data clusters around a central value. Measurement of spread: A single value that describes how similar or varied the set of observed values are for a particular variable. Median: The number in the center of a set when the set is arranged in order. Mode: The number that appears most often in a set. P-value: The probability of finding the observed, or more extreme, results when the null hypothesis of a study question is true.
Range: The area of variation between upper and lower limits on a particular scale. Significance level: The probability of rejecting the null hypothesis in a statistical test when it is true. Standard deviation: A quantity calculated to indicate the extent of deviation for a group as a whole. Statistics: The practice or science of collecting and analyzing numerical data in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample. Type | error: The incorrect rejection of a true null hypothesis (a “false positive”). Type Il error: Incorrectly retaining a false null hypothesis (a “false negative”). Variance: The expectation of the squared deviation of a random variable from its mean, which informally measures how far a set of numbers are spread out from their mean.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Sources Creswell, J. W. (2003). Research design: Qualitative, quantitative, and mixed methods approach. Thousand Oaks, CA: Sage. Ellis, L., Hartley, R. D., & Walsh, A. (2010). Research methods in criminal justice and criminology: An interdisciplinary approach. (2nd ed.). Blue Ridge Summit, PA: Rowman & Littlefield. Frommer, G.P. (1999). Inferential Statistics. Retrieved from http.//www.indiana.edu/~p1013447/dictionary/inf stat.htm. Mason, M. (2010). Sample size and saturation in PhD studies using qualitative interviews. Forum: Qualitative Social Research, 11(3), article 8. Retrieved from http.//www.qualitative-research.net/index.php/fqs/article/view/1428/3027 . Taylor, C. (2016). What Is the difference between type | and type Il errors? Retrieved from http://statistics.about.com/od/Inferential-Statistics/a/Type-I-And-Type-II- Errors.htm. Trochim, W.M.K. (2006). The T-Test. Retrieved from http.//socialresearchmethods.net/kb/stat t.php. Images Stock images provided by 123rf.com Graph of U.S. population grouped by age and gender http://www.census.gov/2010census/img/state profile dc 2010home.qif Visualisation mode median mean By Cmglee (Own work) [CC BY-SA 3.0 or GFDL], via Wikimedia Commons President Truman holding the infamous issue of the Chicago Daily Tribune https.//en.wikipedia.org/wiki/Dewey Defeats Truman#/media/File:Dew evtrumanlZ2.jpg
Ferguson Day 6, Picture 44 By Loavesofbread (Own work) [CC BY-SA 4.0], via Wikimedia Commons