47.00 26.5 74 2.3 C+ 47.00 22.5 70 2.0 55.00 37.0 92 3.7 A- 10 52.00 22.0 74 2.3 C+ Calculate: 1- Measures of central tendency. 2- Measures of dispersion. 3- Some position scales.
Inverse Normal Distribution
The method used for finding the corresponding z-critical value in a normal distribution using the known probability is said to be an inverse normal distribution. The inverse normal distribution is a continuous probability distribution with a family of two parameters.
Mean, Median, Mode
It is a descriptive summary of a data set. It can be defined by using some of the measures. The central tendencies do not provide information regarding individual data from the dataset. However, they give a summary of the data set. The central tendency or measure of central tendency is a central or typical value for a probability distribution.
Z-Scores
A z-score is a unit of measurement used in statistics to describe the position of a raw score in terms of its distance from the mean, measured with reference to standard deviation from the mean. Z-scores are useful in statistics because they allow comparison between two scores that belong to different normal distributions.
A measure of central tendency is a summary statistic that represents the centre point or typical value of a dataset. These measures indicate where most values in a distribution fall and are also referred to as the central location of a distribution. You can think of it as the tendency of data to cluster around a middle value. The most common measures of central tendency are the mean, median mode, geometric mean and harmonic mean. Each of these measures calculates the location of the central point using a different method.
- Mean is the sum of values of all observations divided by the number of observations.
i=1,2, ... n where n is the no. of observations.Excel formula =average(array)
- Median is the middlemost value of a series of observations after arranging the data set in ascending or descending order.
when n is odd
median = value of (n+1)/2th item
when n is even
median = value of (n2)th item + (n2+1)th item/2
Excel formula= median(array)
- The mode is the most frequently occurring value in a data set,
Excel formula =mode(array)
The measures of central tendency are not adequate to describe data. Two data sets can have the same mean but they can be entirely different. Thus to describe data, one needs to know the extent of variability. This is given by the measures of dispersion. Range, interquartile range, and standard deviation are the three commonly used measures of dispersion.
Range: The range is the difference between the largest and the smallest observation in the data. The prime advantage of this measure of dispersion is that it is easy to calculate. On the other hand, it has lot of disadvantages. It is very sensitive to outliers and does not use all the observations in a data set. It is more informative to provide the minimum and maximum values rather than providing the range.
Inter-quartile range: Interquartile range is defined as the difference between the 25th and 75th percentile (also called the first and third quartile). Hence the interquartile range describes the middle 50% of observations. If the interquartile range is large it means that the middle 50% of observations are spaced wide apart. The important advantage of the interquartile range is that it can be used as a measure of variability if the extreme values are not being recorded exactly (as in case of open-ended class intervals in the frequency distribution). Another advantageous feature is that it is not affected by extreme values. The main disadvantage of using an interquartile range as a measure of dispersion is that it is not amenable to mathematical manipulation.
Standard deviation: Standard deviation (SD) is the most commonly used measure of dispersion. It is a measure of the spread of data about the mean. SD is the square root of the sum of squared deviation from the mean divided by the number of observations.
The reason why SD is a very useful measure of dispersion is that, if the observations are from a normal distribution, then 68% of observations lie between mean ± 1 SD 95% of observations lie between mean ± 2 SD and 99.7% of observations lie between mean ± 3 SD
The other advantage of SD is that along with mean it can be used to detect skewness. The disadvantage of SD is that it is an inappropriate measure of dispersion for skewed data.
Step by step
Solved in 2 steps with 4 images