Lab 4. Descriptive Statistics-1

docx

School

San Jose State University *

*We aren’t endorsed by this school

Course

167

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

8

Uploaded by ConstableRainToad32

Report
PH167 Biostatistics Lab 4: Descriptive Statistics Your name: Maili Nguyen Tingle Student ID: 016518333 Traditional Descriptive Statistics In Lab 3, we used histograms and boxplots to explore the shape, location, and spread of the SCR variable. The median was used as a measure of central location and the inter-quartile range (IQR) was used as a measure of spread. Let us now consider other traditional measures of central location and spread: the mean and standard deviation, respectively. We will be looking at concepts from Chapter 4: Summary Statistics. Notes: Carry at least four significant digits during calculations . Like the median, the mean is a summary measure of central location. The mean identifies the gravitational center of the distribution, while the median identifies the value that is greater than or equal to 50% of the other values. Neither the mean nor median tell you anything about the shape or spread of the distribution. H: Paperback page numbers, O: Online book page numbers Q1. Sample mean, SCR variable: The arithmetic mean, x , is the most common measure of central location for a set of data points H: 77-79, O: 77-79). Let us calculate the arithmetic mean for the SCR variable by hand: a. n = b. The sum all values for observations 1 through n, i = 1 n x i = ¿¿ c. The arithmetic mean, x = 1 n i = 1 n x i = ¿¿
PH167 Biostatistics d. Review the notes about the mean that appear on (H:78, O:77). Note that the mean has three different functions. LIST these functions and EXPLAIN the meaning of each of the functions in your own words: (1) (2) (3)
PH167 Biostatistics Q2. Variance and standard deviation, SCR (H: 91-92, O: 91-92) The standard deviation is the most common descriptive measure of spread. This statistic is based on the Sum of Squares (SS), which is the sum of squared deviations. A deviation is the distance a data point lies from the mean: ( x i x ). Square this value to get a squared deviation: ( x i x ) 2 . a. Deviations and squared deviations . Fill in the shaded cells in this table. Note that (A) is the deviation of observation 9 and (B) is the deviation of observations 18. (C) and (D) are the squared deviations of these observations. Round off the values to three decimal places. Observation number SCR value Deviation ( x i x ) Squared deviation ( x i x ) 2 1 0.8 -0.192 0.0368 6 2 1.1 0.108 0.0116 6 3 0.7 -0.292 0.0852 6 4 0.8 -0.192 0.0368 6 5 1.4 0.408 0.1664 6 6 1.0 0.008 0.0000 6 7 1.0 0.008 0.0000 6 8 1.0 0.008 0.0000 6 9 0.7 (A) ( C ) 10 0.9 -0.092 0.0084 6 11 1.1 0.108 0.0116 6 12 0.8 -0.192 0.0368 6 13 1.2 0.208 0.0432 6 14 0.8 -0.192 0.0368 6 15 0.9 -0.092 0.0084 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
PH167 Biostatistics 16 1.7 0.708 0.5012 6 17 0.9 -0.092 0.0084 6 18 1.0 (B) ( D ) 19 0.7 -0.292 0.0852 6 20 1.0 0.008 0.0000 6 21 0.9 -0.092 0.0084 6 22 1.7 0.708 0.5012 6 23 0.8 -0.192 0.0368 6 24 0.9 -0.092 0.0084 6 25 1.0 0.008 0.0000 6 Sums 24.8 0.000* SS = * The sum of deviations is always 0. b. Sum of Squares (SS). The formula for the sum of squares is, SS = i = 1 n ( x i x ) 2 = This is the sum of values in the right-hand column in the above table. Write the sum of squares in the area marked by yellow. Round off value to three decimal places. c. Variance (s 2 ). The Sum of Squares forms the basis of the variance. Calculate the variance using this formula: s 2 = 1 n 1 ×SS = Round value off to three decimal places. The variance is the mean sum of squares or “mean square error (H: 92, O: The reason we multiply the SS by 1 n 1
PH167 Biostatistics instead of 1 n as we normally would to derive the mean, is because we lose one degree of freedom from having to estimate the population mean as opposed to knowing its true value. d. Standard deviation (s). Unfortunately, the variance carries “units squared,” which in this case is (mg/dl) 2 . This makes it difficult to interpret the variance directly. Therefore, for descriptive purposes, we take the square root of the variance to derive the popular statistic known as the standard deviation (s) (H: 92, O: 91- 92). Calculate the standard deviation using this formula: s = The standard deviation is the square root of the variance (“root mean square”). It carries units of the original measurement scale (mg/dl in this case) and is the most common descriptive measure of spread. Additional facts about the standard deviation that appear on (H: 93-94, O: 92-93). You should review these facts now so that you understand how to interpret this important statistic.
Categorical Variables Q3. The variable TOX indicates presence or absence of a toxic reaction; it is the main response variable in our data. Unlike SCR, this variable is categorical. Likewise, the variables GENERIC, DIAG, and STAGE has the same characteristic. Descriptive statistics for categorical variables address counts (or frequencies) and proportions . For the TOX variable in our data set determine: a. Number of observations, n = _______ b. The number of “successes” (i.e., toxicity cases), x = _______ c. Incidence proportion of toxicity, ^ p = x n = ¿ ¿ Compute frequency tables for each of the categorical variables using SPSS Analyze 🡪 Descriptive Statistics 🡪 Frequencies Select all the categorical variables in the data set and click OK. Then report: d. Proportion that is male p ˆ male = _______ e. Place your screenshot of frequency table of a variable that includes the information about male from Output window (from IBM SPSS Statistics Viewer) here:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
f. Proportion that used the Jones (generic) product p ˆ generic = _______ g. Place your screenshot of frequency table of a variable that includes the information about generic product from Output window (from IBM SPSS Statistics Viewer) here: h. Proportion with leukemia p ˆ leukemia = _______ i. Place your screenshot of frequency table of a variable that includes the information about leukemia from Output window (from IBM SPSS Statistics Viewer) here: j. Proportion in relapse p ˆ relapse = _______ k. Place your screenshot of frequency table of a variable that includes the information about relapse from Output window (from IBM SPSS Statistics Viewer) here: