STAT251_Assignment1_2023forupload

docx

School

University of Wollongong *

*We aren’t endorsed by this school

Course

251

Subject

Statistics

Date

Jan 9, 2024

Type

docx

Pages

5

Uploaded by PresidentMusicCaterpillar33

Report
Faculty of Engineering and Information Sciences School of Mathematics and Applied Statistics http://eis.uow.edu.au/smas/index.html STAT251 Fundamentals for Biostatistics Spring Session 2023 Assignment 1 Due Date: August 18 th , 5:00pm (Friday Week 4) The dataset anemia_data.omv can be found on Moodle along with a data dictionary document explaining coding of variables ( anemia_data_dictionary.docx ). The data contain information on potentially anaemic women and are designed to represent the relationships between Haemoglobin measurements, prescription/recommendation of iron supplements and actions taken based on the recommendations, along with some demographic and social information. There are 200 patients in the dataset. Haemoglobin concentration -units are g/dL, ( Hb ) is a measure of how healthy your red blood cells are as they carry oxygen to the cells in your body. Low values can have serious health effects. In general, low values of Haemoglobin are diagnosed as anaemia. It is known that deficiency in iron is related to low concentrations of Haemoglobin. Healthy ranges are considered between 11.6 to 16.5 in women ( according to lifeblood Australia ) and at low altitudes or sea level. The investigators are interested in describing the distribution of Hb in the patients, how many of them have low levels and whether there is a relationship between age intake and Hb , amongst other things, this is of interest as their location is not at sea level, but the study is done at high altitudes, and hence the cut-off used by them for recommending supplements is different. The dataset is taken from a publicly available repository. Reference to come. 1. Use the dataset “ anemia_data.omv ” on Moodle to answer the following questions. a. What type of variable is Hb (main category and subtype) ? Quantitative, continuous b. What type of variable is Wait_time? (main category and subtype) ? Quantitative, discrete i. Based on your answer above, is it correctly set up in Jamovi? If not, what should it be coded to? No, this data should be set up as continuous with all data variables as integer. This data is currently set up as continuous with all data as decimals. c. Use Jamovi to produce a histogram of Hb ; include the relevant output. /2 /1 /2 /2
d. Use Jamovi to produce a box and whisker plot of Hb ; include the relevant output. i. Describe the distribution of Hb . The Hb data is distribution skewed to the right, negatively skewed which correlates with the box and whisker plot data. e. Construct a 5 number summary of Hb using Jamovi to find Q1 (25 th ), Q2(50 th ), and Q3 (75 th ) percentiles and other relevant output. (Report all values to 1 decimal place) f. Use Jamovi to find and report the standard deviation of Hb to 1 decimal place. g. What is the coefficient of variation of Hb ? (Show calculations to 1 decimal place) h. How would you use a coefficient of variation? i. Is the lowest value of Hb (10.4) an outlier using the I. z -score method? II. IQR method? (Show calculations for both methods.) III. Jamovi boxplot (explain your answer) /1 /2 /5 /1 /1 /1 /6
2. By using Exploration , then Descriptives , in Jamovi, display the categories of How serious is anemia ( How_serious_anemia ) using; 1. The appropriate table, and 2. The appropriate plot (see screenshot, click the Frequency tables option, and select the appropriate plot). Include the output for the table and plot. 3. Say we are interested in exploratory analysis regarding Age ( Age ) and how serious they think anemia is ( Howseriousisanemia ). a. What numerical summaries would you look at to explore a relationship, create them in Jamovi? b. What graphical tools would you use to explore such relationship, Use Jamovi to create such? 4. The scatterplot, correlation coefficient table and regression output below show the relationship between level of Hb and Age . All the relevant output is provided, you do not need to use Jamovi for this question. a. Does the relationship look approximately linear? (Yes or no) b. Is the relationship negative, positive, or none? c. From the output below, write out the regression equation for predicting Hb from Age in the form of a straight line. d. Interpret the regression equation for predicting Hb from Age . /1 /1 /2 /2 /2 /2 /2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
e. Interpret the strength of correlation coefficient using the criteria given in the lecture. f. Circle any points on the scatterplot you think might be a bivariate outlier/s g. What would you do if you thought a point was a bivariate outlier? h. Without knowing yet anything about p-values, confidence intervals, etc. Looking at the scatterplot along with the coefficients, does it seem to you that there is a relationship between age and Hb? that is, that you can actually get to explain Hb based on Age alone? Justify your answer.. /1 /1 /2 /2
Scatterplot