Calculating Correlation Coefficients and Creating Lines of Best Fit

docx

School

North Carolina State University *

*We aren’t endorsed by this school

Course

101

Subject

Biology

Date

Apr 3, 2024

Type

docx

Pages

8

Uploaded by ConstablePartridgePerson982

Report
Calculating Correlation Coefficients and Creating Lines of Best Fit Is animal complexity correlated with miRNA diversity? Animal phyla vary greatly in morphology, from simple sponges that lack tissues and symmetry to complex vertebrates. Members of different animal phyla have similar developmental genes, but the number of miRNAs (microRNAs) varies considerably. These small RNAs are involved in the regulation of gene expression. In this exercise, you will explore whether miRNA diversity is correlated to morphological complexity. In the analysis, miRNA diversity is represented by the average number of miRNAs in a phylum ( x ), while morphological complexity is represented by the average number of cell types in that phylum ( y ). Researchers examined the relationship between these two variables by calculating the correlation coefficient ( r ). First, practice reading the data table. Animal phylum i x i y i Porifera 1 5.8 25 Platyhelminthes 2 35 30 Cnidaria 3 2.5 34 Nematoda 4 2.6 38 Echinodermata 5 38.6 45 Cephalochordata 6 33 68 Arthropoda 7 59.1 73 Urochordata 8 25 77 Mollusca 9 50.8 83 Annelida 10 58 94 Vertebrata 11 147.5 172.5 Data from Bradley Deline, University of West Georgia, and Kevin Peterson, Dartmouth College, 2013. For the eighth observation ( i =8), identify the animal phylum, x i , and y i . Which animal phyla have greater complexity than Urochordata?
Calculating the correlation coefficient for this data is a multi-step process. Use the data table on the second to last page of this document to enter your calculations. 1. Start by calculating the mean values for number of miRNAs ( ) and number of cell types ( ). (The mean is the sum of the data values divided by n , the number of observations.) Round your answers to one decimal place. 2. Next calculate the values for ( x i ) and ( y i ) for Porifera. Round your answers to one decimal place. 3. Next calculate the values for ( x i ) 2 and ( y i ) 2 for Porifera. Round your answers to the nearest whole number. 4. Then calculate the value for ( x i )( y i ) for Porifera. Round your answer to the nearest whole number. 5. Repeat steps 2-4 for all phyla. Follow the steps above and enter your answers in the table attached at the end of this document. 6. Now fill in the sum values for ( x i ) 2 , ( y i ) 2 , and ( x i ) ( y i ). (The “ ” symbol indicates that the n values are added together.) 7. Next you need to calculate the standard deviation values. First, for number of miRNAs ( s x ), use the following formula. In this formula, n is the number of observations ( n =11). 8. Then calculate the standard deviation for number of cell types ( sy ), using a similar formula: Enter your answers in the attached table. Round all numbers to one decimal place. Now you have everything you need to calculate the correlation coefficient r for the variables and , using this formula: 9. Give the correct value for r to two decimal places. Keep in mind that the correlation coefficient can range in value between –1 and 1. The correlation coefficient indicates the extent and direction of a linear relationship between two variables ( x and y ) and ranges in value between –1 and 1. When r <0, y and x are negatively correlated, meaning that values of y become smaller as values of x become larger . When r >0, y and x are positively correlated ( y becomes larger as x becomes larger). When r =0, the variables are not correlated.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
10. What is the correlation between miRNA diversity and animal complexity? Explaining variation using the coefficient of determination The coefficient of determination is the square of correlation coefficient (r 2 ) and it represents the proportion of the variation in y that is explained by the relationship between x and y. In other words, if our coefficient of correlation is 0.5, then the coefficient of determination is (0.5) 2 = 0.25, and one can conclude that 0.25 or 25% of the variation in morphological complexity of animals can be explained by the linear relationship between morphological complexity and diversity of miRNAs. Conversely, 75% of the variation in morphological complexity must be explained by other factors. 11. Use your correlation coefficient (r), to determine your coefficient of determination (r 2 ). How much of the variation in your data on morphological complexity can be explained by its relationship with miRNA diversity? What other factors do you think can contribute to the remaining variation? (Remember that this is not experimental data and therefore correlation does not actually imply causality. It is, however, an excellent starting point for hypothesis testing.) 12. Assume you are a researcher interested in addressing this question of correlation between miRNA diversity and animal complexity. Design an experiment to test it. What is your hypothesis?
Generating a regression line equation The equation for a straight line between two variables, x and y , is y = mx + b . In this equation, m is the slope of the line and b is the y -intercept (the point at which the straight line crosses the y -axis). When m <0, the line has a negative slope. When m >0, the line has a positive slope. The correlation coefficient, r , can be used to calculate the values of m and b in a linear regression. 13. First, use the values you have already calculated to calculate m in this equation: m = r (s y / s x ) . Give your answer to the nearest whole number. ( s y and s x are the standard deviations of variables x and y .) 14. Now use your values to calculate b (the y -intercept) in this equation: b = m . Give your answer to the nearest whole number. ( and are the means of these two variables.) Graphing the data and with a regression line Now that you know the equation for the regression line best fit to this data set, graph your data points and the line of best fit on the graph attached to the last page of this document. For the purposes of this activity, you are to hand draw the graph rather than creating it on a computer. If you do not have access to a printer or graph paper, you may draw your graph on a regular sheet of paper. Be careful to plot your data accurately. You already know the y-intercept, so you will need to solve the equation for one more data point and create a straight line between those points. Also, remember to label your axes and to give your graph an appropriate title. 15.
Vertebrata Annelida Mollusca Urochordata Arthropoda Cephalochordata Echinodermata Nematoda Cnidaria Platyhelminthes Porifera Animal phylum 11 10 9 8 7 6 5 4 3 2 1 i = 147.5 58 50.8 25 59.1 33 38.6 2.6 2.5 35 5.8 x i S x = ( x i ) ( x i ) 2 = ( x i ) 2 = 172.5 94 83 77 73 68 45 38 34 30 25 y i S y = ( y i ) ( y i ) 2 = ( y i ) 2 ( x i )( y i ) = ( x i ) ( y i )
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help