Calculating Correlation Coefficients and Creating Lines of Best Fit

pdf

School

Caldwell Community College and Technical Institute *

*We aren’t endorsed by this school

Course

152

Subject

Biology

Date

Feb 20, 2024

Type

pdf

Pages

6

Uploaded by vetteralexise

Report
Calculating Correlation Coefficients and Creating Lines of Best Fit Is animal complexity correlated with miRNA diversity? Animal phyla vary greatly in morphology, from simple sponges that lack tissues and symmetry to complex vertebrates. Members of different animal phyla have similar developmental genes, but the number of miRNAs (microRNAs) varies considerably. These small RNAs are involved in the regulation of gene expression. In this exercise, you will explore whether miRNA diversity is correlated to morphological complexity. In the analysis, miRNA diversity is represented by the average number of miRNAs in a phylum ( x ), while morphological complexity is represented by the average number of cell types in that phylum ( y ). Researchers examined the relationship between these two variables by calculating the correlation coefficient ( r ). First, practice reading the data table. Animal phylum i x i y i Porifera 1 5.8 25 Platyhelminthes 2 35 30 Cnidaria 3 2.5 34 Nematoda 4 2.6 38 Echinodermata 5 38.6 45 Cephalochordata 6 33 68 Arthropoda 7 59.1 73 Urochordata 8 25 77 Mollusca 9 50.8 83 Annelida 10 58 94 Vertebrata 11 147.5 172.5 Data from Bradley Deline, University of West Georgia, and Kevin Peterson, Dartmouth College, 2013. For the eighth observation ( i =8), identify the animal phylum, x i , and y i . Urochordata, xi=25, yi=77 Which animal phyla have greater complexity than Urochordata? Annelida
Calculating the correlation coefficient for this data is a multi-step process. Use the data table on the second to last page of this document to enter your calculations. 1. Start by calculating the mean values for number of miRNAs ( ) and number of cell types ( ). (The mean is the sum of the data values divided by n , the number of observations.) Round your answers to one decimal place. 2. Next calculate the values for ( x i ) and ( y i ) for Porifera. Round your answers to one decimal place. 3. Next calculate the values for ( x i ) 2 and ( y i ) 2 for Porifera. Round your answers to the nearest whole number. 4. Then calculate the value for ( x i )( y i ) for Porifera. Round your answer to the nearest whole number. 5. Repeat steps 2-4 for all phyla. Follow the steps above and enter your answers in the table attached at the end of this document. 6. Now fill in the sum values for ( x i ) 2 , ( y i ) 2 , and ( x i ) ( y i ) . (The “ ” symbol indicates that the n values are added together.) 7. Next you need to calculate the standard deviation values. First, for number of miRNAs ( s x ), use the following formula. In this formula, n is the number of observations ( n =11). 8. Then calculate the standard deviation for number of cell types ( sy ), using a similar formula: Enter your answers in the attached table. Round all numbers to one decimal place. Now you have everything you need to calculate the correlation coefficient r for the variables and , using this formula: 9. Give the correct value for r to two decimal places. Keep in mind that the correlation coefficient can range in value between 1 and 1. r=0.93 The correlation coefficient indicates the extent and direction of a linear relationship between two variables ( x and y ) and ranges in value between 1 and 1. When r <0, y and x are negatively correlated, meaning that values of y become smaller as values of x become larger . When r >0, y and x are positively correlated ( y becomes larger as x becomes larger). When r =0, the variables are not correlated.
10. What is the correlation between miRNA diversity and animal complexity? Positive correlation Explaining variation using the coefficient of determination The coefficient of determination is the square of correlation coefficient (r 2 ) and it represents the proportion of the variation in y that is explained by the relationship between x and y. In other words, if our coefficient of correlation is 0.5, then the coefficient of determination is (0.5) 2 = 0.25, and one can conclude that 0.25 or 25% of the variation in morphological complexity of animals can be explained by the linear relationship between morphological complexity and diversity of miRNAs. Conversely, 75% of the variation in morphological complexity must be explained by other factors. 11. Use your correlation coefficient (r), to determine your coefficient of determination (r 2 ). How much of the variation in your data on morphological complexity can be explained by its relationship with miRNA diversity? What other factors do you think can contribute to the remaining variation? (Remember that this is not experimental data and therefore correlation does not actually imply causality. It is, however, an excellent starting point for hypothesis testing.) My coefficient of determination is 0.86. 0.86, or 86% of the variation in morphological complexity can be explained by its relationship with miRNA diversity. 0.14, or 14% must be explained by other factors. 12. Assume you are a researcher interested in addressing this question of correlation between miRNA diversity and animal complexity. Design an experiment to test it. What is your hypothesis? I would do the same test we just did but with a specific phylum to see their correlation. If the number of miRNAs are correlated to complexity, then the number of cell types will increase with the miRNA.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Generating a regression line equation The equation for a straight line between two variables, x and y , is y = mx + b . In this equation, m is the slope of the line and b is the y -intercept (the point at which the straight line crosses the y -axis). When m <0, the line has a negative slope. When m >0, the line has a positive slope. The correlation coefficient, r , can be used to calculate the values of m and b in a linear regression. 13. First, use the values you have already calculated to calculate m in this equation: m = r (s y / s x ) . Give your answer to the nearest whole number. ( s y and s x are the standard deviations of variables x and y .) m=1 14. Now use your values to calculate b (the y -intercept) in this equation: b = m . Give your answer to the nearest whole number. ( and are the means of these two variables.) b=26 Graphing the data and with a regression line Now that you know the equation for the regression line best fit to this data set, graph your data points and the line of best fit on the graph attached to the last page of this document. For the purposes of this activity, you are to hand draw the graph rather than creating it on a computer. If you do not have access to a printer or graph paper, you may draw your graph on a regular sheet of paper. Be careful to plot your data accurately. You already know the y-intercept, so you will need to solve the equation for one more data point and create a straight line between those points. Also, remember to label your axes and to give your graph an appropriate title.
Vertebrata Annelida Mollusca Urochordata Arthropoda Cephalochordata Echinodermata Nematoda Cnidaria Platyhelminthes Porifera Animal phylum 11 10 9 8 7 6 5 4 3 2 1 i = 41.6 147.5 58 50.8 25 59.1 33 38.6 2.6 2.5 35 5.8 x i S x =40.8 105.9 16.4 9.2 -16.6 17.5 -8.6 -3 -39 -39.1 -6.6 -35.8 ( x i ) ( x i ) 2 =16610 11215 269 85 276 306 74 9 1521 1529 44 1282 ( x i ) 2 = 67.2 172.5 94 83 77 73 68 45 38 34 30 25 y i S y =42.2 105.3 26.8 15.8 9.8 5.8 0.8 -22.2 -29.2 -33.2 -37.2 -42.2 ( y i ) ( y i ) 2 =17800 11088 718 250 96 34 1 493 853 1102 1384 1781 ( y i ) 2 ( x i )( y i ) =15950 11151 440 145 -163 102 -7 88 1139 1298 246 1511 ( x i ) ( y i )
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help