lab quiz 5.2

pdf

School

University of Guelph *

*We aren’t endorsed by this school

Course

2400

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

10

Uploaded by ColonelSeaLion3327

Report
Virtual Lab Quiz 5.2 Mapping a Qualitative and a Quant… Attempt 1 of 3 Written Dec 7, 2023 5:28 PM - Dec 7, 2023 5:55 PM Released Dec 4, 2021 8:00 AM Attempt Score 8 / 8 - 100 % Overall Grade (Highest Attempt) 8 / 8 - 100 % Question 1 1 / 1 point In Virtual Lab 5.1, you investigated the pairwise independence of the marker genotypes in the dataset of 461 corn plants genotyped for 10 markers. The markers separated into two groups on the basis of those pairwise tests of independence that you did. We won't expect you to remember the results so let's define marker group 1 as markers zmUG002, zmUG007, zmUG009 and zmUG010 and marker group 2 as zmUG003, zmUG004 and zmUG008. By the way, if you've been wondering about the names of the markers, the convention is zm = zea mays (latin for corn), UG = U of G marker numbering and then a sequential number starting at 001 with enough zeros for the size of the study planned. For these two marker groups, using the Virtual Lab 5.2 Spreadsheet to help you including the columns that have been pre-calculated to guide you, calculate the distance between the markers and arrange them in the correct order for each group. For group 1, use marker zmUG010 as the starting marker and for group 2 use zmUG004 as the starting marker. Then for each marker in group 1, pick the marker with the shortest distance from zmUG010 as the next marker in the order. From that marker, the next marker in the
order will be the one with the shortest distance and so on. Repeat the same for group 2 starting with zmUG004. Question 2 1 / 1 point One of the two marker sets will segregate independently from the Normal - Dwarf phenotype while the other marker set is linked to (ie does not segregate independently from) the Normal-Dwarf phenotype. Using the Chi- square tests that have been set up for you in the Question 2 section of the Virtual Lab 5.2 spreadsheet, determine which marker set is linked to the Normal - Dwarf phenotype . Linkage is determined by all of the markers in the set failing the test for independent segregation. Marker set 1 is zmUG002, 007, 009 and 010. Marker set 2 is zmUG003, 004 and 008. Question 3 1 / 1 point Marker group 1 order and distances: zmUG010 - 12 cM - zmUG007 - 20 cM - zmUG002 -19 cM - zmUG009 Marker group 2 order and distances: zmUG004 - 21 cM - zmUG003 - 15 cM - zmUG008 Marker group 1 order and distances: zmUG010 - 12 cM - zmUG007 - 20 cM - zmUG002 -19 cM - zmUG009 Marker group 2 order and distances: zmUG004 - 28 cM - zmUG008 - 15 cM - zmUG003 Marker group 1 order and distances: zmUG010 - 26 cM - zmUG002 - 12 cM - zmUG007 -32 cM - zmUG009 Marker group 2 order and distances: zmUG004 - 21 cM - zmUG003 - 15 cM - zmUG008 Marker set 1 Marker set 2
Now that we know one of the marker sets is linked to the Normal - Dwarf phenotype, we can map the Normal - Dwarf phenotype as a qualitative trait. This process is known as linkage mapping, finding regions within the genome that are linked to specific traits. Using the marker group that does not assort independently from the dwarf mutant, determine where the mutant causing the Normal - Dwarf phenotype falls relative to the markers. From Question 1, you know the order of the markers in the set. So now you need to figure out which pair of markers the Normal - Dwarf mutation falls in between. To do this, use the Question 3 section of the Virtual Lab 5.2 spreadsheet to calculate which two markers are on either side of the Normal - Dwarf mutation (ie which two are the smallest distance away) and then fill in the remaining distances between markers using some of the marker distances you calculated in Question 1. Note that when you place the Normal - Dwarf mutation between the 2 markers, the sum of the new distances from marker <-> Normal - Dwarf <-> marker may not add up to the total distance marker <-> marker you originally had in Question 1. This is part of the fun and challenge with linkage mapping - it is a statistical map and so distances can vary a bit depending on the data you analyze. More observations = more accurate map. Question 4 1 / 1 point In questions 1 to 3 we have mapped a qualitative trait - the Normal - Dwarf mutation. Now we will do the analysis a different way using the plant heights as the phenotype and look for a Quantitative Trait Locus affecting plant height. To do this, we will continue with a statistical analysis. Since many of you are just taking or will be taking stats, we will guide you through this process. The entire statistical output is available under "Content" on zmUG010 - 13 cM - zmUG007 - 20 cM - zmUG002 - 19 cM - zmUG009 - 37.5 cM - Normal-Dwarf zmUG010 - 13 cM - Normal-Dwarf - 0.5 cM - zmUG007 - 20 cM - zmUG002 - 19 cM - zmUG009 zmUG010 - 12 cM zmUG007 - 20 cM - zmUG002 - 24 cM - Normal- Dwarf - 37.5 cM - zmUG009
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Courselink but we will cut out the specific pieces we need and present them here for this question. To determine the presence of a QTL, we are looking for a statistical significance between the marker genotypes and the plant heights. To determine statistical significance we are going to look at a probability value and an R-squared. The probability is a measure of the statistical confidence in the connection between marker genotypes and plant heights. To have a lot of confidence, we want to see a p-value less than 0.05 (ie less than 5% error chance of error) and hopefully really small like less than 0.0001 (ie less than 0.01% or almost no chance of error). R-squared is a measure of how well the data matches the genotypes. R-squared (or R 2 ) ranges from 0 to 1. An R- square of 0 means no match with the data and R-squared of 1 means a perfect match with the data. A perfect match never happens but the higher the R-squared, the better. So here are the results we need to focus on out of that entire R output PDF. If this output looks messy and nothing lines up, drag the width of your window out until the tables below don't change shape. Marker zmUG002: Analysis of Variance Table Response: height Df Sum Sq Mean Sq F value Pr(>F) zmUG002 2 17977 8988.6 37.402 9.003e-16 Residuals 458 110070 240.3 Multiple R-squared: 0.1404 Marker zmUG007 Analysis of Variance Table Response: height Df Sum Sq Mean Sq F value Pr(>F) zmUG007 2 87291 43645 490.46 < 2.2e-16 Residuals 458 40757 89 Multiple R-squared: 0.6817 Marker zmUG009
Analysis of Variance Table Response: height Df Sum Sq Mean Sq F value Pr(>F) zmUG009 2 3517 1758.5 6.4675 0.001699 Residuals 458 124530 271.9 Multiple R-squared: 0.02747 Marker zmUG010 Analysis of Variance Table Response: height Df Sum Sq Mean Sq F value Pr(>F) zmUG010 2 44762 22380.8 123.08 < 2.2e-16 Residuals 458 83286 181.8 Multiple R-squared: 0.3496 From these results, we can see that there are 2 markers tied for smallest p- value (ie smallest Pr (>F)). In case you didn't recognize it, the notation < 2.2e- 16 means < 2.2 -16 or in words a number less than 2.2 to the power -16 which is a really, really, really, really small number which is good, that gives us lots of confidence in the results. They are tied on probability, so to decide which of these two markers is better we need to look for which one has the highest R- squared value to determine which marker genotypes are a better fit for the data and therefore decide which marker is the most closely linked to the QTL. On the basis of p-value and R-squared showing us a strong statistical association, which marker is most closely linked to the QTL controlling plant height? (Note - if you look really closely, you may notice that the p-values and R- squared values get better for markers with smaller map distances from the marker to the QTL so question 3 may give you some hints with this question)
Question 5 1 / 1 point Now that we have decided which of the markers is most closely linked to the QTL controlling plant height, what is the average plant height associated with each marker genotype. Again, the relevant pieces of the R output are provided below for all markers. So the first step is to look back at Question 4 to decide which marker you are focusing on and then record the average plant height for each genotype from the R output below. In the tables below, each genotype is listed and for each genotype you have "lsmean", SE (standard error), df, lower.CL (lower confidence limit) and upper.CL (upper confidence limit). The term "lsmean" refers to Least Squares Mean or in our case the Mean or average plant height in centimetres for the corn plants with that marker genotype. We don't need any of the other data but the SE is a measure of the range of the mean, d.f. is the degrees of freedom and the lower.CL and upper.CL define the + / - for the values of the mean. For example, for zmUG002, the mean height for plants with the A (ie AA) genotype was 30.9cm with a 95% chance the average plant height ranges from 27.5cm up to 34.3cm. Marker zmUG002 Marker zmUG002 is most closely linked to a QTL controlling plant height with the smallest p-value (aka Pr (> F) value) of 9.003e-16 and the largest R-squared value of 0.1404 Marker zmUG007 is most closely linked to a QTL controlling plant height with the smallest p-value (aka Pr (> F) value) of <2.2e-16 and the largest R-squared value of 0.6817 Marker zmUG009 is most closely linked to a QTL controlling plant height with the smallest p-value (aka Pr (> F) value) of 0.001699 and the largest R-squared value of 0.02747 Marker zmUG010 is most closely linked to a QTL controlling plant height with the smallest p-value (aka Pr (> F) value) of <2.2e-16 and the largest R-squared value of 0.3496
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
zmUG002 lsmean SE df lower.CL upper.CL A 30.9 1.41 458 27.5 34.3 G 46.9 1.45 458 43.5 50.4 GA 43.6 1.03 458 41.1 46.1 Marker zmUG007 zmUG007 lsmean SE df lower.CL upper.CL C 17.4 0.876 458 15.3 19.5 T 49.3 0.844 458 47.3 51.4 TC 48.9 0.636 458 47.4 50.4 Marker zmUG009 zmUG009 lsmean SE df lower.CL upper.CL A 36.5 1.51 458 32.9 40.1 G 43.6 1.59 458 39.7 47.4 GA 42.3 1.08 458 39.7 44.9 Marker zmUG010 zmUG010 lsmean SE df lower.CL upper.CL A 48.3 1.241 458 45.3 51.2 AG 45.7 0.883 458 43.6 47.8 G 23.6 1.286 458 20.5 26.7 So for the marker you picked in Question 4 above as being the closest statistical association with a QTL for plant height (which you can double check by map distance in Question 3), what is the average plant height for each genotype?
Question 6 1 / 1 point Based on the results for all of the markers in questions 4 and 5 but most obviously for the marker you picked in question 5 above, what is the most likely form of expression for this QTL? Question 7 1 / 1 point One way to visually check a dataset for the presence of a QTL is to plot the phenotypic observations and see if they appear to be continuous or if there is a tendency for the observations to cluster into two "clumps". The scientific term for "two clumps of data" is a bimodal distribution. When looking at plant or animal phenotypes, if the phenotype shows a bimodal distribution, then there is a good chance there is a QTL affecting the phenotype. Looking at the relevant piece extracted from the R output PDF below, what do you conclude from the distribution of the heights of the 461 plants in the dataset you've been working with? Parent 1 Genotype G - height = 46.9cm Parent 2 Genotype A - height = 30.9cm Heterozygous F1 GA - height = 43.6cm Parent 1 Genotype T - height = 49.3cm Parent 2 Genotype C - height = 17.4cm Heterozygous F1 TC - height = 48.9cm Parent 1 Genotype G - height = 43.6cm Parent 2 Genotype A - height = 36.5cm Heterozygous F1 GA - height = 42.3cm Parent 1 Genotype A - height = 48.3cm Parent 2 Genotype G - height = 26.7cm Heterozygous F1 AG - height = 45.7cm Complete dominance of the Parent 1 allele Incomplete dominance of the Parent 2 allele No dominance being expressed by either parental allele
Question 8 1 / 1 point Quantitative Geneticists go to the effort to find QTL in order to improve selection response. As we know from the swine PSE / PSS example in class, it is possible to select individuals with the best genotype and completely eliminate a less favourable allele in one generation. You have graduated and been hired by a plant breeding company. In starting your job, you want to impress your boss with all of your knowledge about corn The observations for plant height show a normal distribution which is additional evidence for the presence of a QTL for plant height consistent with our conclusions in Questions 3 and 4 A picture can be photoshopped to create fake news so a picture can't tell us anything about genetics The observations for plant height show a bimodal distribution which is additional evidence for the presence of a QTL for plant height consistent with our conclusions in Questions 3 and 4 The observations for plant height show a bimodal distribution which contradicts the evidence for the presence of a QTL for plant height in Questions 3 and 4
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
from MBG-2400 and you realize you can use some of the results from Virtual Lab 5.2 to help move your career ahead. Your boss wants you to quickly increase the height of the corn plants. You know from studying selection in MBG-2400 that you can do the same as the pig industry and eliminate the least favourable plant height allele for your corn breeding program in one generation. Based on the results of the analysis of plant height for the presence of a QTL and the conclusions in Question 4 (backed up by Question 3), to eliminate the dwarf allele of the plant height QTL in one generation, you should assign the following fitness levels to these three marker genotypes . . . Done Using the zmUG002 marker only, assign fitness of one to the Parent 1 genotype (G) and a fitness of zero to both the F1 (GA) and Parent 2 (A) genotypes. Using the zmUG007 marker only, assign fitness of one to the Parent 1 genotype (T) and a fitness of zero to both the F1 (TC) and Parent 2 (C) genotypes. Using the zmUG009 marker only, assign fitness of one to the Parent 1 genotype (G) and a fitness of zero to both the F1 (GA) and Parent 2 (A) genotypes. Using the zmUG010 marker only, assign fitness of one to the Parent 1 genotype (A) and a fitness of zero to both the F1 (AG) and Parent 2 (G) genotypes.