Past exam Qus

pdf

School

University of New South Wales *

*We aren’t endorsed by this school

Course

2041

Subject

Geography

Date

May 3, 2024

Type

pdf

Pages

31

Uploaded by AdmiralSparrow4236

Report
2 Question 1. Answer all parts 1A to 1E. Prompted by the devastating fire season this year in Australia, a climate scientist decided to analyse if climate change played a significant role in this particular event. The scientist first plotted the annual average of monthly maximum temperature anomalies in southern Australia since 1910: The scientist then performed a statistical analysis on her computer to analyse the data above and sees the following on her screen: 1A) What kind of analysis did she perform? Write the equation, explain the information you can gather from this output in your own words, and conclude.
3 Dangerous fire conditions do not only depend on maximum temperatures but also on fuel load (vegetation biomass) and if the fuel load is dry enough to burn. She therefore performed a similar analysis for annual mean precipitation (in mm per month) over time and gets the following result: 1B) Write the equation, explain the information you can gather from this output in your own words, and conclude. The scientist then divided her data into three groups, years with large fires (>1M acres burnt), years with medium fires (between 0.5M and 1M acres burnt) and years with small fires (< 0.5 M burnt), and produced the following box plot:
4 She then ran a statistical analysis and sees the following on her screen: Source SS df MS F Prob>F Groups 4.7245 2 2.36225 5.38 0.0059 Error 46.9945 107 0.4392 Total 51.719 109 1C) What kind of analysis did she perform? Explain the information you can gather from this output in your own words, and conclude. Here is a similar plot for precipitation and the result of a similar analysis for precipitation: Source SS df MS F Prob>F Groups 317.33 2 158.666 4.72 0.0109 Error 3596.84 107 33.615 Total 3914.18 109
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
5 1D) Explain the information you can gather from this output in your own words and conclude. Below is a different way to visualize the data with the black diamonds marking the years with large fires (exceeding 1M acres of burnt land). 1E) Most of the fires are sitting in the upper left quadrant. What does this mean? Would there have been a smarter way to analyse the data? How?
6 Question 2 . Answer all parts 2A to 2D. A team of conservation biologists was monitoring the reproduction of an endangered plant in a national park south of Sydney. They set up 15 plots in the heathland in Spring 2019 and measured the number of seedlings that had recently emerged. Over summer 2019/2020, the group actively removed weeds from all the plots. In Spring 2020, they revisited the plots and again measured the number of seedlings that had recently emerged. They collected the following data plotted the differences between seedling recruitment in the two years. Plot Number of seedlings in 2019 Number of seedlings in 2020 1 5 18 2 3 24 3 7 45 4 9 23 5 10 76 6 23 30 7 2 15 8 8 8 9 12 26 10 15 34 11 31 41 12 10 32 13 8 19 14 2 15 15 9 24 A paired-t test was run to formally test whether the two years were different. Paired t-test data: Seedlings$Seedlings by Seedlings$Year t = -4.5718, df = 14, p-value = 0.0004352 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -27.032092 -9.767908 sample estimates: mean of the differences -18.4
7 2A) Why did the scientists decide to use a paired t -test? 2B) Write a paragraph that could be given the national park managers to explain the result. 2C) Can you see any problems with the analysis that they ran? 2D) How confident are you that the results can be explained by the weed removal work conducted between the two sampling events? Describe a sampling design that would allow a more effective test of whether removing weeds could help the regeneration of the endangered plant.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
8 Question 3 . Answer all parts 3A to 3D. Reef Life Survey is a citizen science program where SCUBA divers record the abundance of fish and other organisms from reefs all around the world. By April 2020, over 13000 surveys have been conducted in 53 countries recording over 18 million individuals from almost 5000 species. From the 432325 observations currently available from Australia, I have extracted a sample of these data. The sample has 1000 of the surveys conducted, with data on the abundance of 1759 species of fish. These come from the Temperate Australasia and Indo-West Pacific biogeographic realms (the light green and dark purple regions to the south and north of Australia) Within those realms, there are smaller ecoregions .
9 Here is a multidimensional scaling (MDS) plot that visualises those surveys with each point colour- coded by each of the 20 ecoregions around Australia, with a different symbol for the two biogeographic realms. 3A) What do each of the points on the plot and the distance between them represent? 3B) The stress value for that MDS analysis was 0.105. What does this mean? 3C) What do you conclude about the variation in fish communities across Australia from the MDS plot? 3D) The plot has strong evidence that tropical fish are more abundant than those in temperate regions ” True or false? Discuss with reasons.
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 1 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…%7D&locale=en_us&context=preview&assessmentRunId=108331043#/all ! 2022 - BEES2041 Cover sheet BEES2041-BEES5041 – Data Analysis for Life and Earth Scientists Final Exam –Term 1 2022 Instructions: 1. Time allowed – 2 hours , plus 15 minutes. 2. Total number of questions to be answered – 16 3. Total marks available – 100 marks , worth 35% of the total marks for the course. 4. Marks available for each question are shown in the exam. 5. Students are advised to read all of the examination questions before attempting to answer the questions. 6. This exam cannot be copied, forwarded, or shared in any way 7. Students are reminded of the UNSW rules regarding academic integrity and plagiarism 8. Your work will be saved periodically throughout the exam and will be automatically submitted when the test ends provided you are connected to the internet 9. You must upload all of your work within the exam time. There is no extra time to upload. No late submissions will be accepted. If you have a question or concern during today’s exam, you should contact the exams team for support at Phone +61 2 8936 7007 or the online form .
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 2 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…%7D&locale=en_us&context=preview&assessmentRunId=108331043#/all NSW is home to around 7200 native plant species. Of these, 574 are listed as either Vulnerable, Endangered or Critically Endangered under the NSW Biodiversity Conservation Act 2016. From here on we'll refer to species as being listed or un-listed. A scientist was interested to understand whether there are particular characteristics that make species more likely to be listed under the act. They hypothesised that some growth forms (trees) would be more likely to be listed, as they have longer life cycles and smaller populations in a given area of habitat. To test this, they first compiled a data set of the entire NSW flora, including species name, status (listed or not listed) and the species growth form (tree, herb, shrub, climber). They then calculated the following by growth form: n_species_listed: the number of listed species in each growth form n_species_total: the number of species in each growth form percent_listed: the percentage of each species listed, in each growth from (= n_species_listed/n_species_total *100) The resulting dataset looked like this:
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 3 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…%7D&locale=en_us&context=preview&assessmentRunId=108331043#/all When first analysing the data, they noticed a large number of shrubs recorded as listed, and fewer trees (panel a). But there are also big differences in the number of species present (panel b). They therefore calculated the fraction of species in each growth form that were listed (panel c), so that they could standardise the number of species listed by the total number of species. This panel seemed to support the researchers hypothesis, that more trees were listed, but also shrubs. Before getting too excited, the researchers ran a statistical test comparing the number of listed species in each group to the proportion of total species in each growth form.
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 4 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…%7D&locale=en_us&context=preview&assessmentRunId=108331043#/all 1 Hypotheses Explain what test was run. What are the null and alternate hypothesis? Format " # Σ $ Words: 0 Maximum marks: 5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 5 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…%7D&locale=en_us&context=preview&assessmentRunId=108331043#/all 2 Expected Explain how the variable `p_expected` was calculated and the purpose of including it in the test. Format " # Σ $ Words: 0 Maximum marks: 5
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 6 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…%7D&locale=en_us&context=preview&assessmentRunId=108331043#/all 3 Conclusions Write a short paragraph for National Parks, describing what you concluded from the test Format " # Σ $ Words: 0 Maximum marks: 5
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 7 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…%7D&locale=en_us&context=preview&assessmentRunId=108331043#/all 4 Design Based on the results of this study, the researcher sought to understand why some tree species were listed and others weren't. Suggest a hypothesis for why tree shrub species may be more likely to be listed, and outline a sampling design to test the hypothesis, by comparing listed and un listed species. Format " # Σ $ Words: 0 Maximum marks: 10
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 8 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…%7D&locale=en_us&context=preview&assessmentRunId=108331043#/all Baiting with poison has been used to control populations of dingoes (often referred to as wild dogs) in many agricultural areas in Australia, because they sometimes kill livestock. Larger dingoes have been observed to be less susceptible to baiting, leading researchers to hypothesise that baiting would cause the size of dingoes to increase where baiting had been present for long periods. The ideal design for testing this would involve sampling weights of adults before and after baiting was introduced. As weights cannot be collected for locations where baiting is already present, the researchers wondered if they could use the size of dingo skulls, stored as museum samples, as an indicator of body mass? Fortunately, a dataset had been collected where mass at the time of death for many of the skulls that the researchers measured had been recorded. The researchers therefore first tested the hypothesis that body mass increased significantly with skull length. The dataset had the following variables: They first visualised the data. While there was overall a good relationship, there were some obvious outliers in the data.
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 9 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…%7D&locale=en_us&context=preview&assessmentRunId=108331043#/all
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 10 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…7D&locale=en_us&context=preview&assessmentRunId=108331043#/all 5 Outliers Outline what steps you would take to decide whether the outliers should remain in the analysis. What factors could justify removing outliers? Format " # Σ $ Words: 0 Maximum marks: 5 6 Test 1 With the outliers removed, the researchers ran a test relating body mass to skull length. The results of the test are shown below.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 11 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…%7D&locale=en_us&context=preview&assessmentRunId=108331043#/all What kind of analysis did they perform? Write the equation, explain the information you can gather from this output in your own words, and conclude. Format " # Σ $ Words: 0
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 12 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…7D&locale=en_us&context=preview&assessmentRunId=108331043#/all Maximum marks: 5 7 Test 2 The researchers subsequently realised that it could matter whether the samples came from males and females animals, as they may have different morphology. They made a plot which suggested some differences. So they decided to test if the predicted equation differed between males and females. They ran the following test, with results as shown:
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 13 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…7D&locale=en_us&context=preview&assessmentRunId=108331043#/all What kind of analysis did they perform? Explain the information you can gather from this output in your own words, and conclude. Format " # Σ $ Words: 0 Maximum marks: 7.5 8 Test 3 Finally, the researchers ran an anova to see whether skull size changed in three different geographic zones after the introduction of baiting. The plot shows the distribution of skull sizes recorded before ("pre") and after ("post") baiting was initiated.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 14 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…7D&locale=en_us&context=preview&assessmentRunId=108331043#/all Results of the test are as follows.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
1/4/2022, 5 : 15 pm Final - Term 1 2022 - BEES2041-BEES5041 - Data Analysis: Life & Earth Sc Page 15 of 26 https://unsw.inspera.com/static/player ? viewMedia=print&printPar…7D&locale=en_us&context=preview&assessmentRunId=108331043#/all Was the researchers original hypothesis supported? What can you conclude from this test? Format " # Σ $ Words: 0 Maximum marks: 7.5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BEES2041-BEES5041 - Final Exam - T1 2023 - Data Analysis for Life and Earth Sciences-Data Analysis: Environmental Science & Management 1/27 BEES2041-BEES5041 – Data Analysis for Life and Earth Scientists Final Exam –Term 1 2023 Instructions: 1. Time allowed – 2 hours , plus 15 minutes. 2. Total number of questions to be answered – 15 3. Total marks available – 100 marks , worth 35% of the total marks for the course. 4. Marks available for each question are shown in the exam. 5. Students are advised to read all of the examination questions before attempting to answer the questions. 6. This exam cannot be copied, forwarded, or shared in any way 7. Students are reminded of the UNSW rules regarding academic integrity and plagiarism 8. Your work will be saved periodically throughout the exam and will be automatically submitted when the test ends provided you are connected to the internet 9. You must upload all of your work within the exam time. There is no extra time to upload. No late submissions will be accepted.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BEES2041-BEES5041 - Final Exam - T1 2023 - Data Analysis for Life and Earth Sciences-Data Analysis: Environmental Science & Management 13/27 Gap-filling with predictive models Note: This question logically follows Questions 4-7 above (on leaf area vs temperature). We suggest completing those questions before attempting this section. As noted earlier, the size of a plant's leaves (called "leaf area") is a trait that affects where species are found. Species with larger leaves tend to be found in warmer and wetter areas. Many people therefore consider leaf area a useful indicator of a species preferred climate. There are over 22,000 plant species in Australia. If we had traits like "leaf area" measured for all species, we could use them as an easy indicator of species ecology. However, we currently only have data on leaf area for around 20% of known species. There is therefore a strong need to increase the number of species with records of leaf area. To increase coverage of the dataset in Australia, Isaac's team decided to see whether they could predict a species leaf area from other traits, for which they have more data. This is called gap- filling. The table below shows the number of species for which we have data on a range of variables: Variable Number of species with records in Australia species name 24,472 family 24,435 growth form 15,010 leaf_width 14,149 leaf_length 14,619 leaf_area 4,841 As you can see, we have lots more data on characters like growth form, and quite a lot on other leaf dimensions, such as "leaf width" and "leaf length". The researchers therefore wanted to test how well they could predict leaf area from these other traits. If it worked, they could greatly increase coverage of this important trait.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BEES2041-BEES5041 - Final Exam - T1 2023 - Data Analysis for Life and Earth Sciences-Data Analysis: Environmental Science & Management 14/27 3(a) The team decided to use a random forest model to make the predictions, using the function `ranger` from the R package `ranger`. Following standard techniques for predictive modelling, they first assembled a "labelled" dataset, where the desired outcome variable (leaf area) is known. The dataset had the following columns. Note that the 3 numeric traits were log transformed first. They then split the dataset into two parts using the following code. Explain the role of training and testing datasets, created above, for development of the predictive model. If the goal is to predict the variable leaf area, why is it included in the labelled dataset? Fill in your answer here Format Σ
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BEES2041-BEES5041 - Final Exam - T1 2023 - Data Analysis for Life and Earth Sciences-Data Analysis: Environmental Science & Management 15/27 Words: 0 Maximum marks: 5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BEES2041-BEES5041 - Final Exam - T1 2023 - Data Analysis for Life and Earth Sciences-Data Analysis: Environmental Science & Management 16/27 3(b) The researchers ran their model with all covariates included, then calculated the Root Mean Square Error (RMSE) between observed and predicted values from the model, in both the training and testing datasets. The following plot shows observed Y vs predicted Y, with RMSE and 1:1 line, in both the training and testing datasets. Describe the result. How well does this model predict leaf area? Fill in your answer here Format Σ
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BEES2041-BEES5041 - Final Exam - T1 2023 - Data Analysis for Life and Earth Sciences-Data Analysis: Environmental Science & Management 17/27 Words: 0 Maximum marks: 5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BEES2041-BEES5041 - Final Exam - T1 2023 - Data Analysis for Life and Earth Sciences-Data Analysis: Environmental Science & Management 18/27 3(c) There's a lot more data available on simple categorical variables like growth form. So we could potentially predict leaf area for many more species if our model only used the categorical variables to make predictions. But would such a model be skilful enough? The researchers therefore compared the predictive skill of 3 models using different covariates: 1. All covariates (as in the previous page) 2. A model using only one of the other numeric variables along with categorical data (leaf length, so excluding leaf width) 3. A model using only the categorical variables. The following plot shows observed Y vs predicted Y, with RMSE and 1:1 line, in both the training and testing datasets, for model 3 (Only categorical variables). The following table shows the RMSE in the testing dataset for the 3 models: Model RMSE 1. All covariates 0.40 2. leaf length plus categorical variables 0.59 3. Only categorical variables 0.84 Based on these results, and those shown on the previous page, write a paragraph to summarise your findings on our ability to use predictive models to estimate a species leaf area. Fill in your answer here Format Σ
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BEES2041-BEES5041 - Final Exam - T1 2023 - Data Analysis for Life and Earth Sciences-Data Analysis: Environmental Science & Management 19/27 Words: 0 Maximum marks: 10
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help