In-class Worksheet- Data Science and Social Interactions Week 1-1-1

docx

School

University of Cincinnati, Main Campus *

*We aren’t endorsed by this school

Course

1082L

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

8

Uploaded by MegaSalamanderMaster948

Report
Names: Becca Rose, Avery Maple, Diya Patel, Sydney Raleigh Section: 001 In-class Worksheet: Data Science and Social Interactions Week 1 Biology 1082L This worksheet will take you through a series of steps to teach you the basics of coding using R Studio. Specifically, you will examine a dataset that you would be familiar with from the Behavior module and analyze the data using both Excel and R Studio. The goals of this exercise are to help you learn some of the differences between the two statistical programs, understand the benefits of R, and learn the basics of R before examining some more complicated biological data during Week 2. First, visit https://posit.cloud/ and create an account to use the free version of R Studio. While you can download R or R Studio to your personal hard drive, the cloud-based version of this through Posit Cloud will help ensure consistency across all PC and Mac devices, making it easier for us to assist you. 1. Conduct an ANOVA in Excel. Download the Excel spreadsheet from Canvas under the Data Science and Social Interactions Week 1 Module. The spreadsheet contains raw data that is in a familiar format that you might have utilized for the Behavior Module. Using the Data Analysis Toolpak, conduct an ANOVA analysis on this data. a. Copy/paste below the entire output from the ANOVA analysis that the Toolpak provides. (1 pt)
b. Write a sentence that summarizes the results from the ANOVA analysis, following the format in the General Scientific Writing Expectations document for ANOVA analyses. (Using the incorrect format will earn an automatic 0 for this question). (2 pts) From running the ANOVA test our data showed us that we will reject the null hypothesis because the p-value < .05. 2. Make a bar graph with error bars in Excel. Using either the formulas in Excel or the Descriptive Statistics function in Excel’s Data Analysis Toolpak, find the mean values and standard error values for each of the three groups and create a bar graph . Do not forget that when you add error bars in Excel, you need to “customize”, “specify the values”, and highlight the three standard error values you need to use. Copy/paste the bar graph below. Be sure to include a figure legend. (2 pts) pH =6 pH=7 pH=8 0 50 100 150 200 250 Water pH Time Spent Moving(s) 3. Format your data set to be imported into R Studio. There are two main steps you will need to follow in order to format your Excel file in such a way that can be properly imported into R Studio. a. Re-organize the data into two columns and “clean up” the spreadsheet itself. b. Once complete, save the file to make sure you have a .xls or .xlsx version of the file. Then, “Save As” in order to convert the file to a .csv file.
At this point in time, your group should watch the Code Along Video 1 – Week 1. During these Code Along videos, you will want to have this worksheet nearby, watch the Code Along Video, and have all of your group members follow the same steps as Dr. Hobson demonstrates. 4. Import your .csv file into R Studio. Utilize the videos on Canvas and the Code Along Video 1 to assist you. This essentially involves three steps: a. Upload your document into the R Studio platform. b. Import the document into your Console using the read.csv command. c. Attach the data set to ensure that R Studio knows which data set you are utilizing. 5. The Code Along Video will lead you through the following: a. Downloading and activating the “tidyverse” package. b. Loading your data set. c. Checking the data set. d. Summarizing the data set to find the means and standard error values. e. Plotting a bar plot using the means, adjusting the colors of the bars, adding error bars, and adding axis labels. **Note: please ignore the instructions to add a title to your graph** 6. Copy/paste your code here that produced your graph: (1 pt) a. ggplot(data=summary.stats, aes(x=pH, y=mean)) + b. geom_bar(stat='identity', fill="darkred") + c. labs(title="Effect of pH on movement", d. y="Mean time spent moving") + e. geom_errorbar(aes(ymin=mean-se, ymax=mean+se), f. width=0.25, linewidth=1) 7. Copy/paste your graph itself here: (2 pts) a. a At this point in time, your group should watch the Code Along Video 2 – Week 1.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
8. Conduct an ANOVA in R Studio. a. Copy/paste below the code you used to conduct the ANOVA and to provide your summary results. (1 pt) - results.anova <- aov(TimeSpentMoving ~ pH, data=anova_data) b. Write a sentence that summarizes the results from the ANOVA analysis, following the format in the General Scientific Writing Expectations document for ANOVA analyses. (Using the incorrect format will earn an automatic 0 for this question). (2 pts) (F = 13.88; df1 = 2; df2 = 42; p = 2.35 x 10^-5) 9. Make a box plot in R Studio following the instructions provided in the video. a. Copy/paste the coding below: (1 pt) - ggplot(anova_data, aes(x=pH, y=TimeSpentMoving)) + - geom_boxplot(fill="green") + - labs(title="Effect of pH on movement") b. Copy/paste the box plot itself below: (2 pts) 10. Add the raw data to your boxplot using the “jitter” command as shown in the video. a. Copy/paste the coding below: (1 pt) i. ggplot(anova_data, aes(x=pH, y=TimeSpentMoving)) + ii. geom_boxplot(fill="green") + iii. labs(title="Effect of pH on movement") + iv. geom_jitter(shape=16, position=position_jitter(0.1))
b. Copy/paste the box plot itself below: (2 pts) c. Are there any outliers revealed in the box plot with raw data? If so, describe them. (0.5 pt) Yes, there were a few outliers that became apparent with the box plot. The outliers found were right above the error lines. This shows that although we did have a few outliers they were close to the rest of our data which means that the outliers had no substantial impact on the rest of the data. d. Are there any clusters of data within a treatment group that have been revealed in the box plot with raw data? If so, describe them. (0.5 pt) No, I do not believe we have any clusters of data. 11. Now, go into Excel and make up some data of your own that would require an ANOVA analysis and use a bar graph for data visualization. Utilize three treatment groups and make sure your dependent variable is quantitative. Following the same prompts as above, but now doing so without a video to show you how to do so, import your data set into R Studio, and make a box plot with raw data showing on the graph. This time, each individual group member needs to provide a unique graph . Each graph should
have different axis labels, different mean values, different color schemes, etc. (2 pts) 1. Becca Rose 2. Avery 3. Sydney
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
5.
Diya Patel