HW1

pdf

School

University of Texas *

*We aren’t endorsed by this school

Course

322E

Subject

Industrial Engineering

Date

Dec 6, 2023

Type

pdf

Pages

5

Uploaded by ChefHedgehog3765

Report
HW 1 Enter your name and EID here: Riley Tran; rdt942 You will submit this homework assignment as a pdf file on Gradescope. For all questions, include the R commands/functions that you used to find your answer (show R chunk). Answers without supporting code will not receive credit. Write full sentences to describe your findings. Part 1: (11 pts) The dataset mtcars was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and other aspects of automobile design and performance for different cars (1973-74 models). Look up the documentation for this data frame with a description of the variables by typing ?mtcars in the console pane. Question 1: (2 pt) Take a look at the first 6 rows of the dataset by using an R function in the code chunk below. Do you know about any (or all) of these cars? # your code goes below (make sure to edit comment) head (mtcars) ## mpg cyl disp hp drat wt qsec vs am gear carb ## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 ## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 ## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 ## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 ## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 ## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 Your answer goes here. Write sentences in bold. I have heard of the Mazda RX4, but I have not heard of any of the other models. Question 2: (2 pts) How many rows and columns are there in this data frame in total? 1
# your code goes below (make sure to edit comment) row (mtcars) ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] ## [1,] 1 1 1 1 1 1 1 1 1 1 1 ## [2,] 2 2 2 2 2 2 2 2 2 2 2 ## [3,] 3 3 3 3 3 3 3 3 3 3 3 ## [4,] 4 4 4 4 4 4 4 4 4 4 4 ## [5,] 5 5 5 5 5 5 5 5 5 5 5 ## [6,] 6 6 6 6 6 6 6 6 6 6 6 ## [7,] 7 7 7 7 7 7 7 7 7 7 7 ## [8,] 8 8 8 8 8 8 8 8 8 8 8 ## [9,] 9 9 9 9 9 9 9 9 9 9 9 ## [ reached getOption("max.print") -- omitted 23 rows ] Your answer goes here. Write sentences in bold. There are 32 rows and 11 columns in the data frame in total. Question 3: (1 pt) Save mtcars in your environment and name it as your eid . From now on, use this new object instead of the built-in dataset. # your code goes below (make sure to edit comment) mtcars <- ' rdt942 ' Your answer goes here. Write sentences in bold. Question 4: (2 pts) When is your birthday? Using indexing, grab the value of mpg that corresponds to the day of your birthday (should be a number between 1 and 31). # your code goes below (make sure to edit comment) Your answer goes here. Write sentences in bold. My birthday is August 9th. Question 5: (2 pts) Using logical indexing, count the number of rows in the dataset where the variable mpg takes on values greater than 30. # your code goes below (make sure to edit comment) Your answer goes here. Write sentences in bold. 2
Question 6: (2 pts) Let’s create a new variable called kpl which converts the fuel efficiency mpg in kilometers per liter. Knowing that 1 mpg corresponds to 0.425 kpl, complete the following code and calculate the max kpl: # Add a new variable to the dataset Your answer goes here. Write sentences in bold. Part 2: (6 pts) Let’s quickly explore another built-in dataset: airquality which contains information about daily air quality measruements in New York, May to September 1973. Question 7: (2 pts) Calculate the mean Ozone (in ppb). Why does it make sense to get this answer? Hint: take a look at the column Ozone in the dataset. # your code goes below (make sure to edit comment) mean (airquality $ Ozone) ## [1] NA Your answer goes here. Write sentences in bold. It makes sense to get this answer because a lot of the entrys in the Ozone column read ‘NA’. Question 8: (2 pts) Look at the documentation for the function mean() by running ?mean in the console . What argument should be used to find the mean value that we were not able to get in the previous question? What type of values does that argument take? Your answer goes here. Write sentences in bold. The argument ‘na.rm’ should be used to find the mean value of Ozone. The argument na.rm takes only values that evaluate to true, and only takes existing numerical values. Question 9: (2 pts) Sometimes the R documentation does not feel complete. We wish we had more information or more examples. Find a post online (include the link) that can help you use that argument in the mean() function. Then finally find the mean ozone! 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
# your code goes below (make sure to edit comment) print ( mean (airquality $ Ozone, na.rm= TRUE )) ## [1] 42.12931 Your answer goes here. Write sentences in bold. The link to the article The mean for Ozone is 42.12931 ppb. ———————————————— ———————— Part 3: (5 pts) The Internet clothing retailer Stitch Fix has developed a new model for selling clothes to people online. Their basic approach is to send people a box of 5–6 items of clothing and allow them to try the clothes on. Customers keep (and pay for) what they like while mailing back the remaining clothes. Stitch Fix then sends customers a new box of clothes typically a month later. A critical question for Stitch Fix to consider is “Which clothes should the send to each customer?” Since customers do not request specific clothes, Stitch Fix has to come up with 5–6 items on its own that it thinks the customers will like (and therefore buy). In order to learn something about each customer, they administer an intake survey when a customer first signs up for the service. The survey has about 20 questions and the data is then used to predict what kinds of clothes customers will like. In order to use the data from the intake survey, a statistical algorithm must be built in order to process the customer data and make clothing selections. Suppose you are in charge of building the intake survey and the algorithm for choosing clothes based on the intake survey data. Question 10: (2 pts) What kinds of questions do you think might be useful to ask of a customer in an intake survey in order to better choose clothes for them? What kinds of data would be most valuable? See if you can come up with at least 5 items. Your answer goes here. Write sentences in bold. The customer should be if they want mens or womens clothing, what sizes they wear, are they an active person, do they work in a corporate setting, how would they describe their style, and what color profile are you. There should also be options to pick which clothing piece does the buyer like better. Binary data would be the most useful since most of the questions would have a yes or no answer. For example, a person decides if they like an article of clothing. They either choose yes or no, which can be represented by a 0 or a 1. ### Question 11: (3 pts) In addition to the technical challenges of collecting the data and building this algorithm, you must also consider the impact the algorithm may have on the people involved. What potential negative impact might the algorithm have on the customers who are submitting their data? Consider both the data being submitted as well as the way in which the algorithm will be used when answering this question. Your answer goes here. Write sentences in bold. The algorithm might leave out certain body concerns that the customer might not want to be on display, or how a certain item will fit on a customer. The algorithm might come up with a result that doesn’t appeal to the customer’s preferences in regards to style. There might not be enough questions asked for the algorithm to determine an accurate style profile for a customer, leading to a customer getting rid of the service. 4
Formatting: (3 pts) Knit your file! Into pdf directly or into html. Is it working? If not, try to decipher the error message (look up the error message, consult websites such as stackoverflow or crossvalidated. Once it knits in html, click on Open in Browser at the top left of the window pops out. Print your html file into pdf from your browser. Any issue? Ask your classmates or TA! ## R version 4.3.1 (2023-06-16) ## Platform: aarch64-apple-darwin20 (64-bit) ## Running under: macOS Big Sur 11.3.1 ## ## Matrix products: default ## BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib ## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK ve ## ## locale: ## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 ## ## time zone: America/Chicago ## tzcode source: internal ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## loaded via a namespace (and not attached): ## [1] compiler_4.3.1 fastmap_1.1.1 cli_3.6.1 tools_4.3.1 ## [5] htmltools_0.5.6 yaml_2.3.7 rmarkdown_2.24 knitr_1.43 ## [9] xfun_0.40 digest_0.6.33 rlang_1.1.1 evaluate_0.21 5