HW1
pdf
keyboard_arrow_up
School
University of Texas *
*We aren’t endorsed by this school
Course
322E
Subject
Industrial Engineering
Date
Dec 6, 2023
Type
Pages
5
Uploaded by ChefHedgehog3765
HW 1
Enter your name and EID here:
Riley Tran; rdt942
You will submit this homework assignment as a pdf file on Gradescope.
For all questions, include the R commands/functions that you used to find your answer (show R chunk).
Answers without supporting code will not receive credit. Write full sentences to describe your findings.
Part 1: (11 pts)
The dataset
mtcars
was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption
and other aspects of automobile design and performance for different cars (1973-74 models). Look up the
documentation for this data frame with a description of the variables by typing
?mtcars
in the console
pane.
Question 1: (2 pt)
Take a look at the first 6 rows of the dataset by using an
R
function in the code chunk below. Do you know
about any (or all) of these cars?
# your code goes below (make sure to edit comment)
head
(mtcars)
##
mpg cyl disp
hp drat
wt
qsec vs am gear carb
## Mazda RX4
21.0
6
160 110 3.90 2.620 16.46
0
1
4
4
## Mazda RX4 Wag
21.0
6
160 110 3.90 2.875 17.02
0
1
4
4
## Datsun 710
22.8
4
108
93 3.85 2.320 18.61
1
1
4
1
## Hornet 4 Drive
21.4
6
258 110 3.08 3.215 19.44
1
0
3
1
## Hornet Sportabout 18.7
8
360 175 3.15 3.440 17.02
0
0
3
2
## Valiant
18.1
6
225 105 2.76 3.460 20.22
1
0
3
1
Your answer goes here. Write sentences in bold.
I have heard of the Mazda RX4, but I have not heard of any of the other models.
Question 2: (2 pts)
How many rows and columns are there in this data frame in total?
1
# your code goes below (make sure to edit comment)
row
(mtcars)
##
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
##
[1,]
1
1
1
1
1
1
1
1
1
1
1
##
[2,]
2
2
2
2
2
2
2
2
2
2
2
##
[3,]
3
3
3
3
3
3
3
3
3
3
3
##
[4,]
4
4
4
4
4
4
4
4
4
4
4
##
[5,]
5
5
5
5
5
5
5
5
5
5
5
##
[6,]
6
6
6
6
6
6
6
6
6
6
6
##
[7,]
7
7
7
7
7
7
7
7
7
7
7
##
[8,]
8
8
8
8
8
8
8
8
8
8
8
##
[9,]
9
9
9
9
9
9
9
9
9
9
9
##
[ reached getOption("max.print") -- omitted 23 rows ]
Your answer goes here. Write sentences in bold.
There are 32 rows and 11 columns in the data frame in total.
Question 3: (1 pt)
Save
mtcars
in your environment and name it as your
eid
. From now on, use this new object instead of the
built-in dataset.
# your code goes below (make sure to edit comment)
mtcars
<-
'
rdt942
'
Your answer goes here. Write sentences in bold.
Question 4: (2 pts)
When is your birthday? Using indexing, grab the value of
mpg
that corresponds to the day of your birthday
(should be a number between 1 and 31).
# your code goes below (make sure to edit comment)
Your answer goes here. Write sentences in bold.
My birthday is August 9th.
Question 5: (2 pts)
Using logical indexing, count the number of rows in the dataset where the variable
mpg
takes on values
greater than 30.
# your code goes below (make sure to edit comment)
Your answer goes here. Write sentences in bold.
2
Question 6: (2 pts)
Let’s create a new variable called
kpl
which converts the fuel efficiency
mpg
in kilometers per liter. Knowing
that 1 mpg corresponds to 0.425 kpl, complete the following code and calculate the max kpl:
# Add a new variable to the dataset
Your answer goes here. Write sentences in bold.
Part 2: (6 pts)
Let’s quickly explore another built-in dataset:
airquality
which contains information about daily air quality
measruements in New York, May to September 1973.
Question 7: (2 pts)
Calculate the mean
Ozone
(in ppb). Why does it make sense to get this answer?
Hint: take a look at the
column
Ozone
in the dataset.
# your code goes below (make sure to edit comment)
mean
(airquality
$
Ozone)
## [1] NA
Your answer goes here. Write sentences in bold.
It makes sense to get this answer because a lot of the entrys in the Ozone column
read ‘NA’.
Question 8: (2 pts)
Look at the documentation for the function
mean()
by running
?mean
in the console
.
What argument
should be used to find the mean value that we were not able to get in the previous question? What type of
values does that argument take?
Your answer goes here. Write sentences in bold.
The argument ‘na.rm’ should be used to find the mean value of Ozone.
The
argument na.rm takes only values that evaluate to true, and only takes existing
numerical values.
Question 9: (2 pts)
Sometimes the R documentation does not feel complete. We wish we had more information or more examples.
Find a post online (include the link) that can help you use that argument in the
mean()
function. Then
finally find the mean ozone!
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
# your code goes below (make sure to edit comment)
print
(
mean
(airquality
$
Ozone,
na.rm=
TRUE
))
## [1] 42.12931
Your answer goes here. Write sentences in bold.
The link to the article The mean for Ozone is 42.12931 ppb.
————————————————
————————
Part 3: (5 pts)
The Internet clothing retailer Stitch Fix has developed a new model for selling clothes to people online.
Their basic approach is to send people a box of 5–6 items of clothing and allow them to try the clothes on.
Customers keep (and pay for) what they like while mailing back the remaining clothes. Stitch Fix then sends
customers a new box of clothes typically a month later.
A critical question for Stitch Fix to consider is “Which clothes should the send to each customer?” Since
customers do not request specific clothes, Stitch Fix has to come up with 5–6 items on its own that it thinks
the customers will like (and therefore buy). In order to learn something about each customer, they administer
an
intake survey
when a customer first signs up for the service. The survey has about 20 questions and
the data is then used to predict what kinds of clothes customers will like. In order to use the data from the
intake survey, a statistical algorithm must be built in order to process the customer data and make clothing
selections.
Suppose you are in charge of building the intake survey and the algorithm for choosing clothes based on the
intake survey data.
Question 10: (2 pts)
What kinds of questions do you think might be useful to ask of a customer in an intake survey in order to
better choose clothes for them? What kinds of data would be most valuable? See if you can come up with
at least 5 items.
Your answer goes here. Write sentences in bold.
The customer should be if they want mens or womens clothing, what sizes they wear, are they
an active person, do they work in a corporate setting, how would they describe their style,
and what color profile are you. There should also be options to pick which clothing piece does
the buyer like better. Binary data would be the most useful since most of the questions would
have a yes or no answer. For example, a person decides if they like an article of clothing. They
either choose yes or no, which can be represented by a 0 or a 1.
### Question 11: (3 pts)
In addition to the technical challenges of collecting the data and building this algorithm, you must also
consider the impact the algorithm may have on the people involved. What potential negative impact might
the algorithm have on the customers who are submitting their data? Consider both the data being submitted
as well as the way in which the algorithm will be used when answering this question.
Your answer goes here. Write sentences in bold.
The algorithm might leave out certain body concerns that the customer might not want to be
on display, or how a certain item will fit on a customer. The algorithm might come up with
a result that doesn’t appeal to the customer’s preferences in regards to style.
There might
not be enough questions asked for the algorithm to determine an accurate style profile for a
customer, leading to a customer getting rid of the service.
4
Formatting: (3 pts)
Knit your file! Into pdf directly or into html.
Is it working? If not, try to decipher the error message (look up the error message, consult websites such as
stackoverflow or crossvalidated.
Once it knits in html, click on
Open in Browser
at the top left of the window pops out. Print your html
file into pdf from your browser. Any issue? Ask your classmates or TA!
## R version 4.3.1 (2023-06-16)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Big Sur 11.3.1
##
## Matrix products: default
## BLAS:
/Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;
LAPACK ve
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/Chicago
## tzcode source: internal
##
## attached base packages:
## [1] stats
graphics
grDevices utils
datasets
methods
base
##
## loaded via a namespace (and not attached):
##
[1] compiler_4.3.1
fastmap_1.1.1
cli_3.6.1
tools_4.3.1
##
[5] htmltools_0.5.6 yaml_2.3.7
rmarkdown_2.24
knitr_1.43
##
[9] xfun_0.40
digest_0.6.33
rlang_1.1.1
evaluate_0.21
5