2023-01-30_dataframes
pdf
keyboard_arrow_up
School
University of Houston *
*We aren’t endorsed by this school
Course
404
Subject
Statistics
Date
Apr 3, 2024
Type
Pages
13
Uploaded by student4781
Notes 3: Data Frames and Control
STAT 404: Statistical Computing
Recap
1. What attribute defines an S3 object on a base type? What are some (one main one) S3 objects we
have discussed?
If it has a
class
attribute it’s S3. Data frame has class
data.frame
2. .Rmd demo: Open a new .Rmd file and practice the following three ways to insert a new code chunk
•
The keyboard shortcut Cmd + Option + I / Ctrl + Alt + I.
•
The “Insert” button icon in the editor toolbar.
•
By manually typing the chunk delimiters
```
{r}
and
```
.
3. What do the chunk options
eval
and
echo
control?
Outline
•
Making and working with data frames
–
Subsetting
–
Adding new variables (columns)
–
Removing variables (columns)
In our last thrilling episode
•
Atomic vectors: series of values all of the same type
e.g.,
v[5]
,
v["name"]
•
Arrays:
multi-dimensional generalization of atomic vectors e.g.,
a[5,6,2]
,
a[,6,]
,
a["rowname",
"colname", "layername"]
•
Matrices: special 2D arrays with matrix math
e.g.,
m[5,6]
,
m[,6]
,
m[,"colname"]
•
Lists: vector of values of mixed types
e.g.,
l[[3]]
,
l$name
•
Data frames: list with
data.frame
class attribute; matrix and list indexing work
1
Data frames, encore
•
2D tables of data
•
Each case/observation is a row
•
Each variable/feature is a column
•
Variables can be of any type (numbers, text, Booleans, . . . )
•
Both rows and columns can get names
Creating an example data frame
Use
data.frame()
, similar to how we create lists with
list()
my.df
=
data.frame(
nums=
seq(
0.1
,
0.6
,
by=
0.1
),
chars=
letters[
1
:
6
],
bools=
sample(c(TRUE,FALSE),
6
,
replace=
TRUE))
my.df
##
nums chars bools
## 1
0.1
a FALSE
## 2
0.2
b FALSE
## 3
0.3
c FALSE
## 4
0.4
d FALSE
## 5
0.5
e FALSE
## 6
0.6
f
TRUE
attributes(my.df)
## $names
## [1] "nums"
"chars" "bools"
##
## $class
## [1] "data.frame"
##
## $row.names
## [1] 1 2 3 4 5 6
# Note, a list can have different lengths for different elements!
my.list
=
list(
nums=
seq(
0.1
,
0.6
,
by=
0.1
),
chars=
letters[
1
:
12
],
bools=
sample(c(TRUE,FALSE),
6
,
replace=
TRUE))
my.list
## $nums
## [1] 0.1 0.2 0.3 0.4 0.5 0.6
##
## $chars
##
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l"
##
## $bools
## [1] FALSE FALSE FALSE FALSE FALSE
TRUE
2
Indexing a data frame
•
By rows/columns: similar to how we index matrices
•
By columns only: similar to how we index lists
my.df[,
1
]
# Also works for a matrix
## [1] 0.1 0.2 0.3 0.4 0.5 0.6
my.df[,
"nums"
]
# Also works for a matrix
## [1] 0.1 0.2 0.3 0.4 0.5 0.6
my.df$nums
# Doesn
'
t work for a matrix, but works for a list
## [1] 0.1 0.2 0.3 0.4 0.5 0.6
my.df$chars
# Note: this one has been converted into a factor data type
## [1] "a" "b" "c" "d" "e" "f"
as.character(my.df$chars)
# Converting it back to a character data type; As of 4.1 R no longer converts
## [1] "a" "b" "c" "d" "e" "f"
Creating a data frame from a matrix
Often times it’s helpful to start with a matrix, and add columns (of different data types) to make it a data
frame
class(state.x77)
# Built-in matrix of states data, 50 states x 8 variables
## [1] "matrix" "array"
head(state.x77)
##
Population Income Illiteracy Life Exp Murder HS Grad Frost
Area
## Alabama
3615
3624
2.1
69.05
15.1
41.3
20
50708
## Alaska
365
6315
1.5
69.31
11.3
66.7
152 566432
## Arizona
2212
4530
1.8
70.55
7.8
58.1
15 113417
## Arkansas
2110
3378
1.9
70.66
10.1
39.9
65
51945
## California
21198
5114
1.1
71.71
10.3
62.6
20 156361
## Colorado
2541
4884
0.7
72.06
6.8
63.9
166 103766
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
class(state.region)
# Factor of regions for the 50 states
## [1] "factor"
head(state.region)
## [1] South West
West
South West
West
## Levels: Northeast South North Central West
class(state.division)
# Factor of divisions for the 50 states
## [1] "factor"
head(state.division)
## [1] East South Central Pacific
Mountain
West South Central
## [5] Pacific
Mountain
## 9 Levels: New England Middle Atlantic South Atlantic ... Pacific
levels(state.division)
## [1] "New England"
"Middle Atlantic"
"South Atlantic"
## [4] "East South Central" "West South Central" "East North Central"
## [7] "West North Central" "Mountain"
"Pacific"
is.object(state.division)
## [1] TRUE
typeof(state.division)
## [1] "integer"
# Combine these into a data frame with 50 rows and 10 columns
state.df
=
data.frame(state.x77,
Region=
state.region,
Division=
state.division)
class(state.df)
## [1] "data.frame"
head(state.df)
# Note that the first 8 columns name carried over from state.x77
4
##
Population Income Illiteracy Life.Exp Murder HS.Grad Frost
Area
## Alabama
3615
3624
2.1
69.05
15.1
41.3
20
50708
## Alaska
365
6315
1.5
69.31
11.3
66.7
152 566432
## Arizona
2212
4530
1.8
70.55
7.8
58.1
15 113417
## Arkansas
2110
3378
1.9
70.66
10.1
39.9
65
51945
## California
21198
5114
1.1
71.71
10.3
62.6
20 156361
## Colorado
2541
4884
0.7
72.06
6.8
63.9
166 103766
##
Region
Division
## Alabama
South East South Central
## Alaska
West
Pacific
## Arizona
West
Mountain
## Arkansas
South West South Central
## California
West
Pacific
## Colorado
West
Mountain
data.frame(unname(state.x77),
Region=
state.region,
Division=
state.division)
##
X1
X2
X3
X4
X5
X6
X7
X8
Region
Division
## 1
3615 3624 2.1 69.05 15.1 41.3
20
50708
South East South Central
## 2
365 6315 1.5 69.31 11.3 66.7 152 566432
West
Pacific
## 3
2212 4530 1.8 70.55
7.8 58.1
15 113417
West
Mountain
## 4
2110 3378 1.9 70.66 10.1 39.9
65
51945
South West South Central
## 5
21198 5114 1.1 71.71 10.3 62.6
20 156361
West
Pacific
## 6
2541 4884 0.7 72.06
6.8 63.9 166 103766
West
Mountain
## 7
3100 5348 1.1 72.48
3.1 56.0 139
4862
Northeast
New England
## 8
579 4809 0.9 70.06
6.2 54.6 103
1982
South
South Atlantic
## 9
8277 4815 1.3 70.66 10.7 52.6
11
54090
South
South Atlantic
## 10
4931 4091 2.0 68.54 13.9 40.6
60
58073
South
South Atlantic
## 11
868 4963 1.9 73.60
6.2 61.9
0
6425
West
Pacific
## 12
813 4119 0.6 71.87
5.3 59.5 126
82677
West
Mountain
## 13 11197 5107 0.9 70.14 10.3 52.6 127
55748 North Central East North Central
## 14
5313 4458 0.7 70.88
7.1 52.9 122
36097 North Central East North Central
## 15
2861 4628 0.5 72.56
2.3 59.0 140
55941 North Central West North Central
## 16
2280 4669 0.6 72.58
4.5 59.9 114
81787 North Central West North Central
## 17
3387 3712 1.6 70.10 10.6 38.5
95
39650
South East South Central
## 18
3806 3545 2.8 68.76 13.2 42.2
12
44930
South West South Central
## 19
1058 3694 0.7 70.39
2.7 54.7 161
30920
Northeast
New England
## 20
4122 5299 0.9 70.22
8.5 52.3 101
9891
South
South Atlantic
## 21
5814 4755 1.1 71.83
3.3 58.5 103
7826
Northeast
New England
## 22
9111 4751 0.9 70.63 11.1 52.8 125
56817 North Central East North Central
## 23
3921 4675 0.6 72.96
2.3 57.6 160
79289 North Central West North Central
## 24
2341 3098 2.4 68.09 12.5 41.0
50
47296
South East South Central
## 25
4767 4254 0.8 70.69
9.3 48.8 108
68995 North Central West North Central
## 26
746 4347 0.6 70.56
5.0 59.2 155 145587
West
Mountain
## 27
1544 4508 0.6 72.60
2.9 59.3 139
76483 North Central West North Central
## 28
590 5149 0.5 69.03 11.5 65.2 188 109889
West
Mountain
## 29
812 4281 0.7 71.23
3.3 57.6 174
9027
Northeast
New England
## 30
7333 5237 1.1 70.93
5.2 52.5 115
7521
Northeast
Middle Atlantic
## 31
1144 3601 2.2 70.32
9.7 55.2 120 121412
West
Mountain
## 32 18076 4903 1.4 70.55 10.9 52.7
82
47831
Northeast
Middle Atlantic
## 33
5441 3875 1.8 69.21 11.1 38.5
80
48798
South
South Atlantic
## 34
637 5087 0.8 72.78
1.4 50.3 186
69273 North Central West North Central
## 35 10735 4561 0.8 70.82
7.4 53.2 124
40975 North Central East North Central
## 36
2715 3983 1.1 71.42
6.4 51.6
82
68782
South West South Central
5
## 37
2284 4660 0.6 72.13
4.2 60.0
44
96184
West
Pacific
## 38 11860 4449 1.0 70.43
6.1 50.2 126
44966
Northeast
Middle Atlantic
## 39
931 4558 1.3 71.90
2.4 46.4 127
1049
Northeast
New England
## 40
2816 3635 2.3 67.96 11.6 37.8
65
30225
South
South Atlantic
## 41
681 4167 0.5 72.08
1.7 53.3 172
75955 North Central West North Central
## 42
4173 3821 1.7 70.11 11.0 41.8
70
41328
South East South Central
## 43 12237 4188 2.2 70.90 12.2 47.4
35 262134
South West South Central
## 44
1203 4022 0.6 72.90
4.5 67.3 137
82096
West
Mountain
## 45
472 3907 0.6 71.64
5.5 57.1 168
9267
Northeast
New England
## 46
4981 4701 1.4 70.08
9.5 47.8
85
39780
South
South Atlantic
## 47
3559 4864 0.6 71.72
4.3 63.5
32
66570
West
Pacific
## 48
1799 3617 1.4 69.48
6.7 41.6 100
24070
South
South Atlantic
## 49
4589 4468 0.7 72.48
3.0 54.5 149
54464 North Central East North Central
## 50
376 4566 0.6 70.29
6.9 62.9 173
97203
West
Mountain
data.frame()
is combining a pre-existing matrix (
state.x77
) and two vectors of qualitative categorical
variables (called
factors
;
state.region
,
state.division
)
Column names are preserved or guessed if not explicitly set
colnames(state.df)
##
[1] "Population" "Income"
"Illiteracy" "Life.Exp"
"Murder"
##
[6] "HS.Grad"
"Frost"
"Area"
"Region"
"Division"
state.df[
1
,]
##
Population Income Illiteracy Life.Exp Murder HS.Grad Frost
Area Region
## Alabama
3615
3624
2.1
69.05
15.1
41.3
20 50708
South
##
Division
## Alabama East South Central
Data frame access
By row and column index
state.df[
49
,
3
]
## [1] 0.7
By row and column names
state.df[
"Wisconsin"
,
"Illiteracy"
]
## [1] 0.7
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
rownames(state.df)
##
[1] "Alabama"
"Alaska"
"Arizona"
"Arkansas"
##
[5] "California"
"Colorado"
"Connecticut"
"Delaware"
##
[9] "Florida"
"Georgia"
"Hawaii"
"Idaho"
## [13] "Illinois"
"Indiana"
"Iowa"
"Kansas"
## [17] "Kentucky"
"Louisiana"
"Maine"
"Maryland"
## [21] "Massachusetts"
"Michigan"
"Minnesota"
"Mississippi"
## [25] "Missouri"
"Montana"
"Nebraska"
"Nevada"
## [29] "New Hampshire"
"New Jersey"
"New Mexico"
"New York"
## [33] "North Carolina" "North Dakota"
"Ohio"
"Oklahoma"
## [37] "Oregon"
"Pennsylvania"
"Rhode Island"
"South Carolina"
## [41] "South Dakota"
"Tennessee"
"Texas"
"Utah"
## [45] "Vermont"
"Virginia"
"Washington"
"West Virginia"
## [49] "Wisconsin"
"Wyoming"
state.df[
"Wisconsin"
,
3
]
## [1] 0.7
class(state.df[
"Wisconsin"
,
3
])
## [1] "numeric"
Data frame access (cont’d)
All of a row:
state.df[
"Wisconsin"
,]
##
Population Income Illiteracy Life.Exp Murder HS.Grad Frost
Area
## Wisconsin
4589
4468
0.7
72.48
3
54.5
149 54464
##
Region
Division
## Wisconsin North Central East North Central
class(state.df[
"Wisconsin"
,])
## [1] "data.frame"
Exercise: what class is
state.df["Wisconsin",]
?
Data frame
Data frame access (cont’d.)
All of a column:
7
head(state.df[,
3
])
## [1] 2.1 1.5 1.8 1.9 1.1 0.7
head(state.df[,
"Illiteracy"
])
## [1] 2.1 1.5 1.8 1.9 1.1 0.7
head(state.df$Illiteracy)
## [1] 2.1 1.5 1.8 1.9 1.1 0.7
Data frame access (cont’d.)
Rows matching a condition:
state.df[state.df$Division==
"New England"
,
"Illiteracy"
]
## [1] 1.1 0.7 1.1 0.7 1.3 0.6
state.df[state.df$Region==
"South"
,
"Illiteracy"
]
##
[1] 2.1 1.9 0.9 1.3 2.0 1.6 2.8 0.9 2.4 1.8 1.1 2.3 1.7 2.2 1.4 1.4
Adding columns to a data frame
To add columns: we can either use
data.frame()
, or directly define a new named column
# First way: use data.frame() to concatenate on a new column
state.df
=
data.frame(state.df,
Cool=
sample(c(T,F), nrow(state.df),
rep=
TRUE))
head(state.df,
4
)
##
Population Income Illiteracy Life.Exp Murder HS.Grad Frost
Area
## Alabama
3615
3624
2.1
69.05
15.1
41.3
20
50708
## Alaska
365
6315
1.5
69.31
11.3
66.7
152 566432
## Arizona
2212
4530
1.8
70.55
7.8
58.1
15 113417
## Arkansas
2110
3378
1.9
70.66
10.1
39.9
65
51945
##
Region
Division
Cool
## Alabama
South East South Central FALSE
## Alaska
West
Pacific
TRUE
## Arizona
West
Mountain FALSE
## Arkansas
South West South Central FALSE
8
# Second way: just directly define a new named column
state.df$Score
=
sample(
1
:
100
, nrow(state.df),
replace=
TRUE)
head(state.df,
4
)
##
Population Income Illiteracy Life.Exp Murder HS.Grad Frost
Area
## Alabama
3615
3624
2.1
69.05
15.1
41.3
20
50708
## Alaska
365
6315
1.5
69.31
11.3
66.7
152 566432
## Arizona
2212
4530
1.8
70.55
7.8
58.1
15 113417
## Arkansas
2110
3378
1.9
70.66
10.1
39.9
65
51945
##
Region
Division
Cool Score
## Alabama
South East South Central FALSE
42
## Alaska
West
Pacific
TRUE
49
## Arizona
West
Mountain FALSE
84
## Arkansas
South West South Central FALSE
75
ncol(state.df)
## [1] 12
state.df[,
13
]
<-
NA
head(state.df)
##
Population Income Illiteracy Life.Exp Murder HS.Grad Frost
Area
## Alabama
3615
3624
2.1
69.05
15.1
41.3
20
50708
## Alaska
365
6315
1.5
69.31
11.3
66.7
152 566432
## Arizona
2212
4530
1.8
70.55
7.8
58.1
15 113417
## Arkansas
2110
3378
1.9
70.66
10.1
39.9
65
51945
## California
21198
5114
1.1
71.71
10.3
62.6
20 156361
## Colorado
2541
4884
0.7
72.06
6.8
63.9
166 103766
##
Region
Division
Cool Score V13
## Alabama
South East South Central FALSE
42
NA
## Alaska
West
Pacific
TRUE
49
NA
## Arizona
West
Mountain FALSE
84
NA
## Arkansas
South West South Central FALSE
75
NA
## California
West
Pacific FALSE
89
NA
## Colorado
West
Mountain
TRUE
4
NA
state.df
<-
state.df[,-
13
]
Deleting columns from a data frame
To delete columns: we can either use negative integer indexing, or set a column to
NULL
# First way: use negative integer indexing
state.df
=
state.df[,-ncol(state.df)]
head(state.df,
4
)
9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
##
Population Income Illiteracy Life.Exp Murder HS.Grad Frost
Area
## Alabama
3615
3624
2.1
69.05
15.1
41.3
20
50708
## Alaska
365
6315
1.5
69.31
11.3
66.7
152 566432
## Arizona
2212
4530
1.8
70.55
7.8
58.1
15 113417
## Arkansas
2110
3378
1.9
70.66
10.1
39.9
65
51945
##
Region
Division
Cool
## Alabama
South East South Central FALSE
## Alaska
West
Pacific
TRUE
## Arizona
West
Mountain FALSE
## Arkansas
South West South Central FALSE
# Second way: just directly set a column to NULL
state.df$Cool
=
NULL
head(state.df,
4
)
##
Population Income Illiteracy Life.Exp Murder HS.Grad Frost
Area
## Alabama
3615
3624
2.1
69.05
15.1
41.3
20
50708
## Alaska
365
6315
1.5
69.31
11.3
66.7
152 566432
## Arizona
2212
4530
1.8
70.55
7.8
58.1
15 113417
## Arkansas
2110
3378
1.9
70.66
10.1
39.9
65
51945
##
Region
Division
## Alabama
South East South Central
## Alaska
West
Pacific
## Arizona
West
Mountain
## Arkansas
South West South Central
Reminder: Boolean indexing
With matrices or data frames, we’ll often want to access a subset of the rows corresponding to some condition.
You already know how to do this, with Boolean indexing
# Compare the averages of the Frost column between states in New England and
# Pacific divisions
mean(state.df[(state.df$Division ==
"New England"
),
"Frost"
])
## [1] 145.3333
mean(state.df[(state.df$Division ==
"Pacific"
),
"Frost"
])
## [1] 49.6
What is the average of Frost for the division that contains Texas?
# Which division contains Texas?
tex_div
<-
state.df[
"Texas"
,
"Division"
]
state.df[
"Texas"
,]
10
##
Population Income Illiteracy Life.Exp Murder HS.Grad Frost
Area Region
## Texas
12237
4188
2.2
70.9
12.2
47.4
35 262134
South
##
Division
## Texas West South Central
mean(state.df[(state.df$Division ==
"West South Central"
),
"Frost"
])
## [1] 48.5
mean(state.df[(state.df$Division == tex_div),
"Frost"
])
## [1] 48.5
subset()
The
subset()
function provides a convenient alternative way of accessing rows for data frames
# Using subset(), we can just use the column names directly (i.e., no need for
# using $)
state.df.ne
.1
=
subset(state.df, Division ==
"New England"
)
# Get same thing by extracting the appropriate rows manually
state.df.ne
.2
=
state.df[state.df$Division ==
"New England"
, ]
all(state.df.ne
.1
== state.df.ne
.2
)
## [1] TRUE
# Same calculation as in the last slide, using subset()
mean(subset(state.df, Division ==
"New England"
)$Frost)
## [1] 145.3333
mean(subset(state.df, Division ==
"Pacific"
)$Frost)
# Wimps
## [1] 49.6
mean(subset(state.df, Division ==
"New England"
,Frost,
drop =
TRUE))
## [1] 145.3333
Replacing values
Parts or all of the data frame can be assigned to:
11
summary(state.df$HS.Grad)
##
Min. 1st Qu.
Median
Mean 3rd Qu.
Max.
##
37.80
48.05
53.25
53.11
59.15
67.30
state.df$HS.Grad
<-
state.df$HS.Grad/
100
summary(state.df$HS.Grad)
##
Min. 1st Qu.
Median
Mean 3rd Qu.
Max.
##
0.3780
0.4805
0.5325
0.5311
0.5915
0.6730
state.df$HS.Grad
<-
100
*state.df$HS.Grad
# state.df$HS.Grad[1:5] <- NA
with()
The
with()
function provides a way of expressing operations by column names only.
What percentage of literate adults graduated high school?
head(
100
*(state.df$HS.Grad/(
100
-state.df$Illiteracy)))
## [1] 42.18590 67.71574 59.16497 40.67278 63.29626 64.35045
with()
takes a data frame and evaluates an expression “inside” it:
with(state.df, head(
100
*(HS.Grad/(
100
-Illiteracy))))
## [1] 42.18590 67.71574 59.16497 40.67278 63.29626 64.35045
(so you don’t have to type
state.df$xyz
)
Data arguments
Lots of functions take
data
arguments, and look variables up in that data frame:
plot(Illiteracy~Frost,
data =
state.df)
12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
0
50
100
150
0.5
1.0
1.5
2.0
2.5
Frost
Illiteracy
Summary
•
Data frames are a representation of the “classic” data table in R: rows are observations/cases, columns
are variables/features
•
Each column can be a different data type (but must be the same length)
•
subset()
: function for extracting rows of a data frame meeting a condition
•
with()
: function for operating on data frame columns without indexing
13
Related Documents
Related Questions
In IBM SPSS, what does clicking on this icon do?
arrow_forward
Apply STATA commands & submit the output for each question
only when indicated below
i. Apply the command egen to create a variable called "wyd"
which is the rowtotal function on variables bwght & faminc.
ii. Apply the list command for the first 10 observations to
show that the code in part i worked. Include the outcome of
this code
iii. Apply the egen command to create a new variable called
"bwghtsum" using the sum function on variable bwght by
the variable high_faminc (Note: need to apply the bysort'
statement)
iv. Apply the "by high_faminc" statement to find the
V.
descriptive statistics of bwght and bwghtsum Include the
output of this code.
Why is there a difference between the standard deviations
of bwght and bwghtsum from part iv of this question?
arrow_forward
//$$/$/$/$::$/$:Helppppppp
arrow_forward
Use the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits.
minimum=7,
maximum=74,
7
classes
arrow_forward
Use the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits.
minimum = 12, maximum = 80, 7 classes
The class width is 10
(Type a whole number.)
Use the minimum as the first lower class limit, and then find the remaining lower class limits.
The lower class limits are 12,22,32,42,52,62,72.
(Type a whole number. Use a comma to separate answers as needed.)
The upper class limits are
(Type a whole number. Use a comma to separate answers as needed.)
arrow_forward
Use the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits.
minimum=9, maximum=53, 7 classes. What is the class width?
arrow_forward
Write STATA codes which will generate the outcomes in the questions
& submit the output for each question only when indicated below
i.
ii.
iii.
iv.
V.
Write a code which will allow STATA to go to your favorite folder
to access your files. Load the birthweight1.dta dataset from your
favorite folder and save it under a different filename to protect
data integrity. Call the new dataset babywt.dta (make sure to use
the replace option).
Verify that it contains 2,998 observations and 8 variables. Include
the output of this code.
Are there missing observations for variable(s) for the variables
called bwght, faminc, cigs? How would you know? (You may use
more than one code to show your answer(s)) Include the output
of your code (s).
Write the definitions of these variables: bwght, faminc, male,
white, motheduc,cigs; which of these variables are categorical?
[Hint: use the labels of the variables & the browse command]
Who is this dataset about? Who can use this dataset to answer
what kind of…
arrow_forward
Use the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits.
minimum=12, maximum=59, 7 classes
arrow_forward
According to a climate data center, the highest temperatures (in degrees Fahrenheit) ever recorded in a certain country (as of June 15, 2017) were as follows. Present
these data in a double-stem display.
Click the icon to view the data.
Construct a double-stem display for the given data.
(Use a comma to separate answers as needed.)
i Data Table
Full data setE
112
100
107
121
120
103
116
111
106
112
128
115
115
113
121
120
115
116
124
119
132
119
116
110
105
112
122
119
121
110
109
112
119
115
115
112
112
128 112
113
108
107
106
104
111
112
108
112
112
116
Print
Done
Enter your answer in the edit fields and then click Ched
Check Answer
All parts showing
Clear All
1|0 represent 100, double key represents two times.
re to search
arrow_forward
In IBM SPSS, what does clicking on this icon do?
arrow_forward
What type of relationship is indicated in the scatterplot7
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill

Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt

Elementary Geometry for College Students
Geometry
ISBN:9781285195698
Author:Daniel C. Alexander, Geralyn M. Koeberlein
Publisher:Cengage Learning

Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL

Mathematics For Machine Technology
Advanced Math
ISBN:9781337798310
Author:Peterson, John.
Publisher:Cengage Learning,
Related Questions
- In IBM SPSS, what does clicking on this icon do?arrow_forwardApply STATA commands & submit the output for each question only when indicated below i. Apply the command egen to create a variable called "wyd" which is the rowtotal function on variables bwght & faminc. ii. Apply the list command for the first 10 observations to show that the code in part i worked. Include the outcome of this code iii. Apply the egen command to create a new variable called "bwghtsum" using the sum function on variable bwght by the variable high_faminc (Note: need to apply the bysort' statement) iv. Apply the "by high_faminc" statement to find the V. descriptive statistics of bwght and bwghtsum Include the output of this code. Why is there a difference between the standard deviations of bwght and bwghtsum from part iv of this question?arrow_forward//$$/$/$/$::$/$:Helppppppparrow_forward
- Use the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits. minimum=7, maximum=74, 7 classesarrow_forwardUse the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits. minimum = 12, maximum = 80, 7 classes The class width is 10 (Type a whole number.) Use the minimum as the first lower class limit, and then find the remaining lower class limits. The lower class limits are 12,22,32,42,52,62,72. (Type a whole number. Use a comma to separate answers as needed.) The upper class limits are (Type a whole number. Use a comma to separate answers as needed.)arrow_forwardUse the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits. minimum=9, maximum=53, 7 classes. What is the class width?arrow_forward
- Write STATA codes which will generate the outcomes in the questions & submit the output for each question only when indicated below i. ii. iii. iv. V. Write a code which will allow STATA to go to your favorite folder to access your files. Load the birthweight1.dta dataset from your favorite folder and save it under a different filename to protect data integrity. Call the new dataset babywt.dta (make sure to use the replace option). Verify that it contains 2,998 observations and 8 variables. Include the output of this code. Are there missing observations for variable(s) for the variables called bwght, faminc, cigs? How would you know? (You may use more than one code to show your answer(s)) Include the output of your code (s). Write the definitions of these variables: bwght, faminc, male, white, motheduc,cigs; which of these variables are categorical? [Hint: use the labels of the variables & the browse command] Who is this dataset about? Who can use this dataset to answer what kind of…arrow_forwardUse the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits. minimum=12, maximum=59, 7 classesarrow_forwardAccording to a climate data center, the highest temperatures (in degrees Fahrenheit) ever recorded in a certain country (as of June 15, 2017) were as follows. Present these data in a double-stem display. Click the icon to view the data. Construct a double-stem display for the given data. (Use a comma to separate answers as needed.) i Data Table Full data setE 112 100 107 121 120 103 116 111 106 112 128 115 115 113 121 120 115 116 124 119 132 119 116 110 105 112 122 119 121 110 109 112 119 115 115 112 112 128 112 113 108 107 106 104 111 112 108 112 112 116 Print Done Enter your answer in the edit fields and then click Ched Check Answer All parts showing Clear All 1|0 represent 100, double key represents two times. re to searcharrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin HarcourtElementary Geometry for College StudentsGeometryISBN:9781285195698Author:Daniel C. Alexander, Geralyn M. KoeberleinPublisher:Cengage Learning
- Holt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGALMathematics For Machine TechnologyAdvanced MathISBN:9781337798310Author:Peterson, John.Publisher:Cengage Learning,

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill

Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt

Elementary Geometry for College Students
Geometry
ISBN:9781285195698
Author:Daniel C. Alexander, Geralyn M. Koeberlein
Publisher:Cengage Learning

Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL

Mathematics For Machine Technology
Advanced Math
ISBN:9781337798310
Author:Peterson, John.
Publisher:Cengage Learning,