Stats Unit3 assignment
docx
keyboard_arrow_up
School
West Texas A&M University *
*We aren’t endorsed by this school
Course
6388
Subject
Statistics
Date
Feb 20, 2024
Type
docx
Pages
12
Uploaded by georgiaestrada
Dr. J. Rausch
Stat/Soc/Crim Rerch (POSC-6388)
February 5, 2024
Georgia Estrada
Unit 3 and Problem Sets
> getwd()
[1] "C:/Users/georgia/OneDrive/Documents"
> BES <- read.csv("C:/Users/georgia/Downloads/DSS/DSS/BES.csv")
> head(BES)
vote leave education age
1 leave 1 3 60
2 leave 1 NA 56
3 stay 0 5 73
4 leave 1 4 64
5 don't know NA 2 68
6 stay 0 4 85
> dim(BES)
[1] 30895 4
> table(BES$vote)
don't know leave stay won't vote 2314 13692 14352 537 > freq_table <-table(BES$vote)
> prop.table(freq_table)
don't know leave stay won't vote 0.07489885 0.44317851 0.46454119 0.01738145 > prop.table (table(BES$vote))
don't know leave stay won't vote 0.07489885 0.44317851 0.46454119 0.01738145 > table(BES$education, exclude=NULL)
1 2 3 4 5 <NA> 2045 5781 6272 10676 2696 3425 > mean(BES$leave)
[1] NA
> mean(BES$leave, na.rm=TRUE)
[1] 0.4882328
> BES1<-na.omit(BES)
> head(BES)
vote leave education age
1 leave 1 3 60
2 leave 1 NA 56
3 stay 0 5 73
4 leave 1 4 64
5 don't know NA 2 68
6 stay 0 4 85
> head(BES1)
vote leave education age
1 leave 1 3 60
3 stay 0 5 73
4 leave 1 4 64
6 stay 0 4 85
7 leave 1 3 78
8 leave 1 2 51
> dim(BES)
[1] 30895 4
> dim(BES1)
[1] 25097 4
> table(BES1$leave, BES1$education)
1 2 3 4 5
0 498 1763 3014 6081 1898
1 1356 3388 2685 3783 631
> prop.table(table(BES$leave, BES1$education))
Error in table(BES$leave, BES1$education) : all arguments must have the same length
> prop.table(table(BES1$leave, BES1$education))
1 2 3 4 5
0 0.01984301 0.07024744 0.12009404 0.24229988 0.07562657
1 0.05403036 0.13499621 0.10698490 0.15073515 0.02514245
> prop.table(table(BES1$leave, BES1$education), margin=1)
1 2 3 4 5
0 0.03757356 0.13301645 0.22740305 0.45880489 0.14320205
1 0.11449802 0.28607616 0.22671620 0.31942920 0.05328042
> prop.table(table(BES1$leave, BES1$education), margin=2)
1 2 3 4 5
0 0.2686084 0.3422636 0.5288647 0.6164842 0.7504943
1 0.7313916 0.6577364 0.4711353 0.3835158 0.2495057
> hist(BES1$age)
> hist(BES1$age[BES1leave==0])
Error: object 'BES1leave' not found
> hist(BES1$age[BES1$leave==0])
> hist(BES1$age[BES1$leave==0])#For non-supporters
> hist(BES1$age[BES1$leave==1]) #for supporters
> hist(BES1$age[BES1$education==1]) #W/o qualifications
> hist(BES1$age[BES1$education==4]) #w/ undergraduate degree
> hist(BES1$age[BES1$education==1], freq=FALSE) #w/o qualifications
> hist(BES1$age[BES1$education==4], freq=FALSE) # w/ undergraduate degree
> hist(BES1$age[BES1$leave==0]), freq=FALSE) #For non-supporters
Error: unexpected ',' in "hist(BES1$age[BES1$leave==0]),"
> hist(BES1$age[BES1$leave==0], freq=FALSE) #For non-supporters
> hist(BES1$age[BES1$leave==1], freq=FALSE)#for supporters
> mean(BES1$age[BES1leave==0])#For non-supporters
Error: object 'BES1leave' not found
> mean(BES1$age[BES1$leave==0])#For non-supporters
[1] 46.89
> mean(BES1$age[BES1$leave==1])#for supporters
[1] 55.06823
> median(BES1$age[BES1$leave==0])#For non-supporters
[1] 48
> median(BES1$age[BES1$leave==1])#for supporters
[1] 58
> sd(BES1$age[BES1$leave==0])#For non-supporters
[1] 17.3464
> sd(BES1$age[BES1$leave==1])#for supporters
[1] 14.96106
> var(BES1$age[BES1$leave==1])
[1] 223.8334
> sd(BES1$age[BES1$leave==1])^2
[1] 223.8334
> sqrt(var(BeS1$age[BES1$leave==1]))
Error: object 'BeS1' not found
> sqrt(var(BES1$age[BES1$leave==1]))
[1] 14.96106
> dis<-read.csv("UK_districts.csv")
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file 'UK_districts.csv': No such file or directory
> UK_districts <- read_csv("C:/Users/georgia/Downloads/DSS/DSS/UK_districts.csv")
Error in read_csv("C:/Users/georgia/Downloads/DSS/DSS/UK_districts.csv") : could not find function "read_csv"
> UK_districts <- read.csv("C:/Users/georgia/Downloads/DSS/DSS/UK_districts.csv")
> head(dis)
Error: object 'dis' not found
> head(UK_districts)
name leave high_education
1 Birmingham 50.42 22.98
2 Cardiff 39.98 32.33
3 Edinburgh City 25.56 21.92
4 Glasgow City 33.41 25.91
5 Liverpool 41.81 22.44
6 Swansea 51.51 25.85
> dim(UK_districts)
[1] 382 3
> UK_districts1<-na.omit(UK_districts)
> dim(UK_districts1)
[1] 380 3
> plot(UK_districts1$high_education, UK_districts1$leave)
> plot(x=UK_districts1$high_education, y=UK_districts1$leave)
> plot(y=UK_districts1$leave, x=UK_districts1$high_education)
> abline(v=mean(UK_districts1$high_education), i=lty="dashed")
Error: unexpected '=' in "abline(v=mean(UK_districts1$high_education), i=lty="
> abline(v=mean(UK_districts1$high_education), lty="dashed")
> abline(h=mean(UK_districts1$leave), lty="dashed")
> cor(UK_districts1$high_education, UK_districts1$leave)
[1] -0.7633185
> cor(UK_districts1$leave, UK_districts1$high_education)
[1] -0.7633185
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Problem Set 1
Use the function read.csv() to read the CSV file “india.csv” and use the assignment operator <- to ore the data in an object called india. (Do not forget to set the working directory first.) Provide
the R code you used (without the output). (10 points). > getwd()
[1] "C:/Users/georg/OneDrive/Documents"
> india <- read.csv("C:/Users/georg/Downloads/india.csv")
2. Use the function head() to view the first few observations of the dataset. Provide the R code you used (without the output). (10 points).
> head(india)
village female water irrigation
1 GP1_village2 1 10 0
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
2 GP1_village1 1 0 5
3 GP2_village2 1 2 2
4 GP2_village1 1 31 4
5 GP3_village2 0 0 0
6 GP3_village1 0 0 0
3. What does each observation in this dataset represent? (5 points).
The observation of the dataset is the village type such as village 1 and village 2 of the first 6 out of 322. 4. Please substantively interpret the first observation in the dataset. (5 points).
The first observation in the dataset represents village 2 in Gram Panchayat in group 1 that was assigned a female politician. Next, shows that there were 10 new or repaired drinking water facilities and 0 new or repaired irrigation facilities of randomly assigned politicians. 5. For each variable in the dataset, please identify the type of variable (character vs. numeric binary vs. numeric non-binary) (10 points).
The variable “Village” is a character variable, “female” is a binary variable, “water” and “irrigation are numeric non-binary.
6. How many observations are in the dataset? In other words, how many villages were part of this experiment? (Hint: the function dim() might be helpful here.) Provide the R code you used (without the output) and provide the substantive answer. (10 points).
> dim(india)
[1] 322 4
There are 322 villages in India in the dataset.
Problem Set 2
1.Use the function mean() to calculate the average of the variable female. Please provide a full substantive interpretation of what this average means. Make sure to provide the unit of measurement. (10 points).
> mean(india$female)
[1] 0.3354037
Approximately, 34% of the villages has females politicians were randomly assigned.
2. Use the function mean() to calculate the average of the variable water. Please provide a full substantive interpretation of what this average means. Make sure to provide the unit of measurement. (10 points).
> mean(india$water)
[1] 17.84161
The average of new or repaired drinking water facilities per village is 18.
3. If we wanted to estimate the average causal effect of having a female politician on the number of new (and repaired) drinking water facilities: (10 points).
mean(india$water[india$female==1])-mean(india$water[india$female==0])
[1] 9.252423
a. What would be the treatment variable? Please just provide the name of the variable.
Female
b. What would be the outcome variable? Please just provide the name of the variable.
Water
4. If we wanted to estimate the average causal effect of having a female politician on the number of new (and repaired) irrigation facilities: (10 points).
> mean(india$irrigation[india$female==1])-mean(india$irrigation[india$female==0])
[1] -0.3693319
a. What would be the treatment variable? Please just provide the name of the variable.
Female
b. What would be the outcome variable? Please just provide the name of the variable.
Irrigation 5. In both analyses above: (10 points) a. What would be the treatment group?
b. What would be the control group?
a.) The treatment group are the villages that were randomly assigned with a female politician.
b.) The controlled group are the villages that did not have a random assigned female politician.
Problem Set 3
1.Considering that the dataset we are analyzing comes from a randomized experiment, what can we compute to estimate the average causal effect of having a female politician on the number of new (or repaired) drinking water facilities? Please provide the name of the estimator. (5 points).
We can compute the average number of repaired or new drinking water facilities in villages that have a female politician and compare them to the average number of repaired or new drinking water facilities in villages that do not have a female politician or has a male politician. The name of the estimator is the mean.
2. In this dataset, what is the average number of new (or repaired) drinking water facilities in villages with a female politician? Please answer with a full sentence. (10 points). > mean(india$water[india$female==1])
[1] 23.99074
The average number of new or repaired drinking water facilities in a village with a female politician is 24.0.
3. What is the average number of new (or repaired) drinking water facilities in villages with a male politician? Please answer with a full sentence. (10 points).
> mean(india$water[india$female==0])
[1] 14.73832
The average number of new or repaired drinking water facilities in a village with a male politician is 14.7.
4. What is the estimated average causal effect of having a female politician on the number of new (or repaired) drinking water facilities? Please provide a full substantive answer (make sure to include the assumption, why the assumption is reasonable, the treatment, the outcome, as well as the direction, size, and unit of measurement of the average treatment effect) (25 points).
> mean(india$water[india$female==1])-mean(india$water[india$female==0])
[1] 9.252423
It is assumed that the villages assigned to have a female politician are comparable to the villages that are not assigned a female politician. The assumption is reasonable because the female politicians were assigned at random. The treatment is a randomized experiment of having a female politician versus having a male politician with the outcome of the numbers of new or repaired drinking facilities. The direction, size, and unit of measurement is the mean of 9.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Related Questions
Problem 4-09
Epsilon Airlines services predominately the eastern and southeastern United States. The vast majority of Epsilon's customers make reservations through Epsilon's website, but a small percentage
of customers make reservations via phone. Epsilon employs call-center personnel to handle these reservations along with any problems with the website reservation system and for the
rebooking of flights for customers if their plans change or their travel is disrupted. Staffing the call center appropriately is a challenge for Epsilon's management team. Having too many
employees on hand is a waste of money, but having too few results in very poor customer service and the potential loss of customers.
Epsilon analysts have estimated the minimum number of call-center employees needed by day of week for the upcoming vacation season (June, July, and the first two weeks of August). These
estimates are as follows:
Minimum Number of
Employees Needed
Day
Monday
75
Tuesday
50
Wednesday
45…
arrow_forward
Structure.com/courses/39109/pages/week-number-16-project-number-2 sect-2-dol-7 wed-dot-opens-12-slash-4-slash-2024-and-closes-thurs-dot-12-slas... +✰
+
+
This graded project assignment will be posted as "ALEKS External Assignment: PROJECT #2:
ONLY 1-Page in pdf format must be submitted (uploaded in CANVAS-Inbox Message with ALL work and answers.
If the uploaded Project Assignment can NOT BE OPENED by the instructor, a zero-grade will be posted in the ALEKS gradebook. NO EXCEPTIONS!
Include your name & your section for this course.
If no submission is made by the due date: Thursday, 12/5/2024 at 11:59 pm (EST), a zero-score will be recorded in ALEKS. NO Make-up! No EXCEPT
To be successful with PROJECT#2 assignment do the following:
a. Go to ALEKS Home page by clicking ALEKS in the CANVAS-Modules menu and follow steps below:
b. Make sure you REVIEW STUDY material for Chapter 2 (Section: 2.7) in the ALEKS e-Textbook. Be sure to understand Example 2 and ALSO watch Lectu
(video); Solving a…
arrow_forward
Suppose that license plates in a certain municipality come in two forms: two letters (A …Z) followed by three digits (0 … 9) or three letters followed by two digits. How manydifferent license plates are possible
arrow_forward
. How many 10 digit phone numbers, Area code + number (***)***-****, are possible if the second digit in the area code has to be a 0 or 1 and the first digit of the number can't be 0?
arrow_forward
Let A = {small, medium, large}, B = {blue, green}, and C = {triangle, square}. H
Represent A x C as cells in a spreadsheet.
triangle
---Select---
---Select--- V
small
medium
large
---Select--- V
---Select--- V
---Select--- V
---Select--- ✓
square
medium
---Select--- V
---Select--- V
---Select--- V
---Select--- V
---Select--- ✓
arrow_forward
Define interset.
arrow_forward
what do the ks stand for in the exel tables
arrow_forward
10
Calculate ox³ydydx
arrow_forward
I am not sure how to do part A or B. Please help!!
arrow_forward
4. Part 1: How many phone numbers can be made if all the digits (that is, 10 digits) need to be filled
in and any single digit number (0-9) can be used for any digit?
Part 2: How many phone numbers can be made if the first digit must be 1, the second digit must
be a number in the range 3-5, the third digit must be a number in the range (6-9), and the last
seven digits can be any single digit number 0-9?
arrow_forward
The data file includes the text of three books of the Bible (Joshua, Jonah and Philippians) using the ESV translation. While these are all great books, our only interest for this project is how often each letter is used.
1) In the Word file containing the Biblical text, use the “Find” feature to identify how many times each letter occurs (i.e. the letter’s frequency). Create an Excel spreadsheet to display the number of occurrences of each letter in the English alphabet.
Using the find feature, here is the amount for each letter in alphabet
A
1810
B
323
C
442
D
1097
E
2845
F
609
G
416
H
1689
I
1381
J
134
K
143
L
935
M
586
N
1503
O
2237
P
379
Q
5
R
1362
S
1407
T
2235
U
703
V
257
W
513
X
17
Z
6
2) In the Excel spreadsheet, sum your frequencies to compute the total number of letters in the 3 books (this is n).
a) In your…
arrow_forward
The data file includes the text of three books of the Bible (Joshua, Jonah and Philippians) using the ESV translation. While these are all great books, our only interest for this project is how often each letter is used.
1) In the Word file containing the Biblical text, use the “Find” feature to identify how many times each letter occurs (i.e. the letter’s frequency). Create an Excel spreadsheet to display the number of occurrences of each letter in the English alphabet.
Here is the amount of occurences I found for each letter using the find feature
A=1810
B=323
C=442
D=1097
E=2845
F=609
G=416
H=1689
I=1381
J=134
K=143
L=935
M=586
N=1503
O=2237
P=379
Q=5
R=1362
S=1407
T=2235
U=703
V=257
W=513
X=17
Z=6
2) In the Excel spreadsheet, sum your frequencies to compute the total number of letters in the 3 books (this is n).
a) In your spreadsheet, use the formula to compute the sample proportion of each letter’s appearances relative to total number of letters (i.e. find the relative…
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage
Related Questions
- Problem 4-09 Epsilon Airlines services predominately the eastern and southeastern United States. The vast majority of Epsilon's customers make reservations through Epsilon's website, but a small percentage of customers make reservations via phone. Epsilon employs call-center personnel to handle these reservations along with any problems with the website reservation system and for the rebooking of flights for customers if their plans change or their travel is disrupted. Staffing the call center appropriately is a challenge for Epsilon's management team. Having too many employees on hand is a waste of money, but having too few results in very poor customer service and the potential loss of customers. Epsilon analysts have estimated the minimum number of call-center employees needed by day of week for the upcoming vacation season (June, July, and the first two weeks of August). These estimates are as follows: Minimum Number of Employees Needed Day Monday 75 Tuesday 50 Wednesday 45…arrow_forwardStructure.com/courses/39109/pages/week-number-16-project-number-2 sect-2-dol-7 wed-dot-opens-12-slash-4-slash-2024-and-closes-thurs-dot-12-slas... +✰ + + This graded project assignment will be posted as "ALEKS External Assignment: PROJECT #2: ONLY 1-Page in pdf format must be submitted (uploaded in CANVAS-Inbox Message with ALL work and answers. If the uploaded Project Assignment can NOT BE OPENED by the instructor, a zero-grade will be posted in the ALEKS gradebook. NO EXCEPTIONS! Include your name & your section for this course. If no submission is made by the due date: Thursday, 12/5/2024 at 11:59 pm (EST), a zero-score will be recorded in ALEKS. NO Make-up! No EXCEPT To be successful with PROJECT#2 assignment do the following: a. Go to ALEKS Home page by clicking ALEKS in the CANVAS-Modules menu and follow steps below: b. Make sure you REVIEW STUDY material for Chapter 2 (Section: 2.7) in the ALEKS e-Textbook. Be sure to understand Example 2 and ALSO watch Lectu (video); Solving a…arrow_forwardSuppose that license plates in a certain municipality come in two forms: two letters (A …Z) followed by three digits (0 … 9) or three letters followed by two digits. How manydifferent license plates are possiblearrow_forward
- . How many 10 digit phone numbers, Area code + number (***)***-****, are possible if the second digit in the area code has to be a 0 or 1 and the first digit of the number can't be 0?arrow_forwardLet A = {small, medium, large}, B = {blue, green}, and C = {triangle, square}. H Represent A x C as cells in a spreadsheet. triangle ---Select--- ---Select--- V small medium large ---Select--- V ---Select--- V ---Select--- V ---Select--- ✓ square medium ---Select--- V ---Select--- V ---Select--- V ---Select--- V ---Select--- ✓arrow_forwardDefine interset.arrow_forward
- 4. Part 1: How many phone numbers can be made if all the digits (that is, 10 digits) need to be filled in and any single digit number (0-9) can be used for any digit? Part 2: How many phone numbers can be made if the first digit must be 1, the second digit must be a number in the range 3-5, the third digit must be a number in the range (6-9), and the last seven digits can be any single digit number 0-9?arrow_forwardThe data file includes the text of three books of the Bible (Joshua, Jonah and Philippians) using the ESV translation. While these are all great books, our only interest for this project is how often each letter is used. 1) In the Word file containing the Biblical text, use the “Find” feature to identify how many times each letter occurs (i.e. the letter’s frequency). Create an Excel spreadsheet to display the number of occurrences of each letter in the English alphabet. Using the find feature, here is the amount for each letter in alphabet A 1810 B 323 C 442 D 1097 E 2845 F 609 G 416 H 1689 I 1381 J 134 K 143 L 935 M 586 N 1503 O 2237 P 379 Q 5 R 1362 S 1407 T 2235 U 703 V 257 W 513 X 17 Z 6 2) In the Excel spreadsheet, sum your frequencies to compute the total number of letters in the 3 books (this is n). a) In your…arrow_forwardThe data file includes the text of three books of the Bible (Joshua, Jonah and Philippians) using the ESV translation. While these are all great books, our only interest for this project is how often each letter is used. 1) In the Word file containing the Biblical text, use the “Find” feature to identify how many times each letter occurs (i.e. the letter’s frequency). Create an Excel spreadsheet to display the number of occurrences of each letter in the English alphabet. Here is the amount of occurences I found for each letter using the find feature A=1810 B=323 C=442 D=1097 E=2845 F=609 G=416 H=1689 I=1381 J=134 K=143 L=935 M=586 N=1503 O=2237 P=379 Q=5 R=1362 S=1407 T=2235 U=703 V=257 W=513 X=17 Z=6 2) In the Excel spreadsheet, sum your frequencies to compute the total number of letters in the 3 books (this is n). a) In your spreadsheet, use the formula to compute the sample proportion of each letter’s appearances relative to total number of letters (i.e. find the relative…arrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Algebra & Trigonometry with Analytic GeometryAlgebraISBN:9781133382119Author:SwokowskiPublisher:Cengage
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage