lab7_statistical_inference
html
keyboard_arrow_up
School
Temple University *
*We aren’t endorsed by this school
Course
1013
Subject
Geography
Date
Dec 6, 2023
Type
html
Pages
23
Uploaded by MegaTitanium11196
Lab 7: Inference and Global Climate Change
¶
By the end of this lab, you should know how to:
1. Test whether observed data appears to be a random sample from a distribution.
2. Analyze a natural experiment.
3. Implement and interpret a sign test.
4. Create a function to run a general hypothesis test.
5. Analyze visualizations and draw conclusions from them.
In [ ]:
name = James
In [ ]:
## import statements
# These lines load the tests.
from gofer.ok import check
import numpy as np
from datascience import *
import pandas as pd
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
# Fix for datascience plots
import collections as collections
import collections.abc as abc
collections.Iterable = abc.Iterable
Overview
¶
Climate change is usually referring to the general trend of warming temperatures
globally. Along with these increasing temperatures, unusual shifts in trends in weather
activity such as hurricanes, storms, winds, etc are also usually classified under climate
change. While the climate can shift due to natural occurrences, scientists have found
that human interventions have potentially caused the trend of warming. One explanation
for the warming could be increased solar solar activity, however scientists have found
that solar activity has not generally increased during the period when temperature has
increased.
Links:
NASA
Canada
Data
¶
While there's several different metrics we could analyze to make some inferences about
overall trends in global climate change, for simplicity's sake we will be focusing on land
temperature across different countries. The original table came from this
database
,
however it's been reformatted to make the downstream analyses easier. There are 15
columns: Year, Country, Average Temperature of that country in that year 'avg', and then
a column for each month of that year with temperatures.
In [5]:
from datascience import Table
temps = Table.read_table('temp_per_country.csv')
Question 1:
Let's explore this data a bit. The cell below creates a pivot table with years
as the rows and each country as a new column. Select two countries from our dataset
and draw a line plot of the changes in temperature over time. You only want to graph the
years that have data for both your countries of interest (Hint: You may want to utilize
where and are.above() to select those years with data). There is not a autocheck for this
question as you all may have different answers depending on the countres you pick.
In [7]:
import numpy as np
np.unique(temps['country'])
Out[7]:
array(['Afghanistan', 'Africa', 'Albania', 'Algeria', 'American Samoa',
'Andorra', 'Angola', 'Anguilla', 'Antigua And Barbuda', 'Argentina',
'Armenia', 'Aruba', 'Asia', 'Australia', 'Austria', 'Azerbaijan',
'Bahamas', 'Bahrain', 'Baker Island', 'Bangladesh', 'Barbados',
'Belarus', 'Belgium', 'Belize', 'Benin', 'Bhutan', 'Bolivia',
'Bonaire, Saint Eustatius And Saba', 'Bosnia And Herzegovina',
'Botswana', 'Brazil', 'British Virgin Islands', 'Bulgaria',
'Burkina Faso', 'Burma', 'Burundi', 'Cambodia', 'Cameroon',
'Canada', 'Cape Verde', 'Cayman Islands',
'Central African Republic', 'Chad', 'Chile', 'China',
'Christmas Island', 'Colombia', 'Comoros', 'Congo',
'Congo (Democratic Republic Of The)', 'Costa Rica', 'Croatia',
'Cuba', 'Curaçao', 'Cyprus', 'Czech Republic', "Côte D'Ivoire",
'Denmark', 'Denmark (Europe)', 'Djibouti', 'Dominica',
'Dominican Republic', 'Ecuador', 'Egypt', 'El Salvador',
'Equatorial Guinea', 'Eritrea', 'Estonia', 'Ethiopia', 'Europe',
'Falkland Islands (Islas Malvinas)', 'Faroe Islands',
'Federated States Of Micronesia', 'Fiji', 'Finland', 'France',
'France (Europe)', 'French Guiana', 'French Polynesia',
'French Southern And Antarctic Lands', 'Gabon', 'Gambia',
'Gaza Strip', 'Georgia', 'Germany', 'Ghana', 'Greece', 'Greenland',
'Grenada', 'Guadeloupe', 'Guam', 'Guatemala', 'Guernsey', 'Guinea',
'Guinea Bissau', 'Guyana', 'Haiti',
'Heard Island And Mcdonald Islands', 'Honduras', 'Hong Kong',
'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran', 'Iraq',
'Ireland', 'Isle Of Man', 'Israel', 'Italy', 'Jamaica', 'Japan',
'Jersey', 'Jordan', 'Kazakhstan', 'Kenya', 'Kingman Reef',
'Kiribati', 'Kuwait', 'Kyrgyzstan', 'Laos', 'Latvia', 'Lebanon',
'Lesotho', 'Liberia', 'Libya', 'Liechtenstein', 'Lithuania',
'Luxembourg', 'Macau', 'Macedonia', 'Madagascar', 'Malawi',
'Malaysia', 'Mali', 'Malta', 'Martinique', 'Mauritania',
'Mauritius', 'Mayotte', 'Mexico', 'Moldova', 'Monaco', 'Mongolia',
'Montenegro', 'Montserrat', 'Morocco', 'Mozambique', 'Namibia',
'Nepal', 'Netherlands', 'Netherlands (Europe)', 'New Caledonia',
'New Zealand', 'Nicaragua', 'Niger', 'Nigeria', 'Niue',
'North America', 'North Korea', 'Northern Mariana Islands',
'Norway', 'Oceania', 'Oman', 'Pakistan', 'Palau', 'Palestina',
'Palmyra Atoll', 'Panama', 'Papua New Guinea', 'Paraguay', 'Peru',
'Philippines', 'Poland', 'Portugal', 'Puerto Rico', 'Qatar',
'Reunion', 'Romania', 'Russia', 'Rwanda', 'Saint Barthélemy',
'Saint Kitts And Nevis', 'Saint Lucia', 'Saint Martin',
'Saint Pierre And Miquelon', 'Saint Vincent And The Grenadines',
'Samoa', 'San Marino', 'Sao Tome And Principe', 'Saudi Arabia',
'Senegal', 'Serbia', 'Seychelles', 'Sierra Leone', 'Singapore',
'Sint Maarten', 'Slovakia', 'Slovenia', 'Solomon Islands',
'Somalia', 'South Africa', 'South America',
'South Georgia And The South Sandwich Isla', 'South Korea', 'Spain',
'Sri Lanka', 'Sudan', 'Suriname', 'Svalbard And Jan Mayen',
'Swaziland', 'Sweden', 'Switzerland', 'Syria', 'Taiwan',
'Tajikistan', 'Tanzania', 'Thailand', 'Timor Leste', 'Togo',
'Tonga', 'Trinidad And Tobago', 'Tunisia', 'Turkey', 'Turkmenistan',
'Turks And Caicas Islands', 'Uganda', 'Ukraine',
'United Arab Emirates', 'United Kingdom', 'United Kingdom (Europe)',
'United States', 'Uruguay', 'Uzbekistan', 'Venezuela', 'Vietnam',
'Virgin Islands', 'Western Sahara', 'Yemen', 'Zambia', 'Zimbabwe',
'Åland'],
dtype='<U41')
In [8]:
pivotTable = temps.select('year', 'country', 'avg').pivot('country', 'year', 'avg',
sum)
yourCountries = pivotTable.select('year','Germany','Ireland')
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
yourCountries.show()
year Germany Ireland
1753 8.02917
9.36908
1754 7.75983
9.35642
1755 7.52
9.04967
1756 8.40875
9.50142
1757 8.10858
9.46283
1758 7.49608
8.50692
1759 8.41275
9.49833
1760 8.38017
8.82892
1761 8.59617
9.65783
1762 7.80592
9.09283
1763 7.71208
8.66508
1764 8.21042
9.03192
1765 7.88458
8.87775
1766 7.91758
9.01617
1767 7.437
9.16967
1768 7.4135
8.941
1769 7.79408
9.069
1770 7.78958
8.8345
1771 7.10917
8.95342
1772 8.60708
9.03833
1773 8.61833
8.995
year Germany Ireland
1774 7.84133
9.07517
1775 8.95067
9.81958
1776 7.72567
9.07183
1777 7.69033
9.23717
1778 8.70817
9.6155
1779 9.46975
10.4047
1780 8.10092
9.83517
1781 9.16917
10.1843
1782 7.5945
8.23492
1783 8.86842
9.3365
1784 6.907
8.49633
1785 6.75508
8.85283
1786 7.01508
8.71192
1787 8.382
9.55617
1788 7.77067
9.61858
1789 7.95517
9.41883
1790 8.32217
9.68358
1791 8.63342
9.56083
1792 8.2015
9.55342
1793 8.41233
9.502
1794 9.129
9.83508
1795 8.03758
9.2285
year Germany Ireland
1796 8.17908
9.40342
1797 9.02317
9.3285
1798 8.49575
9.83033
1799 6.31308
8.41825
1800 8.10667
9.27008
1801 8.81067
9.61617
1802 8.26
9.43033
1803 7.35425
9.26942
1804 7.83858
9.60583
1805 6.4745
9.31617
1806 8.89817
9.50075
1807 8.49675
8.72975
1808 7.31592
8.88542
1809 7.855
8.71733
1810 7.7455
8.57542
1811 9.14383
9.24142
1812 6.76292
8.44908
1813 7.68283
9.02358
1814 6.81
8.21408
1815 7.63192
8.77983
1816 6.76433
7.90617
1817 7.94908
8.70117
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
year Germany Ireland
1818 8.28083
9.58717
1819 8.58633
9.05358
1820 7.17875
8.82867
1821 8.21558
9.67442
1822 9.34867
9.785
1823 7.59567
8.71592
1824 8.56575
9.45258
1825 8.71142
9.79633
1826 8.55108
9.88317
1827 8.15158
9.49625
1828 8.41108
10.1043
1829 5.905
8.57183
1830 7.28958
9.02283
1831 8.4605
9.71025
1832 7.67192
9.22092
1833 8.11117
9.20342
1834 9.51108
10.0865
1835 7.95875
9.3285
1836 8.08342
8.69383
1837 7.30283
8.95083
1838 6.45075
8.33275
1839 7.9685
8.76225
year Germany Ireland
1840 7.02533
8.83408
1841 8.37258
8.88392
1842 7.81542
9.43342
1843 8.27758
9.28242
1844 7.21983
9.09325
1845 7.12217
8.76633
1846 9.16675
10.1389
1847 7.47967
9.45992
1848 8.10742
9.13842
1849 7.56467
9.28108
1850 7.41575
9.3615
1851 7.57817
9.39925
1852 8.75058
9.81867
1853 6.87308
8.797
1854 7.86633
9.55275
1855 6.45633
8.42758
1856 7.73117
9.23425
1857 8.5035
10.0288
1858 7.2865
9.38217
1859 8.85667
9.483
1860 7.06167
8.34942
1861 8.08917
9.45108
year Germany Ireland
1862 8.50383
9.0435
1863 8.88858
9.60842
1864 6.47225
9.14242
1865 8.22075
9.78317
1866 8.61125
9.28358
1867 7.79133
9.19408
1868 9.36025
9.99667
1869 8.13992
9.64317
1870 6.98175
9.25942
1871 6.58008
9.49625
1872 9.11858
9.476
1873 8.40683
9.13883
1874 8.01817
9.48467
1875 7.518
9.54658
1876 7.997
9.48375
1877 8.18242
9.26883
1878 8.293
9.48483
1879 6.66092
8.04358
1880 8.27658
9.30925
1881 7.25325
8.69408
1882 8.31658
9.318
1883 7.8375
8.92783
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
year Germany Ireland
1884 8.48617
9.49492
1885 7.72683
8.64883
1886 7.99058
8.73883
1887 6.99483
8.90733
1888 6.90817
8.716
1889 7.44325
9.25975
1890 7.37008
9.25683
1891 7.49325
8.86517
1892 7.58767
8.40533
1893 7.947
9.91825
1894 8.18025
9.281
1895 7.4225
8.70892
1896 7.68775
9.48142
1897 8.05067
9.53483
1898 8.61475
10.0273
1899 8.23675
9.87783
1900 8.45208
9.32717
1901 7.68108
9.11917
1902 7.25667
9.11017
1903 8.40242
9.15042
1904 8.40033
9.14775
1905 8.02742
9.32375
year Germany Ireland
1906 8.354
9.30325
1907 7.85258
8.97567
1908 7.52133
9.62708
1909 7.33683
8.92933
1910 8.41875
9.21342
1911 9.026
9.72125
1912 7.83742
9.27275
1913 8.50675
9.54842
1914 8.48692
9.72192
1915 7.87533
9.0475
1916 8.42692
9.37983
1917 7.4995
8.75958
1918 8.49358
9.57425
1919 7.30075
8.80692
1920 8.61567
9.63067
1921 9.01467
10.517
1922 7.17592
9.04992
1923 7.98092
9.25017
1924 7.5205
9.42175
1925 8.28917
9.18508
1926 8.7205
9.8275
1927 8.08017
9.41483
year Germany Ireland
1928 8.37158
9.61942
1929 7.40658
9.47533
1930 8.78275
9.31858
1931 7.59575
9.53625
1932 8.35317
9.75708
1933 7.70375
10.0422
1934 9.56967
10.0011
1935 8.47167
9.67033
1936 8.47642
9.64817
1937 8.71192
9.53908
1938 8.80833
10.0117
1939 8.45217
9.87492
1940 6.62508
9.68133
1941 7.13958
9.31383
1942 7.2385
9.48342
1943 8.9215
10.0567
1944 8.31758
9.798
1945 8.9005
10.4441
1946 8.38333
9.598
1947 8.45842
9.50275
1948 9.04242
10.0742
1949 9.1765
10.5766
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
year Germany Ireland
1950 8.65342
9.46183
1951 8.81642
9.22975
1952 8.00683
9.37725
1953 9.067
10.0196
1954 7.78883
9.438
1955 7.68408
9.62958
1956 6.9695
9.40517
1957 8.68608
10.0621
1958 8.32583
9.69983
1959 9.18275
10.4129
1960 8.51417
9.67342
1961 9.12417
9.81108
1962 7.32492
9.06442
1963 7.2205
8.78825
1964 8.23633
9.751
1965 7.633
9.06892
1966 8.64567
9.63133
1967 9.04167
9.49142
1968 8.32358
9.58883
1969 7.913
9.33275
1970 7.85608
9.60358
1971 8.60017
10.0866
year Germany Ireland
1972 8.05483
9.23958
1973 8.40283
9.81667
1974 8.97042
9.47867
1975 9.09367
10.0637
1976 8.6505
9.86158
1977 8.80483
9.61225
1978 7.93275
9.64858
1979 7.90167
8.8925
1980 7.72725
9.54533
1981 8.3205
9.59092
1982 9.07333
9.84675
1983 9.18567
10.0795
1984 8.123
9.78458
1985 7.61333
9.11067
1986 8.08508
8.73333
1987 7.58925
9.43867
1988 9.19558
9.86533
1989 9.67983
10.2957
1990 9.6895
10.3127
1991 8.52308
9.82517
1992 9.50583
9.789
1993 8.6335
9.64158
year Germany Ireland
1994 9.847
9.88483
1995 9.0335
10.5189
1996 7.37442
9.59117
1997 9.04867
10.6835
1998 9.17175
10.4595
1999 9.62075
10.4582
2000 10.0204
10.1308
2001 9.09042
10.0231
2002 9.63733
10.4207
2003 9.49425
10.5224
2004 9.05342
10.4391
2005 9.12842
10.5757
2006 9.706
10.6542
2007 9.9975
10.8427
2008 9.64217
10.1587
2009 9.35917
10.1572
2010 8.009
9.25817
2011 9.81958
10.3454
2012 9.22717
10.0358
In [13]:
yourCountries = pivotTable.select('year','Germany','Ireland')
yourCountries.plot('year')
In [15]:
yourCountries = pivotTable.select('year','Germany','Ireland')
yourCountries.plot('year')
Question 1 continued:
In this markdown cell, explain an observation you see from the
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
figure you generated.
...
Question 2:
Let's visualize the change in temperature for the United States.
In [47]:
print(temps)
print(temps.labels)
us = temps.where("country","United States")
plt.figure(figsize = (10, 5))
plt.plot(us['year'].astype(int), us['avg'])
plt.xticks(np.arange(1750, 2025, 25))
plt.show()
year | country
| avg
| jan
| feb
| mar
| apr
| may
| jun
| jul
| aug
| sep
| oct
| nov
| dec
1753 | Åland
| 5.11833 | -2.412 | -3.273 | 0.71
| 2.778
| 6.226
|
11.102 | 15.159 | 15.786 | 12.106 | 8.756
| 2.248
| -7.766
1753 | Albania
| 12.557
| 1.4
| 2.655
| 8.505 | 11.541 | 16.642 |
22.098 | 23.532 | 21.77
| 18.7
| 13.661 | 7.593
| 2.587
1753 | Andorra
| 11.2345 | 0.938
| 4.083
| 8.352 | 9.165
| 13.783 |
19.796 | 21.148 | 18.796 | 16.546 | 11.706 | 5.991
| 4.51
1753 | Austria
| 6.13892 | -6.398 | -3.537 | 2.681 | 6.498
| 11.331 |
16.209 | 16.881 | 14.751 | 12.34
| 7.072
| -0.011 | -4.15
1753 | Belarus
| 5.65175 | -7.122 | -6.956 | 0.706 | 6.768
| 13.06
|
16.615 | 18.032 | 16.501 | 12.448 | 6.84
| 0.024
| -9.095
1753 | Belgium
| 9.45708 | -1.215 | 2.443
| 6.838 | 8.826
| 13.042 |
17.602 | 18.072 | 16.203 | 14.761 | 10.332 | 4.143
| 2.438
1753 | Bosnia And Herzegovina | 10.3656 | -1.973 | 0.043
| 6.558 | 10.293 | 15.253 |
20.463 | 21.589 | 19.567 | 16.607 | 11.343 | 4.776
| -0.132
1753 | Bulgaria
| 10.3995 | -1.841 | -0.883 | 5.889 | 10.043 | 15.723 |
20.69
| 22.084 | 20.446 | 17.03
| 11.668 | 5.201
| -1.256
1753 | Croatia
| 11.2875 | -1.311 | 1.012
| 7.5
| 11.318 | 16.33
|
21.666 | 22.721 | 20.52
| 17.527 | 12.113 | 5.342
| 0.712
1753 | Czech Republic
| 7.49492 | -4.72
| -2.339 | 3.956 | 8.117
| 12.84
|
17.184 | 18.175 | 16.392 | 13.701 | 8.464
| 1.475
| -3.306
... (44320 rows omitted)
('year', 'country', 'avg', 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep',
'oct', 'nov', 'dec')
In [20]:
check('tests/q2.py')
---------------------------------------------------------------------------
NameError
Traceback (most recent call last)
Cell In[20], line 1
----> 1 check(
'tests/q2.py'
)
NameError: name 'check' is not defined
Question 3:
Null and alternative hypothesis. We may be curious whether globally
temperatures are more likely to increase or decrease on average. Based on our
preliminary figures and what we know about creating good hypotheses, set the null and
alternative hypothesis below:
•
Null hypothesis: ...
•
Alternative hypothesis: ...
To test the null hypothesis we're interested in identifying whether the temperature
increased or decreased in each time period.
Temperatures vary widely across countries and years, presumably due to the vast array
of differences among the climates and human intervention. Rather than attempting to
analyze the temperatures themselves, here we will restrict our analysis to whether or not
temperatures increased or decreased over certain time spans. We will not concern
ourselves with how much temperatures increased or decreased; only the direction of the
changes - whether they increased or decreased.
The np.diff function takes an array of values and computes the differences between
adjacent items of a list or array as such:
[item 1 - item 0 , item 2 - item 1 , item 3 - item 2, ...]
Instead, we may wish to compute the difference between items that are two positions
apart. For example, given a 5-element array, we may want:
[item 2 - item 0 , item 3 - item 1 , item 4 - item 2]
The diff_n function below computes this result. Don't worry if the implementation uses
unfamiliar features of Python, as long as you understand its behavior.
In [24]:
def diff_n(values, n):
return np.array(values)[n:] - np.array(values)[:-n]
diff_n(make_array(1, 10, 100, 1000, 10000), 2)
Out[24]:
array([
99,
990, 9900])
Question 4:
Implement the function changes that takes an array of temperatures for a
country, ordered by increasing year. For all two-year periods (e.g., from 1960 to 1962), it
computes and returns the number of increases minus the number of decreases.
For example, the array r = make_array(10, 7, 12, 9, 13, 9, 11) contains 3 increases (10 to
12, 7 to 9, and 12 to 13), 1 decrease (13 to 11), and 1 change that is neither an increase
or decrease (9 to 9). Therefore, changes(r) would return 2, the difference between 3
increases and 1 decrease.
Hint: Consider using the
diff_n
function combined with boolean functions which use
np.count_non-zero
when array elements after using
diff_n
represent increases and
separately when they represent decreases.
In [55]:
def changes(rates, years=2):
differences = diff_n(rates, years)
greater = np.count_nonzero(differences > 0)
less = np.count_nonzero(differences < 0)
return greater - less
In [53]:
check('tests/q4.py')
---------------------------------------------------------------------------
NameError
Traceback (most recent call last)
Cell In[53], line 1
----> 1 check(
'tests/q4.py'
)
NameError: name 'check' is not defined
Question 5:
Assign changes_by_country to a table with one row per country that has
two columns: the Country name and the Temperature changes statistic computed across
all years in our data set for that country. It may be useful to split this process into two
steps. The final table's first 2 rows should look like this:
country
avg changes
Afghanistan 18
Africa
8
Hint: You can use a
group
method to apply your
changes
function to each column in the
original data set while grouping on each country. See this example from Olympic data
below:
In [ ]:
NORUSA = Table.read_table('NORUSA.csv')
NORUSA_NUMBERS = NORUSA.group(['Year','Team']) # Number of athletes per year
NORUSA_NUMBERS
Now compute the increases - decreases for the winter olympics for each team
¶
Below code allows us to group 'Team' across all the years of the Olympics to give the
following table. |Team| Year changes | count changes| |----|---|---| |Norway|20|10| |United
States|20|18
Apply this concept to create the table showing net change for each country.
In [ ]:
NORUSA_NUMBERS.group('Team',changes)
In [29]:
countries = temps.group("country", changes)
changes_by_country = countries.select('country', 'avg changes')
changes_by_country
Out[29]:
country
avg changes
Afghanistan
18
Africa
8
Albania
-22
Algeria
9
American Samoa
-3
Andorra
10
Angola
-1
Anguilla
6
Antigua And Barbuda
0
Argentina
-1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
... (232 rows omitted)
In [ ]:
check('tests/q5.py')
Question 6:
Assign test_stat to the total increases minus the total decreases for all two-
year periods and all countries in our data set. For example, if the temperature in Albania
went up 23 times and fell 17 times, the total change for Albania would be 6. We want the
total value for all the countries together.
In [31]:
test_stat = sum(changes_by_country.column('avg changes'))
print('Total increases minus total decreases, across all countries and years:', test_stat)
Total increases minus total decreases, across all countries and years: 1140
In [ ]:
check('tests/q6.py')
"More increases than decreases," one person exclaims, "Temperatures tend to go up
across two-year periods. What dire times we live in."
"Not so fast," another person replies, "Even if temperatures just moved up and down
uniformly at random, there would be some difference between the increases and
decreases. There were a lot of countries and a lot of years, so there were many chances
for changes to happen. If country temperature increase and decrease at random with
equal probability, perhaps this difference was simply due to chance!"
Based on the null hypothesis above that country temperatures increase and decrease by
chance, we can simulate our test statistic. Our test statistic should depend only on
whether temperature increased or decreased, not on the size of any change. Thus we
choose:
Test Statistic: The number of increases minus the number of decreases
The cell below samples increases and decreases at random from a uniform distribution
100 times. The final column of the resulting table gives the number of increases and
decreases that resulted from sampling in this way. Using sample_from_distribution is
faster than using sample followed by group to compute the same result.
In [32]:
uniform = Table().with_columns(
"Change", make_array('Increase', 'Decrease'),
"Chance", make_array(0.5,
0.5))
uniform.sample_from_distribution('Chance', 100)
Out[32]:
Change
Chance
Chance sample
Increase
0.5
52
Decrease 0.5
48
Question 7:
Complete the simulation below, which samples num_changes
increases/decreases at random many times and forms an empirical distribution of your
test statistic under the null hypothesis. Your job is to
•
fill in the function simulate_under_null, which simulates a single sample under the
null hypothesis, and
•
fill in its argument when it's called below.
As a hint, num_changes should be approximately the number of countries times the
number of time comparisons (you can find the number of year comparisons by using
diff_n().
In [36]:
def simulate_under_null(num_chances_to_change):
"""Simulates some number changing several times, with an equal
chance to increase or decrease. Returns the value of your test statistic for these
simulated changes.
num_chances_to_change is the number of times the number changes.
"""
uniform = Table().with_columns(
"Change", make_array('Increase', 'Decrease'),
"Chance", make_array(0.5, 0.5)
)
sample = uniform.sample_from_distribution('Chance', num_chances_to_change)
increase_num = sample.column('Chance sample').item(0)
decrease_num = sample.column('Chance sample').item(1)
return increase_num - decrease_num
result = simulate_under_null(100)
# Replace 100 with the desired value of
num_chances_to_change
print("Result:", result)
Result: -10
In [57]:
def empirical_distribution(tbl):
num_changes = len(tbl.group('country', list).apply(changes, 'avg list')) * len(tbl)
samples = make_array()
for i in np.arange(10000):
samples = np.append(samples, simulate_under_null(num_changes))
return samples
In [59]:
samples = empirical_distribution(temps)
In [ ]:
check('tests/q7.py')
Question 8:
Complete the analysis as follows:
1. Compute a P-value. (Hint: you can use np.count_nonzero())
2. Using a 5% P-value cutoff, draw a conclusion about the null and alternative
hypotheses.
3. Describe your findings using simple, non-technical language. What does your
analysis tell you about temperatures changes over time? What can you claim about
causation from your statistical analysis?
P-value:
...
Conclusion about the hypotheses:
...
Findings:
...
In [60]:
pvalue = np.count_nonzero(samples >= test_stat)/10000
pvalue
Out[60]:
0.0
Some countries have implemented policies and laws to counteract climate change
whereas others have not - we have a table that contains a boolean to indicate whether a
country has any policies or laws to protect the earth and then has a number of policies
and laws implemented in that country. We can test to see if those countries that have
implemented policies to counteract climate change show any difference in changes in
temperatures from those countries who have not implemented policies. A natural
experiment happens when something other than experimental design applies a
treatment to one group and not to another (control) group, and we have some hope that
the treatment and control groups don't have any other systematic differences. This is
likely not the case globally, but if we did believe that the countries didn't have other
systematic differences, how would we set up the experiment.
Data Source:
Climate Change Laws of the World
Question 9:
Describe this investigation in terms of an experiment. What population are
we studying? What is the control group? What is the treatment group? What outcome are
we measuring? Be precise!
Write your answers below.
• Population: Pop. of a country • Control: # of laws • Treatment: Create laws to
counteract climate change • Outcome: Contrasting the temperatures for countries that
have laws vs the countries that don't
In [61]:
laws = Table.read_table('laws.csv')
laws
Out[61]:
country
haveLaws numberLaws
Afghanistan
True
14
Africa
False
0
Albania
True
4
Algeria
True
14
American Samoa
False
0
Andorra
True
25
Angola
True
23
Anguilla
False
0
Antigua And Barbuda
False
0
Argentina
True
19
... (232 rows omitted)
Question 10:
Let's set up to compute an empirical distribution for countries that have
laws and policies that attempt to counteract climate change and an empirical distribution
for countries that have not implemented laws and policies. We want to focus on the time
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
range between 1990 and 2020 as the majority of laws were implemented in this time
period. We're going to split this up into four steps.
1. Combine the temperature table and the laws table.
2. Set year_range to the correct time period.
3. Create two tables: one of countries that have climate change laws and one for
countries that do not.
In [62]:
temp_law = temps.join('country', laws)
year_range = temp_law.where('year',are.between(1990,2020))
haveLaws = year_range.where('haveLaws',True)
noLaws = year_range.where('haveLaws',False)
In [ ]:
check('tests/q10.py')
Question 11:
Calculate the test statistic for both subsets of countries: those that have
implemented climate change laws and those that have not implemented these laws.
In [63]:
laws_test_stat = sum(haveLaws.group('country',changes).column('avg changes'))
laws_test_stat
Out[63]:
199
In [64]:
nolaws_test_stat = sum(noLaws.group('country',changes).column('avg changes'))
nolaws_test_stat
Out[64]:
61
In [ ]:
check('tests/q11.py')
Question 12:
Now using these tables from question 10 and the calculated test statistic
from question 11, create an empirical distribution and calculate a p-value.
In [65]:
haveLawsSamples = empirical_distribution(haveLaws)
lawsPvalue = np.count_nonzero(samples >= test_stat)/10000
print("P-value for countries that have implemented policies to counteract climate change
from 1990 to 2020 :" + str(lawsPvalue))
P-value for countries that have implemented policies to counteract climate change from
1990 to 2020 :0.0
In [66]:
noLawsSamples = empirical_distribution(haveLaws)
nolawsPvalue = np.count_nonzero(samples >= test_stat)/10000
print("P-value for countries that have NOT implemented policies to counteract climate
change from 1990 to 2020 :" + str(nolawsPvalue))
P-value for countries that have NOT implemented policies to counteract climate change from
1990 to 2020 :0.0
Question 13:
Explain what our results show in the markdown cell below:
The two reswults for Question 12 share alot of similarities. The testing statistics sharea
similar p value for climate change from the years 1990 to 2020.
In [68]:
import glob
from gofer.ok import check
correct = 0
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
checks = [2, 4, 5, 6, 7, 10, 11]
total = len(checks)
for x in checks:
print('Testing question {}: '.format(str(x)))
g = check('tests/q{}.py'.format(str(x)))
if g.grade == 1.0:
print("Passed")
correct += 1
else:
print('Failed')
display(g)
Testing question 2:
Passed
All tests passed!
Testing question 4:
Passed
All tests passed!
Testing question 5:
Passed
All tests passed!
Testing question 6:
Passed
All tests passed!
Testing question 7:
Passed
All tests passed!
Testing question 10:
Passed
All tests passed!
Testing question 11:
Passed
All tests passed!
In [ ]:
print("Nice work ",name)
import time;
localtime = time.asctime( time.localtime(time.time()) )
print("Submitted @ ", localtime)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help