lab7_statistical_inference

html

School

Temple University *

*We aren’t endorsed by this school

Course

1013

Subject

Geography

Date

Dec 6, 2023

Type

html

Pages

23

Uploaded by MegaTitanium11196

Report
Lab 7: Inference and Global Climate Change By the end of this lab, you should know how to: 1. Test whether observed data appears to be a random sample from a distribution. 2. Analyze a natural experiment. 3. Implement and interpret a sign test. 4. Create a function to run a general hypothesis test. 5. Analyze visualizations and draw conclusions from them. In [ ]: name = James In [ ]: ## import statements # These lines load the tests. from gofer.ok import check import numpy as np from datascience import * import pandas as pd import matplotlib %matplotlib inline import matplotlib.pyplot as plt plt.style.use('fivethirtyeight') # Fix for datascience plots import collections as collections import collections.abc as abc collections.Iterable = abc.Iterable Overview Climate change is usually referring to the general trend of warming temperatures globally. Along with these increasing temperatures, unusual shifts in trends in weather activity such as hurricanes, storms, winds, etc are also usually classified under climate change. While the climate can shift due to natural occurrences, scientists have found that human interventions have potentially caused the trend of warming. One explanation for the warming could be increased solar solar activity, however scientists have found that solar activity has not generally increased during the period when temperature has increased.
Links: NASA Canada Data While there's several different metrics we could analyze to make some inferences about overall trends in global climate change, for simplicity's sake we will be focusing on land temperature across different countries. The original table came from this database , however it's been reformatted to make the downstream analyses easier. There are 15 columns: Year, Country, Average Temperature of that country in that year 'avg', and then a column for each month of that year with temperatures. In [5]: from datascience import Table temps = Table.read_table('temp_per_country.csv') Question 1: Let's explore this data a bit. The cell below creates a pivot table with years as the rows and each country as a new column. Select two countries from our dataset and draw a line plot of the changes in temperature over time. You only want to graph the years that have data for both your countries of interest (Hint: You may want to utilize where and are.above() to select those years with data). There is not a autocheck for this question as you all may have different answers depending on the countres you pick. In [7]: import numpy as np np.unique(temps['country']) Out[7]: array(['Afghanistan', 'Africa', 'Albania', 'Algeria', 'American Samoa', 'Andorra', 'Angola', 'Anguilla', 'Antigua And Barbuda', 'Argentina', 'Armenia', 'Aruba', 'Asia', 'Australia', 'Austria', 'Azerbaijan', 'Bahamas', 'Bahrain', 'Baker Island', 'Bangladesh', 'Barbados', 'Belarus', 'Belgium', 'Belize', 'Benin', 'Bhutan', 'Bolivia', 'Bonaire, Saint Eustatius And Saba', 'Bosnia And Herzegovina', 'Botswana', 'Brazil', 'British Virgin Islands', 'Bulgaria', 'Burkina Faso', 'Burma', 'Burundi', 'Cambodia', 'Cameroon', 'Canada', 'Cape Verde', 'Cayman Islands', 'Central African Republic', 'Chad', 'Chile', 'China',
'Christmas Island', 'Colombia', 'Comoros', 'Congo', 'Congo (Democratic Republic Of The)', 'Costa Rica', 'Croatia', 'Cuba', 'Curaçao', 'Cyprus', 'Czech Republic', "Côte D'Ivoire", 'Denmark', 'Denmark (Europe)', 'Djibouti', 'Dominica', 'Dominican Republic', 'Ecuador', 'Egypt', 'El Salvador', 'Equatorial Guinea', 'Eritrea', 'Estonia', 'Ethiopia', 'Europe', 'Falkland Islands (Islas Malvinas)', 'Faroe Islands', 'Federated States Of Micronesia', 'Fiji', 'Finland', 'France', 'France (Europe)', 'French Guiana', 'French Polynesia', 'French Southern And Antarctic Lands', 'Gabon', 'Gambia', 'Gaza Strip', 'Georgia', 'Germany', 'Ghana', 'Greece', 'Greenland', 'Grenada', 'Guadeloupe', 'Guam', 'Guatemala', 'Guernsey', 'Guinea', 'Guinea Bissau', 'Guyana', 'Haiti', 'Heard Island And Mcdonald Islands', 'Honduras', 'Hong Kong', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran', 'Iraq', 'Ireland', 'Isle Of Man', 'Israel', 'Italy', 'Jamaica', 'Japan', 'Jersey', 'Jordan', 'Kazakhstan', 'Kenya', 'Kingman Reef', 'Kiribati', 'Kuwait', 'Kyrgyzstan', 'Laos', 'Latvia', 'Lebanon', 'Lesotho', 'Liberia', 'Libya', 'Liechtenstein', 'Lithuania', 'Luxembourg', 'Macau', 'Macedonia', 'Madagascar', 'Malawi', 'Malaysia', 'Mali', 'Malta', 'Martinique', 'Mauritania', 'Mauritius', 'Mayotte', 'Mexico', 'Moldova', 'Monaco', 'Mongolia', 'Montenegro', 'Montserrat', 'Morocco', 'Mozambique', 'Namibia', 'Nepal', 'Netherlands', 'Netherlands (Europe)', 'New Caledonia', 'New Zealand', 'Nicaragua', 'Niger', 'Nigeria', 'Niue', 'North America', 'North Korea', 'Northern Mariana Islands', 'Norway', 'Oceania', 'Oman', 'Pakistan', 'Palau', 'Palestina', 'Palmyra Atoll', 'Panama', 'Papua New Guinea', 'Paraguay', 'Peru', 'Philippines', 'Poland', 'Portugal', 'Puerto Rico', 'Qatar', 'Reunion', 'Romania', 'Russia', 'Rwanda', 'Saint Barthélemy', 'Saint Kitts And Nevis', 'Saint Lucia', 'Saint Martin', 'Saint Pierre And Miquelon', 'Saint Vincent And The Grenadines', 'Samoa', 'San Marino', 'Sao Tome And Principe', 'Saudi Arabia', 'Senegal', 'Serbia', 'Seychelles', 'Sierra Leone', 'Singapore', 'Sint Maarten', 'Slovakia', 'Slovenia', 'Solomon Islands', 'Somalia', 'South Africa', 'South America', 'South Georgia And The South Sandwich Isla', 'South Korea', 'Spain', 'Sri Lanka', 'Sudan', 'Suriname', 'Svalbard And Jan Mayen', 'Swaziland', 'Sweden', 'Switzerland', 'Syria', 'Taiwan', 'Tajikistan', 'Tanzania', 'Thailand', 'Timor Leste', 'Togo', 'Tonga', 'Trinidad And Tobago', 'Tunisia', 'Turkey', 'Turkmenistan', 'Turks And Caicas Islands', 'Uganda', 'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United Kingdom (Europe)', 'United States', 'Uruguay', 'Uzbekistan', 'Venezuela', 'Vietnam', 'Virgin Islands', 'Western Sahara', 'Yemen', 'Zambia', 'Zimbabwe', 'Åland'], dtype='<U41') In [8]: pivotTable = temps.select('year', 'country', 'avg').pivot('country', 'year', 'avg', sum) yourCountries = pivotTable.select('year','Germany','Ireland')
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
yourCountries.show() year Germany Ireland 1753 8.02917 9.36908 1754 7.75983 9.35642 1755 7.52 9.04967 1756 8.40875 9.50142 1757 8.10858 9.46283 1758 7.49608 8.50692 1759 8.41275 9.49833 1760 8.38017 8.82892 1761 8.59617 9.65783 1762 7.80592 9.09283 1763 7.71208 8.66508 1764 8.21042 9.03192 1765 7.88458 8.87775 1766 7.91758 9.01617 1767 7.437 9.16967 1768 7.4135 8.941 1769 7.79408 9.069 1770 7.78958 8.8345 1771 7.10917 8.95342 1772 8.60708 9.03833 1773 8.61833 8.995
year Germany Ireland 1774 7.84133 9.07517 1775 8.95067 9.81958 1776 7.72567 9.07183 1777 7.69033 9.23717 1778 8.70817 9.6155 1779 9.46975 10.4047 1780 8.10092 9.83517 1781 9.16917 10.1843 1782 7.5945 8.23492 1783 8.86842 9.3365 1784 6.907 8.49633 1785 6.75508 8.85283 1786 7.01508 8.71192 1787 8.382 9.55617 1788 7.77067 9.61858 1789 7.95517 9.41883 1790 8.32217 9.68358 1791 8.63342 9.56083 1792 8.2015 9.55342 1793 8.41233 9.502 1794 9.129 9.83508 1795 8.03758 9.2285
year Germany Ireland 1796 8.17908 9.40342 1797 9.02317 9.3285 1798 8.49575 9.83033 1799 6.31308 8.41825 1800 8.10667 9.27008 1801 8.81067 9.61617 1802 8.26 9.43033 1803 7.35425 9.26942 1804 7.83858 9.60583 1805 6.4745 9.31617 1806 8.89817 9.50075 1807 8.49675 8.72975 1808 7.31592 8.88542 1809 7.855 8.71733 1810 7.7455 8.57542 1811 9.14383 9.24142 1812 6.76292 8.44908 1813 7.68283 9.02358 1814 6.81 8.21408 1815 7.63192 8.77983 1816 6.76433 7.90617 1817 7.94908 8.70117
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
year Germany Ireland 1818 8.28083 9.58717 1819 8.58633 9.05358 1820 7.17875 8.82867 1821 8.21558 9.67442 1822 9.34867 9.785 1823 7.59567 8.71592 1824 8.56575 9.45258 1825 8.71142 9.79633 1826 8.55108 9.88317 1827 8.15158 9.49625 1828 8.41108 10.1043 1829 5.905 8.57183 1830 7.28958 9.02283 1831 8.4605 9.71025 1832 7.67192 9.22092 1833 8.11117 9.20342 1834 9.51108 10.0865 1835 7.95875 9.3285 1836 8.08342 8.69383 1837 7.30283 8.95083 1838 6.45075 8.33275 1839 7.9685 8.76225
year Germany Ireland 1840 7.02533 8.83408 1841 8.37258 8.88392 1842 7.81542 9.43342 1843 8.27758 9.28242 1844 7.21983 9.09325 1845 7.12217 8.76633 1846 9.16675 10.1389 1847 7.47967 9.45992 1848 8.10742 9.13842 1849 7.56467 9.28108 1850 7.41575 9.3615 1851 7.57817 9.39925 1852 8.75058 9.81867 1853 6.87308 8.797 1854 7.86633 9.55275 1855 6.45633 8.42758 1856 7.73117 9.23425 1857 8.5035 10.0288 1858 7.2865 9.38217 1859 8.85667 9.483 1860 7.06167 8.34942 1861 8.08917 9.45108
year Germany Ireland 1862 8.50383 9.0435 1863 8.88858 9.60842 1864 6.47225 9.14242 1865 8.22075 9.78317 1866 8.61125 9.28358 1867 7.79133 9.19408 1868 9.36025 9.99667 1869 8.13992 9.64317 1870 6.98175 9.25942 1871 6.58008 9.49625 1872 9.11858 9.476 1873 8.40683 9.13883 1874 8.01817 9.48467 1875 7.518 9.54658 1876 7.997 9.48375 1877 8.18242 9.26883 1878 8.293 9.48483 1879 6.66092 8.04358 1880 8.27658 9.30925 1881 7.25325 8.69408 1882 8.31658 9.318 1883 7.8375 8.92783
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
year Germany Ireland 1884 8.48617 9.49492 1885 7.72683 8.64883 1886 7.99058 8.73883 1887 6.99483 8.90733 1888 6.90817 8.716 1889 7.44325 9.25975 1890 7.37008 9.25683 1891 7.49325 8.86517 1892 7.58767 8.40533 1893 7.947 9.91825 1894 8.18025 9.281 1895 7.4225 8.70892 1896 7.68775 9.48142 1897 8.05067 9.53483 1898 8.61475 10.0273 1899 8.23675 9.87783 1900 8.45208 9.32717 1901 7.68108 9.11917 1902 7.25667 9.11017 1903 8.40242 9.15042 1904 8.40033 9.14775 1905 8.02742 9.32375
year Germany Ireland 1906 8.354 9.30325 1907 7.85258 8.97567 1908 7.52133 9.62708 1909 7.33683 8.92933 1910 8.41875 9.21342 1911 9.026 9.72125 1912 7.83742 9.27275 1913 8.50675 9.54842 1914 8.48692 9.72192 1915 7.87533 9.0475 1916 8.42692 9.37983 1917 7.4995 8.75958 1918 8.49358 9.57425 1919 7.30075 8.80692 1920 8.61567 9.63067 1921 9.01467 10.517 1922 7.17592 9.04992 1923 7.98092 9.25017 1924 7.5205 9.42175 1925 8.28917 9.18508 1926 8.7205 9.8275 1927 8.08017 9.41483
year Germany Ireland 1928 8.37158 9.61942 1929 7.40658 9.47533 1930 8.78275 9.31858 1931 7.59575 9.53625 1932 8.35317 9.75708 1933 7.70375 10.0422 1934 9.56967 10.0011 1935 8.47167 9.67033 1936 8.47642 9.64817 1937 8.71192 9.53908 1938 8.80833 10.0117 1939 8.45217 9.87492 1940 6.62508 9.68133 1941 7.13958 9.31383 1942 7.2385 9.48342 1943 8.9215 10.0567 1944 8.31758 9.798 1945 8.9005 10.4441 1946 8.38333 9.598 1947 8.45842 9.50275 1948 9.04242 10.0742 1949 9.1765 10.5766
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
year Germany Ireland 1950 8.65342 9.46183 1951 8.81642 9.22975 1952 8.00683 9.37725 1953 9.067 10.0196 1954 7.78883 9.438 1955 7.68408 9.62958 1956 6.9695 9.40517 1957 8.68608 10.0621 1958 8.32583 9.69983 1959 9.18275 10.4129 1960 8.51417 9.67342 1961 9.12417 9.81108 1962 7.32492 9.06442 1963 7.2205 8.78825 1964 8.23633 9.751 1965 7.633 9.06892 1966 8.64567 9.63133 1967 9.04167 9.49142 1968 8.32358 9.58883 1969 7.913 9.33275 1970 7.85608 9.60358 1971 8.60017 10.0866
year Germany Ireland 1972 8.05483 9.23958 1973 8.40283 9.81667 1974 8.97042 9.47867 1975 9.09367 10.0637 1976 8.6505 9.86158 1977 8.80483 9.61225 1978 7.93275 9.64858 1979 7.90167 8.8925 1980 7.72725 9.54533 1981 8.3205 9.59092 1982 9.07333 9.84675 1983 9.18567 10.0795 1984 8.123 9.78458 1985 7.61333 9.11067 1986 8.08508 8.73333 1987 7.58925 9.43867 1988 9.19558 9.86533 1989 9.67983 10.2957 1990 9.6895 10.3127 1991 8.52308 9.82517 1992 9.50583 9.789 1993 8.6335 9.64158
year Germany Ireland 1994 9.847 9.88483 1995 9.0335 10.5189 1996 7.37442 9.59117 1997 9.04867 10.6835 1998 9.17175 10.4595 1999 9.62075 10.4582 2000 10.0204 10.1308 2001 9.09042 10.0231 2002 9.63733 10.4207 2003 9.49425 10.5224 2004 9.05342 10.4391 2005 9.12842 10.5757 2006 9.706 10.6542 2007 9.9975 10.8427 2008 9.64217 10.1587 2009 9.35917 10.1572 2010 8.009 9.25817 2011 9.81958 10.3454 2012 9.22717 10.0358 In [13]: yourCountries = pivotTable.select('year','Germany','Ireland') yourCountries.plot('year') In [15]: yourCountries = pivotTable.select('year','Germany','Ireland') yourCountries.plot('year') Question 1 continued: In this markdown cell, explain an observation you see from the
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
figure you generated. ... Question 2: Let's visualize the change in temperature for the United States. In [47]: print(temps) print(temps.labels) us = temps.where("country","United States") plt.figure(figsize = (10, 5)) plt.plot(us['year'].astype(int), us['avg']) plt.xticks(np.arange(1750, 2025, 25)) plt.show() year | country | avg | jan | feb | mar | apr | may | jun | jul | aug | sep | oct | nov | dec 1753 | Åland | 5.11833 | -2.412 | -3.273 | 0.71 | 2.778 | 6.226 | 11.102 | 15.159 | 15.786 | 12.106 | 8.756 | 2.248 | -7.766 1753 | Albania | 12.557 | 1.4 | 2.655 | 8.505 | 11.541 | 16.642 | 22.098 | 23.532 | 21.77 | 18.7 | 13.661 | 7.593 | 2.587 1753 | Andorra | 11.2345 | 0.938 | 4.083 | 8.352 | 9.165 | 13.783 | 19.796 | 21.148 | 18.796 | 16.546 | 11.706 | 5.991 | 4.51 1753 | Austria | 6.13892 | -6.398 | -3.537 | 2.681 | 6.498 | 11.331 | 16.209 | 16.881 | 14.751 | 12.34 | 7.072 | -0.011 | -4.15 1753 | Belarus | 5.65175 | -7.122 | -6.956 | 0.706 | 6.768 | 13.06 | 16.615 | 18.032 | 16.501 | 12.448 | 6.84 | 0.024 | -9.095 1753 | Belgium | 9.45708 | -1.215 | 2.443 | 6.838 | 8.826 | 13.042 | 17.602 | 18.072 | 16.203 | 14.761 | 10.332 | 4.143 | 2.438 1753 | Bosnia And Herzegovina | 10.3656 | -1.973 | 0.043 | 6.558 | 10.293 | 15.253 | 20.463 | 21.589 | 19.567 | 16.607 | 11.343 | 4.776 | -0.132 1753 | Bulgaria | 10.3995 | -1.841 | -0.883 | 5.889 | 10.043 | 15.723 | 20.69 | 22.084 | 20.446 | 17.03 | 11.668 | 5.201 | -1.256 1753 | Croatia | 11.2875 | -1.311 | 1.012 | 7.5 | 11.318 | 16.33 | 21.666 | 22.721 | 20.52 | 17.527 | 12.113 | 5.342 | 0.712 1753 | Czech Republic | 7.49492 | -4.72 | -2.339 | 3.956 | 8.117 | 12.84 | 17.184 | 18.175 | 16.392 | 13.701 | 8.464 | 1.475 | -3.306 ... (44320 rows omitted) ('year', 'country', 'avg', 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec') In [20]: check('tests/q2.py') --------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[20], line 1 ----> 1 check( 'tests/q2.py' ) NameError: name 'check' is not defined Question 3: Null and alternative hypothesis. We may be curious whether globally temperatures are more likely to increase or decrease on average. Based on our preliminary figures and what we know about creating good hypotheses, set the null and alternative hypothesis below: Null hypothesis: ... Alternative hypothesis: ... To test the null hypothesis we're interested in identifying whether the temperature
increased or decreased in each time period. Temperatures vary widely across countries and years, presumably due to the vast array of differences among the climates and human intervention. Rather than attempting to analyze the temperatures themselves, here we will restrict our analysis to whether or not temperatures increased or decreased over certain time spans. We will not concern ourselves with how much temperatures increased or decreased; only the direction of the changes - whether they increased or decreased. The np.diff function takes an array of values and computes the differences between adjacent items of a list or array as such: [item 1 - item 0 , item 2 - item 1 , item 3 - item 2, ...] Instead, we may wish to compute the difference between items that are two positions apart. For example, given a 5-element array, we may want: [item 2 - item 0 , item 3 - item 1 , item 4 - item 2] The diff_n function below computes this result. Don't worry if the implementation uses unfamiliar features of Python, as long as you understand its behavior. In [24]: def diff_n(values, n): return np.array(values)[n:] - np.array(values)[:-n] diff_n(make_array(1, 10, 100, 1000, 10000), 2) Out[24]: array([ 99, 990, 9900]) Question 4: Implement the function changes that takes an array of temperatures for a country, ordered by increasing year. For all two-year periods (e.g., from 1960 to 1962), it computes and returns the number of increases minus the number of decreases. For example, the array r = make_array(10, 7, 12, 9, 13, 9, 11) contains 3 increases (10 to 12, 7 to 9, and 12 to 13), 1 decrease (13 to 11), and 1 change that is neither an increase or decrease (9 to 9). Therefore, changes(r) would return 2, the difference between 3 increases and 1 decrease. Hint: Consider using the diff_n function combined with boolean functions which use np.count_non-zero when array elements after using diff_n represent increases and separately when they represent decreases. In [55]: def changes(rates, years=2): differences = diff_n(rates, years) greater = np.count_nonzero(differences > 0) less = np.count_nonzero(differences < 0) return greater - less In [53]: check('tests/q4.py') --------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[53], line 1 ----> 1 check( 'tests/q4.py' ) NameError: name 'check' is not defined Question 5: Assign changes_by_country to a table with one row per country that has two columns: the Country name and the Temperature changes statistic computed across all years in our data set for that country. It may be useful to split this process into two
steps. The final table's first 2 rows should look like this: country avg changes Afghanistan 18 Africa 8 Hint: You can use a group method to apply your changes function to each column in the original data set while grouping on each country. See this example from Olympic data below: In [ ]: NORUSA = Table.read_table('NORUSA.csv') NORUSA_NUMBERS = NORUSA.group(['Year','Team']) # Number of athletes per year NORUSA_NUMBERS Now compute the increases - decreases for the winter olympics for each team Below code allows us to group 'Team' across all the years of the Olympics to give the following table. |Team| Year changes | count changes| |----|---|---| |Norway|20|10| |United States|20|18 Apply this concept to create the table showing net change for each country. In [ ]: NORUSA_NUMBERS.group('Team',changes) In [29]: countries = temps.group("country", changes) changes_by_country = countries.select('country', 'avg changes') changes_by_country Out[29]: country avg changes Afghanistan 18 Africa 8 Albania -22 Algeria 9 American Samoa -3 Andorra 10 Angola -1 Anguilla 6 Antigua And Barbuda 0 Argentina -1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
... (232 rows omitted) In [ ]: check('tests/q5.py') Question 6: Assign test_stat to the total increases minus the total decreases for all two- year periods and all countries in our data set. For example, if the temperature in Albania went up 23 times and fell 17 times, the total change for Albania would be 6. We want the total value for all the countries together. In [31]: test_stat = sum(changes_by_country.column('avg changes')) print('Total increases minus total decreases, across all countries and years:', test_stat) Total increases minus total decreases, across all countries and years: 1140 In [ ]: check('tests/q6.py') "More increases than decreases," one person exclaims, "Temperatures tend to go up across two-year periods. What dire times we live in." "Not so fast," another person replies, "Even if temperatures just moved up and down uniformly at random, there would be some difference between the increases and decreases. There were a lot of countries and a lot of years, so there were many chances for changes to happen. If country temperature increase and decrease at random with equal probability, perhaps this difference was simply due to chance!" Based on the null hypothesis above that country temperatures increase and decrease by chance, we can simulate our test statistic. Our test statistic should depend only on whether temperature increased or decreased, not on the size of any change. Thus we choose: Test Statistic: The number of increases minus the number of decreases The cell below samples increases and decreases at random from a uniform distribution 100 times. The final column of the resulting table gives the number of increases and decreases that resulted from sampling in this way. Using sample_from_distribution is faster than using sample followed by group to compute the same result. In [32]: uniform = Table().with_columns( "Change", make_array('Increase', 'Decrease'), "Chance", make_array(0.5, 0.5)) uniform.sample_from_distribution('Chance', 100) Out[32]: Change Chance Chance sample Increase 0.5 52 Decrease 0.5 48 Question 7: Complete the simulation below, which samples num_changes increases/decreases at random many times and forms an empirical distribution of your test statistic under the null hypothesis. Your job is to fill in the function simulate_under_null, which simulates a single sample under the null hypothesis, and fill in its argument when it's called below. As a hint, num_changes should be approximately the number of countries times the number of time comparisons (you can find the number of year comparisons by using diff_n().
In [36]: def simulate_under_null(num_chances_to_change): """Simulates some number changing several times, with an equal chance to increase or decrease. Returns the value of your test statistic for these simulated changes. num_chances_to_change is the number of times the number changes. """ uniform = Table().with_columns( "Change", make_array('Increase', 'Decrease'), "Chance", make_array(0.5, 0.5) ) sample = uniform.sample_from_distribution('Chance', num_chances_to_change) increase_num = sample.column('Chance sample').item(0) decrease_num = sample.column('Chance sample').item(1) return increase_num - decrease_num result = simulate_under_null(100) # Replace 100 with the desired value of num_chances_to_change print("Result:", result) Result: -10 In [57]: def empirical_distribution(tbl): num_changes = len(tbl.group('country', list).apply(changes, 'avg list')) * len(tbl) samples = make_array() for i in np.arange(10000): samples = np.append(samples, simulate_under_null(num_changes)) return samples In [59]: samples = empirical_distribution(temps) In [ ]: check('tests/q7.py') Question 8: Complete the analysis as follows: 1. Compute a P-value. (Hint: you can use np.count_nonzero()) 2. Using a 5% P-value cutoff, draw a conclusion about the null and alternative hypotheses. 3. Describe your findings using simple, non-technical language. What does your analysis tell you about temperatures changes over time? What can you claim about causation from your statistical analysis? P-value: ... Conclusion about the hypotheses: ... Findings: ... In [60]: pvalue = np.count_nonzero(samples >= test_stat)/10000 pvalue Out[60]: 0.0
Some countries have implemented policies and laws to counteract climate change whereas others have not - we have a table that contains a boolean to indicate whether a country has any policies or laws to protect the earth and then has a number of policies and laws implemented in that country. We can test to see if those countries that have implemented policies to counteract climate change show any difference in changes in temperatures from those countries who have not implemented policies. A natural experiment happens when something other than experimental design applies a treatment to one group and not to another (control) group, and we have some hope that the treatment and control groups don't have any other systematic differences. This is likely not the case globally, but if we did believe that the countries didn't have other systematic differences, how would we set up the experiment. Data Source: Climate Change Laws of the World Question 9: Describe this investigation in terms of an experiment. What population are we studying? What is the control group? What is the treatment group? What outcome are we measuring? Be precise! Write your answers below. • Population: Pop. of a country • Control: # of laws • Treatment: Create laws to counteract climate change • Outcome: Contrasting the temperatures for countries that have laws vs the countries that don't In [61]: laws = Table.read_table('laws.csv') laws Out[61]: country haveLaws numberLaws Afghanistan True 14 Africa False 0 Albania True 4 Algeria True 14 American Samoa False 0 Andorra True 25 Angola True 23 Anguilla False 0 Antigua And Barbuda False 0 Argentina True 19 ... (232 rows omitted) Question 10: Let's set up to compute an empirical distribution for countries that have laws and policies that attempt to counteract climate change and an empirical distribution for countries that have not implemented laws and policies. We want to focus on the time
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
range between 1990 and 2020 as the majority of laws were implemented in this time period. We're going to split this up into four steps. 1. Combine the temperature table and the laws table. 2. Set year_range to the correct time period. 3. Create two tables: one of countries that have climate change laws and one for countries that do not. In [62]: temp_law = temps.join('country', laws) year_range = temp_law.where('year',are.between(1990,2020)) haveLaws = year_range.where('haveLaws',True) noLaws = year_range.where('haveLaws',False) In [ ]: check('tests/q10.py') Question 11: Calculate the test statistic for both subsets of countries: those that have implemented climate change laws and those that have not implemented these laws. In [63]: laws_test_stat = sum(haveLaws.group('country',changes).column('avg changes')) laws_test_stat Out[63]: 199 In [64]: nolaws_test_stat = sum(noLaws.group('country',changes).column('avg changes')) nolaws_test_stat Out[64]: 61 In [ ]: check('tests/q11.py') Question 12: Now using these tables from question 10 and the calculated test statistic from question 11, create an empirical distribution and calculate a p-value. In [65]: haveLawsSamples = empirical_distribution(haveLaws) lawsPvalue = np.count_nonzero(samples >= test_stat)/10000 print("P-value for countries that have implemented policies to counteract climate change from 1990 to 2020 :" + str(lawsPvalue)) P-value for countries that have implemented policies to counteract climate change from 1990 to 2020 :0.0 In [66]: noLawsSamples = empirical_distribution(haveLaws) nolawsPvalue = np.count_nonzero(samples >= test_stat)/10000 print("P-value for countries that have NOT implemented policies to counteract climate change from 1990 to 2020 :" + str(nolawsPvalue)) P-value for countries that have NOT implemented policies to counteract climate change from 1990 to 2020 :0.0 Question 13: Explain what our results show in the markdown cell below: The two reswults for Question 12 share alot of similarities. The testing statistics sharea similar p value for climate change from the years 1990 to 2020. In [68]: import glob from gofer.ok import check correct = 0
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
checks = [2, 4, 5, 6, 7, 10, 11] total = len(checks) for x in checks: print('Testing question {}: '.format(str(x))) g = check('tests/q{}.py'.format(str(x))) if g.grade == 1.0: print("Passed") correct += 1 else: print('Failed') display(g) Testing question 2: Passed All tests passed! Testing question 4: Passed All tests passed! Testing question 5: Passed All tests passed! Testing question 6: Passed All tests passed! Testing question 7: Passed All tests passed! Testing question 10: Passed All tests passed! Testing question 11: Passed All tests passed! In [ ]: print("Nice work ",name) import time; localtime = time.asctime( time.localtime(time.time()) ) print("Submitted @ ", localtime)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help