array([[ 1960, 54211], [ 1961, 55438], [ 1962, 56225], [ 1963, 56695], [ 1964, 57032], [ 1965, 57360], [ 1966, 57715], [ 1967, 58055], [ 1968, 58386], [ 1969, 58726], [ 1970, 59063], [ 1971, 59440], [ 1972, 59840], [ 1973, 60243], [ 1974, 60528], [ 1975, 60657], [ 1976, 60586], [ 1977, 60366], [ 1978, 60103], [ 1979, 59980], [ 1980, 60096], [ 1981, 60567], [ 1982, 61345], [ 1983, 62201], [ 1984, 62836], [ 1985, 63026], [ 1986, 62644], [ 1987, 61833], [ 1988, 61079], [ 1989, 61032], [ 1990, 62149], [ 1991, 64622], [ 1992, 68235], [ 1993, 72504], [ 1994, 76700], [ 1995, 80324], [ 1996, 83200], [ 1997, 85451], [ 1998, 87277], [ 1999, 89005], [ 2000, 90853], [ 2001, 92898], [ 2002, 94992], [ 2003, 97017], [ 2004, 98737], [ 2005, 100031], [ 2006, 100832], [ 2007, 101220], [ 2008, 101353], [ 2009, 101453], [ 2010, 101669], [ 2011, 102053], [ 2012, 102577], [ 2013, 103187], [ 2014, 103795], [ 2015, 104341], [ 2016, 104822], [ 2017, 105264]]) Question 2 Now that we have have our data, we need to split this into a training set, and a testing set. But before we split our data into training and testing, we also need to split our data into the predictive features (denoted X) and the response (denoted y). Write a function that will take as input a 2-d numpy array and return four variables in the form of (X_train, y_train), (X_test, y_test), where (X_train, y_train) are the features + response of the training set, and (X-test, y_test) are the features + response of the testing set. Function Specifications: Should take a 2-d numpy array as input. Should split the array such that X is the year, and y is the corresponding population. Should return two tuples of the form (X_train, y_train), (X_test, y_test). Should use sklearn's train_test_split function with a test_size = 0.2 and random_state = 42. Failing Code: def feature_response_split(arr): X, y = arr[:, 0], arr[:, 1] X_train, X_test, y_train, y_test = train_test_split(X.reshape(-1,1), y, test_size = 0.2, random_state = 42) return (X_train, y_train, X_test,y_test) Expected Output: X_train == array([1996, 1991, 1968, 1977, 1966, 1964, 2001, 1979, 1990, 2009, 2010, 2014, 1975, 1969, 1987, 1986, 1976, 1984, 1993, 2015, 2000, 1971, 1992, 2016, 2003, 1989, 2013, 1961, 1981, 1962, 2005, 1999, 1995, 1983, 2007, 1970, 1982, 1978, 2017, 1980, 1967, 2002, 1974, 1988, 2011, 1998]) y_train == array([ 83200, 64622, 58386, 60366, 57715, 57032, 92898, 59980, 62149, 101453, 101669, 103795, 60657, 58726, 61833, 62644, 60586, 62836, 72504, 104341, 90853, 59440, 68235, 104822, 97017, 61032, 103187, 55438, 60567, 56225, 100031, 89005, 80324, 62201, 101220, 59063, 61345, 60103, 105264, 60096, 58055, 94992, 60528, 61079, 102053, 87277]) X_test == array([1960, 1965, 1994, 1973, 2004, 2012, 1997, 1985, 2006, 1972, 2008, 1963]) y_test == array([ 54211, 57360, 76700, 60243, 98737, 102577, 85451, 63026, 100832, 59840, 101353, 56695])

array([[ 1960, 54211], [ 1961, 55438], [ 1962, 56225], [ 1963, 56695], [ 1964, 57032], [ 1965, 57360], [ 1966, 57715], [ 1967, 58055], [ 1968, 58386], [ 1969, 58726], [ 1970, 59063], [ 1971, 59440], [ 1972, 59840], [ 1973, 60243], [ 1974, 60528], [ 1975, 60657], [ 1976, 60586], [ 1977, 60366], [ 1978, 60103], [ 1979, 59980], [ 1980, 60096], [ 1981, 60567], [ 1982, 61345], [ 1983, 62201], [ 1984, 62836], [ 1985, 63026], [ 1986, 62644], [ 1987, 61833], [ 1988, 61079], [ 1989, 61032], [ 1990, 62149], [ 1991, 64622], [ 1992, 68235], [ 1993, 72504], [ 1994, 76700], [ 1995, 80324], [ 1996, 83200], [ 1997, 85451], [ 1998, 87277], [ 1999, 89005], [ 2000, 90853], [ 2001, 92898], [ 2002, 94992], [ 2003, 97017], [ 2004, 98737], [ 2005, 100031], [ 2006, 100832], [ 2007, 101220], [ 2008, 101353], [ 2009, 101453], [ 2010, 101669], [ 2011, 102053], [ 2012, 102577], [ 2013, 103187], [ 2014, 103795], [ 2015, 104341], [ 2016, 104822], [ 2017, 105264]]) Question 2 Now that we have have our data, we need to split this into a training set, and a testing set. But before we split our data into training and testing, we also need to split our data into the predictive features (denoted X) and the response (denoted y). Write a function that will take as input a 2-d numpy array and return four variables in the form of (X_train, y_train), (X_test, y_test), where (X_train, y_train) are the features + response of the training set, and (X-test, y_test) are the features + response of the testing set. Function Specifications: Should take a 2-d numpy array as input. Should split the array such that X is the year, and y is the corresponding population. Should return two tuples of the form (X_train, y_train), (X_test, y_test). Should use sklearn's train_test_split function with a test_size = 0.2 and random_state = 42. Failing Code: def feature_response_split(arr): X, y = arr[:, 0], arr[:, 1] X_train, X_test, y_train, y_test = train_test_split(X.reshape(-1,1), y, test_size = 0.2, random_state = 42) return (X_train, y_train, X_test,y_test) Expected Output: X_train == array([1996, 1991, 1968, 1977, 1966, 1964, 2001, 1979, 1990, 2009, 2010, 2014, 1975, 1969, 1987, 1986, 1976, 1984, 1993, 2015, 2000, 1971, 1992, 2016, 2003, 1989, 2013, 1961, 1981, 1962, 2005, 1999, 1995, 1983, 2007, 1970, 1982, 1978, 2017, 1980, 1967, 2002, 1974, 1988, 2011, 1998]) y_train == array([ 83200, 64622, 58386, 60366, 57715, 57032, 92898, 59980, 62149, 101453, 101669, 103795, 60657, 58726, 61833, 62644, 60586, 62836, 72504, 104341, 90853, 59440, 68235, 104822, 97017, 61032, 103187, 55438, 60567, 56225, 100031, 89005, 80324, 62201, 101220, 59063, 61345, 60103, 105264, 60096, 58055, 94992, 60528, 61079, 102053, 87277]) X_test == array([1960, 1965, 1994, 1973, 2004, 2012, 1997, 1985, 2006, 1972, 2008, 1963]) y_test == array([ 54211, 57360, 76700, 60243, 98737, 102577, 85451, 63026, 100832, 59840, 101353, 56695])

Database System Concepts

7th Edition

ISBN:9780078022159

Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Chapter1: Introduction

Section: Chapter Questions

Problem 1PE

See similar textbooks

Question 2

Now that we have have our data, we need to split this into a training set, and a testing set. But before we split our data into training and testing, we also need to split our data into the predictive features (denoted X) and the response (denoted y).

Write a function that will take as input a 2-d numpy array and return four variables in the form of (X_train, y_train), (X_test, y_test), where (X_train, y_train) are the features + response of the training set, and (X-test, y_test) are the features + response of the testing set.

Function Specifications:

Should take a 2-d numpy array as input.
Should split the array such that X is the year, and y is the corresponding population.
Should return two tuples of the form (X_train, y_train), (X_test, y_test).
Should use sklearn's train_test_split function with a test_size = 0.2 and random_state = 42.

Failing Code:

def feature_response_split(arr):
X, y = arr[:, 0], arr[:, 1]
X_train, X_test, y_train, y_test = train_test_split(X.reshape(-1,1), y, test_size = 0.2, random_state = 42)
return (X_train, y_train, X_test,y_test)

Expected Output:

X_train == array([1996, 1991, 1968, 1977, 1966, 1964, 2001, 1979, 1990, 2009, 2010, 2014, 1975, 1969, 1987, 1986, 1976, 1984, 1993, 2015, 2000, 1971, 1992, 2016, 2003, 1989, 2013, 1961, 1981, 1962, 2005, 1999, 1995, 1983, 2007, 1970, 1982, 1978, 2017, 1980, 1967, 2002, 1974, 1988, 2011, 1998])

y_train == array([ 83200, 64622, 58386, 60366, 57715, 57032, 92898, 59980, 62149, 101453, 101669, 103795, 60657, 58726, 61833, 62644, 60586, 62836, 72504, 104341, 90853, 59440, 68235, 104822, 97017, 61032, 103187, 55438, 60567, 56225, 100031, 89005, 80324, 62201, 101220, 59063, 61345, 60103, 105264, 60096, 58055, 94992, 60528, 61079, 102053, 87277])

X_test == array([1960, 1965, 1994, 1973, 2004, 2012, 1997, 1985, 2006, 1972, 2008, 1963])

y_test == array([ 54211, 57360, 76700, 60243, 98737, 102577, 85451, 63026, 100832, 59840, 101353, 56695])

Expert Solution

Step by step

Solved in 3 steps with 1 images

SEE SOLUTION Check out a sample Q&A here

Knowledge Booster

Learn more about

Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.