population_df = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/AnalyseProject/world_population.csv', index_col='Country Code') meta_df = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/AnalyseProject/metadata.csv', index_col='Country Code')   Question 1   As we've seen previously, the world population data spans from 1960 to 2017. We'd like to build a predictive model that can give us the best guess at what the world population in a given year was. However, as a slight twist this time, we want to compute this estimate for only countries within a given income group. First, however, we need to organise our data such that the sklearn's RandomForestRegressor class can train on our data. To do this, we will write a function that takes as input an income group and return a 2-d numpy array that contains the year and the measured population. Function Specifications: Should take a str argument, called income_group_name as input and return a numpy array type as output. Set the default argument of income_group_name to equal 'Low income'. If the specified value of income_group_name does not exist, the function must raise a ValueError. The array should only have two columns containing the year and the population, in other words, it should have a shape (?, 2) where ? is the length of the data. The values within the array should be of type np.int64. Further Reading: Data types are associated with memory allocation. As such, your choice of data type affects the precision of computations in your program. For example, the np.int data type in numpy can only store values between -2147483648 to 2147483647 and assigning values outside this range for variables of this data type may cause run-time errors. To avoid this, we can use data types with larger memory capacity e.g. np.int64. https://docs.scipy.org/doc/numpy/user/basics.types.html   ### START FUNCTION def get_total_pop_by_income(income_group_name='Low income'):     # your code here     return ### END FUNCTION

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

population_df = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/AnalyseProject/world_population.csv', index_col='Country Code')
meta_df = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/AnalyseProject/metadata.csv', index_col='Country Code')

 

Question 1

 

As we've seen previously, the world population data spans from 1960 to 2017. We'd like to build a predictive model that can give us the best guess at what the world population in a given year was. However, as a slight twist this time, we want to compute this estimate for only countries within a given income group.

First, however, we need to organise our data such that the sklearn's RandomForestRegressor class can train on our data. To do this, we will write a function that takes as input an income group and return a 2-d numpy array that contains the year and the measured population.

Function Specifications:

  • Should take a str argument, called income_group_name as input and return a numpy array type as output.
  • Set the default argument of income_group_name to equal 'Low income'.
  • If the specified value of income_group_name does not exist, the function must raise a ValueError.
  • The array should only have two columns containing the year and the population, in other words, it should have a shape (?, 2) where ? is the length of the data.
  • The values within the array should be of type np.int64.

Further Reading:

Data types are associated with memory allocation. As such, your choice of data type affects the precision of computations in your program. For example, the np.int data type in numpy can only store values between -2147483648 to 2147483647 and assigning values outside this range for variables of this data type may cause run-time errors. To avoid this, we can use data types with larger memory capacity e.g. np.int64.

https://docs.scipy.org/doc/numpy/user/basics.types.html

 

### START FUNCTION
def get_total_pop_by_income(income_group_name='Low income'):
    # your code here
    return

### END FUNCTION

Expert Solution
steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
Elements of Tables
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education