Question 1 Suppose that you have been given a text file unis.csv containing lines of comma- separated data about some universities, and how their graduates report on outcomes from the education. Here are the data fields (also called data attributes): Description Field name UniName State Abbreviation of the University's name Abbreviation of the state where the University is mostly located Percentage of 2018 graduates in full-time employment, three months after Employment(2018) Employment(2019) graduation Percentage of 2019 graduates in full-time employment, three months after graduation The first few lines look like this (note that the first line is a header, and also note that the fields do not themselves contain any commas): UniName, State, Employment (2018), Employment (2019) CQU,QLD,79.1,79.6 Curtin, WA, 72.4,71.4 Deakin, VIC, 72.8,73.4 Suppose that you are part of a team whose task is to analyse the data in unis.csv to calculate the following: place the values for Employment(2018) into bins, with each bin representing a range of 5 (for example, 70 to 75, 75 to 80, etc). For each bin, find how many states contain universities whose Employment (2018) score is within that bin. Provide well-commented Python code that will perform this calculation. You do not need to deal with misformatted files or other errors. You are allowed to use a library like Pandas, but this is not required. It is important that your comments should clearly describe the structure used for storing the data in your program (eg if you use a dictionary, you must explain what the keys and values represent; if you use Pandas, you must indicate the indices of the dataframes your code refers to, etc).

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question
Question 1,
Suppose that you have been given a text file unis.csv containing lines of comma-
separated data about some universities, and how their graduates report on outcomes from
the education. Here are the data fields (also called data attributes):
Field name
UniName
State
Employment(2018)
Employment (2019)
Description
Abbreviation of the University's name
Abbreviation of the state where the
University is mostly located
Percentage of 2018 graduates in full-time
employment, three months after
graduation
Percentage of 2019 graduates in full-time
employment, three months after
graduation
The first few lines look like this (note that the first line is a header, and also note that the
fields do not themselves contain any commas):
UniName, State, Employment (2018), Employment (2019)
CQU,QLD,79.1,79.6
Curtin, WA, 72.4,71.4
Deakin, VIC, 72.8,73.4
Suppose that you are part of a team whose task is to analyse the data in unis.csv to
calculate the following: place the values for Employment(2018) into bins, with each bin
representing a range of 5 (for example, 70 to 75, 75 to 80, etc). For each bin, find how many
states contain universities whose Employment (2018) score is within that bin.
Provide well-commented Python code that will perform this calculation. You do not need to
deal with misformatted files or other errors. You are allowed to use a library like Pandas,
but this is not required. It is important that your comments should clearly describe the
structure used for storing the data in your program (eg if you use a dictionary, you must
explain what the keys and values represent; if you use Pandas, you must indicate the indices
of the dataframes your code refers to, etc).
Transcribed Image Text:Question 1, Suppose that you have been given a text file unis.csv containing lines of comma- separated data about some universities, and how their graduates report on outcomes from the education. Here are the data fields (also called data attributes): Field name UniName State Employment(2018) Employment (2019) Description Abbreviation of the University's name Abbreviation of the state where the University is mostly located Percentage of 2018 graduates in full-time employment, three months after graduation Percentage of 2019 graduates in full-time employment, three months after graduation The first few lines look like this (note that the first line is a header, and also note that the fields do not themselves contain any commas): UniName, State, Employment (2018), Employment (2019) CQU,QLD,79.1,79.6 Curtin, WA, 72.4,71.4 Deakin, VIC, 72.8,73.4 Suppose that you are part of a team whose task is to analyse the data in unis.csv to calculate the following: place the values for Employment(2018) into bins, with each bin representing a range of 5 (for example, 70 to 75, 75 to 80, etc). For each bin, find how many states contain universities whose Employment (2018) score is within that bin. Provide well-commented Python code that will perform this calculation. You do not need to deal with misformatted files or other errors. You are allowed to use a library like Pandas, but this is not required. It is important that your comments should clearly describe the structure used for storing the data in your program (eg if you use a dictionary, you must explain what the keys and values represent; if you use Pandas, you must indicate the indices of the dataframes your code refers to, etc).
Expert Solution
steps

Step by step

Solved in 3 steps with 1 images

Blurred answer
Knowledge Booster
Data Modeling Concepts
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education