his exercise is about data manipulation.   In this project, we will process agricultural data, namely the adoption of different Genetically Modified (GM) crops in the U.S. The data was collected over the years 2000-2016.   In this project, we are interested in how the adoption of different GM food and non-food crops has been proceeding in different states. We are going to determine the minimum and maximum adoption by state and the years when the minimum and maximum occurred.

Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
icon
Related questions
Question

PYTHON PROGRAMMING

Files are here: http://www.cse.msu.edu/~cse231/Online/Projects/Project05/

 

This exercise is about data manipulation.

 

In this project, we will process agricultural data, namely the adoption of different Genetically Modified (GM) crops in the U.S. The data was collected over the years 2000-2016.

 

In this project, we are interested in how the adoption of different GM food and non-food crops has been proceeding in different states. We are going to determine the minimum and maximum adoption by state and the years when the minimum and maximum occurred.

 

Assignment Specifications:

 

The data files The data files that you use are yearly state-wise percentage plantings for each type of crop:

• alltablesGEcrops.csv: the actual data from the USDA.

• data1.csv: data modified to test this project.

• data2.csv: data modified to test this project, but you do not get to see it

 

Input: This is real data so it is messy and you have to deal with that. Here are the constraints.

 

• All files have the same header, the same number of columns, and are comma-separated (i.e. csv format).

 

• You do not know ahead of time which crops are in the file ("Crop" column at index 1). Your program needs to determine this.

For example, the data2.csv file has different crops!

 

• Missouri for some unknown reason has extraneous characters in the state name on some rows. You need to handle those.

 

• The last column, the 'Value' column at index 6 with the percentage data, sometimes has digits, sometimes is blank, and sometimes has extraneous characters. You need to account for that. We are only interested in rows that have digits in this last column.

 

• We are only interested in data for the 50 states, 'State' column at index 0. All other rows are to be ignored. We provide a list of state names in the file state_list.txt that you can copy into your program. Note that most states have no entry!

 

• We are only interested in "All GE varieties" indicated in the 'Variety' column at index 3. All rows with other values in that column are to be ignored. Output: Here are the specifications for the output

 

• Output is by crop, ordered alphabetically by crop name as it appears in the 'Crop' column at index 1.

 

• For each crop the data is output alphabetically by state name. Each line has these values: state, max year, max value, min year, min value I used this format string: "{:<20s}{:<8s}{:<6d}{:<8s}{:<6d}"

 

• When determining the minimum and maximum values often there are multiple years with the same value. You are to output the lowest value year if multiple years have the same value.

 

Specific Requirements :

These are additional requirements that you need to satisfy to ensure a full score:

 

1. Items 1-9 of the Coding Standard will be enforced for this project

 

2. You must use dictionaries in the solution, and they must be used in a meaningful way. (I read into a dictionary indexed by states where each state had a dictionary by crop with years and percentages. To prepare for output I used a second dictionary index by crop.)

 

3. There must be at least 4 functions that must perform non-identical operations, and do them in a meaningful way.

 

a. There must be a function named open_file() that takes no arguments and returns a file pointer. If a file is not found, you must display a proper message and reprompt.

 

b. There must be a function named read_file(fp) that takes a file pointer and returns a dictionary. The input file must be closed in this function and never opened again for reading. You get to decide the organization of your dictionary.

 

c. There must be a function named main() called by if __name__ == "__main__": main()

 

d. There must be at least one other function that does something meaningful (it is good to have more functions).

 

Suggested Procedure

• Solve the problem using pencil and paper first. You cannot write a program until you have figured out how to solve the problem. This first step may be done collaboratively with another student. However, once the discussion turns to Python specifics and the subsequent writing of Python statements, you must work on your own.

• Construct the program one function at a time

 

 

SKELETON CODE

 

STATES = ["Alaska", "Alabama", "Arizona", "Arkansas", "California"]

 

def open_file():

# needes error check   

  fp = open("alltablesCEcrops.csv")

  return fp

 

def read_file(fp):

# count = 0

#skip header   

  fp.readline()

  for line in fp:

    line_lst = line.strip().split()

    state = line_lst[0]

    crop = line_lst [1]

    variety = line_lst[3]

    year = int(line_lst[4])

# may not be an int

    value = line_lst [6] 

# if state in states    

# if "Missouri" in state:   

    if variety == "All GE varieties:":

      print(state, crop, variety, year, value)

# if count > 10:

#break

#count+=1  

  return data_dictionary     

       

def main():

  fp = open_file()

  data_disctionary = read_file(fp)       

  print(data_disctionary)

   

   

main()   

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps with 1 images

Blurred answer
Recommended textbooks for you
Computer Networking: A Top-Down Approach (7th Edi…
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
Computer Organization and Design MIPS Edition, Fi…
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
Concepts of Database Management
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
Prelude to Programming
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
Sc Business Data Communications and Networking, T…
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY