hw02

py

School

University of Michigan *

*We aren’t endorsed by this school

Course

206

Subject

Computer Science

Date

Feb 20, 2024

Type

py

Pages

6

Uploaded by erzas088

Report
# -*- coding: utf-8 -*- """hw02.ipynb Automatically generated by Colaboratory. Original file is located at https://colab.research.google.com/drive/1MolVV1OaiWD8Xb-gJA-i8LspM1aAODWC """ import numpy as np # run this first """# Homework 2, Stats 206/DataSci 101 ## Question 1 We will start this homework with some questions on collections in Python. ### Q1.a Recall that *sets* are collections where each value in the collection only appears once. Using the `set` function, turn both of these strings into sets. Then use set methods to find * How many letters are shared between the two names? * Which letters are in "George Washington" that are not in "Abraham Lincoln"? * How many letters are in neither president's name? """ gw = "George Washington".lower() al = "Abraham Lincoln".lower() az = "abcdefghijklmnopqrstuvwxyz" gw_set = set(gw) al_set = set(al) az_set = set(az) print(gw_set & al_set) print(gw_set - al_set) print(az_set - (gw_set & al_set)) """ ### Q1.b Use a `for` loop to find the unique values in the lunch orders below. Using Python, print out the number of unique items being ordered. (*Hint*: You can use `set()` to create an empty set.)""" lunch_orders = [{"pasta", "coffee"}, {"fries","sandwich", "cookie"}, {"pasta", "salad", "water"}, {"salad", "pasta", "coffee"},
{"sandwich", "water"}] u_set = set() for order in lunch_orders: for item in order: u_set.add(item) print(u_set) """### Q1.c Here is a menu at the resturant, with items and their cost: | Item | Price | Vegetarian | | --- | --- | --- | | salad | 8 | True | | soup | 5 | False | | pasta | 12 | True | | sandwich | 9 | False | | burger | 13 | False | | fries | 6 | True | | cake | 6 | True | | cookie| 3 | True | | water | 0 | True | | coffee| 2 | True | | soda | 3 | True | Create a dictionary called `menu` that has the item names as keys and a two-tuple with (price, vegetarian) as the value. Demonstrate by retrieving the price and vegetarian status of "burger". """ menu = {"salad" : (8, True), "soup" : (5, False), "pasta" : (12, True), "sandwich": (9, False), "burger": (13, False), "fries": (6, True), "cake": (6, True), "cookie": (3, True), "water": (0, True), "coffee": (2, True), "soda": (3, True) } print(menu["burger"]) """### Q1.d Write a function that takes a lunch order (a set of menu items) and returns the total price. Write a function that takes a lunch order a returns true if the entire order is vegetarian. Use list comprehensions to compute the price of each order and whether each order
is vegtarian. """ def total_price(lunch_order): price = 0 for item in lunch_order: price += menu[item][0] return price def veg_status(lunch_order): return all(menu[item][1] for item in order) # Hints: try running the following code # ("this", "is", "a tuple")[1] # [s.upper() for s in {"hello", "world"}] """### Q1.e When you have completed your menu in the part (c) you can remove the `#` charcters make this code run: """ menu_dict = { "name": list(menu.keys()), "price": [i[0] for i in menu.values()], "vegetarian": [i[1] for i in menu.values()] } """Use this table to compute the following: * What is the price of the most expensive item in the list? * What is the name of the most expensive item? (You can use the `np.argmax()` function to help you.) * How many vegetarian items are there? """ import numpy as np #price of most expensive highest = max(menu_dict["price"]) print(highest) #name of most expensive highest_index = np.argmax(menu_dict["price"]) item_name = menu_dict["name"][highest_index] print(item_name) #number of veg items num_veg_items = sum(menu_dict["vegetarian"]) print(num_veg_items) """### Question 2 """ # run this to do question 2 import pandas as pd
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
"""### Q2.a Here's a Pandas `DataFrame` version of our menu table. """ menu_df = pd.DataFrame({'name': ['salad', 'soup', 'pasta', 'sandwich', 'burger', 'fries', 'cake', 'cookie', 'water', 'coffee', 'soda'], 'price': [8, 5, 12, 9, 13, 6, 6, 3, 0, 2, 3], 'vegetarian': [True, False, True, False, False, True, True, True, True, True, True]}) """Using **methods** for Series objects (i.e., methods called on the columns of the table. Compute the following: * The name of the least expensive item on the list. * The number of items that cost between 5 and 10 dollars (inclusive). * The names all of vegetarian items that cost more than 10 dollars. """ #name of least expensive item lowest_price = menu_df.loc[menu_df["price"].idxmin(), "name"] print(lowest_price) #number of items between 5 and 10 dollars bw5_and_10 = menu_df[(menu_df["price"] >= 5) & (menu_df["price"] <= 10)].shape[0] print(bw5_and_10) #names of all vegetarian items >$10 above_10 = menu_df[(menu_df["price"] > 10) & (menu_df["vegetarian"] == True)] ["name"].tolist() print(above_10) """### Q2.b Create a table that is only composed of items that cost less than or equal to $5. How many vegetarian items are in this menu? """ #new table five_dollars = menu_df[menu_df['price'] <= 5]
num_veg = five_dollars['vegetarian'].sum() print(num_veg) """## Q2.c Let's use these techniques on some real data! """ from google.colab import drive drive.mount('/content/gdrive') spotify = pd.read_csv("/content/gdrive/MyDrive/Stats 206 Winter 2024/data/spotify.csv") spotify.columns """Using Python and Pandas, answer the following questions: * How many songs have a popularity greater than 80? * For songs with a popularity greater than 80, what is the shortest song (in miliseconds)? * What is the name of the song with popularity greater than 80 that has the slowest tempo? """ #songs with popularity > 80 eighty_more = spotify[spotify["track.popularity"] > 80] pop = 0 for track in spotify["track.popularity"]: if track > 80: pop += 1 print(pop) #shortest song in pop > 80 songs shortest = eighty_more.loc[eighty_more["duration_ms"].idxmin(), "track.name"] print(shortest) #name of song with slowest tempo + pop > 80 slowest = eighty_more.loc[eighty_more["tempo"].idxmin(), "track.name"] print(slowest) """### Q2.d Here is a litte code that will create a new column with four (approximately) evenly sized groups based on popularly (low, medium, medium high, and high). The code after creating the columns shows how many rows are in each group. There are a few songs with the same popularity value, which makes it difficult to get exactly evenly sized groups, but this is pretty close. The `(a, b]` output on the left shows the popularity range of that group. """ spotify["popgrp"] = pd.qcut(spotify["track.popularity"], 4) spotify["popgrp"].value_counts() """For each group, find the number of explicit songs using the `"track.explicit"` column. Which group has the most explicit songs?""" #num explicit
explicit = spotify.groupby("popgrp")["track.explicit"].sum() #which group index explicit_group = explicit.idxmax() print(explicit_group) print(explicit[explicit_group]) """### Q2.e Create a new column that is the product of `danceability` and `energy`. Find the song with the highest product of danceability and energy. Find the tempo of this song. Is it a particularly fast song in terms of tempo? How many songs are faster than this song? How many are slower? """ #product of danceability and energy spotify["d_e_product"] = spotify["danceability"] * spotify["energy"] #song with the highest product highest_song_index = spotify["d_e_product"].idxmax() highest_song = spotify.loc[highest_song_index] #tempo tempo_highest_song = highest_song["tempo"] #is it fast? is_fast = tempo_highest_song > spotify["tempo"].mean() #num faster and slower num_faster = (spotify["tempo"] > tempo_highest_song).sum() num_slower = (spotify["tempo"] < tempo_highest_song).sum() #highest product print(highest_song["track.name"]) #tempo print(tempo_highest_song) #fast? yes/no print(is_fast) #number faster print(num_faster) #number slower print(num_slower)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help