Skip to main content

Documents Computer Science

Lab-05 - Jupyter Notebook.pdf

Lab-05 - Jupyter Notebook

pdf

School

University of North Texas *

*We aren’t endorsed by this school

Course

5502

Subject

Computer Science

Date

Feb 20, 2024

Type

pdf

Pages

28

Uploaded by venkatasai1999

Part 1: Data Visualizations 1. Grouped Bar Plots: Make both a side-by-side bar plot and a stacked bar plot that displays the number of child visitors and the number of adult visitors at a waterpark in the months of April, May, June and July. Be sure to include titles, legends and appropriate labels sufficiently sized for readability. April Children: 780 Adults: 315 May Children: 1050 Adults: 400 June Children: 3056 Adults: 1000 July Children: 5025 Adults: 1500 In [179]:  import numpy as np import matplotlib.pyplot as plt

In [180]:  children = ( 780 , 1050 , 3056 , 5025 ) ind = np.arange( 4 ) width = 0.35 fig, ax = plt.subplots() rects1 = ax.bar(ind, children, width) adult = ( 315 , 400 , 1000 , 1500 ) rects2 = ax.bar(ind + width, adult, width) ax.set_ylabel( 'Number of Visitors' ,fontsize = 13 ) ax.set_title( 'Number of Vistors in 4months(April - July)' ,fontsize = 13 ) ax.set_xticks(ind + width / 2 ) ax.set_xticklabels(( 'April' , 'May' , 'June' , 'July' ),fontsize = 10 ) ax.legend((rects1[ 0 ], rects2[ 0 ]), ( 'children' , 'adult' )) def autolabel (rects): for rect in rects: height = rect.get_height() ax.text(rect.get_x() + rect.get_width() / 2. , height, '%d' % int (height),ha = 'center' , va = 'bottom' ) autolabel(rects1) autolabel(rects2) plt.show()

In [181]:  children = [ 780 , 1050 , 3056 , 5025 ] adults = [ 315 , 400 , 1000 , 1500 ] months = [ 'April' , 'May' , 'June' , 'July' ] width = 0.5 ind = np.arange( len (months)) tick_pos = [i + (width / 50 ) for i in ind] p1 = plt.bar(ind, adults, width, align = 'center' ) p2 = plt.bar(ind, children, width, bottom = adults, align = 'center' ) plt.ylabel( 'Number of Visitors' , fontsize = 12 ) plt.xlabel( 'Months' , fontsize = 12 , labelpad = 15 ) plt.title( 'Number of Visitors in 4 months (April - July)' , fontsize = 13 ) plt.xticks(tick_pos, months, fontsize = 10 ) plt.legend((p1[ 0 ], p2[ 0 ]), ( 'Adults' , 'Children' ), loc = "best" ) for r1, r2 in zip (p1, p2): h1 = r1.get_height() h2 = r2.get_height() plt.text(r1.get_x() + r1.get_width() / 2. , h1 / 2. , "%d" % h1, ha = "center" , va = "bottom" , color = "white" , fontsiz plt.text(r2.get_x() + r2.get_width() / 2. , h1 + h2 / 2. , "%d" % h2, ha = "center" , va = "bottom" , color = "white" , fo plt.show()

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

2. Histogram: Make a histogram of the following scores from the Fall 2017 Data Structures course at Loyola University Chicago. Feel free to experiment on the best number of histogram bins for visualization. 114.8, 98.8, 97.3, 96, 94.1, 93.1, 93.1, 91.6, 91.5, 91.3, 90.3, 89.2, 87.5, 87.4, 85.2, 81.7, 81.6, 81.5, 80, 79.3, 78.2, 77.6, 77.1, 76.7, 75.1, 73.9, 72, 71, 64.6, 63.3, 47.2, 38.7 In [182]:  In [129]:  38.7 Fall_2017 = np.array([ 114.8 , 98.8 , 97.3 , 96 , 94.1 , 93.1 , 93.1 , 91.6 , 91.5 , 91.3 , 90.3 , 89.2 , 87.5 , 87.4 , 85.2 , 81.7 , 81.6 , 81.5 , 80 , 79.3 , 78.2 , 77.6 , 77.1 , 76.7 , 75.1 , 73.9 , 72 , 71 , 64.6 , 63.3 , 47.2 , 38.7 ] print ( min (Fall_2017))

In [130]:  In [131]:  114.8 Out[131]: Text(0.5, 1.0, 'Fall 2017- Data Structures Course Scrores ') print ( max (Fall_2017)) t = [ 0 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , 90 , 100 , 110 , 120 ] plt.hist(Fall_2017,bins = t,ec = "black" ,color = 'green' ) plt.xticks(t) plt.xlabel( "scores" ) plt.ylabel( "count" ) plt.grid(axis = 'y' , linestyle = '--' , alpha = 0.6 ) plt.title( "Fall 2017- Data Structures Course Scrores " )

3. Line Plot: Create a line plot of sin(x) and cos(x + π/2) for -2π < x < 2π where x increases at intervals of π/4. Make the sin(x) graph red and make the cos(x+π/2) graph green. Put both lines onto the same plot.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

In [132]:  Using the same info as above, make a subplot with 2 different graphs- one graph for sin(x) and Out[132]: <matplotlib.legend.Legend at 0x184f1b56410> x = np.arange( - 2 * np.pi, 2 * np.pi,np.pi / 4 ) y = np.sin(x) z = np.cos(x + np.pi / 2 ) plt.plot(x,y,color = 'red' ,label = "sin(x)" ) plt.plot(x,z,color = 'green' ,label = "cos(x+π/2)" ) plt.xlabel( '-2π to 2π' ) plt.ylabel( 'sin(x) and cos(x+ π/2)' ) plt.title( 'sin(x) and cos(x+ π/2) line plot' ) plt.grid( True ) plt.legend()

In [133]:  4. Scatter Plot: Using the following data about winter temperatures affecting the number of days for lake ice at Lake Superior, construct a scatter plot to display the data. Include a line of best fit. plt.figure(figsize = ( 10 , 4 )) plt.subplot( 1 , 2 , 1 ) plt.plot(x, y, color = 'red' , label = 'sin(x)' ) plt.xlabel( 'x' ) plt.ylabel( 'Value' ) plt.title( 'sin(x)' ) plt.subplot( 1 , 2 , 2 ) plt.plot(x, z, color = 'green' , label = 'cos(x + π/2)' ) plt.xlabel( 'x' ) plt.ylabel( 'Value' ) plt.title( 'cos(x + π/2)' ) plt.tight_layout() plt.show()

Mean Temperature (in Fahrenheit): 22.94, 23.02, 25.68, 19.96, 24.80, 23.98, 22.10, 20.30, 24.20, 22.74, 24.16, 24.94, 22.40, 22.14, 20.84, 25.66, 21.73, 24.49, 24.13, 22.17, 21.73, 20.41, 24.41, 23.95, 20.95, 26.71, 22.81, 23.11, 23.33, 28.83, 23.11, 21.47, 23.97, 24.75, 23.61, 23.08, 21.24,

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

In [134]:  import numpy as np import matplotlib.pyplot as plt from scipy import stats mean_temperature = [ 22.94 , 23.02 , 25.68 , 19.96 , 24.80 , 23.98 , 22.10 , 20.30 , 24.20 , 22.74 , 24.16 , 24.94 , 22.40 , 22.14 , 20.84 , 25.66 , 21.73 , 24.49 , 24.13 , 22.17 , 21.73 , 20.41 , 24.41 , 23.95 , 20.95 , 26.71 , 22.81 , 23.11 , 23.33 , 28.83 , 23.11 , 21.47 , 23.97 , 24.75 , 23.61 , 23.08 , 21.24 , 26.63 , 23.88 ] days_of_ice = [ 87 , 137 , 106 , 97 , 105 , 118 , 118 , 136 , 91 , 107 , 96 , 114 , 125 , 115 , 118 , 82 , 115 , 97 , 104 , 146 , 126 , 141 , 111 , 123 , 118 , 83 , 48 , 118 , 116 , 81 , 116 , 123 , 112 , 99 , 102 , 118 , 63 , 62 , 132 ] slope, intercept, r_value, p_value, std_err = stats.linregress(mean_temperature, days_of_ice) line = slope * np.array(mean_temperature) + intercept plt.figure(figsize = ( 10 , 6 )) plt.scatter(mean_temperature, days_of_ice, label = 'Data Points' , color = 'blue' ) plt.plot(mean_temperature, line, label = 'Line of Best Fit' , color = 'red' ) plt.xlabel( 'Mean Temperature (Fahrenheit)' ) plt.ylabel( 'Days of Ice' ) plt.title( 'Relationship between Mean Temperature and Days of Ice' ) plt.legend() plt.grid( True ) plt.show()

Part 2: Basic Data Structure 1: Lists 1. Make a list with the spelled-out number strings ‘one’, ‘two’, ‘three’, ‘four’, and ‘five’ in that order and call it myList.

In [135]:  2. Remove ‘three’ from the list using positional indexing. In [136]:  3. Check if ‘four’ is in the list. In [137]:  4. Append ‘six’ to the end of the list, then print the length of the list. In [138]:  5. Print the contents of the list, but also next to each item print the length of the string (e.g. one is 3, four is 4) using a for loop. In [139]:  Out[135]: ['one', 'two', 'three', 'four', 'five'] ['one', 'two', 'four', 'five'] True ['one', 'two', 'four', 'five', 'six'] 5 one is 3 two is 3 four is 4 five is 4 six is 3 myList = [ "one" , "two" , "three" , "four" , "five" ] myList del myList[ 2 ] print (myList) print ( "four" in myList) myList.append( "six" ) print (myList) print ( len (myList)) for i in myList: print (i, 'is' , len (i))

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

6. Create a list only of the lengths of the strings and show your result. You can use the loop before to fill the list. In [140]:  2: Dictionaries 1. Make a dictionary with the keys be English words as below, and the values be the translation. You can use this language example (German) or choose your own. Note: you need to make sure all of these words are represented as strings, in quotes. apple - Apfel apples - Äpfel I - Ich and - und like - mag strawberries - Erdbeeren In [141]:  2. Use the dictionary to look up the translation for ‘apple’ and ‘like’. In [142]:  3. Make a variable var with the string “I like apples and strawberries”. In [143]:  Out[140]: [3, 3, 4, 4, 3] Out[141]: {'apple': 'Apfel', 'apples': 'Äpfel', 'I': 'Ich', 'and': 'und', 'like': 'mag', 'strawberries': 'Erdbeeren'} Apfel mag Out[143]: 'I like apples and strawberries' length_string = [ len (word) for word in myList] length_string dictionary_words = { "apple" : "Apfel" , "apples" : "Äpfel" , "I" : "Ich" , "and" : "und" , "like" : "mag" , "strawberries" : "Erdbeeren" } dictionary_words print (dictionary_words[ "apple" ]) print (dictionary_words[ "like" ]) var = "I like apples and strawberries" var

4. Now create a list from var with each word a separate item (this is a string split operation). In [144]:  5. Iterate through the list you’ve created and replace any word in your dictionary with the translation. In [145]:  6. Now take your new list and turn it into a string with spaces between the words. In [146]:  3: Arrays 1. Create an array of zeros of size 8 x 8 and print the data type of the array. Out[144]: ['I', 'like', 'apples', 'and', 'strawberries'] Out[145]: ['Ich', 'mag', 'Äpfel', 'und', 'Erdbeeren'] Out[146]: 'Ich mag Äpfel und Erdbeeren' split_string = var.split() split_string replace_list = [] for item in split_string: if item in dictionary_words: replace_list.append(dictionary_words[item]) else : replace_list.append(item) replace_list replace_string = ' ' .join(replace_list) replace_string

In [147]:  In [148]:  2. Fill the array with the numbers 1 to 64 first by row, then by column. You may want to use a for loop inside a for loop to do this. In [149]:  3. Transpose the array. [[0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0.]] <class 'numpy.ndarray'> Out[149]: array([[ 1, 2, 3, 4, 5, 6, 7, 8], [ 9, 10, 11, 12, 13, 14, 15, 16], [17, 18, 19, 20, 21, 22, 23, 24], [25, 26, 27, 28, 29, 30, 31, 32], [33, 34, 35, 36, 37, 38, 39, 40], [41, 42, 43, 44, 45, 46, 47, 48], [49, 50, 51, 52, 53, 54, 55, 56], [57, 58, 59, 60, 61, 62, 63, 64]]) array = np.zeros(( 8 , 8 )) print (array) print ( type (array)) s = len (array) k = len (array) fill_array = [[(j + 1 ) + s * i for j in range (s)] for i in range (k)] fill_array = np.array(fill_array) fill_array

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

In [150]:  4. Print only the top 4 rows and columns. In [151]:  5. Make a 1D array out of your 2D array with the numbers 1 to 64 in order (note the column vs row issue, you may need transposes.) In [152]:  In [153]:  6. Now take that 1D array you made from before and reshape it back to the original 2D array. Out[150]: array([[ 1, 9, 17, 25, 33, 41, 49, 57], [ 2, 10, 18, 26, 34, 42, 50, 58], [ 3, 11, 19, 27, 35, 43, 51, 59], [ 4, 12, 20, 28, 36, 44, 52, 60], [ 5, 13, 21, 29, 37, 45, 53, 61], [ 6, 14, 22, 30, 38, 46, 54, 62], [ 7, 15, 23, 31, 39, 47, 55, 63], [ 8, 16, 24, 32, 40, 48, 56, 64]]) Out[151]: array([[ 1, 9, 17, 25], [ 2, 10, 18, 26], [ 3, 11, 19, 27], [ 4, 12, 20, 28]]) Out[152]: array([ 1, 9, 17, 25, 33, 41, 49, 57, 2, 10, 18, 26, 34, 42, 50, 58, 3, 11, 19, 27, 35, 43, 51, 59, 4, 12, 20, 28, 36, 44, 52, 60, 5, 13, 21, 29, 37, 45, 53, 61, 6, 14, 22, 30, 38, 46, 54, 62, 7, 15, 23, 31, 39, 47, 55, 63, 8, 16, 24, 32, 40, 48, 56, 64]) Out[153]: 1 array = np.transpose(fill_array) array array[ 0 : 4 , 0 : 4 ] array_1 = array.flatten() array_1 array_1.ndim

In [154]:  In [155]:  4: Application - Word Counts Word counts are often used in text processing to automatically classify documents by topic. They are also used to automatically measure the “sentiment” by counting, for example, the number of positive or negative words used in the comment or essay. Write code to count the number of unique words in a very large string using the following steps. A) First convert the string to a list with each word a separate item in the list. Hint: use a string split function for your language, and make sure it separates by “ ”. B) Then use a dictionary to associate each word with a count. Note, the dictionary won’t be able to increment a key unless you add it first, so you may have to check to see if it exists before setting the original count of a word to 1. C) Print each word and its count afterwards, and test with an interesting block of text that will have multiple words counted multiple times. (Note, the words don’t have to be in any particular order.) For example: “how much wood would a woodchuck chuck if a woodchuck could chuck wood” Expected output: how - 1 much -1 wood - 2 would - 1 a - 2 woodchuck - 2 chuck - 2 if - 1 could - 1 In [156]:  Out[154]: array([[ 1, 9, 17, 25, 33, 41, 49, 57], [ 2, 10, 18, 26, 34, 42, 50, 58], [ 3, 11, 19, 27, 35, 43, 51, 59], [ 4, 12, 20, 28, 36, 44, 52, 60], [ 5, 13, 21, 29, 37, 45, 53, 61], [ 6, 14, 22, 30, 38, 46, 54, 62], [ 7, 15, 23, 31, 39, 47, 55, 63], [ 8, 16, 24, 32, 40, 48, 56, 64]]) Out[155]: 2 Out[156]: 'how much wood would a woodchuck chuck if a woodchuck could chuck wood' array_2 = array_1.reshape( 8 , 8 ) array_2 array_2.ndim Input_String = "how much wood would a woodchuck chuck if a woodchuck could chuck wood" Input_String

In [157]:  In [158]:  Out[157]: ['how', 'much', 'wood', 'would', 'a', 'woodchuck', 'chuck', 'if', 'a', 'woodchuck', 'could', 'chuck', 'wood'] how - 1 much - 1 wood - 2 would - 1 a - 2 woodchuck - 2 chuck - 2 if - 1 could - 1 List_Of_words = Input_String.split() List_Of_words word_counts = {} for word in List_Of_words: if word in word_counts: word_counts[word] += 1 else : word_counts[word] = 1 for word, count in word_counts.items(): print ( f" {word} - {count} " )

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Part 3: Data Frames In this part, we will study a classic data set - the survivors in the sinking of the Titanic. As there were limited lifeboats, decisions were made prioritizing who would and would not survive. We will observe how different factors such as age, sex, and class affected a person’s chance of survival using data frames. Steps: 1. Input the following data into a data frame called titanic, and display the entire data frame: Sex, Class, Survived, Died Children, First, 6, 0 Children, Second, 24, 0 Children, Third, 27, 52 Men, First, 57, 118 Men, Second, 14, 154 Men, Third, 75, 387 Men, Crew, 192, 693 Women, First, 140, 4 Women, Second, 80, 13 Women, Third, 76, 89 Women, Crew, 20, 3

In [159]:  2. Now only show the data of the people in first class. In [160]:  3. Delete the crew members from the data. Out[159]: Sex Class Survived Died 0 Children First 6 0 1 Children Second 24 0 2 Children Third 27 52 3 Men First 57 118 4 Men Second 14 154 5 Men Third 75 387 6 Men Crew 192 693 7 Women First 140 4 8 Women Second 80 13 9 Women Third 76 89 10 Women Crew 20 3 Out[160]: Sex Class Survived Died 0 Children First 6 0 3 Men First 57 118 7 Women First 140 4 import pandas as pd titanic_data = { 'Sex' : [ 'Children' , 'Children' , 'Children' , 'Men' , 'Men' , 'Men' , 'Men' , 'Women' , 'Women' , 'Women' , 'Women' ], 'Class' : [ 'First' , 'Second' , 'Third' , 'First' , 'Second' , 'Third' , 'Crew' , 'First' , 'Second' , 'Third' , 'Crew' ], 'Survived' : [ 6 , 24 , 27 , 57 , 14 , 75 , 192 , 140 , 80 , 76 , 20 ], 'Died' : [ 0 , 0 , 52 , 118 , 154 , 387 , 693 , 4 , 13 , 89 , 3 ] } data_frame = pd.DataFrame(titanic_data) data_frame data_frame.loc[data_frame[ "Class" ] == "First" ]

In [161]:  4. Create a new column that is the total number of people for that group (those who survived + died). Out[161]: Sex Class Survived Died 0 Children First 6 0 1 Children Second 24 0 2 Children Third 27 52 3 Men First 57 118 4 Men Second 14 154 5 Men Third 75 387 7 Women First 140 4 8 Women Second 80 13 9 Women Third 76 89 data_frame = data_frame[data_frame[ 'Class' ] != 'Crew' ] data_frame

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

In [162]:  5. Create a new column with the percentage of people who survived. C:\Users\16036\AppData\Local\Temp\ipykernel_15412\83928789.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#return ing-a-view-versus-a-copy (https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-v ersus-a-copy) data_frame["Total_Number"] = data_frame["Survived"]+data_frame["Died"] Out[162]: Sex Class Survived Died Total_Number 0 Children First 6 0 6 1 Children Second 24 0 24 2 Children Third 27 52 79 3 Men First 57 118 175 4 Men Second 14 154 168 5 Men Third 75 387 462 7 Women First 140 4 144 8 Women Second 80 13 93 9 Women Third 76 89 165 data_frame[ "Total_Number" ] = data_frame[ "Survived" ] + data_frame[ "Died" ] data_frame

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

In [163]:  6. Delete the column indicating the total number of people in that group. C:\Users\16036\AppData\Local\Temp\ipykernel_15412\601391844.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#return ing-a-view-versus-a-copy (https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-v ersus-a-copy) data_frame['Survival Percentage'] = (data_frame['Survived'] / data_frame['Total_Number']) * 100 Out[163]: Sex Class Survived Died Total_Number Survival Percentage 0 Children First 6 0 6 100.000000 1 Children Second 24 0 24 100.000000 2 Children Third 27 52 79 34.177215 3 Men First 57 118 175 32.571429 4 Men Second 14 154 168 8.333333 5 Men Third 75 387 462 16.233766 7 Women First 140 4 144 97.222222 8 Women Second 80 13 93 86.021505 9 Women Third 76 89 165 46.060606 data_frame[ 'Survival Percentage' ] = (data_frame[ 'Survived' ] / data_frame[ 'Total_Number' ]) * 100 data_frame

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

In [164]:  7. Only show the rows where more than 80% of the people survived. In [165]:  8. Then only show the rows where less than 40% of the people survived. C:\Users\16036\AppData\Local\Temp\ipykernel_15412\75101291.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#return ing-a-view-versus-a-copy (https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-v ersus-a-copy) data_frame.drop(labels ="Total_Number",axis=1,inplace=True) Out[164]: Sex Class Survived Died Survival Percentage 0 Children First 6 0 100.000000 1 Children Second 24 0 100.000000 2 Children Third 27 52 34.177215 3 Men First 57 118 32.571429 4 Men Second 14 154 8.333333 5 Men Third 75 387 16.233766 7 Women First 140 4 97.222222 8 Women Second 80 13 86.021505 9 Women Third 76 89 46.060606 Out[165]: Sex Class Survived Died Survival Percentage 0 Children First 6 0 100.000000 1 Children Second 24 0 100.000000 7 Women First 140 4 97.222222 8 Women Second 80 13 86.021505 data_frame.drop(labels = "Total_Number" ,axis = 1 ,inplace = True ) data_frame data_frame[data_frame[ 'Survival Percentage' ] > 80 ]

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

In [166]:  9. Calculate the total number of people that survived and died for each class, then report the percentages. (Hint: Use a grouped calculation.) In [167]:  Out[166]: Sex Class Survived Died Survival Percentage 2 Children Third 27 52 34.177215 3 Men First 57 118 32.571429 4 Men Second 14 154 8.333333 5 Men Third 75 387 16.233766 C:\Users\16036\AppData\Local\Temp\ipykernel_15412\2717960705.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#return ing-a-view-versus-a-copy (https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-v ersus-a-copy) data_frame.drop(columns="Survival Percentage",axis=1,inplace=True) Out[167]: Sex Class Survived Died 0 Children First 6 0 1 Children Second 24 0 2 Children Third 27 52 3 Men First 57 118 4 Men Second 14 154 5 Men Third 75 387 7 Women First 140 4 8 Women Second 80 13 9 Women Third 76 89 data_frame[data_frame[ "Survival Percentage" ] < 40 ] data_frame.drop(columns = "Survival Percentage" ,axis = 1 ,inplace = True ) data_frame

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

In [168]:  In [169]:  C:\Users\16036\AppData\Local\Temp\ipykernel_15412\4153513455.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#return ing-a-view-versus-a-copy (https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-v ersus-a-copy) data_frame["total"]=data_frame["Survived"]+data_frame["Died"] Out[168]: Sex Class Survived Died total 0 Children First 6 0 6 1 Children Second 24 0 24 2 Children Third 27 52 79 3 Men First 57 118 175 4 Men Second 14 154 168 5 Men Third 75 387 462 7 Women First 140 4 144 8 Women Second 80 13 93 9 Women Third 76 89 165 Class First 62.0 Second 41.0 Third 25.0 dtype: float64 Class First 38.0 Second 59.0 Third 75.0 dtype: float64 data_frame[ "total" ] = data_frame[ "Survived" ] + data_frame[ "Died" ] data_frame percentage_survived = round ((data_frame.groupby( "Class" ).Survived.sum() / data_frame.groupby( "Class" ).total.sum()) * 100 ) percentage_died = round ((data_frame.groupby( "Class" ).Died.sum() / data_frame.groupby( "Class" ).total.sum()) * 100 ) print (percentage_survived) print (percentage_died)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

In [170]:  10. Save your table in CSV format (as e.g. titanic_data.csv) with the first line as headers for the columns. In [171]:  11. Duplicate the CSV file on your computer since you will be editing the copied version (e.g. titanic_data2.csv). Open the new CSV file in a text editor. Note the way the data is organized. Now, in the text editor, add new lines including the data for the crew that was removed earlier. (Help: the percentage of male crew and female crew that survived was 21.69% and 86.96%.) In [172]:  12. Now read that updated CSV file into a new data frame called titanic2, and display the data. In [173]:  Out[170]: %Survived %Died Class First 62.0 38.0 Second 41.0 59.0 Third 25.0 75.0 Out[173]: Class %Survived %Died 0 First 62.0 38.0 1 Second 41.0 59.0 2 Third 25.0 75.0 finalResult = pd.concat([percentage_survived,percentage_died],axis = 1 ,keys = [ "%Survived" , "%Died" ]) finalResult finalResult.to_csv( "titanic_data.csv" ) finalResult.to_csv( "titanic_data2.csv" ) titanic2 = pd.read_csv( "titanic_data2.csv" ) titanic2

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Documents

SCS 100 Module Three Activity Template..docx

Week 3 Summmative assessment.docx

Module2_CodeReflection_and_FlowChart.docx

3-3 Milestone_ Vector Data Structure Pseudocode (1).docx

4-3 Milestone_ Hash Table Data Structure Pseudocode.docx

CourseHero5.pdf

HardwareandOS_aliciamosley_assessment5_attempt1.docx

Intelligence Community.docx

CourseHero2.docx

BIO1130 - PopGen exercise 2023 edit.docx

Recommended textbooks for you

Text book image

COMPREHENSIVE MICROSOFT OFFICE 365 EXCE

Computer Science

ISBN:9780357392676

Author:FREUND, Steven

Publisher:CENGAGE L

Text book image

Programming with Microsoft Visual Basic 2017

Computer Science

ISBN:9781337102124

Author:Diane Zak

Publisher:Cengage Learning

Text book image

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

Text book image

New Perspectives on HTML5, CSS3, and JavaScript

Computer Science

ISBN:9781305503922

Author:Patrick M. Carey

Publisher:Cengage Learning

Text book image

Oracle 12c: SQL

Computer Science

ISBN:9781305251038

Author:Joan Casteel

Publisher:Cengage Learning

Text book image

Computer Science

ISBN:9781337681872

Author:PINARD

Publisher:Cengage

SEE MORE TEXTBOOKS

Recommended textbooks for you

COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:9780357392676
Author:FREUND, Steven
Publisher:CENGAGE L
Programming with Microsoft Visual Basic 2017
Computer Science
ISBN:9781337102124
Author:Diane Zak
Publisher:Cengage Learning
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage
New Perspectives on HTML5, CSS3, and JavaScript
Computer Science
ISBN:9781305503922
Author:Patrick M. Carey
Publisher:Cengage Learning
Oracle 12c: SQL
Computer Science
ISBN:9781305251038
Author:Joan Casteel
Publisher:Cengage Learning
CMPTR
Computer Science
ISBN:9781337681872
Author:PINARD
Publisher:Cengage

Text book image

COMPREHENSIVE MICROSOFT OFFICE 365 EXCE

Computer Science

ISBN:9780357392676

Author:FREUND, Steven

Publisher:CENGAGE L

Text book image

Programming with Microsoft Visual Basic 2017

Computer Science

ISBN:9781337102124

Author:Diane Zak

Publisher:Cengage Learning

Text book image

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

Text book image

New Perspectives on HTML5, CSS3, and JavaScript

Computer Science

ISBN:9781305503922

Author:Patrick M. Carey

Publisher:Cengage Learning

Text book image

Oracle 12c: SQL

Computer Science

ISBN:9781305251038

Author:Joan Casteel

Publisher:Cengage Learning

Text book image

Computer Science

ISBN:9781337681872

Author:PINARD

Publisher:Cengage

SEE MORE TEXTBOOKS