Midterm Cheat Sheet

pdf

School

University of North Georgia, Dahlonega *

*We aren’t endorsed by this school

Course

1001

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

11

Uploaded by ProfessorIron11938

Report
Midterm Cheat Sheet October 9, 2023 1 Midterm Cheat Sheet 1.1 Initialization Code Block [32]: ## Do not change this cell, only execute it. ## This cell initializes Python so that datascience, numpy and scipy packages , are ready to use. from datascience import * % matplotlib inline import matplotlib.pyplot as plots plots . style . use( 'fivethirtyeight' ) import numpy as np import scipy.stats as stats from IPython.core.display import display, HTML toggle_code_str = ''' <form action="javascript:code_toggle()"><input type="submit" id="toggleButton" , value="Hide/Show Code (Too Advanced or Not the Point)"></form> ''' toggle_code_prepare_str = ''' <script> function code_toggle() { if ($('div.cell.code_cell.rendered.selected div.input').css('display')! , ='none'){ $('div.cell.code_cell.rendered.selected div.input').hide(); } else { $('div.cell.code_cell.rendered.selected div.input').show(); } } </script> ''' display(HTML(toggle_code_prepare_str + toggle_code_str)) def hide_code (): 1
display(HTML(toggle_code_str)) <IPython.core.display.HTML object> 1.2 The groupstats function [33]: ## Do not change this cell, only execute it. ## This cell creates the function `groupstats` which provides descriptive , statistics ## on a numeric variable for each level of a grouping variable. def groupstats (table, group, data): ### This function will find all the major descriptive stats you need ### cut = table . select(group, data) . sort(group) favstats = cut . group(group, np . mean) . sort(group) words = [data, 'mean' ] favstats = favstats . relabeled( ' ' . join(words), "mean" ) groups = favstats . column( 0 ) q1 = make_array() for i in np . arange( len (groups)): q1 = np . append(q1, np . percentile(table . where(group, groups . item(i)) . , column(data), 25 )) q3 = make_array() for i in np . arange( len (groups)): q3 = np . append(q3, np . percentile(table . where(group, groups . item(i)) . , column(data), 75 )) favstats = favstats . with_column( 'std' , cut . group(group, stats . tstd) . , sort(group) . column( 1 ) ) favstats = favstats . with_column( 'min' , cut . group(group, min ) . sort(group) . , column( 1 ) ) favstats = favstats . with_column( 'Q1' , q1 ) favstats = favstats . with_column( 'median' , cut . group(group, np . median) . , sort(group) . column( 1 ) ) favstats = favstats . with_column( 'Q3' , q3 ) favstats = favstats . with_column( 'max' , cut . group(group, max ) . sort(group) . , column( 1 ) ) favstats = favstats . with_column( 'IQR' , cut . group(group, stats . iqr) . , sort(group) . column( 1 ) ) favstats = favstats . with_column( 'n' , cut . group(group ) . sort(group) . , column( 1 ) ) return favstats from hide_code3 import hide_code 2
hide_code() <IPython.core.display.HTML object> 2 reading in an existing table: name = Table.read_table(‘table name’) 3 .select selects a column out of a table by name or numerical position. It then returns another table that only contains columns I selected from another table numerical position = item(1) or item (20 etc.) ex: new_cars = cars.select(‘name”, ’highway_mpg’) 4 .labels and .relabel .labels: Lists the column labels in a table .relabel: modifies the existing table by changing the column heading in the first argument to the second and creates a copy of the table with the modified label 5 .column the values of a column (an array) Ex: sf.column(‘height’) outputs: the values in the row ‘height’ as an array 6 .column_with adds column(s) from an original table to a new table Ex: southside.with_column(‘Blocks from campus’) out: adds column ‘Blocks from campus’ to the new table ‘southside’ 7 np.mean Returns the mean value of an array [34]: #example survey = Table . read_table( 'welcome_survey_v4.csv' ) survey . group( 'Sleep position' , np . mean) 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
[34]: Sleep position | Year mean | Extraversion mean | Number of textees mean | Hours of sleep mean | Handedness mean On your back | | 5.57947 | 6.60596 | 7.11175 | On your left side | | 5.66845 | 7.46791 | 7.00535 | On your right side | | 5.89401 | 7.50922 | 7.05645 | On your stomach | | 5.96226 | 7.58491 | 7.15094 | 8 np.tstd standard deviation Standard Deviation (SD) is the square root of the variance sd = variance ** 0.5 stats.tstd(values) WARNING: np.std computes the population standard deviation, not the sample standard deviation. 9 np.percentile use the np.percentile command to find the 50￿￿ percentile (the median) of a short data set. The Interquartile Range The ￿￿￿=￿3−￿1 10 np.average returns the mean value of an array Ex: np.average(sf.column(‘height’)) outputs: the average (226.470) 11 np.arange shows the number of values you specify from that array [35]: #Ex: make_array( 0 , 1 , 2 , 3 , 4 , 5 , 6 ) np . arange( 6 ) [35]: array([0, 1, 2, 3, 4, 5]) [36]: #Ex 2 np . arange( 1 , 21 , 2 ) [36]: array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19]) 4
[37]: #Ex 3 np . arange( 15 , 7 , -1 ) [37]: array([15, 14, 13, 12, 11, 10, 9, 8]) 12 .take a copy of the original table with only the rows at the given indices 13 .group Group rows by unique values or combinations of values in a column(s) output: new table [38]: survey = Table . read_table( 'welcome_survey_v4.csv' ) survey . group( 'Sleep position' ) [38]: Sleep position | count On your back | 302 On your left side | 374 On your right side | 434 On your stomach | 212 14 .pivot A table where each distinct value in column 1 has its own column and each unique value in column 2 has its own row. [39]: #Example: survey . pivot( 'Sleep position' , 'Handedness' ) /opt/conda/lib/python3.8/site-packages/datascience/tables.py:920: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray values = np.array(tuple(values)) [39]: Handedness | On your back | On your left side | On your right side | On your stomach Both | 4 | 5 | 7 | 1 Left-handed | 31 | 27 | 30 | 12 Right-handed | 267 | 342 | 397 | 199 5
15 .num_rows and .num_columns Compute the number of rows in a table Compute the number of columns in a table [40]: #example survey . where( "Sleep position" , 'On your back' ) . num_rows [40]: 302 16 .where creates a copy of a table that shows only rows that match what you asked for (perdicate) [41]: #EXAMPLE: skyscrapers = Table . read_table( 'skyscrapers.csv' ) skyscrapers . where( 'city' , 'Los Angeles' ) [41]: name | material | city | height | completed U.S. Bank Tower | steel | Los Angeles | 310.29 | 1990 Aon Center | steel | Los Angeles | 261.52 | 1974 Two California Plaza | steel | Los Angeles | 228.6 | 1992 Gas Company Tower | steel | Los Angeles | 228.3 | 1991 Bank of America Plaza | steel | Los Angeles | 224.03 | 1975 777 Tower | steel | Los Angeles | 221 | 1991 Wells Fargo Tower | steel | Los Angeles | 220.37 | 1983 Figueroa at Wilshire | steel | Los Angeles | 218.54 | 1989 City National Tower | steel | Los Angeles | 213.06 | 1971 Paul Hastings Tower | steel | Los Angeles | 213.06 | 1971 … (1 rows omitted) 17 .sort Create a copy of a table sorted by the values in a column. Defaults to ascending order unless descending = True is included. Ex: skyscrapers.where(‘city’, ‘New York City’).sort(‘completed’) results in the rows being sorted by year ‘completed’ from earliest to latest 18 type function will tell you what type of object Python considerssomething to be. Ex: type(4.5) output: float 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
19 are.between, are. containing, are.equal_to ex: - movies.where(‘Year’, are.between(2000, 2005)) • movies.where(‘#1 Movie’, are.containing(‘Harry Potter’)) • movies.where(‘Year’, are.equal_to(2002)) 20 Functions of Arrays Make_array: Makes a numpy array with the values passed in (can be numbers or words, but not both together) len(‘array name’): tells the length of that array Sum(array): returns the sum of values in that array etc… [42]: #example my_array = make_array( 1 , 2 , 3 , 4 ) my_array [42]: array([1, 2, 3, 4]) 21 Graphs [43]: #BAR GRAPH EX: top_movies = Table . read_table( 'top_movies_2017.csv' ) studios = top_movies . select( 'Studio' ) studios studio_distribution = studios . group( 'Studio' ) #actual code to get bar graph studio_distribution . barh( 'Studio' ) 7
8
22 .scatter [44]: studios = top_movies . select( 'Studio' ) studios studio_distribution = studios . group( 'Studio' ) #actual code to get scatter studio_distribution . scatter( 'Studio' ) [45]: #HISTOGRAM EX baby = Table . read_table( "baby.csv" ) baby . hist( 'Birth Weight' , group = 'Maternal Smoker' ) 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
23 Boxplots Box plots are a graphical display of the descriptive statistics that we know as the Five Number Summary. Five Number Summary is as follows • The minimum • The 25￿￿ percentile, ￿1 • The 50￿￿ percentile, the median • The 75￿￿ percentile, ￿3 • The maximum [46]: #EXAMPLE pets = make_array( 4 , 2 , 4 , 1 , 2 , 6 , 1 , 0 , 1 , 3 , 2 , 4 , 10 , 1 , 2 , 4 , 12 , 3 , 1 , 3 , , 4 , 0 , 0 ) plots . boxplot(pets, widths =0.4 ); plots . xlabel( "class pets" ); 10
24 .plot Draws a line graph consisting of one point for each row of the table. Example: everyone.plot(‘AGE’, ‘2010’) output: would make a line plot graph that shows the age in year 2010. (yr = y axis age = x axis) 11