project3

pdf

School

University of California, Berkeley *

*We aren’t endorsed by this school

Course

Subject

Computer Science

Date

Jul 2, 2024

Type

pdf

Pages

Uploaded by CoachSheepPerson165

Question 1.2.2 Choose two different words in the dataset with a magnitude (absolute value) of correlation higher than 0.2 and plot a scatter plot with a line of best fit for them. Please do not pick “outer” and “space” or “san” and “francisco”. The code to plot the scatter plot and line of best fit is given for you, you just need to calculate the correct values to r , slope and intercept . Hint 1: It’s easier to think of words with a positive correlation, i.e. words that are often mentioned together. Try to think of common phrases or idioms. Hint 2: Refer to Section 15.2 of the textbook for the formulas. For additional past examples of regression, see Homework 9. In [62]: word_x = 'blue' word_y = 'moon' # These arrays should make your code cleaner! arr_x = movies . column(word_x) arr_y = movies . column(word_y) x_su = standard_units(arr_x) y_su = standard_units(arr_y) r = np . mean(x_su * y_su) slope = r * np . std(arr_y) / np . std(arr_x) intercept = np . mean(arr_y) - slope * np . mean(arr_x) # DON'T CHANGE THESE LINES OF CODE movies . scatter(word_x, word_y) max_x = max (movies . column(word_x)) plots . title( f"Correlation: { r } , magnitude greater than .2: { abs (r) >= 0.2 } " ) plots . plot([ 0 , max_x * 1.3 ], [intercept, intercept + slope * (max_x *1.3 )], color = 'gold' ); 1

Question 1.3.1 Draw a horizontal bar chart with two bars that show the proportion of Comedy movies in each dataset ( train_movies and test_movies ). The two bars should be labeled “Training” and “Test”. Complete the function comedy_proportion first; it should help you create the bar chart. Hint : Refer to Section 7.1 of the textbook if you need a refresher on bar charts. In [66]: def comedy_proportion (table): # Return the proportion of movies in a table that have the comedy genre. movie_len = table . num_rows movie_group = table . group( 'Genre' ) . where( 'Genre' , are . equal_to( 'comedy' )) . column( 'count' ) . i return movie_group / movie_len # The staff solution took multiple lines. Start by creating a table. # If you get stuck, think about what sort of table you need for barh to work comedy_proportion_t = comedy_proportion(train_movies) comedy_proportion_test = comedy_proportion(test_movies) comedy_tbl = Table() . with_columns( 'Categories' , make_array( 'Training' , 'Test' ), 'Proportions' , comedy_tbl . barh( 'Categories' ) 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Question 3.1.7 In two sentences or less, describe how you selected your features. I selected these features because of the slope and looking at other around the middle of the slope which will help satisfy that these words appear once in at least each movie. 5

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Question 3.3.3 Do you see a pattern in the types of movies your classifier misclassifies? In two sentences or less, describe any patterns you see in the results or any other interesting findings from the table above. If you need some help, try looking up the movies that your classifier got wrong on Wikipedia. Some patterns I see from the data are that comedy movies cover things that would also be in a horror or thriller movie. 7

Question 4.2 Do you see a pattern in the mistakes your new classifier makes? How good an accuracy were you able to get with your limited classifier? Did you notice an improvement from your first classifier to the second one? Describe in two sentences or less. Hint: You may not be able to see a pattern. I did not really notice a pattern until I double checked and I saw that with a new classifier that the new proportion is higher than the first time I did it. 9

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Question 4.3 Given the constraint of five words, how did you select those five? Describe in two sentences or less. I choose my words by zooming into the 11

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Documents

17197169104233430701740026705814.jpg

6-3.pdf

Quiz 17.pdf

CYB 220 Right to Privacy.docx

8-3.pdf

Synchronous Session Week 5 - Copy.docx

IMG_7072.png

image.jpg

TaxPrep Instructional Assignment 2.docx

IMG_7064.png

Stepik 3.3-3.4.pdf

cs cribsheet 1.docx

Recommended textbooks for you

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

Operations Research : Applications and Algorithms

Computer Science

ISBN:9780534380588

Author:Wayne L. Winston

Publisher:Brooks Cole

COMPREHENSIVE MICROSOFT OFFICE 365 EXCE

Computer Science

ISBN:9780357392676

Author:FREUND, Steven

Publisher:CENGAGE L

Oracle 12c: SQL

Computer Science

ISBN:9781305251038

Author:Joan Casteel

Publisher:Cengage Learning

A Guide to SQL

Computer Science

ISBN:9781111527273

Author:Philip J. Pratt

Publisher:Course Technology Ptr

Management Of Information Security

Computer Science

ISBN:9781337405713

Author:WHITMAN, Michael.

Publisher:Cengage Learning,

SEE MORE TEXTBOOKS

Recommended textbooks for you

Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage
Operations Research : Applications and Algorithms
Computer Science
ISBN:9780534380588
Author:Wayne L. Winston
Publisher:Brooks Cole
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:9780357392676
Author:FREUND, Steven
Publisher:CENGAGE L
Oracle 12c: SQL
Computer Science
ISBN:9781305251038
Author:Joan Casteel
Publisher:Cengage Learning
A Guide to SQL
Computer Science
ISBN:9781111527273
Author:Philip J. Pratt
Publisher:Course Technology Ptr
Management Of Information Security
Computer Science
ISBN:9781337405713
Author:WHITMAN, Michael.
Publisher:Cengage Learning,