Homework 3C

pdf

School

University of Arkansas *

*We aren’t endorsed by this school

Course

4143

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

Uploaded by SuperBatMaster87

1 Homework 3 Problem 3 Consider the following dataset shown in Table 1. Samples Feature 1 Feature 2 1 0 0 2 0 1 3 1 0 4 1 1 Table 1: Simple Dataset for Problem 3 Answer the following questions: 1. Suppose that • we assign Sample 1 and 2 to the first cluster (Cluster 1) and assign Samples 3 and 4 to the second cluster (Cluster 2), i.e., C 1 = {1 , 2} , C 2 = {3 , 4}, and Suppose that: • we assign Sample 1, 2, and 3 to Cluster 1 and Sample 4 to Cluster 2, and • we use squared Euclidean distance to calculate how “dissimilar” the two points are. Using the definition of S (C) introduced in 3-clustering.pdf to compute S (C 1 ) + S (C 2 ) and show your work. • we use Euclidean distance to calculate how “dissimilar” the two points are. • Using the definition of S (C) introduced in 3-clustering.pdf to compute S (C 1 ) + S (C 2 ) and show your work.

2 Problem 4 1. If the goal is to assign all four data samples into 2 clusters, list all candidate clustering (i.e., all possible ways to assign the four samples into 2 clusters). For example, C 1 = {1 , 2} , C 2 = {3 , 4} could be one candidate clustering; C 1 = {1} , C 2 = {2 , 3 , 4} could be another, and what else? (please show your work) Options: 1. Option 1: C 1 = {1} , C 2 = {2 , 3 , 4} 2. Option 2: C 1 = {1,2} , C 2 = {3 , 4} 3. Option 3: C 1 = {1,2,3} , C 2 = {4} 4. Option 4: C 1 = {1,3} , C 2 = {2 , 4} 5. Option 5: C 1 = {1,2,4}, C 2 = {3} 6. Option 6: C 1 = {1,4}, C 2 = {2,3} 7. Option 7: C 1 = {2}, C 2 = {1,3,4} 2. Compute the total “dissimilarity” scores, i.e., S (C 1 ) + S (C 2 ), of all the candidate clusterings above (using squared Euclidean distance ). What is the best way to divide the four data points into 2 cluster? Options: 1. Option 1: C 1 = {1} , C 2 = {2 , 3 , 4} = S (C 1 ) + S (C 2 ) = 0 + 4/3 = 4/3 2. Option 2: C 1 = {1,2} , C 2 = {3 , 4} = S (C 1 ) + S (C 2 ) = 0.5 + 0.5 = 1 3. Option 3: C 1 = {1,2,3} , C 2 = {4} = S (C 1 ) + S (C 2 ) = 4/3 + 0 = 4/3 4. Option 4: C 1 = {1,3} , C 2 = {2 , 4} = S (C 1 ) + S (C 2 ) = 0.5 + 0.5 = 1 5. Option 5: C 1 = {1,2,4}, C 2 = {3} = S (C 1 ) + S (C 2 ) = 4/3 + 0 = 4/3 6. Option 6: C 1 = {1,4}, C 2 = {2,3} = S (C 1 ) + S (C 2 ) = 1 + 1 = 2 7. Option 7: C 1 = {2}, C 2 = {1,3,4} = S (C 1 ) + S (C 2 ) = 0 + 4/3 = 4/3 The best ways to divide the four data points into 2 clusters are option 2 and 4 because dissimilarity scores are closer to 0 than the other options.

3 Problem 5 Download USArrests.csv from the Data folder on Blackboard Learn, and complete the following tasks: 1. How many rows and columns does this data set have? 2. Use only two features, “Murder”, and “Assault”, perform a K-means clustering analysis using Python. Here, we let K = 2, and set n init = 200 . Plot the two clusters generated by Python (differentiate two clusters using different colors) and insert the figure below.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Documents

Lab 9 - Ethernet and ARP.docx

Assignment 3 UC.docx

Tech 4010 - Assignment 3-2.pdf

Homework 6C.pdf

Homework 8C.pdf

MIS QUIZ 2.docx

206 Lab G - Struct and Dynamic memory.pdf

HW3 (2).pdf

Homework2_Solution.pdf

LAB 4_instruction.pdf

HW1.pdf

COS-1010_WA9.docx

Recommended textbooks for you

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781305627482

Author:Carlos Coronel, Steven Morris

Publisher:Cengage Learning

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781285196145

Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel

Publisher:Cengage Learning

Operations Research : Applications and Algorithms

Computer Science

ISBN:9780534380588

Author:Wayne L. Winston

Publisher:Brooks Cole

A Guide to SQL

Computer Science

ISBN:9781111527273

Author:Philip J. Pratt

Publisher:Course Technology Ptr

Oracle 12c: SQL

Computer Science

ISBN:9781305251038

Author:Joan Casteel

Publisher:Cengage Learning

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

SEE MORE TEXTBOOKS

Recommended textbooks for you

Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781305627482
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781285196145
Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel
Publisher:Cengage Learning
Operations Research : Applications and Algorithms
Computer Science
ISBN:9780534380588
Author:Wayne L. Winston
Publisher:Brooks Cole
A Guide to SQL
Computer Science
ISBN:9781111527273
Author:Philip J. Pratt
Publisher:Course Technology Ptr
Oracle 12c: SQL
Computer Science
ISBN:9781305251038
Author:Joan Casteel
Publisher:Cengage Learning
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781305627482

Author:Carlos Coronel, Steven Morris

Publisher:Cengage Learning

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781285196145

Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel

Publisher:Cengage Learning

Operations Research : Applications and Algorithms

Computer Science

ISBN:9780534380588

Author:Wayne L. Winston

Publisher:Brooks Cole

A Guide to SQL

Computer Science

ISBN:9781111527273

Author:Philip J. Pratt

Publisher:Course Technology Ptr

Oracle 12c: SQL