CSCI_5080_Assignment_07_1

docx

School

Austin Peay State University *

*We aren’t endorsed by this school

Course

5080

Subject

Computer Science

Date

Jan 9, 2024

Type

docx

Pages

Uploaded by JusticeGalaxyHorse20

CSCI_5080_Assignment_07 Question 1 A 1(2, 10), A 2(2, 5), A 3(8, 4), B 1(5, 8), B 2(7, 5), B 3(6, 4), C 1(1, 2), C 2(4, 9). Using R to solve for the Euclidean distance > data_hmwk7 <- matrix(c(2,10,2,5,8,4,5,8,7,5,6,4,1,2,4,9), byrow = T, nrow = 8 ) > hmwk7<- dist(data_hmwk7, method = "euclidean") > hmwk7 1 2 3 4 5 6 7 2 5.000000 3 8.485281 6.082763 4 3.605551 4.242641 5.000000 5 7.071068 5.000000 1.414214 3.605551 6 7.211103 4.123106 2.000000 4.123106 1.414214 7 8.062258 3.162278 7.280110 7.211103 6.708204 5.385165 8 2.236068 4.472136 6.403124 1.414214 5.000000 5.385165 7.615773 Dataset A1(2, 10) B1(5, 8) C1(1, 2) Cluster Assignment A1 (2,10) 0 3.605551 8.06 1 A2 (2,5) 5 4.24 3.16 3 A3 (8,4) 8.485281 5 7.27 2 B1 (5,8) 3.605551 0 7.211103 2 B2 (7,5) 7.071068 3.605551 6.708204 2 B3 (6,4) 7.211103 4.123106 5.385165 2 C1 (1,2) 8.062258 7.211103 0 3 C2 (4,9) 2.236068 1.414214 7.615773 2 (a.) After the initial round of execution, the clustering result comprises three clusters: (1) consisting of {A1}, (2) including {B1, A3, B2, B3, C2}, and (3) containing {C1, A2}. The centers for these clusters are derived by averaging the x- values and y-values, resulting in coordinates (1) (2, 10), (2) (6, 6), and (3) (1.5, 3.5). (b.) 2nd session 1 CSCI_5080_Assignment_07

Dataset Cluster1(2, 10) Cluster2(6, 6) Cluster 3(1.5, 3.5) Cluster Assignment A1 (2,10) 0 5.66 6.52 1 A2 (2,5) 5 4.123 1.58 3 A3 (8,4) 8.485281 2.83 6.52 2 B1 (5,8) 3.605551 2.23 5.7 2 B2 (7,5) 7.071068 1.414 5.7 2 B3 (6,4) 7.211103 2 4.53 2 C1 (1,2) 8.062258 6.4 1.58 3 C2 (4,9) 2.236068 3.6056 6.04 1 (1) {A1, C2} (2) {A3, B1, B2, B3}, (3) {A2, C1} with centers (1) (3, 9.5), (2) (6.5, 5.25), (3) (1.5, 3.5) The Third Session Datase t Cluster_c1( 3, 9.5) Cluster_c2(6. 5, 5.25) Cluster_c 3(1.5, 3.5) Assignme nt of Cluster A1 (2,10) 1.118 6.543 6.52 1 A2 (2,5) 4.61 4.51 1.58 3 A3 (8,4) 7.433 1.95 6.52 2 B1 (5,8) 2.5 3.13 5.7 1 B2 (7,5) 6.02 0.56 5.7 2 B3 (6,4) 6.26 1.35 4.53 2 C1 (1,2) 7.76 6.388 1.58 3 C2 (4,9) 1.11 4.5 6.04 1 The final three clusters are (1) {A1, C2, B1}, (2) {A3, B2, B3}, (3) {C1, A2} Question 2: The Clustering-Based SVM (CB-SVM) is specifically crafted to address challenges posed by large datasets, where traditional Support Vector Machines (SVMs) may underperform when trained on the entire dataset. To mitigate this, CB-SVM employs a hierarchical micro-clustering algorithm, scanning the entire dataset only once to provide the SVM with high-quality samples containing statistical summaries. This approach maximizes the 2 CSCI_5080_Assignment_07

learning benefit for SVMs while maintaining scalability in terms of training efficiency. The core concept of CB-SVM involves utilizing a hierarchical micro-clustering technique, generating detailed descriptions close to the boundary and coarser descriptions farther away. The algorithm begins by constructing two micro-cluster trees from positive and negative training data. Each higher-level node in a tree serves as a summarized representation of its children nodes. Once the trees are constructed, CB-SVM initiates SVM training solely from the root nodes. After establishing the "rough" boundary, it selectively declusters only the data summary near the boundary into lower (finer) levels using the tree structure. This hierarchical representation of data summaries provides an effective foundation for CB-SVM to perform selective decluttering. The algorithm repeats this process until reaching the leaf level. CB-SVM proves valuable for analyzing extensive datasets, including streaming data or large data warehouses, where random sampling may hinder performance due to infrequently occurring important data or irregular patterns. The algorithm significantly reduces the total number of data points for SVM training while preserving the high quality of Support Vectors (SVs) that best describe the boundary. While traditional selective sampling requires scanning the entire dataset at each round, CB-SVM operates based on the CF tree, constructed in a single scan of the entire dataset. This tree carries statistical summaries that facilitate the efficient and effective construction of an SVM boundary. The CB-SVM algorithm can be outlined as follows: 1. Construct two CF trees independently from positive and negative datasets. 2. Train an SVM boundary function using centroids of root entries from the CF trees. If the root node contains too few entries, train from entries in the second levels. 3 CSCI_5080_Assignment_07

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

3. Decluster entries near the boundary into the next level, accumulating children entries declustered from parent entries into the training set with non-declustered parent entries. 4. Construct another SVM from centroids of entries in the training set and repeat from step 3 until no further accumulation occurs. Time spent: - 2hrs. 4 CSCI_5080_Assignment_07

PART 2 Lesson 3: 5 CSCI_5080_Assignment_07

Lesson 4 TM_Decision Tree Model: Decision Tree Tab 6 CSCI_5080_Assignment_07

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Dependency Network Tab: 7 CSCI_5080_Assignment_07

TM_Clustering Model: Cluster Diagram Tab Cluster Profiles Tab Cluster Characteristics Tab 8 CSCI_5080_Assignment_07

Cluster Discrimination Tab TM_Naïve Baiyes Model Dependency Network tab 9 CSCI_5080_Assignment_07

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Attribute Profile Tab: Attribute Characteristics Tab: 10 CSCI_5080_Assignment_07

Attribute Discrimination Tab: Time spent: 4 hours. 11 CSCI_5080_Assignment_07

APPENDIX R-code for Question 1 > data_hmwk7 <- matrix(c(2,10,2,5,8,4,5,8,7,5,6,4,1,2,4,9), byrow = T, nrow = 8 ) > hmwk7<- dist(data_hmwk7, method = "euclidean") > hmwk7 1 2 3 4 5 6 7 2 5.000000 3 8.485281 6.082763 4 3.605551 4.242641 5.000000 5 7.071068 5.000000 1.414214 3.605551 6 7.211103 4.123106 2.000000 4.123106 1.414214 7 8.062258 3.162278 7.280110 7.211103 6.708204 5.385165 8 2.236068 4.472136 6.403124 1.414214 5.000000 5.385165 7.615773 > # 2nd session > a=c(6,6) > a1=c(2,10) > a2=c(2,5) > a3=c(8,4) > a4=c(5,8) > a5=c(7,5) > a6=c(6,4) > a7=c(1,2) > a8=c(4,9) > b=c(1.5,3.5) > dist(rbind(a,a1), method="euclidean") a a1 5.656854 > dist(rbind(a,a2), method="euclidean") a a2 4.123106 > dist(rbind(a,a3), method="euclidean") a a3 2.828427 > dist(rbind(a,a4), method="euclidean") a a4 2.236068 > dist(rbind(a,a5), method="euclidean") a a5 1.414214 > dist(rbind(a,a6), method="euclidean") a a6 2 > dist(rbind(a,a7), method="euclidean") a a7 6.403124 > dist(rbind(a,a8), method="euclidean") a a8 3.605551 12 CSCI_5080_Assignment_07

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

> # 3rd cluster > dist(rbind(b,a1), method="euclidean") b a1 6.519202 > dist(rbind(b,a2), method="euclidean") b a2 1.581139 > dist(rbind(b,a3), method="euclidean") b a3 6.519202 > dist(rbind(b,a4), method="euclidean") b a4 5.700877 > dist(rbind(b,a5), method="euclidean") b a5 5.700877 > dist(rbind(b,a6), method="euclidean") b a6 4.527693 > dist(rbind(b,a7), method="euclidean") b a7 1.581139 > dist(rbind(b,a8), method="euclidean") b a8 6.041523 > # 3rd Section > c1=c(3, 9.5) > c2=c(6.5, 5.25) > c3=c(1.5, 3.5) > # > dist(rbind(c1,a1), method="euclidean") c1 a1 1.118034 > dist(rbind(c1,a2), method="euclidean") c1 a2 4.609772 > dist(rbind(c1,a3), method="euclidean") c1 a3 7.433034 > dist(rbind(c1,a4), method="euclidean") c1 a4 2.5 > dist(rbind(c1,a5), method="euclidean") c1 a5 6.020797 > dist(rbind(c1,a6), method="euclidean") c1 a6 6.264982 > dist(rbind(c1,a7), method="euclidean") c1 a7 7.762087 > dist(rbind(c1,a8), method="euclidean") 13 CSCI_5080_Assignment_07

c1 a8 1.118034 > # > dist(rbind(c2,a1), method="euclidean") c2 a1 6.543126 > dist(rbind(c2,a2), method="euclidean") c2 a2 4.506939 > dist(rbind(c2,a3), method="euclidean") c2 a3 1.952562 > dist(rbind(c2,a4), method="euclidean") c2 a4 3.132491 > dist(rbind(c2,a5), method="euclidean") c2 a5 0.559017 > dist(rbind(c2,a6), method="euclidean") c2 a6 1.346291 > dist(rbind(c2,a7), method="euclidean") c2 a7 6.388466 > dist(rbind(c2,a8), method="euclidean") c2 a8 4.506939 > # > dist(rbind(c3,a1), method="euclidean") c3 a1 6.519202 > dist(rbind(c3,a2), method="euclidean") c3 a2 1.581139 > dist(rbind(c3,a3), method="euclidean") c3 a3 6.519202 > dist(rbind(c3,a4), method="euclidean") c3 a4 5.700877 > dist(rbind(c3,a5), method="euclidean") c3 a5 5.700877 > dist(rbind(c3,a6), method="euclidean") c3 a6 4.527693 > dist(rbind(c3,a7), method="euclidean") c3 a7 1.581139 > dist(rbind(c3,a8), method="euclidean") c3 a8 6.041523 14 CSCI_5080_Assignment_07

Related Documents

EDU201 - unit 5 game-based assessment.docx

wek 8 discusion.docx

CSCI_5080_Assignment_06_1.docx

5080_assig_1_1.docx

Final_Exam_FALL_2023_6083B.docx

CSCI_5080_Assignment_04_01.docx

final 5080_2024.docx

Assignment 1.pdf

CIS_5200_Fall_2023_Final_Practice_Problems.pdf

test-1-mock.pdf

Deom_for_GMIY.docx

Response_for_GMIY.docx

Recommended textbooks for you

Operations Research : Applications and Algorithms

Computer Science

ISBN:9780534380588

Author:Wayne L. Winston

Publisher:Brooks Cole

COMPREHENSIVE MICROSOFT OFFICE 365 EXCE

Computer Science

ISBN:9780357392676

Author:FREUND, Steven

Publisher:CENGAGE L

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781305627482

Author:Carlos Coronel, Steven Morris

Publisher:Cengage Learning

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781285196145

Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel

Publisher:Cengage Learning

New Perspectives on HTML5, CSS3, and JavaScript

Computer Science

ISBN:9781305503922

Author:Patrick M. Carey

Publisher:Cengage Learning

SEE MORE TEXTBOOKS

Recommended textbooks for you

Operations Research : Applications and Algorithms
Computer Science
ISBN:9780534380588
Author:Wayne L. Winston
Publisher:Brooks Cole
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:9780357392676
Author:FREUND, Steven
Publisher:CENGAGE L
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781305627482
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781285196145
Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel
Publisher:Cengage Learning
New Perspectives on HTML5, CSS3, and JavaScript
Computer Science
ISBN:9781305503922
Author:Patrick M. Carey
Publisher:Cengage Learning

Operations Research : Applications and Algorithms

Computer Science

ISBN:9780534380588

Author:Wayne L. Winston

Publisher:Brooks Cole

COMPREHENSIVE MICROSOFT OFFICE 365 EXCE

Computer Science

ISBN:9780357392676

Author:FREUND, Steven

Publisher:CENGAGE L

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781305627482

Author:Carlos Coronel, Steven Morris

Publisher:Cengage Learning

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781285196145

Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel

Publisher:Cengage Learning

New Perspectives on HTML5, CSS3, and JavaScript

Computer Science

ISBN:9781305503922

Author:Patrick M. Carey

Publisher:Cengage Learning

SEE MORE TEXTBOOKS