Test_ Exam2 bcis 466 _ Quizlet

pdf

School

New Mexico State University *

*We aren’t endorsed by this school

Course

466

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

7

Uploaded by PrivateSnow15846

Report
4/13/23, 5:01 PM Test: Exam2 bcis 466 | Quizlet https://quizlet.com/503903269/test?answerTermSides=4&promptTermSides=6&questionCount=21&questionTypes=4&showImages=true 1/7 Name: Score: 21 Multiple choice questions Term the k in k-means in cluster analysis is the: initial number of centroids captures the frequency of that item set all options are correct save a model to a file 1 of 21 Term cluster results can be used to: -segment the data into groups so that each group can be analyzed further -create labeled samples for a classification task -determine anomalous samples -classify new samples all of the options is correct save a model to a file captures the frequency of that item set all options are correct initial number of centroids 2 of 21
4/13/23, 5:01 PM Test: Exam2 bcis 466 | Quizlet https://quizlet.com/503903269/test?answerTermSides=4&promptTermSides=6&questionCount=21&questionTypes=4&showImages=true 2/7 Term The support of an item set: captures the number of items in that set captures the correlation between the items in that item set captures how many times that item set is used in a rule captures the frequency of that item set 3 of 21 Term Why was the churn variable converted into a string variable in the Churn Prediction example (KNIME)? -only non-numerical variables are allowed in the predictor model used -make interpretation of churn variable easier -output variable of the predictor model must be categorical determining whether power usage will rise or fall determine the regression line that best fits the samples output variables of the predictor model must be categorical a transaction or set of items that occur togethor 4 of 21 Term in linear regression, the least squares method is used to ..... determine the regression line that best fits the samples determine the distance between two pairs of samples. determine whether the target is categorical or numerical determine how to partition the data into training and test sets. 5 of 21
4/13/23, 5:01 PM Test: Exam2 bcis 466 | Quizlet https://quizlet.com/503903269/test?answerTermSides=4&promptTermSides=6&questionCount=21&questionTypes=4&showImages=true 3/7 Term In association analysis, an item set is a set of transactions that occur a certain number of times in the data a transaction or set of items that occur togethor a set of rules that infrequently occur togethor a set of items that two rules have in commone 6 of 21 Term for a rule of the form X --> Y x is the support, and y is the confidence x is the antecedent, and y is the consequent x is the consequent and y is the antecedent 7 of 21 Term which of the following is not an algorithm used in association analysis? FP growth eclat k-means apriori 8 of 21 Term rule confidence is used to: prune rules by eliminating rules with low confidence determine the rule with the most items measure the intuitiveness of a rule identify frequent item sets 9 of 21
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4/13/23, 5:01 PM Test: Exam2 bcis 466 | Quizlet https://quizlet.com/503903269/test?answerTermSides=4&promptTermSides=6&questionCount=21&questionTypes=4&showImages=true 4/7 Term how does simple linear regression differ from multiple linear regression predict when a customer is going to stop doing business with the company simple linear regression how only one variable. multiple linear regression has more than one variable. in classification, your predicting a category, and in regression, you're predicting a number. to find rules to capture associations between items and events 10 of 21 Term the purpose of pmml writher node (knime) in the churn prediction example is to: create a prediction model save a model to a file evaluate the performance of the predictor model 11 of 21 Term The equal size sampling node in the churn prediction example is needed because the number of customers who churn (churn =1) is undersampled compared to the number of customers who do not churn (churn = 0). T/F True False 12 of 21
4/13/23, 5:01 PM Test: Exam2 bcis 466 | Quizlet https://quizlet.com/503903269/test?answerTermSides=4&promptTermSides=6&questionCount=21&questionTypes=4&showImages=true 5/7 Term Which is not an example of regression? -predicting the demand for a product -predicting the price of a stock -estimating the amount of rain -determinig whether power usage will rise or fall. output variables of the predictor model must be categorical determining whether power usage will rise or fall prune rules by eliminating rules with low confidence determine the regression line that best fits the samples 13 of 21 Term predicting whether a stock price will go up or down is an example of regression T/F True False 14 of 21 Term What is the main difference between classification and regression? output variables of the predictor model must be categorical determining whether power usage will rise or fall simple linear regression how only one variable. multiple linear regression has more than one variable. in classification, your predicting a category, and in regression, you're predicting a number. 15 of 21 Term similarity measures in cluster analysis capture the similarity between clusters T/F True False 16 of 21
4/13/23, 5:01 PM Test: Exam2 bcis 466 | Quizlet https://quizlet.com/503903269/test?answerTermSides=4&promptTermSides=6&questionCount=21&questionTypes=4&showImages=true 6/7 Term the main steps in the k means clustering algorithm are assign each sample to the closest centroid, then calculate the new centroid. none of the options is correct calculate the distances between the cluster centroids, then find the two closest centroids. count the number of samples, then determine the initial centroids. calculate the centroids, then determine the appropriate stopping criterion depending on the number of centroids. 17 of 21 Term a cluster centroid is -the mean of all samples in all the clusters -the mean of all the samples in the two closest clusters -the mean of all the samples in the two farthest clusters -the mean of all the samples in the cluster captures the frequency of that item set determining whether power usage will rise or fall all options are correct the mean of all samples in the cluster 18 of 21 Term Goal of Cluster Analysis predict when a customer is going to stop doing business with the company to segment data so that differences between samples in the same cluster are minimized and differences between samples of different clusters are maximized. to find rules to capture associations between items and events in classification, your predicting a category, and in regression, you're predicting a number. 19 of 21
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4/13/23, 5:01 PM Test: Exam2 bcis 466 | Quizlet https://quizlet.com/503903269/test?answerTermSides=4&promptTermSides=6&questionCount=21&questionTypes=4&showImages=true 7/7 Term The goal of churn prediction assign each sample to the closest centroid, then calculate the new centroid predict when a customer is going to stop doing business with the company to find rules to capture associations between items and events to segment data so that differences between samples in the same cluster are minimized and differences between samples of different clusters are maximized. 20 of 21 Term the goal of association analysis is to find rules to capture associations between items and events initial number of centroids predict when a customer is going to stop doing business with the company prune rules by eliminating rules with low confidence 21 of 21