https://www.kaggle.com/datasets/mirzahasnine/telecom-churn-dataset?resource=download  Select three experiments (3 marks each) from the list (your Lecturer may choose one for you): A. Build a simple classifier apply to dataset (Decision Tree) B. Cluster Analysis (K-Means) C. Topic Detection Analysis (Import public post comments from Twitter, Facebook, Instagram with the help of exportcomments.com). D. Linear Regression.  Select Modelling Technique • Task: Select Modelling Technique . Output Modelling Technique • Record the actual modelling technique that is used.  Output Modelling Assumption • Activities: Define any built-in assumptions made by the technique about the data (e.g., quality, format, distribution). Compare these assumptions with those in the Data Description Report. Make sure that these assumptions hold and step back to the Data Preparation Phase if necessary. You can explain the data file here, even when it is pre prepared.  Generate Test Design • Activities: Check existing test designs for each data mining goal separately. Decide on necessary steps (number of iterations, number of folds etc.). Prepare data required for test. (You can use 66% of records for model Building/Training and rest for Testing).  Build a Model • Task: Build a model. Run the modelling tool on the prepared dataset to create one or more models. (Using Knime Tool as shown in the lab).  Output Parameter Settings • Activities: Set initial parameters. Document reasons for choosing those values. • Activities: Run the selected technique on the input dataset to produce the model. Postprocess data mining results (e.g., editing rules, display trees).  Output Modelling Technique • Activities: Describe any characteristics of the current model that may be useful for the future. Give a detailed description of the model and any special features. • Activities: State conclusions regarding patterns in the data (if any); sometimes the model reveals important facts about the data without a separate Assessment process (e.g., that the output or conclusion is duplicated in one of the inputs). Result Analysis / Evaluation

icon
Related questions
Question

https://www.kaggle.com/datasets/mirzahasnine/telecom-churn-dataset?resource=download 

Select three experiments (3 marks each) from the list (your Lecturer may choose one for you): A. Build a simple classifier apply to dataset (Decision Tree) B. Cluster Analysis (K-Means) C. Topic Detection Analysis (Import public post comments from Twitter, Facebook, Instagram with the help of exportcomments.com). D. Linear Regression.

 Select Modelling Technique • Task: Select Modelling Technique . Output Modelling Technique • Record the actual modelling technique that is used.  Output Modelling Assumption • Activities: Define any built-in assumptions made by the technique about the data (e.g., quality, format, distribution). Compare these assumptions with those in the Data Description Report. Make sure that these assumptions hold and step back to the Data Preparation Phase if necessary. You can explain the data file here, even when it is pre prepared.  Generate Test Design • Activities: Check existing test designs for each data mining goal separately. Decide on necessary steps (number of iterations, number of folds etc.). Prepare data required for test. (You can use 66% of records for model Building/Training and rest for Testing).  Build a Model • Task: Build a model. Run the modelling tool on the prepared dataset to create one or more models. (Using Knime Tool as shown in the lab).  Output Parameter Settings • Activities: Set initial parameters. Document reasons for choosing those values. • Activities: Run the selected technique on the input dataset to produce the model. Postprocess data mining results (e.g., editing rules, display trees).  Output Modelling Technique • Activities: Describe any characteristics of the current model that may be useful for the future. Give a detailed description of the model and any special features. • Activities: State conclusions regarding patterns in the data (if any); sometimes the model reveals important facts about the data without a separate Assessment process (e.g., that the output or conclusion is duplicated in one of the inputs). Result Analysis / Evaluation 

 

Expert Solution
steps

Step by step

Solved in 2 steps

Blurred answer