DAT 520 Module Three Lab Worksheet Adriana Carroll

docx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

320

Subject

Electrical Engineering

Date

Feb 20, 2024

Type

docx

Pages

3

Uploaded by MateGerbil2595

Report
DAT 520 Module Three Lab Worksheet Decision Trees in Power BI Overview In this lab, you will construct a decision tree using a bottom-up methodology in Power BI. You will break down its structure, interpret its results, and articulate a response to the proposed research question. Scenario In the spring of 1912, one of the most infamous maritime catastrophes occurred. The RMS Titanic, en route to New York City on its maiden voyage, struck an iceberg in the Atlantic Ocean and sank, resulting in the death of over 1500 individuals. This incident that occurred over 100 years ago is still considered the deadliest maritime accident to occur outside of warfare. This tragedy has captivated the minds of people worldwide and even sparked multi-million-dollar movies to tell its tale. This fascination has led many to ask, "Would I have survived the sinking of the Titanic?" Leveraging decision modeling within Power BI, we hope to understand more about what would have increased your odds of survival on that fateful day in the North Atlantic. Instructions Construct a decision tree leveraging the provided data set for Module 3, which provides insight into key variables that influenced the survivability of those who embarked on the Titanic in 1912. Effectively describe the model’s structure (nodes, branches, etc.), answer the questions posed below, and provide screenshots when prompted. Please note: This assignment will be submitted and graded in Brightspace. uCertify Instructions Navigate to uCertify lab 5.2.1 Decision Trees in Power BI. Open Power BI desktop. Select Get Data and choose Text/CSV. Navigate to the desktop and select the DAT-520 Data Files folder. Open Module 3. Select the data set Titanic.csv. Select the titanic tab. Select Transform Data. Delete the columns: PassengerId , Parch , Embark Change the name of the "2urvived" variable to "Survived." Remove null values from the Age column. Transform Fare to a two-decimal variable. Close and Apply Changes. Create new calculated columns leveraging z-scores: Create a column called z.Age using the following formula: z.Age = (‘titanic’ [Age] - average (‘titanic’ [Age] ))/ STDEV.P (‘titanic’ [Age] ) • Create a column called z.Fare using the following formula: z.Fare =
(‘titanic’ [Fare] - average (‘titanic’ [Fare] ))/ STDEV.P (‘titanic’ [Fare] ) • For Practice, Create a column called z.Sib using the following formula: z.Sib = (‘ titanic’ [SibSp] - AVERAGE (‘titanic’ [SibSp] ))/ STDEV.P (‘titanic’ [SibSp] ) Select the Decision Tree visualization. Select Get more visuals from the more options button. Sign in with your SNHU credentials if necessary. Select Decision Tree. Enable required scripts and programs. Construct your decision model. Select Survived as your target variable • Click the drop-down carrot next to Survived and select Do not summarize , as you will use this as a binary predictor. Select z.Age and z.Fare as your input variables (in that order). Expand the visualization into Focus mode. Questions Provide a screenshot of your existing decision tree. Describe what you are seeing in terms of the model’s breakdown from the root node down through each level. Using the survival variable as a binary variable, each portion of the decision tree using age and fare to predict a yes/no survival percentage. What criteria leads you to a 3% survival rate? Fare > 52, Age <= 45, Fare < 76, and Fare <= 59 Fare < 52, Age <= 6.5, and Age < 2.5 Fare < 52, Age <= 6.5, and Age > 2.5 Add the variable "sibsp" (siblings) to the model as an input and provide a screenshot of your new model. Does adding this variable improve the model’s root error? No, the root error remains at 0.30. Does it add additional complexity to the model? It simply adds a new variable to consider when viewing the binary output. What are your initial impressions of this model versus the original model?
The initial impression is not exciting, the outputs appear to remain relatively the same with no new decision breaks to consider. Add the variable Sex to the model as an input variable. Does adding this variable improve the model’s root error? Root error has adjusted to 0.29. Does it add additional complexity to the model? It looks more with a break at the start of the decision tree regarding sex but results in fewer outcomes. What are your initial impressions of this model versus the original model? The initial impressions of the new model is the addition of a new layer in a new layer to consider but sex doesn't appear to play a massive role other than dividing the results. Filter your visualization so that the model only displays the results for those whose Fares were less than $25. Provide a screenshot of your new model. If you were a female (0 = Male, 1 = Female) and paid less than $13 (hint: use your filters), what were the chances of you surviving during tragedy onboard the Titanic? 6%
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help