DAT 205 Module Four Data Analytics Lifecycle

docx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

205

Subject

Industrial Engineering

Date

Jan 9, 2024

Type

docx

Pages

3

Uploaded by EarlMaskFly7

Report
Ali Boehlke DAT 205 Professor Augustine 19 November 2023 DAT 205 Module Four Data Analytics Lifecycle Template Instructions Fill in the tables below for each section. The tables will expand as you type. You may also insert images into the tables using the copy and paste or Insert Picture features. Create a diagram of the phases of the data analytics lifecycle (DAL). Data Analytics Lifecycle (DAL) Briefly describe the key points of what occurs during each phase. 1 1. Discovery 2. Operationalize 3. Communication Result 4. Model Building 5. Data Preparation 6. Model Planning
In the initial Discovery Phase, the data science team investigates the problem, identifying relevant data sources and developing testable hypotheses. In Phase 2, Data Preparation involves manipulating and refining data within an analytic sandbox, with tools like Hadoop and Open Refine aiding in this iterative process. The team then enters Phase 3, Model Planning, examining data relationships, choosing pertinent models and variables, and preparing various datasets. This phase often utilizes tools like Matlab and STATISTICA. In Phase 4, Model Building, the team focuses on creating datasets for different purposes, evaluating whether current tools are sufficient or a more advanced environment is needed. This stage might involve R, Octave, WEKA, and other similar devices. In the fifth phase, Communicating Results, the team assesses the model's outcomes against set benchmarks, planning how to relay this information to stakeholders best and considering any warnings or assumptions. Finally, in the Operationalize phase, the team broadly shares the benefits, starts a pilot project for controlled deployment, and learns from the model's performance in a limited production setting. Tools like SQL and MADlib are typical in this stage, culminating in delivering comprehensive reports and code. Select one phase of the DAL. Phase 2: Data Preparation Describe a data analyst’s role in your chosen phase. In Phase 2: Data Preparation of the Data Analytics Lifecycle, a data analyst's role is central and multifaceted, encompassing a range of activities essential for the project's success. This phase starts with the data analyst gathering data from diverse sources. This can include internal databases, external datasets, cloud sources, or even manual data entry, ensuring the data aligns with the problem statement and hypothesis identified in the Discovery Phase. The data analyst then moves on to cleaning and ensuring data quality, which involves identifying and correcting errors, handling missing values, and removing duplicates. This step is vital to prevent the "garbage in, garbage out" problem and is fundamental to the integrity of the analysis. Simultaneously, the analyst transforms the data into a format more suitable for analysis. This involves normalizing data, creating derived variables, and converting different data types, such as turning categorical data into numerical forms. This transformation makes the data compatible with various analysis tools and methodologies. Conducting an Exploratory Data Analysis (EDA) is another critical responsibility in this phase. The analyst uses statistical summaries and graphical representations to discover patterns, spot anomalies, test hypotheses, or check assumptions. Tools like Python or R are often employed, enabling the 2
analyst to visualize and understand the data distributions and relationships. Additionally, structuring and formatting the data is a critical task. The analyst organizes the data into tables, creates indexes, and sets up the data in a form ready for advanced analysis or model building. This structured approach aids in making the data more accessible and understandable. Documentation and reproducibility are also a significant part of the data analyst's role. Keeping detailed records of the data preparation process ensures that the steps taken for data cleaning and transformation are transparent and can be replicated, which is vital for the integrity and credibility of the analysis. Lastly, collaboration and communication are integral to the data analyst's role in this phase. They must work closely with other team members, such as data scientists and business analysts, to communicate findings from the data preparation phase. This collaboration is critical to addressing data quality or relevance concerns and defining the best datasets for achieving the project's goals. Overall, in the Data Preparation phase, a data analyst is responsible for ensuring that the data is accurate, clean, and structured, laying the groundwork for insightful and reliable analysis in the subsequent phases of the Data Analytics Lifecycle. Cite all references in APA format. (n.d.). Life Cycle Phases of Data Analytics. Geeks For Geeks. https://www.geeksforgeeks.org/life-cycle- phases-of-data-analytics/ 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help