DAT 205 Module Four Data Analytics Lifecycle
docx
keyboard_arrow_up
School
Southern New Hampshire University *
*We aren’t endorsed by this school
Course
205
Subject
Industrial Engineering
Date
Jan 9, 2024
Type
docx
Pages
3
Uploaded by EarlMaskFly7
Ali Boehlke
DAT 205
Professor Augustine
19 November 2023
DAT 205 Module Four Data Analytics Lifecycle Template
Instructions
Fill in the tables below for each section. The tables will expand as you type. You may also insert images
into the tables using the copy and paste or Insert Picture features.
Create a diagram of the phases of the data analytics lifecycle (DAL).
Data Analytics Lifecycle
(DAL)
Briefly describe the key points of what occurs during each phase.
1
1.
Discovery
2.
Operationalize
3.
Communication
Result
4.
Model
Building
5.
Data
Preparation
6.
Model
Planning
In the initial Discovery Phase, the data science team investigates the problem, identifying relevant
data sources and developing testable hypotheses.
In Phase 2, Data Preparation involves manipulating and refining data within an analytic sandbox, with
tools like Hadoop and Open Refine aiding in this iterative process.
The team then enters Phase 3, Model Planning, examining data relationships, choosing pertinent
models and variables, and preparing various datasets. This phase often utilizes tools like Matlab and
STATISTICA.
In Phase 4, Model Building, the team focuses on creating datasets for different purposes, evaluating
whether current tools are sufficient or a more advanced environment is needed. This stage might
involve R, Octave, WEKA, and other similar devices.
In the fifth phase, Communicating Results, the team assesses the model's outcomes against set
benchmarks, planning how to relay this information to stakeholders best and considering any
warnings or assumptions.
Finally, in the Operationalize phase, the team broadly shares the benefits, starts a pilot project for
controlled deployment, and learns from the model's performance in a limited production setting.
Tools like SQL and MADlib are typical in this stage, culminating in delivering comprehensive reports
and code.
Select one phase of the DAL.
Phase 2: Data Preparation
Describe a data analyst’s role in your chosen phase.
In Phase 2: Data Preparation of the Data Analytics Lifecycle, a data analyst's role is central and
multifaceted, encompassing a range of activities essential for the project's success. This phase starts
with the data analyst gathering data from diverse sources. This can include internal databases,
external datasets, cloud sources, or even manual data entry, ensuring the data aligns with the
problem statement and hypothesis identified in the Discovery Phase.
The data analyst then moves on to cleaning and ensuring data quality, which involves identifying and
correcting errors, handling missing values, and removing duplicates. This step is vital to prevent the
"garbage in, garbage out" problem and is fundamental to the integrity of the analysis.
Simultaneously, the analyst transforms the data into a format more suitable for analysis. This involves
normalizing data, creating derived variables, and converting different data types, such as turning
categorical data into numerical forms. This transformation makes the data compatible with various
analysis tools and methodologies.
Conducting an Exploratory Data Analysis (EDA) is another critical responsibility in this phase. The
analyst uses statistical summaries and graphical representations to discover patterns, spot anomalies,
test hypotheses, or check assumptions. Tools like Python or R are often employed, enabling the
2
analyst to visualize and understand the data distributions and relationships.
Additionally, structuring and formatting the data is a critical task. The analyst organizes the data into
tables, creates indexes, and sets up the data in a form ready for advanced analysis or model building.
This structured approach aids in making the data more accessible and understandable.
Documentation and reproducibility are also a significant part of the data analyst's role. Keeping
detailed records of the data preparation process ensures that the steps taken for data cleaning and
transformation are transparent and can be replicated, which is vital for the integrity and credibility of
the analysis.
Lastly, collaboration and communication are integral to the data analyst's role in this phase. They must
work closely with other team members, such as data scientists and business analysts, to communicate
findings from the data preparation phase. This collaboration is critical to addressing data quality or
relevance concerns and defining the best datasets for achieving the project's goals.
Overall, in the Data Preparation phase, a data analyst is responsible for ensuring that the data is
accurate, clean, and structured, laying the groundwork for insightful and reliable analysis in the
subsequent phases of the Data Analytics Lifecycle.
Cite all references in APA format.
(n.d.). Life Cycle Phases of Data Analytics. Geeks For Geeks. https://www.geeksforgeeks.org/life-cycle-
phases-of-data-analytics/
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help