Module 3 Presentation

pptx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

511

Subject

Computer Science

Date

Jan 9, 2024

Type

pptx

Pages

8

Uploaded by Realgoodman5417

Report
Data Cleaning Methods Power BI vs. Rattle/R for Data Cleansing
Why We Need Clean Data: In its current form, the bank customer data set is yielding questionable results after analysis. Without clean data, any further attempts of analysis will be unreliable. Issues In Data Set: The current data set has many issues which require data cleaning methods including: - Missing/Blank Data - Misspelled Data/Inconsistent Spellings - Special Characters/Inconsistent Formatting - Statistical Outliers/ Duplicate Data - NULL and N/A Values Importance of Data Cleaning: Clean Data is crucial to the accurate outcome of data analysis, and decision- making. Unclean data can waste time, resources and productivity. This data must be clean so we can make predictions on who is likely to use Term-Deposits Need For Clean Data
POWER BI To clean data in Power BI, we use the Power Query Editor Use the “Transform Data” Tab Power Query Editor is a “one-stop-shop” Easy and Intuitive
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RATTLE/R Use of Rattle requires some coding Use the “Transform” Tab to clean data Integration with R for custom scripting Complex, but Robust
POWER BI POWER BI RATTLE/R 01 02 03 04 01 02 03 04 User-friendly interface that excels in exploration, visualization, and reporting. Much greater learning curve if not familiar with R syntax. Can be easily used for simple data cleaning, but may be limited due to fixed programming. Suitable for simple or complex data cleaning tasks due to custom scripting through R. More ready-to-use tabs and tools in the “Transform Data” tab. “Transform” tab had limitations on scope of use. R custom scripting was required to fully clean data set. Harder to find missing, blank, or N/A data due to lack of exploration tools. “Summary” and “Exploration” tabs made it easy to find issues in columns.
RECOMMENDATION
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
References Should you start learning R? Weigh the Pros and Cons of R programming. (2021). TechVidvan . https://techvidvan.com/tutorials/pros-and-cons-of-r/ Williams, G. (2011). Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery . Springer. Simplilearn. (2023). What is R: Overview, its Applications and what is R used for? Simplilearn.com . https://www.simplilearn.com/what-is-r-article Foote, K. D. (2023, March 1). The impact of poor data quality (and how to fix it) - DATAVERSITY . DATAVERSITY. https://www.dataversity.net/the-impact-of-poor- data-quality-and-how-to-fix-it/#:~:text=Inaccurate%20analytics%3A%20Data %20analysis%20or,that%20should%20not%20be%20trusted.
References Fry, H. (2022, January 8). 5 Data cleansing methods in Power Query and Excel - Harry Fry - Medium. Medium . https://medium.com/@harryfry/5-data- cleansing-methods-in-power-query-and-excel-73160c9179ad AbsentData. (2022, November 25). Microsoft Power BI Pros and Cons - AbsentData . https://absentdata.com/power-bi-pros-and-cons/