DATA ANALYSIS

docx

School

Government College University Faisalabad *

*We aren’t endorsed by this school

Course

123

Subject

Computer Science

Date

Nov 24, 2024

Type

docx

Pages

31

Uploaded by RaiFia

Report
1 Data Analysis for Business: Individual written report Module Leader: Stelios Sotiriadis Student Name: Student ID:
2 Table of Contents 1. Chosen Problem ................................................................................................................................. 3 2. Chosen Dataset ................................................................................................................................... 4 2.1 What is Python? ............................................................................................................................... 4 2.2 What purposes serve Python? ......................................................................................................... 5 3. Steps for Data Analysis (Using Python) ............................................................................................... 5 3.1 Step 1: Required Packages Imported ............................................................................................. 5 3.2 Step 2: Gathering Data .................................................................................................................... 6 3.3 Step 3: Manipulation of Data (Manipulate data according to the needs) .................................... 6 3.4 Step 4: Data Visualization & Exploratory Data Analysis ............................................................. 7 3.5 Ranked provinces and countries ................................................................................................. 7 3.5.1 Part 1: Ranking Most affected countries ................................................................................ 7 I) Top 10 Countries with Confirmed Cases: .................................................................................... 7 II) Top ten Countries with Death Cases: .......................................................................................... 8 III) Top ten Countries with Recovered Cases: ................................................................................. 9 IV) Top Ten Countries Active Cases: .............................................................................................. 10 4. Analysis of COVID-19 Impacts .......................................................................................................... 11 4.1 Analysis of the Covid-19 Effects (Problem) using Python ........................................................... 12 4.2 Preparation of Data ....................................................................................................................... 13 4.3 Analysis of the Spread of Pandemic ............................................................................................. 20 4.4 Effects of Covid-19 on the Economy Analysis .............................................................................. 24 5. Conclusion ............................................................................................................................................ 28 References ................................................................................................................................................ 29
3 Business Decision The worldwide economy has been significantly impacted by the pandemic of COVID-19 (Saurav et al., 2021). Governments, as well as corporations, are searching for strategies to comprehend the implications of the epidemic and decide how to react. The objective of the business decision being supported by this solution is to comprehend and lessen the COVID-19 pandemic's financial impact (Badarinza et al., 2018). This entails examining the impact of the worldwide epidemic on numerous economic metrics, including the GDP, rate of unemployment, and the index of human development. Policymakers, authorities, and organizations can be provided with insights to help them decide on recovery methods, allocation of resources, and policy revisions by investigating these consequences via data analysis. Data-driven Decision-making Problem The COVID-19 pandemic's financial impact was chosen as a top data-driven decision-making topic because of its enormous practical importance as well as the accessibility of pertinent data. The epidemic caused unparalleled disruptions to the world's economic growth, which makes it essential to fully evaluate any repercussions (Bundervoet et al., 2022). Data analytics can assist us in comprehending how variables such as COVID-19 cases and prevention efforts correspond to financial metrics by revealing patterns and linkages within the data. This analysis can help companies and lawmakers make well-informed decisions to deal with the pandemic's consequences properly. Datasets are readily available from reliable sources like Kaggle, providing a solid basis for carrying out a data-driven inquiry. Determining how the epidemic affected the economy is crucial for planning for the coming years and developing measures to create more stable industries (Goldstein et al., 2020). A set of information on COVID-19 cases, fatalities, and recuperation can be used to determine which nations were most severely affected by the epidemic. Making well-informed choices regarding how to lessen the effects of a worldwide epidemic can be done with this information. For instance, authorities may utilize this data to focus their aid programmes on the regions that need it the most. Organizations can use this data to decide how to function after the epidemic.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4 3 Data from Kaggle was used in the investigation to evaluate the effects of COVID-19 on different countries. Analytical operations, such as cleaning text and data with numbers, were carried out using Python. There were a couple of data-cleaning tasks carried out. First, duplication was removed from text data, including names of countries, and case forms were standardized. Secondly, missing outliers and missing values were removed from the numerical data relating to COVID-19 cases, fatalities, and financial indicators. The manipulation of data was performed using Python modules like Matlab and Plotly, which were utilized to visualise the data. The evaluation concentrated on how the global epidemic affected countries' COVID-19 scores, changes to their GDP per capita, and their index of human development scores. Cleaning the data and evaluation using Python provided new insights into COVID-19's effects on the world economy. Chosen Dataset The Kaggle dataset has been used to gather pertinent information about the problem. Data analysts as well as machine learning experts can connect on-line at Kaggle. Kaggle users can communicate with one another, explore and share data sets, take notes with graphics processing unit integration, and collaborate on machine learning issues together (Wang et al., 2021). This digital site, which was also founded in 2010 by Jeremy Howard and Anthony Goldbloom and was acquired by Google in 2017, promises to help both students and experts achieve their goals in the field of information science with the help of its powerful tools and support. Kaggle currently has much more than 8 million active monthly users in 2021. (Quaranta et al., 2021). Nearly all Kaggle databases are reliable. One can assess a dataset's credibility by looking at its likes, remarks, as well as the shared worksheets that use it. Interestingly, the data science methods used throughout Kaggle contests function well for situations that are analogous to everyday life. Even very different challenges can benefit from their use occasionally. The best part is that the straightforward fixes you may discover under open Notebooks are already very successful.
5 2.1 What is Python? Python is a very capable general-purpose software platform that offers data analysts a large variety of programs and tools. It is also simple to learn. Python was used to investigate how the coronavirus affects the world's trade. Python is a well-known programming language for computers that are used to make websites and programmes, control workflows, and do data analysis. Python is an all-purpose tool that may be used to create a variety of programmes and is not specifically designed for any problems (De Smedt and Daelemans, 2012). It has been pushed to the forefront of the list of programming languages now in use because of its versatility as well as beginner-friendliness. The market study company RedMonk conducted a survey of experts and found that it was the second most popular computer program amongst them. 2.2 What purposes serve Python? Python is frequently used for creating web pages and applications, automating repetitive tasks, and analyzing and displaying data. Python has been used by many non-programmers, including economists and researchers, for a number of routine activities including managing finances since it is simple to learn. Not just programmers and computer scientists use Python (Arbuckle, 2010). For those who work in less data-intensive fields like journalism, small company ownership, or online marketing, learning Python can expand their career options. Python can also help non- programmers streamline some of their daily duties. 3. Steps for Data Analysis (Using Python) Required Packages Imported Data Gathering Data Wrangling (Transform data according to the needs) Data Visualization & Exploratory Data Analysis 3.1 Step 1: Required Packages Imported All Python programming for data processing begins with installing the necessary packages. Though Python offers a wide range of tools for data analysts, the most well-known data science
6 tools in this investigation were NumPy and Pandas for Data Manipulation and EDA (Manguri et al., 2020). The active Python tools Plotly and Matlab were used for data visualization. Importing packages into Python code is relatively easy: This code imports the main packages needed to carry out data processing. 3.2 Step 2: Gathering Data Getting high-quality data is probably the most crucial component for clear and flawless data processing. To enhance the reliability of this research, a lot of information from numerous sources was gathered (Benis et al., 2021). The Python library Requests is employed to retrieve information from a specified JSON file. Requests were utilized in this function to extract information from the specified query. 3.3 Step 3: Manipulation of Data (Manipulate data according to the needs) Data is transformed and cleaned up throughout the data wrangling procedure in accordance with requirements. In order to move further with the study, the data had to be transformed. Here is the Data Wrangling code:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
7 However, a new Python module called "Date Time" was installed in order to evaluate the suggested data because it allows us to interact with times and dates in datasets. Prepare yourself to see the analyses' overall picture now with "Exploratory Data Analysis and Visualization Of data." 3.4 Step 4: Data Visualization & Exploratory Data Analysis As the core of data analysis, this procedure is extremely drawn out. Therefore, this procedure was broken down into three steps: Classifying nations and regions (based on COVID-19 aspects) COVID-19 Time Series Cases Case distribution and classification 3.5 Ranked provinces and countries By using several exploratory data analysis and Visualization techniques, the top nations and regions were determined from the already extracted data depending on their proven, fatalities, recoveries, and ongoing cases. For the subsequent graphics, follow the source code (Note: Each simulation is accessible as well as can hover over them to see their data points).
8 3.5.1 Part 1: Ranking Most affected countries I) Top 10 Countries with Confirmed Cases: The plot generated by the following code would display the top Ten nations according to the number of confirmed OVID-19 cases. II) Top ten Countries with Death Cases: Source: (Benis et al., 2021) The plot generated by the following code would display the top Ten nations according to the number of fatalities.
9 Source: (Verma et al., 2021) III) Top ten Countries with Recovered Cases: The plot generated by the following code would list the top Ten nations according to the number of recovered cases.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
10 Source: (Shrestha et al., 2020) IV) Top Ten Countries Active Cases:
11 Source: (Matta et al., 2020) 4. Analysis of COVID-19 Impacts Since the world wasn't prepared for the epidemic, the 1st surge of COVID-19 had an effect on the worldwide industry. It led to an increase in cases, fatalities, poverty, and joblessness, which in turn caused an unexpected loss. Here, it is necessary to assess the growth of Covid-19 instances as well as all impacts on the global financial (Asare and Barfi, 2021). The data-set that has been obtained from Kaggle was utilized to examine the effects of COVID-19. It includes information on: The nation-code The names of all the nations Record's creation date All-country indicator of human evolution Covid-19 cases per day
12 Covid-19-related daily fatalities Index of country strictness The countries' populations Gross domestic product per-capita of the countries The dataset of the Impacts of COVID-19 is available at the following links (Kaggle.com): https://www.kaggle.com/datasets/shashwatwork/impact-of-covid19-pandemic-on-the-global- economy https://www.kaggle.com/imdevskp/corona-virus-report 4.1 Analysis of the Covid-19 Effects (Problem) using Python Let's begin by installing the required Python modules and the data-set in order to address the selected issue of the Covid-19 effects study (Hyman et al., 2021):
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
13 Source: (Asare and Barfi, 2021) Statistics on COVID-19 instances as well as their effects on the gross domestic product were used from Dec 31, 2020, to Oct 10, 2021 (Sv et al., 2021). 4.2 Preparation of Data Two data files make up the data-set that has been used in this study. One file contains unprocessed data, while the other includes processed data (Maital and Barzani, 2020). However,
14 both datasets have been considered for this issue because they both have similarly significant data in various columns. So, let's examine each dataset separately (Silva et al., 2020):
15 Source: (Asare and Barfi, 2021) After the first views of both datasets, it was determined that combining them into a specific dataset was necessary (Krarti and Aldubyan, 2021). However, let's first check whether or not any examples of each country are already there in the dataset before making a new one: data["COUNTRY"].value_ counts () As a result, the dataset did not contain an equal number of samples from each country. Let's examine the value for the mode: data["COUNTRY"].value_ counts (). mode () The average mean is thus 294. It must utilize it to divide the total of all samples connected to the populace, Gross domestic product per capita, and social progress score. Now let's combine the relevant variables from both datasets to generate a new set of data:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
16 Source: (Asare and Barfi, 2021) The column does not yet have included the Gross domestic product per capita. In the data-set, it wasn't located the accurate Gross domestic product per capita numbers. Therefore, it'll be preferable to manually gather data on each country's Gross domestic product per capita. Moreover, due to the huge number of nations included in this dataset, it's going to be difficult to personally gather data on each nation's Gross domestic product per capita. So, let's choose a portion of this dataset. The top ten nations with the most covid-19 instances will be chosen to form a subset from this data. It will make a suitable sample for research on the effects of COVID-19 on the economy. Therefore, let's categories the data according to the overall number of Covid-19 cases:
17 Source: (Asare and Barfi, 2021) Here are how the top ten nations with the most infections were determined:
18 Source: (Asare and Barfi, 2021) Here two more columns were added to this dataset (gross domestic product per capita pre Covid- 19 and gross domestic product per-capita throughout Covid-19):
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
19 Source: (Asare and Barfi, 2021) Note: The gross domestic product per capita data was manually gathered.
20 4.3 Analysis of the Spread of Pandemic Now let's examine the distribution of covid-19 in each of the nations with the greatest number of instances. The 1st is that every nation with the greatest number of COVID-19 instances has been looked into (Roy, 2020): 1. figure = px. bar (data, y='Total Cases', x='Country', title="Countries with Highest Covid Cases") 2. figure. show () Source: (Asare and Barfi, 2021) The United States was found to have a significantly higher percentage of COVID-19 instances than India and Brazil, who came in 2nd & 3rd, respectively. Let's now examine the total rate of death among all of the nations with the greatest proportion of COVID-19 cases: 1. figure = px. bar (data, y='Total Deaths', x='Country', title="Countries with Highest Deaths")
21 2. figure. show () Source: (Asare and Barfi, 2021) In terms of mortality, the America is in the top, followed by India and Brazil in 2nd and 3rd position, and the overall number of cases of covid-19. Another thing to note is that, when compared to the overall number of instances, the death rates in Southern Africa, India and Russia are quite low. Let's now examine the overall numbers of cases and fatalities in each of these nations:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
22 Source: (Asare and Barfi, 2021)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
23 Source: (Asare and Barfi, 2021) Now let's examine the per cent of total fatalities and overall cases in each of the nations with the greatest proportion of cases of COVID-19: Source: (Asare and Barfi, 2021) Source: (Asare and Barfi, 2021)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
24 Python would be used to determine the mortality rate of Covid-19 patients, as shown below (Feyisa, 2020): 1. death_ rate = (data ["Total Deaths"]. sum () / data ["Total Cases"]. sum ()) * 100 2. print ("Death Rate = ", death _ rate) Death Rate = 3.6145 The rigorousness index is a significant column in this data-set as well. It's a combined measure of response that considers things like travel restrictions, job restrictions, as well as school closings. It demonstrates how closely nations are adhering to these controls on the transmission of COVID-19 (Vijay et al., 2020): 4.4 Effects of Covid-19 on the Economy Analysis Now let's examine how COVID-19 will have an economic impact. The gross domestic product per capita is the main consideration in this analysis of the economic shutdowns brought on by the covid-19 pandemic. Let's take a glance at the gross domestic product per capita in the nations with the largest number of covid-19 patients prior to the epidemic (Elavarasan et al., 2020):
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
25 Source: (Asare and Barfi, 2021) Source: (Asare and Barfi, 2021) Below is the examination of the influence of COVID-19 on gross domestic product per capita by comparing GDP per capita pre-COVID-19 as well as during COVID-19:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
26 Source: (Asare and Barfi, 2021)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
27 Source: (Asare and Barfi, 2021) In all of the countries with the greatest incidence of COVID-19, there has been a decline in GDP per capita. The Index of Human Development is another significant economic driver. It is a statistical index that combines data on per capita, education, as well as life span. Let's take a glance at the number of nations that allocated money to human evolution:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
28 Source: (Asare and Barfi, 2021) Thus, the analysis of the Covid-19 outbreak and its economic consequences has been done in this manner. 5. Conclusion This work involved researching the development of COVID-19 throughout nations and how it affected the world economy. The research has demonstrated that the covid-19 epidemic was responsible for the greatest number of covid-19 illnesses and fatalities in the USA. The United States rigorousness score is a significant factor in this. When compared to the inhabitants, it is rather low. Additionally, the gross domestic product growth rate of each country affected by the COVID-19 epidemic was analyzed Covid-19 effects, however, were examined using Python. COVID-19 was selected as a problem is this report. COVID-19 is a major current issue. And Python was used to investigate the economic impacts of COVID-19.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
29 References Amirova, E.F., Lomakin, D.E., Smoktal, N.N., Khamkhoeva, F.Y. and Zhilenko, V.Y., 2021. The impact of COVID-19 pandemic on the global economy and environment. Journal of Environmental Management & Tourism, 12(5), pp.1236-1241. Arbuckle, D., 2010. Python Testing: Beginner's Guide. Packt Publishing Ltd. Asare, P. and Barfi, R., 2021. The impact of Covid-19 pandemic on the Global economy: emphasis on poverty alleviation and economic growth. Economics, 8(1), pp.32-43. Retrieved from: https://www.kaggle.com/datasets/shashwatwork/impact-of-covid19-pandemic-on-the- global-economy Badarinza, C., Balasubramaniam, V. and Ramadorai, T., 2019. The household finance landscape in emerging economies. Annual Review of Financial Economics, 11, pp.109-129. Benis, A., Amador Nelke, S. and Winokur, M., 2021. Training the Next Industrial Engineers and Managers about Industry 4.0: a case study about challenges and opportunities in the COVID-19 Era. Sensors, 21(9), p.2905. Bundervoet, T., Dávalos, M.E. and Garcia, N., 2022. The short-term impacts of COVID-19 on households in developing countries: An overview based on a harmonized dataset of high- frequency surveys. World development, p.105844. De Smedt, T. and Daelemans, W., 2012. Pattern for python. The Journal of Machine Learning Research, 13(1), pp.2063-2067. Elavarasan, R.M., Shafiullah, G.M., Raju, K., Mudgal, V., Arif, M.T., Jamal, T., Subramanian, S., Balaguru, V.S., Reddy, K.S. and Subramaniam, U., 2020. COVID-19: Impact analysis and recommendations for power sector operation. Applied energy, 279, p.115739. Feyisa, H.L., 2020. The World Economy at COVID-19 quarantine: contemporary review. International journal of economics, finance and management sciences, 8(2), pp.63-74.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
30 Goldstein, M., Gonzalez, P.M., Sreelakshmi, P. and Wimpey, J., 2020. The global state of small business during COVID-19: Gender inequalities. World Bank Blogs. Full Results Forthcoming. Available at: https://blogs. worldbank. org/developmenttalk/global-state-small-business-during- covid-19-gender-inequalities. Hyman, M., Mark, C., Imteaj, A., Ghiaie, H., Rezapour, S., Sadri, A.M. and Amini, M.H., 2021. Data analytics to evaluate the impact of infectious disease on economy: Case study of COVID- 19 pandemic. Patterns, 2(8), p.100315. Imtyaz, A., Haleem, A. and Javaid, M., 2020. Analysing governmental response to the COVID- 19 pandemic. Journal of Oral Biology and Craniofacial Research, 10(4), pp.504-513. Krarti, M. and Aldubyan, M., 2021. Review analysis of COVID-19 impact on electricity demand for residential buildings. Renewable and Sustainable Energy Reviews, 143, p.110888. Maital, S. and Barzani, E., 2020. The global economic impact of COVID-19: A summary of research. Samuel Neaman Institute for National Policy Research, 2020, pp.1-12. Manguri, K.H., Ramadhan, R.N. and Amin, P.R.M., 2020. Twitter sentiment analysis on worldwide COVID-19 outbreaks. Kurdistan Journal of Applied Research, pp.54-65. Matta, S., Chopra, K.K. and Arora, V.K., 2020. Morbidity and mortality trends of Covid 19 in top 10 countries. indian journal of tuberculosis, 67(4), pp.S167-S172. Quaranta, L., Calefato, F. and Lanubile, F., 2021, May. KGTorrent: A dataset of python Jupyter notebooks from kaggle. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) (pp. 550-554). IEEE. Roy, S., 2020. Economic impact of Covid-19 pandemic. A preprint, pp.1-29. Saurav, A., Kusek, P., Kuo, R. and Viney, B., 2021. The impact of COVID 19 on foreign investors: Evidence from the quarterly global multinational enterprise pulse survey for the first quarter of 2021. World Bank.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
31 Silva, P.C., Batista, P.V., Lima, H.S., Alves, M.A., Guimarães, F.G. and Silva, R.C., 2020. COVID-ABS: An agent-based model of COVID-19 epidemic to simulate health and economic effects of social distancing interventions. Chaos, Solitons & Fractals, 139, p.110088. Sv, P., Tandon, J. and Hinduja, H., 2021. Indian citizen's perspective about side effects of COVID-19 vaccine–A machine learning study. Diabetes & Metabolic Syndrome: Clinical Research & Reviews, 15(4), p.102172. Verma, P., Dumka, A., Bhardwaj, A., Ashok, A., Kestwal, M.C. and Kumar, P., 2021. A statistical analysis of impact of COVID19 on the global economy and stock index returns. SN Computer Science, 2, pp.1-13. Vijay, T., Chawla, A., Dhanka, B. and Karmakar, P., 2020, December. Sentiment analysis on covid-19 twitter data. In 2020 5th IEEE international conference on recent advances and innovations in engineering (ICRAIE) (pp. 1-7). IEEE. Wang, A.Y., Wang, D., Drozdal, J., Liu, X., Park, S., Oney, S. and Brooks, C., 2021, May. What makes a well-documented notebook? a case study of data scientists’ documentation practices in kaggle. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-7).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help