Use the file "Wells.csv for your data and compute the correlation between: 1. RHOB and NPHI (Bulk density and Neutron porosity) 2. ROP (Rate of Penetration) and WOB (Weight on Bit)

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Hi, I am trying to write code to compute covariance, computing the standard deviation of X and Y correctly (Just one point for computing the variance), and to correct answer using only one function. The data provided for the data set does contain nan so they need to be filtered out. Attached is a picture of the original prompt and the code I have tried/put together so far

Write only one function that will take as argument 2 Numpy arrays (X and Y, which are of the same length, N) and then computes and outputs the following
quantity:
= yi=N (Xi - X mean)(Y; – Ymcan)
(oxoy)(N – 1)
Cou(X, Y)
PxY =
EN
oxoy
The quantity above is called the correlation and is defined as the covariance divided by the standard deviation in x and standard deviation in y. The
standard deviation ox is the square root of the variance, given by:
N - 1
A similar formula applies to the standard deviation of Y.
You need to use the following programming constructs:
1. Functions
2. Recursive loops
3. Numpy arrays
Use the file 'Wells.csv' for your data and compute the correlation between:
1. RHOB and NPHI (Bulk density and Neutron porosity)
2. ROP (Rate of Penetration) and WOB (Weight on Bit)
Transcribed Image Text:Write only one function that will take as argument 2 Numpy arrays (X and Y, which are of the same length, N) and then computes and outputs the following quantity: = yi=N (Xi - X mean)(Y; – Ymcan) (oxoy)(N – 1) Cou(X, Y) PxY = EN oxoy The quantity above is called the correlation and is defined as the covariance divided by the standard deviation in x and standard deviation in y. The standard deviation ox is the square root of the variance, given by: N - 1 A similar formula applies to the standard deviation of Y. You need to use the following programming constructs: 1. Functions 2. Recursive loops 3. Numpy arrays Use the file 'Wells.csv' for your data and compute the correlation between: 1. RHOB and NPHI (Bulk density and Neutron porosity) 2. ROP (Rate of Penetration) and WOB (Weight on Bit)
In [1]: import pandas as pd
In [2]:
M df = pd.read_csv('Wells.csv')
In [3]: M df
Out[3]:
Well Depth
GR
PEF1 PEF2
DT
ROP
WOB DownT Torque . DownP
Mudflow
ЕCD
BS
RT
1
1. 2922.5 13.4058 8.7053
NaN 77.1874
4.5008 4.3012
71.0
23.583
382.2 2200.9165 1.4182
8.5 1.6100
1
1 2923.0 15.2468 6.4380
NaN 75.5047
6.5108 4.9543
71.0
33.721
382.5 1993.9286 1.4188
8.5 1.6648
2
1 2923.5 11.2243 6.2109
NaN 75.5697
7.6733 7.0439
71.0 34.831
382.7 1993.9286 1.4195
8.5 1.6856
1
1. 2924.0
11.7085 5.9728
NaN 75.9891 10.2010 7.0977
72.0
35.166
383.0 1993.9286 1.4204
8.5 1.4633
4
1 2924.5 16.3429 6.1139
NaN 75.1929 12.8272 9.6089
72.0
34.892
383.1 1993.9286
1.4206
8.5 1.5418
...
...
-..
55940
15 4083.5 59.7060
NaN NaN 68.0602
NaN
NaN
NaN
NaN
NaN
NaN
NaN NaN 1.7590
55941
15 4084.0 58.4170
NaN
NaN 70.3944
NaN
NaN
NaN
NaN
NaN
NaN
NaN NaN 1.6510
55942
15 4084.5 57.4990
NaN
NaN 71.9931
NaN
NaN
NaN
NaN .
NaN
NaN
NaN Nan 1.5970
55943
15 4085.0 56.7850
NaN
NaN 72.7590
NaN
NaN
NaN
NaN
NaN
NaN
NaN NaN 1.4820
55944
15 4085.5 61.7220
NaN
NaN 72.8121
NaN
NaN
NaN
NaN ...
NaN
NaN
NaN Nan 1.4350
55945 rows x 22 columns
In [ ]:
H df.shape
In [ ]:
I df.isnull().sum()
In [ ]:
I df.dropna(how='any', inplace-True)
In [ ]:
H df.shape
In [ ]: M df [ 'ROP']
In [ ]:
I df = pd.read_csv('Wells.csv')
In [ ]: N df['ROP']
df['WOB']
df['RHOB']
W
Transcribed Image Text:In [1]: import pandas as pd In [2]: M df = pd.read_csv('Wells.csv') In [3]: M df Out[3]: Well Depth GR PEF1 PEF2 DT ROP WOB DownT Torque . DownP Mudflow ЕCD BS RT 1 1. 2922.5 13.4058 8.7053 NaN 77.1874 4.5008 4.3012 71.0 23.583 382.2 2200.9165 1.4182 8.5 1.6100 1 1 2923.0 15.2468 6.4380 NaN 75.5047 6.5108 4.9543 71.0 33.721 382.5 1993.9286 1.4188 8.5 1.6648 2 1 2923.5 11.2243 6.2109 NaN 75.5697 7.6733 7.0439 71.0 34.831 382.7 1993.9286 1.4195 8.5 1.6856 1 1. 2924.0 11.7085 5.9728 NaN 75.9891 10.2010 7.0977 72.0 35.166 383.0 1993.9286 1.4204 8.5 1.4633 4 1 2924.5 16.3429 6.1139 NaN 75.1929 12.8272 9.6089 72.0 34.892 383.1 1993.9286 1.4206 8.5 1.5418 ... ... -.. 55940 15 4083.5 59.7060 NaN NaN 68.0602 NaN NaN NaN NaN NaN NaN NaN NaN 1.7590 55941 15 4084.0 58.4170 NaN NaN 70.3944 NaN NaN NaN NaN NaN NaN NaN NaN 1.6510 55942 15 4084.5 57.4990 NaN NaN 71.9931 NaN NaN NaN NaN . NaN NaN NaN Nan 1.5970 55943 15 4085.0 56.7850 NaN NaN 72.7590 NaN NaN NaN NaN NaN NaN NaN NaN 1.4820 55944 15 4085.5 61.7220 NaN NaN 72.8121 NaN NaN NaN NaN ... NaN NaN NaN Nan 1.4350 55945 rows x 22 columns In [ ]: H df.shape In [ ]: I df.isnull().sum() In [ ]: I df.dropna(how='any', inplace-True) In [ ]: H df.shape In [ ]: M df [ 'ROP'] In [ ]: I df = pd.read_csv('Wells.csv') In [ ]: N df['ROP'] df['WOB'] df['RHOB'] W
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education