Write a python program that reads the data file https://archive.ics.uci.edu/ml/machine-learning-databases/eventdetection/CalIt2.data and finds the total count of outflow and the total count of inflow. The attributes in the file are as follows: 1. Flow ID: 7 is out flow, 9 is in flow 2. Date: MM/DD/YY 3. Time: HH:MM:SS 4. Count: Number of counts reported for the previous half hour Rows: Each half hour time slice is represented by 2 rows: one row for the out flow during that time period (ID=7) and one row for the in flow during that time period (ID=9) Hint: # Importing the dataset dataset = pd.read_csv('CalIt2.data')
Write a python program that reads the data file
https://archive.ics.uci.edu/ml/machine-learning-
count of inflow.
The attributes in the file are as follows:
1. Flow ID: 7 is out flow, 9 is in flow
2. Date: MM/DD/YY
3. Time: HH:MM:SS
4. Count: Number of counts reported for the previous half hour
Rows: Each half hour time slice is represented by 2 rows: one row for
the out flow during that time period (ID=7) and one row for the in flow
during that time period (ID=9)
Hint: # Importing the dataset
dataset = pd.read_csv('CalIt2.data')
Use any data set (find a csv data file on the web) that has the following properties
- at least 5 features where some of the features have categorical values,
- has some missing values
and write code that will pre-process the data to deal with missing values, scale the data, and split the
processed data into 75% training, 15 % validation and 10 % test sets.
-after preprocessing, using matlplotlib, create xy plots of some of the features.
Trending now
This is a popular solution!
Step by step
Solved in 2 steps with 3 images