c02 intensity (2)
.docx
keyboard_arrow_up
School
Kenyatta University *
*We aren’t endorsed by this school
Course
545
Subject
Statistics
Date
Nov 24, 2024
Type
docx
Pages
14
Uploaded by CoachDolphinMaster782
1 | P a g e
C02 Intensity
Institutional Affiliation
Student’s Name
Date
2 | P a g e
C02 Intensity (
Draft
)
•
1.1. Descriptive Statistics: Calculate and print the mean and standard deviation of carbon (actual) intensity values over the specified date range. Round the output numbers
to three decimal places.
import pandas as pd
# Assuming your dataset is named 'df'
# Replace 'df' with the actual variable name if it's different
# Convert 'CO2_emission' column to numeric, treating errors as NaN
df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce')
# Calculate mean and standard deviation
mean_co2_emission = df['CO2_emission'].mean()
std_dev_co2_emission = df['CO2_emission'].std()
# Print the results rounded to three decimal places
print(f"Mean CO2 Emission: {mean_co2_emission:.3f}")
print(f"Standard Deviation of CO2 Emission: {std_dev_co2_emission:.3f}")
1.2. Time Period Analysis:
Calculate and print the duration of the data collection period
(i.e., the time between the earliest and latest timestamps).
import pandas as pd
# Create a DataFrame with the provided data
data = {
'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'],
'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types',
'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other',
3 | P a g e
'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'],
'Year': [1980] * 30,
'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0],
'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0, 0, 0, 0, 0, None, 0],
'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6,
'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6,
'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24,
'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24,
'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0]
}
df = pd.DataFrame(data)
# Convert 'Year' column to numeric, treating errors as NaN
df['Year'] = pd.to_numeric(df['Year'], errors='coerce')
# Find the minimum and maximum years
min_year = df['Year'].min()
max_year = df['Year'].max()
# Calculate the duration
duration = max_year - min_year
# Print the result
print(f"Data Collection Period Duration: {duration} years")
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
4 | P a g e
Peak Intensity Detection: Identify and print the timestamp and value associated with the highest carbon (actual) intensity.
import pandas as pd
# Create a DataFrame with the provided data
data = {
'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'],
'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types',
'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'],
'Year': [1980] * 30,
'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0],
'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0, 0, 0, 0, 0, None, 0],
'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6,
'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6,
'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24,
'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24,
'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0]
}
5 | P a g e
df = pd.DataFrame(data)
# Convert 'CO2_emission' column to numeric, treating errors as NaN
df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce')
# Find the row with the highest CO2 emission
max_co2_row = df.loc[df['CO2_emission'].idxmax()]
# Extract timestamp and value
timestamp_max_co2 = max_co2_row['Year']
value_max_co2 = max_co2_row['CO2_emission']
# Print the result
print(f"Highest Carbon Intensity (CO2 Emission):")
Print(f"Timestamp: {timestamp_max_co2}")
print(f"Value: {value_max_co2}")
1.4. Data Filtering: Filter the data to include only entries with (actual) carbon intensity values above 200, and then calculate and print the mean and standard deviation of the filtered data. Round the output numbers to three decimal places.
import pandas as pd
# Create a DataFrame with the provided data
data = {
'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'],
'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids',
6 | P a g e
'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'],
'Year': [1980] * 30,
'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0],
'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462,
20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0.005893112, 0, 0, 0.005893112, None, 0],
'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6,
'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6
+ [180.5156037] * 6,
'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24,
'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24,
'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0]
}
df = pd.DataFrame(data)
# Convert 'CO2_emission' column to numeric, treating errors as NaN
df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce')
# Filter the data to include only entries with CO2 emission values above 200
filtered_data = df[df['CO2_emission'] > 200]
# Calculate mean and standard deviation of the filtered data
mean_filtered = filtered_data['CO2_emission'].mean()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
7 | P a g e
std_dev_filtered = filtered_data['CO2_emission'].std()
# Print the results rounded to three decimal places
print(f"Filtered Data - Mean CO2 Emission: {mean_filtered:.3f}")
print(f"Filtered Data - Standard Deviation of CO2 Emission: {std_dev_filtered:.3f}")
•
1.5. Weekday vs. Weekend Analysis: Create a new column indicating whether each entry corresponds to a weekday or weekend, and then calculate and print the mean (actual) carbon intensity values for weekdays and weekends separately. Round the output numbers to three decimal places.
import pandas as pd
# Create a DataFrame with the provided data
data = {
'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'],
'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types',
'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'],
'Year': [1980] * 30,
'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0],
'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0.005893112, 0, 0, 0.005893112, None, 0],
'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6,
'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6,
'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24,
8 | P a g e
'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24,
'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0]
}
df = pd.DataFrame(data)
# Convert 'Year' column to numeric, treating errors as NaN
df['Year'] = pd.to_numeric(df['Year'], errors='coerce')
# Convert 'CO2_emission' column to numeric, treating errors as NaN
df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce')
# Create a new column 'Day_Type' indicating whether each entry corresponds to a weekday or weekend
df['Day_Type'] = pd.to_datetime(df['Year'], format='%Y').dt.dayofweek // 5 == 1 # 0-4: Weekdays, 5-6: Weekends
# Calculate the mean (actual) carbon intensity values for weekdays and weekends separately
mean_weekday = df[df['Day_Type'] == False]['CO2_emission'].mean()
mean_weekend = df[df['Day_Type'] == True]['CO2_emission'].mean()
# Print the results rounded to three decimal places
print(f"Weekday Analysis - Mean CO2 Emission: {mean_weekday:.3f}")
print(f"Weekend Analysis - Mean CO2 Emission: {mean_weekend:.3f}")
•
1.6. Time of Day Analysis: Extract the hour of the day from timestamps and group data by the hour, and create a bar plot to visualize the mean carbon (actual) intensity for each hour of the day
9 | P a g e
import pandas as pd
import matplotlib.pyplot as plt
# Create a DataFrame with the provided data
data = {
'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'],
'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types',
'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'],
'Year': [1980] * 30,
'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0],
'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0.005893112, 0, 0, 0.005893112, None, 0],
'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6,
'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6,
'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24,
'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24,
'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0],
}
df = pd.DataFrame(data)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
10 | P a g e
# Convert 'Year' column to numeric, treating errors as NaN
df['Year'] = pd.to_numeric(df['Year'], errors='coerce')
# Convert 'CO2_emission' column to numeric, treating errors as NaN
df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce')
# Extract the hour of the day from timestamps (assuming 'Year' is a timestamp column)
df['Hour_of_Day'] = pd.to_datetime(df['Year'], format='%Y').dt.hour
# Group data by the hour of the day and calculate the mean carbon intensity for each hour
mean_intensity_by_hour = df.groupby('Hour_of_Day')['CO2_emission'].mean()
# Create a bar plot to visualize the mean carbon intensity for each hour of the day
plt.bar(mean_intensity_by_hour.index, mean_intensity_by_hour)
plt.xlabel('Hour of the Day')
plt.ylabel('Mean CO2 Emission')
plt.title('Mean Carbon Intensity for Each Hour of the Day')
plt.show()
•
1.7. Histogram Visualization: Generate a histogram using the seaborn library, enabling the kernel density estimation (KDE) and utilizing 20 bins to portray the distribution of both actual and forecasted carbon intensity values. Ensure proper axis labeling and select a suitable bin count.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Assuming your dataset is named 'df'
# Replace 'df' with the actual variable name if it's different
# Create a DataFrame with the provided data
data = {
'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'],
11 | P a g e
'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types',
'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'],
'Year': [1980] * 30,
'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0],
'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0.005893112, 0, 0, 0.005893112, None, 0],
'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6,
'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6,
'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24,
'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24,
'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0],
}
df = pd.DataFrame(data)
# Convert 'Year' column to numeric, treating errors as NaN
df['Year'] = pd.to_numeric(df['Year'], errors='coerce')
# Convert 'CO2_emission' column to numeric, treating errors as NaN
df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce')
# Use seaborn to create a histogram with KDE for both actual and forecasted carbon intensity
values
plt.figure(figsize=(10, 6))
12 | P a g e
sns.histplot(data=df, x='CO2_emission', kde=True, bins=20, hue='Energy_type', multiple='stack')
plt.xlabel('Carbon Intensity')
plt.ylabel('Frequency')
plt.title('Distribution of Actual and Forecasted Carbon Intensity')
plt.legend(title='Energy Type')
plt.show()
1.8. Boxplot of Actual and Forcasted Intensity:
Create a Seaborn boxplot to visualize the distribution of carbon intensity values (actual and forcaseted) and identify potential outliers (if there is any).
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Create a DataFrame with the provided data
data = {
'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'],
'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear',
'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear',
'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear',
'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear',
'renewables_n_other'],
'Year': [1980] * 30,
'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0],
'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0.005893112, 0, 0, 0.005893112, None, 0],
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
13 | P a g e
'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6,
'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6,
'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24,
'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24,
'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0],
}
df = pd.DataFrame(data)
# Convert 'Year' column to numeric, treating errors as NaN
df['Year'] = pd.to_numeric(df['Year'], errors='coerce')
# Convert 'CO2_emission' column to numeric, treating errors as NaN
df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce')
# Use seaborn to create a boxplot for actual and forecasted carbon intensity values
plt.figure(figsize=(10, 6))
sns.boxplot(data=df, x='Energy_type', y='CO2_emission')
plt.xlabel('Energy Type')
plt.ylabel('Carbon Intensity')
plt.title('Boxplot of Actual and Forecasted Carbon Intensity')
plt.show()
2. Image Transformations using OpenCV
14 | P a g e
Recommended textbooks for you
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
College Algebra (MindTap Course List)
Algebra
ISBN:9781305652231
Author:R. David Gustafson, Jeff Hughes
Publisher:Cengage Learning
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin HarcourtCollege Algebra (MindTap Course List)AlgebraISBN:9781305652231Author:R. David Gustafson, Jeff HughesPublisher:Cengage Learning
- Holt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGAL
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
College Algebra (MindTap Course List)
Algebra
ISBN:9781305652231
Author:R. David Gustafson, Jeff Hughes
Publisher:Cengage Learning
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL