c02 intensity (2)

.docx

School

Kenyatta University *

*We aren’t endorsed by this school

Course

545

Subject

Statistics

Date

Nov 24, 2024

Type

docx

Pages

14

Uploaded by CoachDolphinMaster782

Report
1 | P a g e C02 Intensity Institutional Affiliation Student’s Name Date
2 | P a g e C02 Intensity ( Draft ) 1.1. Descriptive Statistics: Calculate and print the mean and standard deviation of carbon (actual) intensity values over the specified date range. Round the output numbers to three decimal places. import pandas as pd # Assuming your dataset is named 'df' # Replace 'df' with the actual variable name if it's different # Convert 'CO2_emission' column to numeric, treating errors as NaN df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce') # Calculate mean and standard deviation mean_co2_emission = df['CO2_emission'].mean() std_dev_co2_emission = df['CO2_emission'].std() # Print the results rounded to three decimal places print(f"Mean CO2 Emission: {mean_co2_emission:.3f}") print(f"Standard Deviation of CO2 Emission: {std_dev_co2_emission:.3f}") 1.2. Time Period Analysis: Calculate and print the duration of the data collection period (i.e., the time between the earliest and latest timestamps). import pandas as pd # Create a DataFrame with the provided data data = { 'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'], 'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other',
3 | P a g e 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'], 'Year': [1980] * 30, 'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0], 'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0, 0, 0, 0, 0, None, 0], 'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6, 'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6, 'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24, 'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24, 'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0] } df = pd.DataFrame(data) # Convert 'Year' column to numeric, treating errors as NaN df['Year'] = pd.to_numeric(df['Year'], errors='coerce') # Find the minimum and maximum years min_year = df['Year'].min() max_year = df['Year'].max() # Calculate the duration duration = max_year - min_year # Print the result print(f"Data Collection Period Duration: {duration} years")
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4 | P a g e Peak Intensity Detection: Identify and print the timestamp and value associated with the highest carbon (actual) intensity. import pandas as pd # Create a DataFrame with the provided data data = { 'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'], 'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'], 'Year': [1980] * 30, 'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0], 'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0, 0, 0, 0, 0, None, 0], 'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6, 'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6, 'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24, 'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24, 'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0] }
5 | P a g e df = pd.DataFrame(data) # Convert 'CO2_emission' column to numeric, treating errors as NaN df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce') # Find the row with the highest CO2 emission max_co2_row = df.loc[df['CO2_emission'].idxmax()] # Extract timestamp and value timestamp_max_co2 = max_co2_row['Year'] value_max_co2 = max_co2_row['CO2_emission'] # Print the result print(f"Highest Carbon Intensity (CO2 Emission):") Print(f"Timestamp: {timestamp_max_co2}") print(f"Value: {value_max_co2}") 1.4. Data Filtering: Filter the data to include only entries with (actual) carbon intensity values above 200, and then calculate and print the mean and standard deviation of the filtered data. Round the output numbers to three decimal places. import pandas as pd # Create a DataFrame with the provided data data = { 'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'], 'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids',
6 | P a g e 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'], 'Year': [1980] * 30, 'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0], 'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0.005893112, 0, 0, 0.005893112, None, 0], 'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6, 'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6, 'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24, 'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24, 'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0] } df = pd.DataFrame(data) # Convert 'CO2_emission' column to numeric, treating errors as NaN df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce') # Filter the data to include only entries with CO2 emission values above 200 filtered_data = df[df['CO2_emission'] > 200] # Calculate mean and standard deviation of the filtered data mean_filtered = filtered_data['CO2_emission'].mean()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
7 | P a g e std_dev_filtered = filtered_data['CO2_emission'].std() # Print the results rounded to three decimal places print(f"Filtered Data - Mean CO2 Emission: {mean_filtered:.3f}") print(f"Filtered Data - Standard Deviation of CO2 Emission: {std_dev_filtered:.3f}") 1.5. Weekday vs. Weekend Analysis: Create a new column indicating whether each entry corresponds to a weekday or weekend, and then calculate and print the mean (actual) carbon intensity values for weekdays and weekends separately. Round the output numbers to three decimal places. import pandas as pd # Create a DataFrame with the provided data data = { 'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'], 'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'], 'Year': [1980] * 30, 'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0], 'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0.005893112, 0, 0, 0.005893112, None, 0], 'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6, 'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6, 'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24,
8 | P a g e 'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24, 'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0] } df = pd.DataFrame(data) # Convert 'Year' column to numeric, treating errors as NaN df['Year'] = pd.to_numeric(df['Year'], errors='coerce') # Convert 'CO2_emission' column to numeric, treating errors as NaN df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce') # Create a new column 'Day_Type' indicating whether each entry corresponds to a weekday or weekend df['Day_Type'] = pd.to_datetime(df['Year'], format='%Y').dt.dayofweek // 5 == 1 # 0-4: Weekdays, 5-6: Weekends # Calculate the mean (actual) carbon intensity values for weekdays and weekends separately mean_weekday = df[df['Day_Type'] == False]['CO2_emission'].mean() mean_weekend = df[df['Day_Type'] == True]['CO2_emission'].mean() # Print the results rounded to three decimal places print(f"Weekday Analysis - Mean CO2 Emission: {mean_weekday:.3f}") print(f"Weekend Analysis - Mean CO2 Emission: {mean_weekend:.3f}") 1.6. Time of Day Analysis: Extract the hour of the day from timestamps and group data by the hour, and create a bar plot to visualize the mean carbon (actual) intensity for each hour of the day
9 | P a g e import pandas as pd import matplotlib.pyplot as plt # Create a DataFrame with the provided data data = { 'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'], 'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'], 'Year': [1980] * 30, 'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0], 'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0.005893112, 0, 0, 0.005893112, None, 0], 'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6, 'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6, 'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24, 'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24, 'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0], } df = pd.DataFrame(data)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
10 | P a g e # Convert 'Year' column to numeric, treating errors as NaN df['Year'] = pd.to_numeric(df['Year'], errors='coerce') # Convert 'CO2_emission' column to numeric, treating errors as NaN df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce') # Extract the hour of the day from timestamps (assuming 'Year' is a timestamp column) df['Hour_of_Day'] = pd.to_datetime(df['Year'], format='%Y').dt.hour # Group data by the hour of the day and calculate the mean carbon intensity for each hour mean_intensity_by_hour = df.groupby('Hour_of_Day')['CO2_emission'].mean() # Create a bar plot to visualize the mean carbon intensity for each hour of the day plt.bar(mean_intensity_by_hour.index, mean_intensity_by_hour) plt.xlabel('Hour of the Day') plt.ylabel('Mean CO2 Emission') plt.title('Mean Carbon Intensity for Each Hour of the Day') plt.show() 1.7. Histogram Visualization: Generate a histogram using the seaborn library, enabling the kernel density estimation (KDE) and utilizing 20 bins to portray the distribution of both actual and forecasted carbon intensity values. Ensure proper axis labeling and select a suitable bin count. import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Assuming your dataset is named 'df' # Replace 'df' with the actual variable name if it's different # Create a DataFrame with the provided data data = { 'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'],
11 | P a g e 'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'], 'Year': [1980] * 30, 'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0], 'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0.005893112, 0, 0, 0.005893112, None, 0], 'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6, 'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6, 'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24, 'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24, 'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0], } df = pd.DataFrame(data) # Convert 'Year' column to numeric, treating errors as NaN df['Year'] = pd.to_numeric(df['Year'], errors='coerce') # Convert 'CO2_emission' column to numeric, treating errors as NaN df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce') # Use seaborn to create a histogram with KDE for both actual and forecasted carbon intensity values plt.figure(figsize=(10, 6))
12 | P a g e sns.histplot(data=df, x='CO2_emission', kde=True, bins=20, hue='Energy_type', multiple='stack') plt.xlabel('Carbon Intensity') plt.ylabel('Frequency') plt.title('Distribution of Actual and Forecasted Carbon Intensity') plt.legend(title='Energy Type') plt.show() 1.8. Boxplot of Actual and Forcasted Intensity: Create a Seaborn boxplot to visualize the distribution of carbon intensity values (actual and forcaseted) and identify potential outliers (if there is any). import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Create a DataFrame with the provided data data = { 'Country': ['World', 'World', 'World', 'World', 'World', 'World', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Afghanistan', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Albania', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'Algeria', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa', 'American Samoa'], 'Energy_type': ['all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other', 'all_energy_types', 'coal', 'natural_gas', 'petroleum_n_other_liquids', 'nuclear', 'renewables_n_other'], 'Year': [1980] * 30, 'Energy_consumption': [292.8997896, 78.65613403, 53.8652233, 132.0640194, 7.575700462, 20.70234415, 0.026583217, 0.002479248, 0.002094, 0.014624098, None, 0.00738587, 0.162981822, 0.024317315, 0.01047, 0.099297277, None, 0.02889723, 0.780695167, 0.002547398, 0.5428, 0.232740836, None, 0.002606933, 0.005893112, 0, 0, 0.005893112, None, 0], 'Energy_production': [296.3372276, 80.11419429, 54.76104559, 133.1111089, 7.575700462, 20.77517837, 0.072561156, 0.002355286, 0.06282, 0, None, 0.00738587, 0.15556162, 0.013229039, 0.01047, 0.10154, None, 0.030322582, 2.803017355, 7.59E-05, 0.48498, 2.31538521, None, 0.002576225, 0.005893112, 0, 0, 0.005893112, None, 0],
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
13 | P a g e 'GDP': [27770.91028] * 30 + [13356.5] * 6 + [2682.7] * 6 + [19221.7] * 6 + [32.646] * 6, 'Population': [4298126.522] * 30 + [1.990283134] * 6 + [60.75290633] * 6 + [40.61530287] * 6 + [180.5156037] * 6, 'Energy_intensity_per_capita': [68.14592081] * 30 + [0] * 24, 'Energy_intensity_by_GDP': [10.54699996] * 30 + [0] * 24, 'CO2_emission': [4946.62713, 1409.790188, 1081.593377, 2455.243565, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, None, None, None, 0, 0, None, 0, 0, None, 0, None, 0], } df = pd.DataFrame(data) # Convert 'Year' column to numeric, treating errors as NaN df['Year'] = pd.to_numeric(df['Year'], errors='coerce') # Convert 'CO2_emission' column to numeric, treating errors as NaN df['CO2_emission'] = pd.to_numeric(df['CO2_emission'], errors='coerce') # Use seaborn to create a boxplot for actual and forecasted carbon intensity values plt.figure(figsize=(10, 6)) sns.boxplot(data=df, x='Energy_type', y='CO2_emission') plt.xlabel('Energy Type') plt.ylabel('Carbon Intensity') plt.title('Boxplot of Actual and Forecasted Carbon Intensity') plt.show() 2. Image Transformations using OpenCV
14 | P a g e