### THE DATA  mun_dict = {     '@CityofCTAlerts' : 'Cape Town',     '@CityPowerJhb' : 'Johannesburg',     '@eThekwiniM' : 'eThekwini' ,     '@EMMInfo' : 'Ekurhuleni',     '@centlecutility' : 'Mangaung',     '@NMBmunicipality' : 'Nelson Mandela Bay',     '@CityTshwane' : 'Tshwane' } twitter_url = 'https://raw.githubusercontent.com/Explore-AI/Public-Data/master/Data/twitter_nov_2019.csv' twitter_df = pd.read_csv(twitter_url) twitter_df.head() ### QUESTION Municipality & Hashtag Detector Write a function which takes in a pandas dataframe and returns a modified dataframe that includes two new columns that contain information about the municipality and hashtag of the tweet. Function Specifications: Function should take a pandas dataframe as input. Extract the municipality from a tweet using the mun_dict dictonary given at the start of the notebook and insert the result into a new column named 'municipality' in the same dataframe. Use the entry np.nan when a municipality is not found. Extract a list of hashtags from a tweet into a new column named 'hashtags' in the same dataframe. Use the entry np.nan when no hashtags are found. Hint: you will need to mun_dict variable defined at the top of this notebook.

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

### THE DATA 

mun_dict = {
    '@CityofCTAlerts' : 'Cape Town',
    '@CityPowerJhb' : 'Johannesburg',
    '@eThekwiniM' : 'eThekwini' ,
    '@EMMInfo' : 'Ekurhuleni',
    '@centlecutility' : 'Mangaung',
    '@NMBmunicipality' : 'Nelson Mandela Bay',
    '@CityTshwane' : 'Tshwane'
}

twitter_url = 'https://raw.githubusercontent.com/Explore-AI/Public-Data/master/Data/twitter_nov_2019.csv'
twitter_df = pd.read_csv(twitter_url)
twitter_df.head()

### QUESTION

Municipality & Hashtag Detector

Write a function which takes in a pandas dataframe and returns a modified dataframe that includes two new columns that contain information about the municipality and hashtag of the tweet.

Function Specifications:

  • Function should take a pandas dataframe as input.
  • Extract the municipality from a tweet using the mun_dict dictonary given at the start of the notebook and insert the result into a new column named 'municipality' in the same dataframe.
  • Use the entry np.nan when a municipality is not found.
  • Extract a list of hashtags from a tweet into a new column named 'hashtags' in the same dataframe.
  • Use the entry np.nan when no hashtags are found.

Hint: you will need to mun_dict variable defined at the top of this notebook.

Expert Solution
steps

Step by step

Solved in 3 steps with 2 images

Blurred answer
Knowledge Booster
Array
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education