Write a function which removes english stop words from a tweet. Function Specifications: It should take a pandas dataframe as input. Should tokenise the sentences according to the definition in function 6. Note that function 6 cannot be called within this function. Should remove all stop words in the tokenised list. The stopwords are defined in the stop_words_dict variable defined at the top of this notebook. The resulting tokenised list should be placed in a column named "Without Stop Words". The function should modify the input dataframe. The function should return the modified dataframe. Expected Output: Specific rows: stop_words_remover(twitter_df.copy()).loc[0, "Without Stop Words"] == ['@bongadlulane', 'send', 'email', 'mediadesk@eskom.co.za'] stop_words_remover(twitter_df.copy()).loc[100, "Without Stop Words"] == ['#eskomnorthwest', '#mediastatement', ':', 'notice', 'supply', 'interruption', 'lichtenburg', 'area', 'https://t.co/7hfwvxllit']
Write a function which removes english stop words from a tweet. Function Specifications: It should take a pandas dataframe as input. Should tokenise the sentences according to the definition in function 6. Note that function 6 cannot be called within this function. Should remove all stop words in the tokenised list. The stopwords are defined in the stop_words_dict variable defined at the top of this notebook. The resulting tokenised list should be placed in a column named "Without Stop Words". The function should modify the input dataframe. The function should return the modified dataframe. Expected Output: Specific rows: stop_words_remover(twitter_df.copy()).loc[0, "Without Stop Words"] == ['@bongadlulane', 'send', 'email', 'mediadesk@eskom.co.za'] stop_words_remover(twitter_df.copy()).loc[100, "Without Stop Words"] == ['#eskomnorthwest', '#mediastatement', ':', 'notice', 'supply', 'interruption', 'lichtenburg', 'area', 'https://t.co/7hfwvxllit']
Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
Related questions
Question
100%
Stop Words
Write a function which removes english stop words from a tweet.
Function Specifications:
- It should take a pandas dataframe as input.
- Should tokenise the sentences according to the definition in function 6. Note that function 6 cannot be called within this function.
- Should remove all stop words in the tokenised list. The stopwords are defined in the stop_words_dict variable defined at the top of this notebook.
- The resulting tokenised list should be placed in a column named "Without Stop Words".
- The function should modify the input dataframe.
- The function should return the modified dataframe.
Expected Output:
Specific rows:
stop_words_remover(twitter_df.copy()).loc[0, "Without Stop Words"] == ['@bongadlulane', 'send', 'email', 'mediadesk@eskom.co.za'] stop_words_remover(twitter_df.copy()).loc[100, "Without Stop Words"] == ['#eskomnorthwest', '#mediastatement', ':', 'notice', 'supply', 'interruption', 'lichtenburg', 'area', 'https://t.co/7hfwvxllit']
Whole table:
stop_words_remover(twitter_df.copy())
Tweets | Date | Without Stop Words | |
---|---|---|---|
0 | @BongaDlulane Please send an email to mediades... | 2019-11-29 12:50:54 | [@bongadlulane, send, email, mediadesk@eskom.c... |
1 | @saucy_mamiie Pls log a call on 0860037566 | 2019-11-29 12:46:53 | [@saucy_mamiie, pls, log, 0860037566] |
2 | @BongaDlulane Query escalated to media desk. | 2019-11-29 12:46:10 | [@bongadlulane, query, escalated, media, desk.] |
3 | Before leaving the office this afternoon, head... | 2019-11-29 12:33:36 | [leaving, office, afternoon,, heading, weekend... |
4 | #ESKOMFREESTATE #MEDIASTATEMENT : ESKOM SUSPEN... | 2019-11-29 12:17:43 | [#eskomfreestate, #mediastatement, :, eskom, s... |
... | ... | ... | ... |
195 | Eskom's Visitors Centres’ facilities include i... | 2019-11-20 10:29:07 | [eskom's, visitors, centres’, facilities, incl... |
196 | #Eskom connected 400 houses and in the process... | 2019-11-20 10:25:20 | [#eskom, connected, 400, houses, process, conn... |
197 | @ArthurGodbeer Is the power restored as yet? | 2019-11-20 10:07:59 | [@arthurgodbeer, power, restored, yet?] |
198 | @MuthambiPaulina @SABCNewsOnline @IOL @eNCA @e... | 2019-11-20 10:07:41 | [@muthambipaulina, @sabcnewsonline, @iol, @enc... |
199 | RT @GP_DHS: The @GautengProvince made a commit... | 2019-11-20 10:00:09 | [rt, @gp_dhs:, @gautengprovince, commitment, e... |
Expert Solution
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by step
Solved in 4 steps with 1 images
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Recommended textbooks for you
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education