Write Python code to count the frequency of hashtags in a twitter feed. Your code assumes a twitter feed variable tweets exists, which is a list of strings containing tweets. Each element of this list is a single tweet, stored as a string. For example, tweets may look like: tweets = ["Happy #IlliniFriday!", "It is a pretty campus, isn't it, #illini?", "Diving into the last weekend of winter break like... #ILLINI #JoinTheFight", "Are you wearing your Orange and Blue today, #Illini Nation?"]   Your code should produce a sorted list of tuples stored in hashtag_counts, where each tuple looks like (hashtag, count), hashtag is a string and count is an integer. The list should be sorted by count in descending order, and if there are hashtags with identical counts, these should be sorted alphabetically, in ascending order, by hashtag. From the above example, our unsorted hashtag_counts might look like: [('#illini', 2), ('#jointhefight', 1), ('#illinifriday!', 1), ('#illini?', 1)] The hashtag_counts sorted by the above specifications will look like: [('#illini', 2), ('#illini?', 1), ('#illinifriday!', 1), ('#jointhefight', 1)]   You may use str.split() to split each tweet into a list of words. A hashtag is any word that starts with a hash mark (#). (That means that the hash mark # should be included in the hashtag value above.) Steps/Hints: Preprocessing: You will need to convert each hashtag to lower case before you count it. For example, for this question #UIUC and #Uiuc add to the count of same hashtag (#uiuc). Do not further process the tweets or hashtags beyond using .split(), such as attempting to remove punctuation. While in the 'real world' you would absolutely do this, in this problem the autograder will be unhappy with you if you do. And if using .split(), do not pass any arguments (when no arguments are added then every kind of whitespace will be considered). You may find it helpful to use an intermediate data structure for this problem to count the frequency of each hastag. If you aren't sure how to sort or convert to lowercase, you may find Python docs how to sort and Python docs for string methods useful. Optional Practice - Plotting with Matplotlib Try using plt.barh() to plot a histogram illustrating the frequency of each word. Try adding x-axis and y-axis labels and a title to your plot. We won't be grading your plot. However we will be grading plots in future assignments. Therefore we strongly recommend giving this a shot to make sure you are familiar with matplotlib plots, how to add labels, titles, etc. The setup code gives the following variable: Name Type Description tweets list a list of strings containing tweets Your code snippet should define the following variable: Name Type Description hashtag_counts list list of tuples (hashtag,count) where hashtag is a string and count is an integer

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Write Python code to count the frequency of hashtags in a twitter feed.

Your code assumes a twitter feed variable tweets exists, which is a list of strings containing tweets. Each element of this list is a single tweet, stored as a string. For example, tweets may look like:

tweets = ["Happy #IlliniFriday!", "It is a pretty campus, isn't it, #illini?", "Diving into the last weekend of winter break like... #ILLINI #JoinTheFight", "Are you wearing your Orange and Blue today, #Illini Nation?"]

 

Your code should produce a sorted list of tuples stored in hashtag_counts, where each tuple looks like (hashtag, count), hashtag is a string and count is an integer. The list should be sorted by count in descending order, and if there are hashtags with identical counts, these should be sorted alphabetically, in ascending order, by hashtag.

From the above example, our unsorted hashtag_counts might look like:

[('#illini', 2), ('#jointhefight', 1), ('#illinifriday!', 1), ('#illini?', 1)]

The hashtag_counts sorted by the above specifications will look like:

[('#illini', 2), ('#illini?', 1), ('#illinifriday!', 1), ('#jointhefight', 1)]

 

You may use str.split() to split each tweet into a list of words. A hashtag is any word that starts with a hash mark (#). (That means that the hash mark # should be included in the hashtag value above.)

Steps/Hints:

  • Preprocessing: You will need to convert each hashtag to lower case before you count it. For example, for this question #UIUC and #Uiuc add to the count of same hashtag (#uiuc).

  • Do not further process the tweets or hashtags beyond using .split(), such as attempting to remove punctuation. While in the 'real world' you would absolutely do this, in this problem the autograder will be unhappy with you if you do.

  • And if using .split(), do not pass any arguments (when no arguments are added then every kind of whitespace will be considered).

  • You may find it helpful to use an intermediate data structure for this problem to count the frequency of each hastag.

  • If you aren't sure how to sort or convert to lowercase, you may find Python docs how to sort and Python docs for string methods useful.

Optional Practice - Plotting with Matplotlib

  • Try using plt.barh() to plot a histogram illustrating the frequency of each word. Try adding x-axis and y-axis labels and a title to your plot.

  • We won't be grading your plot. However we will be grading plots in future assignments. Therefore we strongly recommend giving this a shot to make sure you are familiar with matplotlib plots, how to add labels, titles, etc.

The setup code gives the following variable:

Name Type Description
tweets list a list of strings containing tweets

Your code snippet should define the following variable:

Name Type Description
hashtag_counts list list of tuples (hashtag,count) where hashtag is a string and count is an integer
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 4 steps with 2 images

Blurred answer
Follow-up Questions
Read through expert solutions to related follow-up questions below.
Follow-up Question
  • Max points: 30
  • Earned points: 0
  • Message
    Feedback for case 0 --------------------- hashtag_counts is not a list. Feedback for case 1 --------------------- hashtag_counts is not a list. Feedback for case 2 --------------------- hashtag_counts is not a list.
Solution
Bartleby Expert
SEE SOLUTION
Knowledge Booster
Time complexity
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education