"### Exercise 1 - Text Retrieval ###\n", "One important task in information retrieval is to find news that are more important for a user. The idea is to give a set of keywords and test each news and find the ones where the keywords appears more frequently.\n", "\n", "Write a program that reads a file and uses three keywords and checks how many times each word appears in the text. Run your code in the three files: nytimes.txt, bostonGlobe.txt. and washington Post.txt and indicate which one has more important news using the keywords: election, inflation and climate.\n", "\n", "Notice that we are not interested in counting all the words in the file, just the keywords, so for each file build a dictionary with the keywords and count how many times each. word shows up in the file. At the end print all three dictionaries to see what newspaper has the best news to read. \n", "Notice that news text is not encoded in regular format, they usually use UTF-8 (to display text in the web browser), so, when opening a file use: \n", "\n", "fhand-open(filename, encoding=\"utf-8\")\n", "HINT: read the whole file at once and split the string to check each word in the file.\n", "\n", "You should get the following count for each file, in order: \n", "[{'inflation': 13), ('election': 8), ('climate': 30, 'inflation': 1)]\n", \n", "\n"
"### Exercise 1 - Text Retrieval ###\n", "One important task in information retrieval is to find news that are more important for a user. The idea is to give a set of keywords and test each news and find the ones where the keywords appears more frequently.\n", "\n", "Write a program that reads a file and uses three keywords and checks how many times each word appears in the text. Run your code in the three files: nytimes.txt, bostonGlobe.txt. and washington Post.txt and indicate which one has more important news using the keywords: election, inflation and climate.\n", "\n", "Notice that we are not interested in counting all the words in the file, just the keywords, so for each file build a dictionary with the keywords and count how many times each. word shows up in the file. At the end print all three dictionaries to see what newspaper has the best news to read. \n", "Notice that news text is not encoded in regular format, they usually use UTF-8 (to display text in the web browser), so, when opening a file use: \n", "\n", "fhand-open(filename, encoding=\"utf-8\")\n", "HINT: read the whole file at once and split the string to check each word in the file.\n", "\n", "You should get the following count for each file, in order: \n", "[{'inflation': 13), ('election': 8), ('climate': 30, 'inflation': 1)]\n", \n", "\n"
Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
Related questions
Question
python
![### Exercise 1 - Text Retrieval ###
One important task in information retrieval is to find news that are more important for a user. The idea is to give a set of keywords and test each news and find the ones where the keywords appear more frequently.
Write a program that reads a file and uses three keywords and checks how many times each word appears in the text. Run your code in the three files: nytimes.txt, bostonGlobe.txt, and washingtonPost.txt and indicate which one has more important news using the keywords: election, inflation, and climate.
Notice that we are not interested in counting all the words in the file, just the keywords, so for each file build a dictionary with the keywords and count how many times each word shows up in the file. At the end print all three dictionaries to see what newspaper has the best news to read.
Notice that news text is not encoded in regular format, they usually use UTF-8 (to display text in the web browser), so, when opening a file use:
```
filename=open(filename, encoding="utf-8")
```
Hint: read the whole file at once and split the string to check each word in the file.
You should get the following count for each file, in order:
```
[{'inflation': 13}, {'election': 8}, {'climate': 30, 'inflation': 1}]
```](/v2/_next/image?url=https%3A%2F%2Fcontent.bartleby.com%2Fqna-images%2Fquestion%2F918808b3-3bab-4c24-a49a-fe439aaeecc0%2Fcef69d71-5349-4bdd-ae7c-ef3f776093ae%2Fi6qc8uk_processed.png&w=3840&q=75)
Transcribed Image Text:### Exercise 1 - Text Retrieval ###
One important task in information retrieval is to find news that are more important for a user. The idea is to give a set of keywords and test each news and find the ones where the keywords appear more frequently.
Write a program that reads a file and uses three keywords and checks how many times each word appears in the text. Run your code in the three files: nytimes.txt, bostonGlobe.txt, and washingtonPost.txt and indicate which one has more important news using the keywords: election, inflation, and climate.
Notice that we are not interested in counting all the words in the file, just the keywords, so for each file build a dictionary with the keywords and count how many times each word shows up in the file. At the end print all three dictionaries to see what newspaper has the best news to read.
Notice that news text is not encoded in regular format, they usually use UTF-8 (to display text in the web browser), so, when opening a file use:
```
filename=open(filename, encoding="utf-8")
```
Hint: read the whole file at once and split the string to check each word in the file.
You should get the following count for each file, in order:
```
[{'inflation': 13}, {'election': 8}, {'climate': 30, 'inflation': 1}]
```
Expert Solution

This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by step
Solved in 4 steps with 2 images

Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Recommended textbooks for you

Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education

Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON

Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON

Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education

Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON

Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON

C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON

Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning

Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education