This
The field of information retrieval is concerned with finding relevant electronic documents based upon a query. For example, given a group of keywords (the query), a search engine retrieves Web pages (documents)and displays them sorted by relevance to the query. This technology requires a way to compare a document with the query to see which is most relevant to the query.
A simple way to make this comparison is to compute the binary cosine coefficient. The coefficient is a value between 0 and 1, where 1 indicates that the query is very similar to the document and 0 indicates that the query has no keywords in common with the document. This approach treats each document as a set of words. For example, given the following sample document:
“Chocolate ice cream, chocolate milk, and chocolate bars are delicious.” This document would be parsed into keywords where case is ignored and punctuation discarded and turned into the set containing the words {chocolate, ice, cream, milk, and, bars, are, delicious}. An identical processis performed on the query to turn it into a set of strings. Once we have a query Q represented as a set of words and a document D represented as a set of words, the similarity between Q and D is computed by:
Sim=| Q∩D || Q | | D |
Modify the StringSet from Programming Project 12 by adding an additional member function that computes the similarity between the current StringSet and an input parameter of type StringSet. The sqrt function is in the cmath library.
Create two text files on your disk named Document1.txt and Document2.txt. Write some text content of your choice in each file, but make sure that each file contains different content. Next, write a program that allows the user to input from the keyboard a set of strings that represents a query. The program should then compare the query to both text files on the disk and output the similarity to each one using the binary cosine coefficient. Test your program with different queries to see if the similarity metric is working correctly.
Want to see the full answer?
Check out a sample textbook solutionChapter 11 Solutions
Problem Solving with C++ (10th Edition)
Additional Engineering Textbook Solutions
Starting Out with Programming Logic and Design (5th Edition) (What's New in Computer Science)
Database Concepts (8th Edition)
Modern Database Management
Java: An Introduction to Problem Solving and Programming (8th Edition)
Electric Circuits. (11th Edition)
Thinking Like an Engineer: An Active Learning Approach (4th Edition)
- Based on the given problem, create an algorithm and a block diagram, and write the program code: Function: y=xsinx Interval: [0,π] Requirements: Create a graph of the function. Show the coordinates (x and y). Choose your own scale and show it in the block diagram. Create a block diagram based on the algorithm. Write the program code in Python. Requirements: Each step in the block diagram must be clearly shown. The graph of the function must be drawn and saved (in PNG format). Write the code in a modular way (functions and the main part should be separate). Please explain and describe the results in detail.arrow_forwardBased on the given problem, create an algorithm and a block diagram, and write the program code: Function: y=xsinx Interval: [0,π] Requirements: Create a graph of the function. Show the coordinates (x and y). Choose your own scale and show it in the block diagram. Create a block diagram based on the algorithm. Write the program code in Python. Requirements: Each step in the block diagram must be clearly shown. The graph of the function must be drawn and saved (in PNG format). Write the code in a modular way (functions and the main part should be separate). Please explain and describe the results in detail.arrow_forwardQuestion: Based on the given problem, create an algorithm and a block diagram, and write the program code: Function: y=xsinx Interval: [0,π] Requirements: Create a graph of the function. Show the coordinates (x and y). Choose your own scale and show it in the block diagram. Create a block diagram based on the algorithm. Write the program code in Python. Requirements: Each step in the block diagram must be clearly shown. The graph of the function must be drawn and saved (in PNG format). Write the code in a modular way (functions and the main part should be separate). Please explain and describe the results in detail.arrow_forward
- 23:12 Chegg content://org.teleg + 5G 5G 80% New question A feed of 60 mol% methanol in water at 1 atm is to be separated by dislation into a liquid distilate containing 98 mol% methanol and a bottom containing 96 mol% water. Enthalpy and equilibrium data for the mixture at 1 atm are given in Table Q2 below. Ask an expert (a) Devise a procedure, using the enthalpy-concentration diagram, to determine the minimum number of equilibrium trays for the condition of total reflux and the required separation. Show individual equilibrium trays using the the lines. Comment on why the value is Independent of the food condition. Recent My stuff Mol% MeOH, Saturated vapour Table Q2 Methanol-water vapour liquid equilibrium and enthalpy data for 1 atm Enthalpy above C˚C Equilibrium dala Mol% MeOH in Saturated liquid TC kJ mol T. "Chk kot) Liquid T, "C 0.0 100.0 48.195 100.0 7.536 0.0 0.0 100.0 5.0 90.9 47,730 928 7,141 2.0 13.4 96.4 Perks 10.0 97.7 47,311 87.7 8,862 4.0 23.0 93.5 16.0 96.2 46,892 84.4…arrow_forwardYou are working with a database table that contains customer data. The table includes columns about customer location such as city, state, and country. You want to retrieve the first 3 letters of each country name. You decide to use the SUBSTR function to retrieve the first 3 letters of each country name, and use the AS command to store the result in a new column called new_country. You write the SQL query below. Add a statement to your SQL query that will retrieve the first 3 letters of each country name and store the result in a new column as new_country.arrow_forwardWe are considering the RSA encryption scheme. The involved numbers are small, so the communication is insecure. Alice's public key (n,public_key) is (247,7). A code breaker manages to factories 247 = 13 x 19 Determine Alice's secret key. To solve the problem, you need not use the extended Euclid algorithm, but you may assume that her private key is one of the following numbers 31,35,55,59,77,89.arrow_forward
- Consider the following Turing Machine (TM). Does the TM halt if it begins on the empty tape? If it halts, after how many steps? Does the TM halt if it begins on a tape that contains a single letter A followed by blanks? Justify your answer.arrow_forwardPllleasassseee ssiiirrrr soolveee thissssss questionnnnnnnarrow_forwardPllleasassseee ssiiirrrr soolveee thissssss questionnnnnnnarrow_forward
- Pllleasassseee ssiiirrrr soolveee thissssss questionnnnnnnarrow_forwardPllleasassseee ssiiirrrr soolveee thissssss questionnnnnnnarrow_forward4. def modify_data(x, my_list): X = X + 1 my_list.append(x) print(f"Inside the function: x = {x}, my_list = {my_list}") num = 5 numbers = [1, 2, 3] modify_data(num, numbers) print(f"Outside the function: num = {num}, my_list = {numbers}") Classe Classe that lin Thus, A pro is ref inter Ever dict The The output: Inside the function:? Outside the function:?arrow_forward
- C++ Programming: From Problem Analysis to Program...Computer ScienceISBN:9781337102087Author:D. S. MalikPublisher:Cengage LearningC++ for Engineers and ScientistsComputer ScienceISBN:9781133187844Author:Bronson, Gary J.Publisher:Course Technology PtrNew Perspectives on HTML5, CSS3, and JavaScriptComputer ScienceISBN:9781305503922Author:Patrick M. CareyPublisher:Cengage Learning
- Programming Logic & Design ComprehensiveComputer ScienceISBN:9781337669405Author:FARRELLPublisher:CengageEBK JAVA PROGRAMMINGComputer ScienceISBN:9781337671385Author:FARRELLPublisher:CENGAGE LEARNING - CONSIGNMENTMicrosoft Visual C#Computer ScienceISBN:9781337102100Author:Joyce, Farrell.Publisher:Cengage Learning,