A common problem in textual analysis is to determine the frequency and location of words in a document. The information is stored in a concordance, which lists the distinct words in alphabetical order and makes references to each line on which the word is used. For instance, consider the quotation:   Peter Piper picked a peck of pickled peppers. A peck of pickled peppers Peter Piper picked. If Peter Piper picked a peck of pickled peppers, where is the peck that Peter Piper picked? The word “piper” occurs 4 times in the text and appears on lines 1, 2, and 3. The word “pickled” occurs 3 times and appears on lines 1 and 2. For the text above, the output of the concordance is: Write a c++ program to create a concordance for a text file. Since during the process of building the concordance, it is frequently necessary to look up the word in the current words in the concordance, update concordance entries, as well as insert new words into the concordance, a binary search tree data structure is a good candidate for the application. A word is a consecutive sequence of letters of the alphabet. Your concordance should not be case sensitive (i.e. Pickled and pickled are the same words). Your program should input the text word by word, keeping track of the current line (line 1, line 2, etc.). Extract each word and insert it into a binary search tree. Each node of the tree should have the form:The node should also have a method that returns the key value for the data. The key of data in this application is the word. All the words are arranged in the binary search tree based on the alphabetical order of the words: If the word is encountered the first time, a new data is created. The data includes the word, frequency count of 1, the line number in the list of line numbers, and the line number as the “last line number” see so far. Inserted the new data into the tree. If the word is already in the tree, update the frequency and line number list. After reading the file, print an alphabetized list of words, the frequency count, and the ordered list of lines on which the word occurred. Program Requirements: A node struct is defined with the required data members and key function Use STL list for the list of lines. Write a binary search tree class for organizing and storing the words Develop user defined functions in the main(client) program to support modularity The requirement for this is BST.h,BST.cpp, wordEntry.h, wordEntry.cpp, main.cpp, and text1.dat.   text1.dat Peter Piper picked a peck of pickled peppers. A peck of pickled peppers Peter Piper picked. If Peter Piper picked a peck of pickled peppers, where is the peck that Peter Piper picked?

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Instructions

A common problem in textual analysis is to determine the frequency and location of words in a document. The information is stored in a concordance, which lists the distinct words in alphabetical order and makes references to each line on which the word is used. For instance, consider the quotation:

 
Peter Piper picked a peck of pickled peppers. A peck of pickled
peppers Peter Piper picked. If Peter Piper picked a peck of
pickled peppers, where is the peck that Peter Piper picked?

The word “piper” occurs 4 times in the text and appears on lines 1, 2, and 3. The word “pickled” occurs 3 times and appears on lines 1 and 2.

For the text above, the output of the concordance is:

Write a c++ program to create a concordance for a text file. Since during the process of building the concordance, it is frequently necessary to look up the word in the current words in the concordance, update concordance entries, as well as insert new words into the concordance, a binary search tree data structure is a good candidate for the application. A word is a consecutive sequence of letters of the alphabet. Your concordance should not be case sensitive (i.e. Pickled and pickled are the same words). Your program should input the text word by word, keeping track of the current line (line 1, line 2, etc.). Extract each word and insert it into a binary search tree. Each node of the tree should have the form:The node should also have a method that returns the key value for the data. The key of data in this application is the word. All the words are arranged in the binary search tree based on the alphabetical order of the words:

  1. If the word is encountered the first time, a new data is created. The data includes the word, frequency count of 1, the line number in the list of line numbers, and the line number as the “last line number” see so far. Inserted the new data into the tree.
  2. If the word is already in the tree, update the frequency and line number list. After reading the file, print an alphabetized list of words, the frequency count, and the ordered list of lines on which the word occurred.

Program Requirements:

  • A node struct is defined with the required data members and key function
  • Use STL list for the list of lines.
  • Write a binary search tree class for organizing and storing the words
  • Develop user defined functions in the main(client) program to support modularity

The requirement for this is BST.h,BST.cpp, wordEntry.h, wordEntry.cpp, main.cpp, and text1.dat.

 

text1.dat

Peter Piper picked a peck of pickled peppers. A peck of pickled
peppers Peter Piper picked. If Peter Piper picked a peck of
pickled peppers, where is the peck that Peter Piper picked?

 

 

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
Time complexity
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education