I need help urgently to be able to understand how to do this program it is written in C++ but I am currently lost as to how I should start or do it or what is it even asking I would highly appreciate any good help, thank you Simple Data CompressionCS 10C Programming AssignmentHuffman coding is used to compress data. The idea is straightforward: represent more commonlonger strings with shorter ones via a basic translation matrix. The translation matrix is easilycomputed from the data itself by counting and sorting by frequency.For example, in a well-known corpus used in Natural Language Processing called the "Brown"corpus (see nltk.org), the top-20 most frequent tokens, which are words or punctuation marksare listed below associated with frequency and code. The word "and" for example requireswriting three characters. However, if I encoded it differently, say, using the word "5" (yes, Icalled "5" a word on purpose), then I save having to write two extra characters! Note, the word"and" is so frequent, I save those two extra characters many times over!Token FrequencyCodethe627131583342493463of360804and279325to257326a218817in19536that102379is1001110was977711for884112883713878914The 725815with 70121667236706it17as18he656619his646620So the steps of Huffman coding are relatively straightforward:1. Pass through the data once, collecting a list of token-frequency counts.2. Sort the token-frequency counts by frequency, in descending order.3. Assign codes to tokens using a simple counter, for example by incrementing over theintegers; this is just to keep things simple. 4. Store the new mapping (token -> code) in a hashtable called "encoder".5. Store the reverse mapping (code -> token) in a hashtable called "decoder".6. Pass through the data a second time. This time, replace all tokens with their codes.Now, be amazed at how much you've shrunk your data!Delivery Notes:(1) Implement your own hashtable from scratch, you are not allowed to use existing hashtable libraries.(2) To be useful, your output should include the coded data as well as the decoder (code ->token) mapping file.Now GZIP all that and watch it shrink immensely!

Every hash-table stores data in the form of a (key, value) combination. Interestingly every key is…

Answered: 4. Store the new mapping (token ->…

Database System Concepts

7th Edition

ISBN:9780078022159

Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Chapter1: Introduction

Section: Chapter Questions

Problem 1PE

See similar textbooks

Similar questions

Recommended textbooks for you

Database System Concepts

Computer Science

ISBN:

9780078022159

Author:

Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:

McGraw-Hill Education

Starting Out with Python (4th Edition)

Computer Science

ISBN:

9780134444321

Author:

Tony Gaddis

Publisher:

PEARSON

Digital Fundamentals (11th Edition)

Computer Science

ISBN:

9780132737968

Author:

Thomas L. Floyd

Publisher:

PEARSON

Database System Concepts

Computer Science

ISBN:

9780078022159

Author:

Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:

McGraw-Hill Education

Starting Out with Python (4th Edition)

Computer Science

ISBN:

9780134444321

Author:

Tony Gaddis

Publisher:

PEARSON

Digital Fundamentals (11th Edition)

Computer Science

ISBN:

9780132737968

Author:

Thomas L. Floyd

Publisher:

PEARSON

C How to Program (8th Edition)

Computer Science

ISBN:

9780133976892

Author:

Paul J. Deitel, Harvey Deitel

Publisher:

PEARSON

Database Systems: Design, Implementation, & Manag…

Computer Science

ISBN:

9781337627900

Author:

Carlos Coronel, Steven Morris

Publisher:

Cengage Learning

Programmable Logic Controllers

Computer Science

ISBN:

9780073373843

Author:

Frank D. Petruzella

Publisher:

McGraw-Hill Education

SEE MORE TEXTBOOKS