Viterbi algorithmYou will develop a first-order HMM (Hidden Markov Model) for POS (part of speech)tagging in Python. This involves:• counting occurrences of one part of speech following another in a training corpus,• counting occurrences of words together with parts of speech in a training corpus,• relative frequency estimation with smoothing,• finding the best sequence of parts of speech for a list of words in the test corpus,according to a HMM model with smoothed probabilities,• computing the accuracy, that is, the percentage of parts of speech that is guessedcorrectly.As discussed in the lectures, smoothing is necessary to avoid zero probabilities forevents that were not witnessed in the training corpus. Rather than implementing aform of smoothing yourself, you can for this assignment take the implementation ofWitten-Bell smoothing in NLTK (among the forms of smoothing in NLTK, this seemsto be the most robust one). An example of use for emission probabilities is in filesmoothing.py; one can similarly apply smoothing to transition probabilities.Run your application on the English (EWT) training and testing corpora. Youshould get an accuracy above 89%. If your accuracy is much lower, then you areprobably doing something wrong.Comparisons between languagesInvestigate, by visual inspection and by computational means, the upos parts of speechin different treebanks from Universal Dependencies. (Take a few languages based onyour own interests, but no more than about 10. Go for the quality of your submission,not quantity!) Two examples of specific questions you could address:• Which of the chosen languages have a rich morphology and which have a poormorphology?• How similar are the chosen languages, in terms of bigram models of their partsof speech?For the first question, know that you can access the lemma of a token bytoken ['lemma']. What can you say about the relation between forms and lemmasin the case of languages with rich morphology?2For the second question, consider that the transition probabilities of two relatedlanguages may be very similar, even though the emission probabilities may be incom-parable due to the mostly disjoint vocabularies. How could we measure the similaritybetween two bigram models trained from corpora?Feel free to think of further questions to address. It is worth noting that next to the('universal') upos tags, the Universal Dependencies treebanks sometimes also containlanguage-specific (xpos) tags.

Answered: Image /qna-images/answer/82f9b13b-5c13-4359-9807-9072566b3145.jpg

Answered: counting occurrences of one part of…

Computer Networking: A Top-Down Approach (7th Edition)

7th Edition

ISBN:9780133594140

Author:James Kurose, Keith Ross

Publisher:James Kurose, Keith Ross

Chapter1: Computer Networks And The Internet

Section: Chapter Questions

Problem R1RQ: What is the difference between a host and an end system? List several different types of end...

See similar textbooks

Similar questions

PLEASE ANSWR WITH CLEAR EXPLAINATION!!!
How do I answer this question?
Given a grammar of the form: -> | > 0 | 1 | 2 1...19 where ! means negation (i.e., !3 means -3, !7 means -7 and so on). - | + 1. Derive the string !!9+4-12 and evaluate it arithmetically (assume standard arithmetic mean- ings for all terminals while evaluating). 2. State whether it is ambiguous or unambiguous. If it is ambiguous give an equivalent unam- biguous grammar. If it is unambiguous, states all the precedences and associativities enforced by the grammar.

Recommended textbooks for you

Computer Networking: A Top-Down Approach (7th Edi…

Computer Engineering

ISBN:

9780133594140

Author:

James Kurose, Keith Ross

Publisher:

PEARSON

Computer Organization and Design MIPS Edition, Fi…

Computer Engineering

ISBN:

9780124077263

Author:

David A. Patterson, John L. Hennessy

Publisher:

Elsevier Science

Network+ Guide to Networks (MindTap Course List)

Computer Engineering

ISBN:

9781337569330

Author:

Jill West, Tamara Dean, Jean Andrews

Publisher:

Cengage Learning

Computer Networking: A Top-Down Approach (7th Edi…

Computer Engineering

ISBN:

9780133594140

Author:

James Kurose, Keith Ross

Publisher:

PEARSON

Computer Organization and Design MIPS Edition, Fi…

Computer Engineering

ISBN:

9780124077263

Author:

David A. Patterson, John L. Hennessy

Publisher:

Elsevier Science

Network+ Guide to Networks (MindTap Course List)

Computer Engineering

ISBN:

9781337569330

Author:

Jill West, Tamara Dean, Jean Andrews

Publisher:

Cengage Learning

Concepts of Database Management

Computer Engineering

ISBN:

9781337093422

Author:

Joy L. Starks, Philip J. Pratt, Mary Z. Last

Publisher:

Cengage Learning

Prelude to Programming

Computer Engineering

ISBN:

9780133750423

Author:

VENIT, Stewart

Publisher:

Pearson Education

Sc Business Data Communications and Networking, T…

Computer Engineering

ISBN:

9781119368830

Author:

FITZGERALD

Publisher:

WILEY

SEE MORE TEXTBOOKS

counting occurrences of one part of speech following another in a training corpus, counting occurrences of words together with parts of speech in a training corpus, • relative frequency estimation with smoothing, Gndine the hoot gogm ne of norte of anooch for o liat of worda in ho toat 0orn110