Consider a text corpus consisting of N tokens of d distinct words and the number of times each distinct word w appears is given by . We want to apply a version of Laplace smoothing that estimates a word's probability as: xw+a/ N+ad for some constant a (Laplace recommended a = 1, but other values are possible.) In the following problems, assume N is 100,000, d is 10,000 and a is 2. A. Give both the unsmoothed maximum likelihood probability estimate and the Laplace smoothed estimate of a word that appears 1,000 times in the corpus. B. Do the same for a word that does not appear at all. C. You are running a Naive Bayes text classifier with Laplace Smoothing, and you suspect that you are overfitting the data. How would you increase or decrease the parameter a? D. Could increasing a increase or decrease training set error? Increase or decrease validation set error?
Consider a text corpus consisting of N tokens of d distinct words and the number of times each distinct word w appears is given by . We want to apply a version of Laplace smoothing that estimates a word's probability as: xw+a/ N+ad for some constant a (Laplace recommended a = 1, but other values are possible.) In the following problems, assume N is 100,000, d is 10,000 and a is 2.
A. Give both the unsmoothed maximum likelihood probability estimate and the Laplace smoothed estimate of a word that appears 1,000 times in the corpus.
B. Do the same for a word that does not appear at all.
C. You are running a Naive Bayes text classifier with Laplace Smoothing, and you suspect that you are overfitting the data. How would you increase or decrease the parameter a?
D. Could increasing a increase or decrease training set error? Increase or decrease validation set error?
![](/static/compass_v2/shared-icons/check-mark.png)
Trending now
This is a popular solution!
Step by step
Solved in 2 steps
![Blurred answer](/static/compass_v2/solution-images/blurred-answer.jpg)
I actually forgot to type "xw" after "given by" in my question. Consider a text corpus consisting of N tokens of d distinct words and the number of times each distinct word w appears is given by xw.
![Database System Concepts](https://www.bartleby.com/isbn_cover_images/9780078022159/9780078022159_smallCoverImage.jpg)
![Starting Out with Python (4th Edition)](https://www.bartleby.com/isbn_cover_images/9780134444321/9780134444321_smallCoverImage.gif)
![Digital Fundamentals (11th Edition)](https://www.bartleby.com/isbn_cover_images/9780132737968/9780132737968_smallCoverImage.gif)
![Database System Concepts](https://www.bartleby.com/isbn_cover_images/9780078022159/9780078022159_smallCoverImage.jpg)
![Starting Out with Python (4th Edition)](https://www.bartleby.com/isbn_cover_images/9780134444321/9780134444321_smallCoverImage.gif)
![Digital Fundamentals (11th Edition)](https://www.bartleby.com/isbn_cover_images/9780132737968/9780132737968_smallCoverImage.gif)
![C How to Program (8th Edition)](https://www.bartleby.com/isbn_cover_images/9780133976892/9780133976892_smallCoverImage.gif)
![Database Systems: Design, Implementation, & Manag…](https://www.bartleby.com/isbn_cover_images/9781337627900/9781337627900_smallCoverImage.gif)
![Programmable Logic Controllers](https://www.bartleby.com/isbn_cover_images/9780073373843/9780073373843_smallCoverImage.gif)