The data file includes the text of three books of the Bible (Joshua, Jonah and Philippians) using the ESV translation.  While these are all great books, our only interest for this project is how often each letter is used.  1) In the Word file containing the Biblical text, use the “Find” feature to identify how many times each letter occurs (i.e. the letter’s frequency).  Create an Excel spreadsheet to display the number of occurrences of each letter in the English alphabet.  Using the find feature, here is the amount for each letter in alphabet  A 1810 B 323 C 442 D 1097 E 2845 F 609 G 416 H 1689 I 1381 J 134 K 143 L 935 M 586 N 1503 O 2237 P 379 Q 5 R 1362 S 1407 T 2235 U 703 V 257 W 513 X 17 Z 6 2) In the Excel spreadsheet, sum your frequencies to compute the total number of letters in the 3 books (this is n).  a) In your spreadsheet, use the formula to compute the sample proportion of each letter’s appearances relative to total number of letters (i.e. find the relative frequency of each letter). Use the Excel sorting function to sort the letters in order of their frequencies.  b) Use the simple Confidence Interval (CI) formula to find a 95% CI on the proportion of how often each letter is used in English text in general. Enter the lower bound in the first Excel column (using   ) and the upper bound in the next column (using  ).  3)  Identify those letters whose Cls do not overlap with any the CIs of any of the other letters.  (For example the CI [0.042, 0.052] overlaps with [0.050, 0.060] because the upper bound of the first CI is greater than the lower bound of the second CI.)  List the letters with the non-overlapping Cis and specify how many such letters there are.  4) The previous analysis could be useful if our goal was to decipher an encrypted message, where each letter is scrambled (for example, each “a” might become a “g”, while each “b” might become an “o” and so forth).  a) Assume that the letter “z” in encrypted message has a relative frequency of 0.06 (it accounts for 6% of the total number of letters). Which letter’s Confidence Intervals found in question 2 include 0.06 and thus are the most likely candidates to be the letter which was encrypted as “z”? b) Further assume that “y” in the encrypted message has a relative frequency of 0.04 (4%). Which letter’s question 2 CIs include 0.04? c) If “x” in the encrypted message has a relative frequency of 0.02 (2%), which letter’s question 2 CIs include 0.02? 5) a) How many possible ways are there to assign actual letters of the text to the encrypted letters in a message? (Hint: “A” could be assigned to any one of the 26 letters, including itself.  Once “A” has been assigned, “B” can be assigned to any letter except the letter that corresponds to “A”).  b) As your answer to part (a) makes clear, there are a super-high number of possible ways all the letters could be assigned.  Knowing something about each letter’s relative frequency dramatically reduces the number of likely combinations.  For example, if there were only 3 possible options for each encrypted letter in the message, then how many possible ways would there be to assign real letters to the letter in the encrypted message?   (I just need help with parts 3 - 5b)

MATLAB: An Introduction with Applications
6th Edition
ISBN:9781119256830
Author:Amos Gilat
Publisher:Amos Gilat
Chapter1: Starting With Matlab
Section: Chapter Questions
Problem 1P
icon
Related questions
Question
100%

The data file includes the text of three books of the Bible (Joshua, Jonah and Philippians) using the ESV translation.  While these are all great books, our only interest for this project is how often each letter is used. 

1) In the Word file containing the Biblical text, use the “Find” feature to identify how many times each letter occurs (i.e. the letter’s frequency).  Create an Excel spreadsheet to display the number of occurrences of each letter in the English alphabet. 

Using the find feature, here is the amount for each letter in alphabet 

A

1810

B

323

C

442

D

1097

E

2845

F

609

G

416

H

1689

I

1381

J

134

K

143

L

935

M

586

N

1503

O

2237

P

379

Q

5

R

1362

S

1407

T

2235

U

703

V

257

W

513

X

17

Z

6

2) In the Excel spreadsheet, sum your frequencies to compute the total number of letters in the 3 books (this is n). 

a) In your spreadsheet, use the formula to compute the sample proportion of each letter’s appearances relative to total number of letters (i.e. find the relative frequency of each letter). Use the Excel sorting function to sort the letters in order of their frequencies. 

b) Use the simple Confidence Interval (CI) formula to find a 95% CI on the proportion of how often each letter is used in English text in general. Enter the lower bound in the first Excel column (using   ) and the upper bound in the next column (using  ). 

3)  Identify those letters whose Cls do not overlap with any the CIs of any of the other letters.  (For example the CI [0.042, 0.052] overlaps with [0.050, 0.060] because the upper bound of the first CI is greater than the lower bound of the second CI.)  List the letters with the non-overlapping Cis and specify how many such letters there are. 

4) The previous analysis could be useful if our goal was to decipher an encrypted message, where each letter is scrambled (for example, each “a” might become a “g”, while each “b” might become an “o” and so forth). 

a) Assume that the letter “z” in encrypted message has a relative frequency of 0.06 (it accounts for 6% of the total number of letters). Which letter’s Confidence Intervals found in question 2 include 0.06 and thus are the most likely candidates to be the letter which was encrypted as “z”?

b) Further assume that “y” in the encrypted message has a relative frequency of 0.04 (4%). Which letter’s question 2 CIs include 0.04?

c) If “x” in the encrypted message has a relative frequency of 0.02 (2%), which letter’s question 2 CIs include 0.02?

5) a) How many possible ways are there to assign actual letters of the text to the encrypted letters in a message? (Hint: “A” could be assigned to any one of the 26 letters, including itself.  Once “A” has been assigned, “B” can be assigned to any letter except the letter that corresponds to “A”). 

b) As your answer to part (a) makes clear, there are a super-high number of possible ways all the letters could be assigned.  Knowing something about each letter’s relative frequency dramatically reduces the number of likely combinations.  For example, if there were only 3 possible options for each encrypted letter in the message, then how many possible ways would there be to assign real letters to the letter in the encrypted message?

 

(I just need help with parts 3 - 5b)

The data file includes the text of three books of the Bible (Joshua, Jonah and Philippians) using
the ESV translation. While these are all great books, our only interest for this project is how
often each letter is used.
1) In the Word file containing the Biblical text, use the "Find" feature to identify how many
times each letter occurs (i.e. the letter's frequency). Create an Excel spreadsheet to display the
number of occurrences of each letter in the English alphabet.
2) In the Excel spreadsheet, sum your frequencies to compute the total number of letters in the 3
books (this is n).
a) In your spreadsheet, use the formula p = to compute the sample proportion of each letter's
appearances relative to total number of letters (i.e. find the relative frequency of each letter).
Use the Excel sorting function to sort the letters in order of their frequencies.
b) Use the simple Confidence Interval (CI) formula ( p – 1.96,
p + 1.96
to find a 95%
CI on the proportion of how often each letter is used in English text in general. Enter the lower
bound in the first Excel column (using p – 1.96
" ) and the upper bound in the next column
bd
(using p + 1.96.
n
Transcribed Image Text:The data file includes the text of three books of the Bible (Joshua, Jonah and Philippians) using the ESV translation. While these are all great books, our only interest for this project is how often each letter is used. 1) In the Word file containing the Biblical text, use the "Find" feature to identify how many times each letter occurs (i.e. the letter's frequency). Create an Excel spreadsheet to display the number of occurrences of each letter in the English alphabet. 2) In the Excel spreadsheet, sum your frequencies to compute the total number of letters in the 3 books (this is n). a) In your spreadsheet, use the formula p = to compute the sample proportion of each letter's appearances relative to total number of letters (i.e. find the relative frequency of each letter). Use the Excel sorting function to sort the letters in order of their frequencies. b) Use the simple Confidence Interval (CI) formula ( p – 1.96, p + 1.96 to find a 95% CI on the proportion of how often each letter is used in English text in general. Enter the lower bound in the first Excel column (using p – 1.96 " ) and the upper bound in the next column bd (using p + 1.96. n
3) Identify those letters whose Cls do not overlap with any the CIs of any of the other letters.
(For example the CI [0.042, 0.052] overlaps with [0.050, 0.060] because the upper bound of the
first CI is greater than the lower bound of the second CI.) List the letters with the non-
overlapping Cis and specify how many such letters there are.
4) The previous analysis could be useful if our goal was to decipher an encrypted message,
where each letter is scrambled (for example, each "a" might become a "g", while each "b" might
become an “o" and so forth).
a) Assume that the letter "z" in encrypted message has a relative frequency of 0.06 (it accounts
for 6% of the total number of letters). Which letter's Confidence Intervals found in question 2
include 0.06 and thus are the most likely candidates to be the letter which was encrypted as “z"?
b) Further assume that “y" in the encrypted message has a relative frequency of 0.04 (4%).
Which letter's question 2 CIs include 0.04?
c) If “x" in the encrypted message has a relative frequency of 0.02 (2%), which letter's question
2 CIs include 0.02?
5) a) How many possible ways are there to assign actual letters of the text to the encrypted letters
in a message? (Hint: “A" could be assigned to any one of the 26 letters, including itself. Once
"A" has been assigned, "B" can be assigned to any letter except the letter that corresponds to
"A").
b) As your answer to part (a) makes clear, there are a super-high number of possible ways all the
letters could be assigned. Knowing something about each letter's relative frequency
dramatically reduces the number of likely combinations. For example, if there were only 3
possible options for each encrypted letter in the message, then how many possible ways would
there be to assign real letters to the letter in the encrypted message?
Transcribed Image Text:3) Identify those letters whose Cls do not overlap with any the CIs of any of the other letters. (For example the CI [0.042, 0.052] overlaps with [0.050, 0.060] because the upper bound of the first CI is greater than the lower bound of the second CI.) List the letters with the non- overlapping Cis and specify how many such letters there are. 4) The previous analysis could be useful if our goal was to decipher an encrypted message, where each letter is scrambled (for example, each "a" might become a "g", while each "b" might become an “o" and so forth). a) Assume that the letter "z" in encrypted message has a relative frequency of 0.06 (it accounts for 6% of the total number of letters). Which letter's Confidence Intervals found in question 2 include 0.06 and thus are the most likely candidates to be the letter which was encrypted as “z"? b) Further assume that “y" in the encrypted message has a relative frequency of 0.04 (4%). Which letter's question 2 CIs include 0.04? c) If “x" in the encrypted message has a relative frequency of 0.02 (2%), which letter's question 2 CIs include 0.02? 5) a) How many possible ways are there to assign actual letters of the text to the encrypted letters in a message? (Hint: “A" could be assigned to any one of the 26 letters, including itself. Once "A" has been assigned, "B" can be assigned to any letter except the letter that corresponds to "A"). b) As your answer to part (a) makes clear, there are a super-high number of possible ways all the letters could be assigned. Knowing something about each letter's relative frequency dramatically reduces the number of likely combinations. For example, if there were only 3 possible options for each encrypted letter in the message, then how many possible ways would there be to assign real letters to the letter in the encrypted message?
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 4 steps

Blurred answer
Similar questions
Recommended textbooks for you
MATLAB: An Introduction with Applications
MATLAB: An Introduction with Applications
Statistics
ISBN:
9781119256830
Author:
Amos Gilat
Publisher:
John Wiley & Sons Inc
Probability and Statistics for Engineering and th…
Probability and Statistics for Engineering and th…
Statistics
ISBN:
9781305251809
Author:
Jay L. Devore
Publisher:
Cengage Learning
Statistics for The Behavioral Sciences (MindTap C…
Statistics for The Behavioral Sciences (MindTap C…
Statistics
ISBN:
9781305504912
Author:
Frederick J Gravetter, Larry B. Wallnau
Publisher:
Cengage Learning
Elementary Statistics: Picturing the World (7th E…
Elementary Statistics: Picturing the World (7th E…
Statistics
ISBN:
9780134683416
Author:
Ron Larson, Betsy Farber
Publisher:
PEARSON
The Basic Practice of Statistics
The Basic Practice of Statistics
Statistics
ISBN:
9781319042578
Author:
David S. Moore, William I. Notz, Michael A. Fligner
Publisher:
W. H. Freeman
Introduction to the Practice of Statistics
Introduction to the Practice of Statistics
Statistics
ISBN:
9781319013387
Author:
David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:
W. H. Freeman