We will create a function ( avg_sentence_1len ) to calculate the average sentence length across a piece of text. This function should take text as an input parameter. Within this function: 1. sentences : Use the split() string method to split the input text at every .. This will split the text into a list of sentences. Store this in the variable sentences . To keep things simple, we will consider every "." as a sentence separator. (This decision could lead to misleading answers. For example, "Hello Dr. Jacob." is actually a single sentence, but our function will consider this 2 separate sentences). 2. words : Use the split() method to split the input text into a list of separate words, storing this in words. Again, to limit complexity, we will assume that all words are separated by a single space (" "). (So, while "l am going.to see you later" actualy has 7 words, since there is no space after the "", so we will assume the this to contain 6 separate words in our function.) 3. Calculate the average sentence length, return ing this from your function: 4. if the last value in sentences is an empty string: the average sentence length should be the number of words divided by the len(sentences) - 1. 5. otherwise, the average sentence length should be the number of words divided by the number of sentences. For the "I am going to see you later" example, your function should return 1.5.

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question
1c) Average sentence length
We will create a function ( avg_sentence_len ) to calculate the average sentence length across a piece of text.
This function should take text as an input parameter.
Within this function:
1. sentences : Use the split() string method to split the input text at every '.'. This will split the text into a list of sentences. Store this in the
variable sentences . To keep things simple, we will consider every "." as a sentence separator. (This decision could lead to misleading answers. For
example, "Hello Dr. Jacob." is actually a single sentence, but our function will consider this 2 separate sentences).
2. words : Use the split() method to split the input text into a list of separate words, storing this in words . Again, to limit complexity, we will assume
that all words are separated by a single space (" "). (So, while "I am going.to see you later" actually has 7 words, since there is no space after the ".", so
we will assume the this to contain 6 separate words in our function.)
3. Calculate the average sentence length, return ing this from your function:
4. if the last value in sentences is an empty string: the average sentence length should be the number of words divided by the len(sentences) - 1.
5. otherwise, the average sentence length should be the number of words divided by the number of sentences.
For the "I am going.to see you later" example, your function should return 1.5.
In [16]: def avg_sentence_len(s):
sentences=s.split(".")
words=s.split(" ")
k=len(words)
m=len(sentences)
if(sentences[len(sentences)-1]==''):
m=m-1
ans=k/m
return ans
In [17]: assert avg_sentence_len("each sentence. has two. words right?.")
== 2
assert avg_sentence_len("a. a. a") == avg_sentence_len("a. a. a.")
assert avg_sentence_len("a. a. a") == 1
assert avg_sentence_len("hello Dr. Jacob.") == 1.5
assert avg_sentence_len(news_df['message'][0])//1
assert avg_sentence_len("one. two.") != avg_sentence_len("one.two.")
== 16
AssertionError
<ipython-input-17-5eab60349a65> in <module>
Traceback (most recent call last)
1 assert avg_sentence_len("each sentence. has two. words right?.")
-> 2 assert avg_sentence_len("a. a. a") == avg_sentence_len("a. a. a.")
3 assert avg_sentence_len("a. a. a") == 1
4 assert avg_sentence_len("hello Dr. Jacob.") == 1.5
5 assert avg_sentence_len(news_df['message'][0])//1 == 16
== 2
AssertionError:
Transcribed Image Text:1c) Average sentence length We will create a function ( avg_sentence_len ) to calculate the average sentence length across a piece of text. This function should take text as an input parameter. Within this function: 1. sentences : Use the split() string method to split the input text at every '.'. This will split the text into a list of sentences. Store this in the variable sentences . To keep things simple, we will consider every "." as a sentence separator. (This decision could lead to misleading answers. For example, "Hello Dr. Jacob." is actually a single sentence, but our function will consider this 2 separate sentences). 2. words : Use the split() method to split the input text into a list of separate words, storing this in words . Again, to limit complexity, we will assume that all words are separated by a single space (" "). (So, while "I am going.to see you later" actually has 7 words, since there is no space after the ".", so we will assume the this to contain 6 separate words in our function.) 3. Calculate the average sentence length, return ing this from your function: 4. if the last value in sentences is an empty string: the average sentence length should be the number of words divided by the len(sentences) - 1. 5. otherwise, the average sentence length should be the number of words divided by the number of sentences. For the "I am going.to see you later" example, your function should return 1.5. In [16]: def avg_sentence_len(s): sentences=s.split(".") words=s.split(" ") k=len(words) m=len(sentences) if(sentences[len(sentences)-1]==''): m=m-1 ans=k/m return ans In [17]: assert avg_sentence_len("each sentence. has two. words right?.") == 2 assert avg_sentence_len("a. a. a") == avg_sentence_len("a. a. a.") assert avg_sentence_len("a. a. a") == 1 assert avg_sentence_len("hello Dr. Jacob.") == 1.5 assert avg_sentence_len(news_df['message'][0])//1 assert avg_sentence_len("one. two.") != avg_sentence_len("one.two.") == 16 AssertionError <ipython-input-17-5eab60349a65> in <module> Traceback (most recent call last) 1 assert avg_sentence_len("each sentence. has two. words right?.") -> 2 assert avg_sentence_len("a. a. a") == avg_sentence_len("a. a. a.") 3 assert avg_sentence_len("a. a. a") == 1 4 assert avg_sentence_len("hello Dr. Jacob.") == 1.5 5 assert avg_sentence_len(news_df['message'][0])//1 == 16 == 2 AssertionError:
1d) Calculate Average Sentence Length
Apply the above method to 'message' column of news_df , creating a new column in news_df called 'ASL'. This column should store the average
sentence length of each message.
In [13]: d = {'coll': [1,2,3,4,5 1, 'message': ["a. a. a.","a. a. a", "hello, dr. Jacobs.","one. two. ", "each sentence. has two. words r
newdf=pd. DataFrame (d)
#print(newdf)
r=[]
for i in range(len(newdf)):
print(newdf['message'][i])
r.append(avg_sentence_len(newdf[ 'message'][i]))
newdf ['ASL']=r
print(newdf)
print(avg_sentence_len ("a. a. a"))
а. а. а.
ValueError
Traceback (most recent call last)
<ipython-input-13-c584200b7c4b> in <module>
print (newdf['message'][i])
r.append (avg_sentence_len(newdf[ 'message'][i]))
newdf['ASL']=r
print (newdf)
print(avg_sentence_len ("a. a. a"))
6.
7
----> 8
9
10
/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in _setitem_(self, key, value)
3485
else:
3486
# set column
-> 3487
self._set_item (key, value)
3488
3489
def setitem_slice(self, key, value):
/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in set_item(self, key, value)
3562
self._ensure_valid_index (value)
value = self._sanitize_column (key, value)
NDFrame._set_item(self, key, value)
3563
-> 3564
3565
3566
/ opt/conda/lib/python3.7/site-packages/pandas/core/frame. py in _sanitize_column (self, key, value, broadcast)
3747
# turn me into an ndarray
value = sanitize_index(value, self.index, copy=False)
if not isinstance(value, (np.ndarray, Index)):
if isinstance(value, list) and len(value) > 0:
3748
-> 3749
3750
3751
Transcribed Image Text:1d) Calculate Average Sentence Length Apply the above method to 'message' column of news_df , creating a new column in news_df called 'ASL'. This column should store the average sentence length of each message. In [13]: d = {'coll': [1,2,3,4,5 1, 'message': ["a. a. a.","a. a. a", "hello, dr. Jacobs.","one. two. ", "each sentence. has two. words r newdf=pd. DataFrame (d) #print(newdf) r=[] for i in range(len(newdf)): print(newdf['message'][i]) r.append(avg_sentence_len(newdf[ 'message'][i])) newdf ['ASL']=r print(newdf) print(avg_sentence_len ("a. a. a")) а. а. а. ValueError Traceback (most recent call last) <ipython-input-13-c584200b7c4b> in <module> print (newdf['message'][i]) r.append (avg_sentence_len(newdf[ 'message'][i])) newdf['ASL']=r print (newdf) print(avg_sentence_len ("a. a. a")) 6. 7 ----> 8 9 10 /opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in _setitem_(self, key, value) 3485 else: 3486 # set column -> 3487 self._set_item (key, value) 3488 3489 def setitem_slice(self, key, value): /opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in set_item(self, key, value) 3562 self._ensure_valid_index (value) value = self._sanitize_column (key, value) NDFrame._set_item(self, key, value) 3563 -> 3564 3565 3566 / opt/conda/lib/python3.7/site-packages/pandas/core/frame. py in _sanitize_column (self, key, value, broadcast) 3747 # turn me into an ndarray value = sanitize_index(value, self.index, copy=False) if not isinstance(value, (np.ndarray, Index)): if isinstance(value, list) and len(value) > 0: 3748 -> 3749 3750 3751
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 3 steps with 3 images

Blurred answer
Knowledge Booster
Function Arguments
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education