Background In Lecture 5 to Lecture 6, we learned the python programming skills to process biological sequences and pattern searching. In this assignment, you are required to write python programs to practice processing of biological sequences and pattern searching. Question 1 (3%) Write a function that check whether an input sequence is a valid protein sequence. The function template is given to you as below. The function should return True if the input sequence is a valid protein sequence, and return False otherwise. The following is the list of valid aminoacid symbols: "A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S", "T","V", "W", "Y" def validate_protein (protein_seq): """ Checks if protein sequence is valid. Returns True is sequence is valid, or False otherwise. # To be completed... Question 2 (5%) 11 11 11 Write a function that, given a sequence as an argument, allows to detect if there are repeated sub-sequences of size k (the second argument of the function). The result should be a dictionary where keys are sub-sequences and values are the number of times they occur (at least 2). The function template is given to you as below. (Hints: you can make use of the function "search_all_occurrences" shown to you in Lecture 6) def number_of_repeated_subseq (seq, k): """Return a dictionary where keys are sub-sequences of size k and
Background In Lecture 5 to Lecture 6, we learned the python programming skills to process biological sequences and pattern searching. In this assignment, you are required to write python programs to practice processing of biological sequences and pattern searching. Question 1 (3%) Write a function that check whether an input sequence is a valid protein sequence. The function template is given to you as below. The function should return True if the input sequence is a valid protein sequence, and return False otherwise. The following is the list of valid aminoacid symbols: "A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S", "T","V", "W", "Y" def validate_protein (protein_seq): """ Checks if protein sequence is valid. Returns True is sequence is valid, or False otherwise. # To be completed... Question 2 (5%) 11 11 11 Write a function that, given a sequence as an argument, allows to detect if there are repeated sub-sequences of size k (the second argument of the function). The result should be a dictionary where keys are sub-sequences and values are the number of times they occur (at least 2). The function template is given to you as below. (Hints: you can make use of the function "search_all_occurrences" shown to you in Lecture 6) def number_of_repeated_subseq (seq, k): """Return a dictionary where keys are sub-sequences of size k and
Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
Related questions
Question
![Background
In Lecture 5 to Lecture 6, we learned the python programming skills to process biological
sequences and pattern searching. In this assignment, you are required to write python programs
to practice processing of biological sequences and pattern searching.
Question 1 (3%)
Write a function that check whether an input sequence is a valid protein sequence. The function
template is given to you as below. The function should return True if the input sequence is a
valid protein sequence, and return False otherwise. The following is the list of valid aminoacid
symbols:
"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V", "W", "Y"
def validate_protein (protein_seq) :
""" Checks if protein sequence is valid. Returns True is sequence is
valid, or False otherwise.
# To be completed...
Question 2 (5%)
Write a function that, given a sequence as an argument, allows to detect if there are repeated
sub-sequences of size k (the second argument of the function). The result should be a dictionary
where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq (seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
11 11 11
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
return dic
Question 3 (7%)
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, pattern. 1/2
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed.
1](/v2/_next/image?url=https%3A%2F%2Fcontent.bartleby.com%2Fqna-images%2Fquestion%2F13aa0b27-6a24-4549-8d2d-69436cd6eed9%2Fb60d32a5-bd57-42f5-9f41-795192c09311%2F89qywdk_processed.png&w=3840&q=75)
Transcribed Image Text:Background
In Lecture 5 to Lecture 6, we learned the python programming skills to process biological
sequences and pattern searching. In this assignment, you are required to write python programs
to practice processing of biological sequences and pattern searching.
Question 1 (3%)
Write a function that check whether an input sequence is a valid protein sequence. The function
template is given to you as below. The function should return True if the input sequence is a
valid protein sequence, and return False otherwise. The following is the list of valid aminoacid
symbols:
"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V", "W", "Y"
def validate_protein (protein_seq) :
""" Checks if protein sequence is valid. Returns True is sequence is
valid, or False otherwise.
# To be completed...
Question 2 (5%)
Write a function that, given a sequence as an argument, allows to detect if there are repeated
sub-sequences of size k (the second argument of the function). The result should be a dictionary
where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq (seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
11 11 11
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
return dic
Question 3 (7%)
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, pattern. 1/2
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed.
1
![where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq
(seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
re rn dic
Question 3 (7%)
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, patterns):
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed...
return dic
For example, suppose a file with filename “File 1.txt" contains the following texts,
Mary is a girl.
They are girls.
Fish has gills.
By calling the following lines of codes,
strings = ["is", "gi*1"]
print (text_search ("File 1.txt", strings))
The following output is obtained:
{'is': 2, 'girl': 2, 'gill': 1}
1
2/2
2](/v2/_next/image?url=https%3A%2F%2Fcontent.bartleby.com%2Fqna-images%2Fquestion%2F13aa0b27-6a24-4549-8d2d-69436cd6eed9%2Fb60d32a5-bd57-42f5-9f41-795192c09311%2Frnmu8nd_processed.png&w=3840&q=75)
Transcribed Image Text:where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq
(seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
re rn dic
Question 3 (7%)
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, patterns):
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed...
return dic
For example, suppose a file with filename “File 1.txt" contains the following texts,
Mary is a girl.
They are girls.
Fish has gills.
By calling the following lines of codes,
strings = ["is", "gi*1"]
print (text_search ("File 1.txt", strings))
The following output is obtained:
{'is': 2, 'girl': 2, 'gill': 1}
1
2/2
2
Expert Solution
![](/static/compass_v2/shared-icons/check-mark.png)
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by step
Solved in 2 steps
![Blurred answer](/static/compass_v2/solution-images/blurred-answer.jpg)
Recommended textbooks for you
![Computer Networking: A Top-Down Approach (7th Edi…](https://www.bartleby.com/isbn_cover_images/9780133594140/9780133594140_smallCoverImage.gif)
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
![Computer Organization and Design MIPS Edition, Fi…](https://www.bartleby.com/isbn_cover_images/9780124077263/9780124077263_smallCoverImage.gif)
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
![Network+ Guide to Networks (MindTap Course List)](https://www.bartleby.com/isbn_cover_images/9781337569330/9781337569330_smallCoverImage.gif)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
![Computer Networking: A Top-Down Approach (7th Edi…](https://www.bartleby.com/isbn_cover_images/9780133594140/9780133594140_smallCoverImage.gif)
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
![Computer Organization and Design MIPS Edition, Fi…](https://www.bartleby.com/isbn_cover_images/9780124077263/9780124077263_smallCoverImage.gif)
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
![Network+ Guide to Networks (MindTap Course List)](https://www.bartleby.com/isbn_cover_images/9781337569330/9781337569330_smallCoverImage.gif)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
![Concepts of Database Management](https://www.bartleby.com/isbn_cover_images/9781337093422/9781337093422_smallCoverImage.gif)
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
![Prelude to Programming](https://www.bartleby.com/isbn_cover_images/9780133750423/9780133750423_smallCoverImage.jpg)
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
![Sc Business Data Communications and Networking, T…](https://www.bartleby.com/isbn_cover_images/9781119368830/9781119368830_smallCoverImage.gif)
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY