Question 1 (3%) Write a function that check whether an input sequence is a valid protein sequence. The function template is given to you as below. The function should return True if the input sequence is a valid protein sequence, and return False otherwise. The following is the list of valid aminoacid symbols: "A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S", "T", "", "W", "Y" def validate_protein (protein_seq): """ Checks if protein sequence is valid. Returns True is sequence is valid, or False otherwise. # To be completed... 11 11 11

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

need help for Python

Background
In Lecture 5 to Lecture 6, we learned the python programming skills to process biological
sequences and pattern searching. In this assignment, you are required to write python programs
to practice processing of biological sequences and pattern searching.
Question 1 (3%)
Write a function that check whether an input sequence is a valid protein sequence. The function
template is given to you as below. The function should return True if the input sequence is a
valid protein sequence, and return False otherwise. The following is the list of valid aminoacid
symbols:
"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V", "W", "y"
def validate_protein (protein_seq) :
""" Checks if protein sequence is valid. Returns True is sequence is
valid, or False otherwise.
# To be completed...
Question 2 (5%)
Write a function that, given a sequence as an argument, allows to detect if there are repeated
sub-sequences of size k (the second argument of the function). The result should be a dictionary
where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq (seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
return dic
Question 3 (7%)
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, patterns) :
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed...
return dic
For example, suppose a file with filename "File 1.txt" contains the following texts,
Mary is a girl.
They are girls.
Fish has gills.
By calling the following lines of codes,
1/2
strings
["is", "gi*1"]
print (text search ("File 1.txt", strings))
=
1
Transcribed Image Text:Background In Lecture 5 to Lecture 6, we learned the python programming skills to process biological sequences and pattern searching. In this assignment, you are required to write python programs to practice processing of biological sequences and pattern searching. Question 1 (3%) Write a function that check whether an input sequence is a valid protein sequence. The function template is given to you as below. The function should return True if the input sequence is a valid protein sequence, and return False otherwise. The following is the list of valid aminoacid symbols: "A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S", "T", "V", "W", "y" def validate_protein (protein_seq) : """ Checks if protein sequence is valid. Returns True is sequence is valid, or False otherwise. # To be completed... Question 2 (5%) Write a function that, given a sequence as an argument, allows to detect if there are repeated sub-sequences of size k (the second argument of the function). The result should be a dictionary where keys are sub-sequences and values are the number of times they occur (at least 2). The function template is given to you as below. (Hints: you can make use of the function "search_all_occurrences" shown to you in Lecture 6) def number_of_repeated_subseq (seq, k): """Return a dictionary where keys are sub-sequences of size k and values are number of times they occur (at least 2) """ dic = {} # To be completed... return dic Question 3 (7%) Write a function that performs a text search in a text file. The function takes the filename and a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A "*" character within the string is a wildcard character, which can stand for unknown characters with any length greater than zero. It returns a dictionary where keys are the patterns in the list of strings being searched, and values are the number of times they occur. Your function should have the following template: def text_search (filename, patterns) : """It searches the file filename and returns a dictionary of search result, showing patterns with number of occurrences""" dic = {} # To be completed... return dic For example, suppose a file with filename "File 1.txt" contains the following texts, Mary is a girl. They are girls. Fish has gills. By calling the following lines of codes, 1/2 strings ["is", "gi*1"] print (text search ("File 1.txt", strings)) = 1
"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V",
"W",
"Y"
def validate_protein (protein_seq) :
""" Checks if protein sequence is valid. Returns True is sequence is
valid, or False otherwise.
# To be completed...
Question 2 (5%)
Write a function that, given a sequence as an argument, allows to detect if there are repeated
sub-sequences of size k (the second argument of the function). The result should be a dictionary
where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq (seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
return dic
Question 3 (7%)
" " #
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, patterns) :
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed...
return dic
For example, suppose a file with filename "File 1.txt" contains the following texts,
Mary is a girl.
They are girls.
Fish has gills.
By calling the following lines of codes,
strings = ["is", "gi*1"]
print (text_search ("File 1.txt", strings))
The following output is obtained:
{'is': 2, 'girl': 2, 'gill': 1}
1
2/2
2
Transcribed Image Text:"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S", "T", "V", "W", "Y" def validate_protein (protein_seq) : """ Checks if protein sequence is valid. Returns True is sequence is valid, or False otherwise. # To be completed... Question 2 (5%) Write a function that, given a sequence as an argument, allows to detect if there are repeated sub-sequences of size k (the second argument of the function). The result should be a dictionary where keys are sub-sequences and values are the number of times they occur (at least 2). The function template is given to you as below. (Hints: you can make use of the function "search_all_occurrences" shown to you in Lecture 6) def number_of_repeated_subseq (seq, k): """Return a dictionary where keys are sub-sequences of size k and values are number of times they occur (at least 2) """ dic = {} # To be completed... return dic Question 3 (7%) " " # Write a function that performs a text search in a text file. The function takes the filename and a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A "*" character within the string is a wildcard character, which can stand for unknown characters with any length greater than zero. It returns a dictionary where keys are the patterns in the list of strings being searched, and values are the number of times they occur. Your function should have the following template: def text_search (filename, patterns) : """It searches the file filename and returns a dictionary of search result, showing patterns with number of occurrences""" dic = {} # To be completed... return dic For example, suppose a file with filename "File 1.txt" contains the following texts, Mary is a girl. They are girls. Fish has gills. By calling the following lines of codes, strings = ["is", "gi*1"] print (text_search ("File 1.txt", strings)) The following output is obtained: {'is': 2, 'girl': 2, 'gill': 1} 1 2/2 2
Expert Solution
steps

Step by step

Solved in 5 steps with 2 images

Blurred answer
Knowledge Booster
Types of Function
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education