Write a function in Python that takes a DNA sequence and kmer size (integer) as input, and returns a dictionary of all kmers (keys) in the string with a list of positions as values. The positions should start at 1. Use your function to make a dictionary of the 'seq' string below and print the dictionary. The following sequence with size = 3 should return: seq = 'ATCGTTCATCG' kmerdict(seq, 3) {'ATC': [1, 8], "CAT': [7, CGT': [3], "GTT': [4], 'TCA': [6], 'TCG': [2, 9], 'TTC': [5]} Note that the order in the output is not important. Use your function and the second string and print the positions of all ATGS |: seq = 'ATCGTTCATCG def kmerdict(sequence, size): index = {} return index seq2 3D 'САСтТСАСТССАТGGCCCАТСТСТСАTGAATCAGTACСАААТGCAСТСАСАТСАТТАТGСАCGGCACTTGссТСAGCGGTCTАТАСССТGTGCСАТТТАСССАТААCGCCC print( "Here are all the ATG positions in seq2: ")
Could you help with the code and explanation?
![Write a function in Python that takes a DNA sequence and kmer size (integer) as input, and returns a dictionary of all kmers (keys) in the string with a list of
positions as values. The positions should start at 1. Use your function to make a dictionary of the 'seq' string below and print the dictionary.
The following sequence with size = 3 should return:
seq =
'ATCGTTCATCG'
kmerdict(seq,3)
{'АTC': [1, 8],
"CAT': [7],
'CGT': [3],
GTT': [4],
'TCA': [6],
'TCG': [2, 9],
TTC': [5]}
Note that the order in the output is not important.
Use your function and the second string and print the positions of all ATGS
]:
seg
'АТCGTTCAТCG'
def kmerdict(sequence, size):
index
{ }
return index
"САСТТСАСТССАТGGCCСАТСТСТСАTGAATCAGTАССАААТGCAСТСАСАТСАТТАTGCACGGCACTTGCCТСAGCGGTCТАТАСССТGTтGCCATTTACССАТААСGCСС
"Here are all the ATG positions in seg2: ")
seq2
%3D
print(](/v2/_next/image?url=https%3A%2F%2Fcontent.bartleby.com%2Fqna-images%2Fquestion%2F2ae3c598-5ca0-4a09-8f44-c28883b8b5eb%2F8464e8ec-1710-4037-a445-f902844c5eb0%2F08cuwxh_processed.png&w=3840&q=75)

The approach i used is as follows:-
- First find all the possible substrings of the string and get them in the list
- Then only take those substrings which are of the size in our case it is 3
- Then find the occurence of those substring in the sequecne using startswith
- Then add the occurences one by one in the list
- Add the record in the dictionary as the substring as key and list of occurence as value
- Finally return the dictionary
Everything is mentioned in the code comments
Code is added in the step 2 along with screenshot for the code and output
#To find all the substring of string
We use approach in which 1 string hold the index of current element and other element take the substrings
Ex. Hello
Now i will hold H and j will also hold H first substring = H
Now i stays there only and j increment to e second substring = He again j increments we get substring = Hel
Like this we do till end but at end j will be at o hence we will get hello but we eliminate that and not add that
Then i increment to e and also j = e hence another substring = e again j keeps incrementing till end
Trending now
This is a popular solution!
Step by step
Solved in 2 steps with 3 images









