In Python: Using the following code as a starting point def fastaParser(): with open("seqs.fa") as fh: header = "" for line in fh: line = line.strip() if line.startswith(">"): #line is a header header = line.lstrip(">") else: #line is a sequence line sequence = line.strip() yield (header,sequence) But alter it so that it functions with a multi-line (sequences are 1-? number of lines long) FASTA file. The trick with this is that the place in the code we can yield a complete record changes when the sequence is more than one line. Your code must account for this unknown in some manner.
In Python:
Using the following code as a starting point
def fastaParser():
with open("seqs.fa") as fh:
header = ""
for line in fh:
line = line.strip()
if line.startswith(">"):
#line is a header
header = line.lstrip(">")
else:
#line is a sequence line
sequence = line.strip()
yield (header,sequence)
But alter it so that it functions with a multi-line (sequences are 1-? number of lines long) FASTA file.
The trick with this is that the place in the code we can yield a complete record changes when the sequence is more than one line. Your code must account for this unknown in some manner.

Step by step
Solved in 4 steps with 3 images









