The file “dna.seq” (on Blackboard) consists of several DNA sequences. Write a program that reads in the file “dna.seq” and counts the number of sequences with the following properties: • The total number of sequences in the file • The number of sequences that have the pattern CTATA • The number of sequences that have more than 1000 bases • The number of sequences that have over 50% GC composition • The number of sequences that have more than 2000 bases and more than 50% GC composition Use python
The file “dna.seq” (on Blackboard) consists of several DNA sequences. Write a program that reads in the file “dna.seq” and counts the number of sequences with the following properties: • The total number of sequences in the file • The number of sequences that have the pattern CTATA • The number of sequences that have more than 1000 bases • The number of sequences that have over 50% GC composition • The number of sequences that have more than 2000 bases and more than 50% GC composition Use python
Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
Related questions
Question
The file “dna.seq” (on Blackboard) consists of several DNA sequences. Write a program that reads in the file “dna.seq” and counts the number of sequences with the following properties:
• The total number of sequences in the file
• The number of sequences that have the pattern CTATA
• The number of sequences that have more than 1000 bases
• The number of sequences that have over 50% GC composition
• The number of sequences that have more than 2000 bases and more than 50% GC
composition
Use python
Expert Solution
Step 1: Algorithms
The goal is to create a Python program that performs several tasks on these sequences, including counting sequences with specific properties:
- Total Sequences
- : Count the total number of sequences in the "dna.seq" file.
- Sequences with Pattern
- : Count the number of sequences in the file that contain the pattern "CTATA."
- Sequences with Length
- : Count the number of sequences with more than 1000 bases.
- Sequences with GC Composition
- : Count the number of sequences with over 50% GC composition, where GC composition refers to the proportion of guanine (G) and cytosine (C) bases in a sequence.
- Sequences with Length and GC Composition
- : Count the number of sequences with more than 2000 bases and over 50% GC composition.
Step by step
Solved in 5 steps with 3 images
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Recommended textbooks for you
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education