C++ Code: In checkpoint B, you will build on checkpoint A to load (from standard input) a database of individuals and their known counts for several STRs (Short Tandem Repeats). Given a query DNA sequence, you will then check the database to find the name of the individual most likely to have that DNA. You will implement the following functions: /** * Reads from standard input a list of Short Tandem Repeat (STRs) * and their known counts for several individuals * * @param nameSTRs the STRs (eg. AGAT, AATG, TATC) * @param nameIndividuals the names of individuals (eg. Alice, Bob, Charlie) * @param STRcounts the count of the longest consecutive occurrences of each STR in the DNA sequence for each individual * @pre nameSTRs, nameIndividuals, and nameSTRs are empty * @post nameSTRs, nameIndividuals and STRcounts are populated with data read from stdin **/ void readData(vector& nameSTRs, vector& nameIndividuals, vector>& STRcounts) For example, consider the input: 3 AGAT AATG TATC Alice 5 2 8 Bob 3 7 4 Charlie 6 1 5 It shows, in the first line, the number of STRs followed by the names of those STRs, which will be populated into the vector nameSTRs. The remaining lines contain data for a number of individuals. Their names will be populated into the vector nameIndividuals and the longest consecutive counts of STRs will be stored in the 2D vector STRcounts (which is a vector of vector of ints). Elements in a 2D vector are vector themselves. Check this resource for learning more about 2D Vectors in C++. Note, that an empty line at the end of the input denotes the end of data. In other words, the code should stop reading names and STR counts as soon as an empty line is encountered. /** * Prints a list of Short Tandem Repeat (STRs) and their * known counts for several individuals * * @param nameSTRs the STRs (eg. AGAT, AATG, TATC) * @param nameIndividuals the names of individuals (eg. Alice, Bob, Charlie) * @param STRcounts the STR counts * @pre nameSTRs, nameIndividuals, and STRcounts hold the data intended to be printed * @post the name of individuals and their STR counts in a column-major format are printed to stdout **/ void printData(vector& nameSTRs, vector& nameIndividuals, vector>& STRcounts) This function will print out the information that has been previously read (using the function readData) in a format that aligns an individual's STR counts along a column. For example, the output for the above input will be: name Alice Bob Charlie ---------------------------------------- AGAT 5 3 6 AATG 2 7 1 TATC 8 4 5 This output uses text manipulators to left-align each name and counts within 10 characters. The row of dashes is set to 40 characters. /** * Computes the longest consecutive occurrences of several STRs in a DNA sequence * * @param sequence a DNA sequence of an individual * @param nameSTRs the STRs (eg. AGAT, AATG, TATC) * @returns the count of the longest consecutive occurrences of each STR in nameSTRs **/ vector getSTRcounts(string& sequence, vector& nameSTRs) For example, if the sequence is AACCCTGCGCGCGCGCGATCTATCTATCTATCTATCCAGCATTAGCTAGCATCAAGATAGATAGATGAATTTCGAAATGAATGAATGAATGAATGAATGAATG and the vector namesSTRs is {"AGAT", "AATG", "TATC"}, then the output is the vector {3, 7, 4} /** * Compares if two vectors of STR counts are identical or not * * @param countQuery STR counts that is being queried (such as that computed from an input DNA sequence) * @param countDB STR counts that are known for an individual (such as that stored in a database) * @returns a boolean indicating whether they are the same or not **/ bool compareSTRcounts(vector& countQuery, vector& countDB) For example, if countQuery is the vector {3, 7, 4}, and countDB is the vector {3, 3, 4}, the function returns false. Bringing it all together in main Part of the main() function is already written for you: it reads a query DNA sequence and also reads (and prints) the database of STR counts for several individuals. Do NOT change that code. You should only need to make modifications beyond this point. In particular, your code should display the counts for each STR in the query DNA sequence. If there is a match with one of the individuals, it should display their name. If there is no match with any individual, then the program should output No Match found. For example, for the above query sequence, the output is: Counts of the STRs in the DNA sequence is: 3 7 4 Found Match: Bob

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

C++ Code:

In checkpoint B, you will build on checkpoint A to load (from standard input) a database of individuals and their known counts for several STRs (Short Tandem Repeats). Given a query DNA sequence, you will then check the database to find the name of the individual most likely to have that DNA.

You will implement the following functions:

  1. /** * Reads from standard input a list of Short Tandem Repeat (STRs) * and their known counts for several individuals * * @param nameSTRs the STRs (eg. AGAT, AATG, TATC) * @param nameIndividuals the names of individuals (eg. Alice, Bob, Charlie) * @param STRcounts the count of the longest consecutive occurrences of each STR in the DNA sequence for each individual * @pre nameSTRs, nameIndividuals, and nameSTRs are empty * @post nameSTRs, nameIndividuals and STRcounts are populated with data read from stdin **/ void readData(vector<string>& nameSTRs, vector<string>& nameIndividuals, vector<vector<int>>& STRcounts)

For example, consider the input:

3 AGAT AATG TATC Alice 5 2 8 Bob 3 7 4 Charlie 6 1 5

It shows, in the first line, the number of STRs followed by the names of those STRs, which will be populated into the vector nameSTRs. The remaining lines contain data for a number of individuals. Their names will be populated into the vector nameIndividuals and the longest consecutive counts of STRs will be stored in the 2D vector STRcounts (which is a vector of vector of ints). Elements in a 2D vector are vector themselves. Check this resource for learning more about 2D Vectors in C++.

Note, that an empty line at the end of the input denotes the end of data. In other words, the code should stop reading names and STR counts as soon as an empty line is encountered.

  1. /** * Prints a list of Short Tandem Repeat (STRs) and their * known counts for several individuals * * @param nameSTRs the STRs (eg. AGAT, AATG, TATC) * @param nameIndividuals the names of individuals (eg. Alice, Bob, Charlie) * @param STRcounts the STR counts * @pre nameSTRs, nameIndividuals, and STRcounts hold the data intended to be printed * @post the name of individuals and their STR counts in a column-major format are printed to stdout **/ void printData(vector<string>& nameSTRs, vector<string>& nameIndividuals, vector<vector<int>>& STRcounts)

This function will print out the information that has been previously read (using the function readData) in a format that aligns an individual's STR counts along a column. For example, the output for the above input will be:

name Alice Bob Charlie ---------------------------------------- AGAT 5 3 6 AATG 2 7 1 TATC 8 4 5

This output uses text manipulators to left-align each name and counts within 10 characters. The row of dashes is set to 40 characters.

  1. /** * Computes the longest consecutive occurrences of several STRs in a DNA sequence * * @param sequence a DNA sequence of an individual * @param nameSTRs the STRs (eg. AGAT, AATG, TATC) * @returns the count of the longest consecutive occurrences of each STR in nameSTRs **/ vector<int> getSTRcounts(string& sequence, vector<string>& nameSTRs)

For example, if the sequence is

AACCCTGCGCGCGCGCGATCTATCTATCTATCTATCCAGCATTAGCTAGCATCAAGATAGATAGATGAATTTCGAAATGAATGAATGAATGAATGAATGAATG


and the vector namesSTRs is {"AGAT", "AATG", "TATC"}, then the output is the vector {3, 7, 4}

  1. /** * Compares if two vectors of STR counts are identical or not * * @param countQuery STR counts that is being queried (such as that computed from an input DNA sequence) * @param countDB STR counts that are known for an individual (such as that stored in a database) * @returns a boolean indicating whether they are the same or not **/ bool compareSTRcounts(vector<int>& countQuery, vector<int>& countDB)

For example, if countQuery is the vector {3, 7, 4}, and countDB is the vector {3, 3, 4}, the function returns false.

Bringing it all together in main

  • Part of the main() function is already written for you: it reads a query DNA sequence and also reads (and prints) the database of STR counts for several individuals. Do NOT change that code. You should only need to make modifications beyond this point. In particular, your code should display the counts for each STR in the query DNA sequence. If there is a match with one of the individuals, it should display their name. If there is no match with any individual, then the program should output No Match found. For example, for the above query sequence, the output is:

Counts of the STRs in the DNA sequence is: 3 7 4 Found Match: Bob

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 3 steps with 1 images

Blurred answer
Knowledge Booster
Array
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education