Graded HW 5-2

docx

School

Johns Hopkins University *

*We aren’t endorsed by this school

Course

633

Subject

Biology

Date

Jan 9, 2024

Type

docx

Pages

5

Uploaded by murphydanyael

Report
Name: Danyael Murphy Modules 09-11: Graded Homework 5 1. (0.6) Use prosite ( http://prosite.expasy.org/ ) to find motifs in CAA22878 . Unclick the checkbox that would exclude patterns that match frequently. Scan Prosite. a. (0.2) How many distinct patterns are above the line that reads “hits by patterns with a high probability of occurrence or by user-defined patterns?” 1 b. (0.2) Looking at the pattern that is above that line, what types of molecules do proteins with this pattern transport? Sugar/carbohydrates c. (0.2) Click on the prosite number (starts with PS). Scroll down for the consensus pattern. At what position in the pattern is an "any amino acid except threonine (T) or aspartic acid (D)?" 7 th position 2. (0.6) Analyze the following sequence: >hypothetical protein [Comamonadaceae bacterium CR] MSLKERIQEEMKAAMRAKDTARLGAIRLLLAAIKQKEVDERVMLDDAAIIAVVDKLIKQRRDSVTAYQQA QRSDLADKEAAEITVLEAYLPQRWSRAEVEEAVSRVVAETAATGPGDMGKVMAAVKAQCQGKADMAVVSQ VVKAALAARGG Using the Lehninger scale, what is the charge of this protein at pH 4.0, pH 7.0, and pH 9.0? Refer to the attached document for the syntax of calculating charge in the R Peptides package. pH charge 4.0 12.79 7.0 1.96 9.0 0.46
3. (0.8) Use the protein below for transmembrane prediction: > Dpy19/1 protein [Mus musculus] MVTRGFLEFQLFGWLFGKVHPGAVVFAILAAMSIQGSANLQTQWNIVGEFSNLPQEELIEWIRYSTKPDA VFAGAMPTMASVKLSALRPVVNHPHYEDAGLRARTKIVYSMYSRKAPEDVKKELMKLKVNYYILEESWCI RRSKPGCSMPEIWDVEDPDNAGKTPLCNILVKDSKPHFTTVFQNSVYKVLEVLRQ Run Phobius and TMHMM on this protein. Compare and contrast the results. What are the exact predicted transmembrane locations and approximate probabilities from the graph? Transmembrane location Approximate probability of TM Phobius 12…34 90% TMHMM 1…41 100% 4. Use this mitochondrial protein from yeast. >yeast mitochondrial protein MVLLHKSTHIFPTDFASVSRAFFNRYPNPYSPHVLSIDTISRNVDQEGNLRTTRLLKKSGKLPTWVKPFL RGITETWIIEVSVVNPANSTMKTYTRNLDHTGIMKVEEYTTYQFDSATSSTIADSRVKFSSGFNMGIKSK VEDWSRTKFDENVKKSRMGMAFVIQKLEEARNPQF Try the following localization programs. List the organelle prediction (if any), and probability (if any). a. (0.2) TargetP ( https://services.healthtech.dtu.dk/service.php?TargetP- 2.0 ) b. (0.3) DeepMito ( http://busca.biocomp.unibo.it/deepmito/ ) c. (0.2) CELLO ( http://cello.life.nctu.edu.tw/ ) - FASTA format only d. (0.3) Predotar ( https://urgi.versailles.inra.fr/predotar/ ) TargetP Protein type Other Likelihood 0.923 DeepMito Predicted mitochondrial? No GO-term Mitochondrial intermembrane space Score 0.17 Localization Periplasmic
CELLO Reliability 1.348 Predotar Mitochondrial 0,19 ER 0,00 Elsewhere 0,55 5. There is a file of ITPR3 cDNAs called itpr3.fst. Open MAFFT and run the multiple alignment followed by the following alignment programs: NJ (both methods), UPGMA. a. (0.2) What are ITPR3 proteins? ITPR3 proteins are encoded by the ITPR3 gene. The protein encoded by this gene is a second messenger that mediates the release of intracellular calcium. b. (0.2) Comment on how reproducible the tree is using the different methods. The trees are very similar. The length of the branches differ slightly. c. (0.3) List all sister taxa for each tree NJ (conserved) 5&6, 1&2 NJ(gap-free) 1&2, 5&6 UPGMA 1&2, 5&6 d. (0.1) Which tree topology do you think makes the most sense based on the organisms? UPGMA
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
e. (0.2) Briefly explain your previous answer. The order of divergence makes more sense in the UPGMA tree. It makes the most sense because of the increasing complexity of the organisms. 6. Find the sequence in the NCBI nucleotide database with the accession number, KC333876. a. (0.1) What gene cluster is listed in the definition field? ethABCD gene cluster b. (0.1) List the locations of the -35 signal and the -10 signal for the gene cluster. -35 signal 1552…1561 -10 signal 1574…1579 c. (0.2) Why is the -10 signal more than 10 bases upstream from the start of the first CDS of the operon? Many genes may have a more extensive promoter region upstream of the core promoter, sometimes extending as far as 200 bp farther upstream. d. (0.1) Why is there no -35 and -10 signal for the third CDS (second of operon)? The region may not code for a protein. Thus, a transcription initiation sequence isn’t necessary. e. (0.2) Run the sequence using FGENESB . Choose Bacterial generic as the training set. List any discrepancies from the GenBank record. GenBank CDS locations FGENESH CDS locations 263..1219 263..1297 1685..2923 1655..2923 2993..4195 4201..4521 4592..4903 5205..6719 f. (0.2) How many different mRNA transcripts are needed to produce the six CDS regions listed in GenBank and by FGENES B ? 3 g. (0.1) Run BLAST using the predicted protein from FGENESB from the first CDS (should be on complement strand). Then run BLAST with the first protein
sequence from the GenBank record. Based on your BLAST results, which query sequence do you believe to be more correct--the prediction or the NCBI-annotated protein? Briefly explain. NCBI-annotated protein is the more correct prediction.