Process or set of rules that allow for the solving of specific, well-defined computational problems through a specific series of commands. This topic is fundamental in computer science, especially with regard to artificial intelligence, databases, graphics, networking, operating systems, and security.
Expert Solution
INTRODUCTION TO IDENTIFYING HOMOLOGOUS SEQUENCES
Sequence similarity searching to identify homologous sequences is one of the first, and most informative, steps in any analysis of newly determined sequences. Modern protein sequence databases are very comprehensive, so that more than 80% of metagenomic sequence samples typically share significant similarity with proteins in sequence databases. Widely used similarity searching programs, like BLAST,PSI-BLAST , SSEARCH (Smith and Waterman (1981); Pearson (1991), FASTA (Pearson and Lipman (1988) and the HMMER3 programs produce accurate statistical estimates, ensuring protein sequences that share significant similarity also have similar structures. Similarity searching is effective and reliable because sequences that share significant similarity can be inferred to be homologous; they share a common ancestor.
The units in this chapter present practical strategies for identifying homologous sequences in DNA and protein databases; once homologs have been found, more accurate alignments can be built from multiple sequence alignments , which can also form the basis for more sensitive searches, phenotype prediction, and evolutionary analysis.
While similarity searching is an effective and reliable strategy for identifying homologs – sequences that share a common evolutionary ancestor – most similarity searches seek to answer a much more challenging question: "Is there a related sequence with a similar function?". The inference of functional similarity from homology is more difficult, both because functional similarity is more difficult to quantify, and because the relationship between homology (structure) and function is complex. This introduction first discusses how homology is inferred from significant similarity, and how those inferences can be confirmed, and then considers strategies that connect homology to more accurate functional prediction.