Bioinformatics Co-ordinators Contacts Bioinformatics Statistics Bioinformatics FAQ Bioinformatics Jokes Bioinformatics Lecture Notes BioinformaticsOnline Members List Bioinformatics Cartoons Contact Us Bioinformatics Scholarship Search Just About Me
BioinformaticsOnline  Home
   
     
 

How to become expert bioinformatician

PhD in Bioinformatics

Bioinformatics Conferences

Bioinformatics Journals

Bioinformatics Lecture Notes

 

Exercise - Sequence Database Searching for Similar Sequences


First we are going to perform a successive database search to retrieve several members of a protein family.

• Use the NCBI protein database search to retrieve the sequence of a protein of the light harvesting (LHC) protein family
• Copy the sequence and paste its first line into the sequence field at the FASTA web page: http://www.ebi.ac.uk/fasta33/index.html
• Perform the search using the default parameters
• Select a more distant match with a Tobacco protein by checking the box on the left
• Click ‘Show Alignments’ and interpret the results (note the Z-score, the E value and the percentage of identity)
• Go back and click on the link of the entry
• Display the entry in FASTA format and copy the sequence
• Paste the first line of the sequence into a newly opened search field, select Gap open = -2 and Lower expectation value = 0.001 (this looses the search restrictions and limits displayed results to more badly aligning sequences)
• Run the search and then repeat the last steps to find more distant protein family members
• Save the found Tobacco sequences of all searches in a text file (not more than 10) Next we’re going to use the blastx program and the protein database to compare a translated DNA sequence to a protein database.
• Open the file seq2.txt from the download page and copy and paste the sequence into the blastx search field (go to http://www.ncbi.nlm.nih.gov/BLAST/ and choose blastx from the ‚Translated’ area)
• Perform the search and click on the link of the first sequence to show the entry
• Display the entry in FASTA format, copy the sequence and perform a proteinprotein search
• Save all protein sequences into a text file
• Repeat the last search with the first and second sequence which do not belong to an unknown protein

Contact Jitendra Narayan


 
 
 
 
 
 

 

 
© BioinformaticsOnline.com,2007-09, India, All rights reservedow
Conceptualized & Designed by: Jitendra Narayan Powered by: BCS-InfoSolutions, India