Saturday 15 March 2014

bioinformatics - Get fasta source file from blast database name -


I am currently writing a library that uses the blast's -outfmt 10 option , Which gives you a CSV instead of a beautiful human-readable format.

Like

  tblastn-db dmel_a -query somequery.faa -outfmt 10   

The problem is, I'm using DB source file I want to use so that I can remove some scenes after processing. The only way I know how to do this is to use -outfmt 10 to remove and run the blast twice. Then I parse human readable output for that line:

  Database: source.fas   

But, it only works when < Code> Title makeblastdb is not specified in the database to create stitle of outfmt 10 anyway seems a fast header line. I can not simply search the database name and then a .fna, .fas, .faa because you can name the database differently from the source file.

Is there any other way to remove the Fastest source file from the explosion database name? Do not see me in the list of outfmt options or am I blind today?

found a solution based on a biostar question, and a blog post. Explosion required + 2.2.28 If your fast does not exactly nick the NCBI

Use the -psse_sconds flag when you create an explosion database, then with blastdbcmd, You can remove a limit of the sequence

  blastdbcmd -db t / blastTest / dmel -range 1-10- enter some some_seq_id    

No comments:

Post a Comment