批量Blast就是指多个序列的Blast。
blastall -p blastn -d BlastDB -i in_file.fasta >blast_output
当in_file.fasta里面只有一个序列时,就是单个Blast啊。in_file.fasta也可以放多个Fasta格式的序列,这样子就是批量Blast了。
当然了,麻烦的是批量Blast之后的结果,一个的话我们可以看得了,当批量上千个时,我们不可能一个个看到的。这种小事情Blast早就想到了。这就引进了-m8参数。-b5参数是指显示匹配的前5个结果
blastall -p blastn -d BlastDB -i in_file.fasta -m8 -b5 >blast_output
推荐的命令行如下:
blastall -p blastn -d BlastDB -i in_file.fasta -m8 -b5 -b1 -a2 -FF >blast_output
-a2参数是用二个CPU,加速。-FF是不过滤简单的重复序列和低复杂度的序列(默认是过滤的)。
本文详细出处参考:http://liucheng.name/1221/
Thursday, September 6, 2012
Wednesday, September 5, 2012
How to Blast sequences against a genome
How to Blast sequences against a genome
1. Get to a DOS window (e.g. by RUN command)
2. Type the following command to run Blast:
blastp -db databaseName -query contigFile -out filename -evalue e-value
For example:
blastp -db octdata -query maydata.fna -out myResults.txt -evalue .00001
4. Output could be modest when comparing two small sequences, but with lots of sequences, you can fill your disk drive with LOTS of output (dozens of megabytes).
5. How do you know whether the program worked? If you have a large output file (i.e. dozens of megabytes), don't try to read it into something like Word (you risk choking it). I don't think that Microsoft has any solution for us, but there is an ancient freeware program from the pre-Windows era that will do the job. Click here to download DR (standing for DiRectory). Put it in the Blast directory. Type DR at a DOS prompt to run.
6. To run DR, type DR at a DOS prompt to get a list of files in \Blast, then press the F10 key to sort the files by date of creation, then press the End key to go to the end of the list. You should see the file you just made. Press the Enter key to see the contents of the file (you can scroll through the file using the usual keys).
7. However you look at the output file, you should see something like: BLASTP 2.2.9 [May-01-2004]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Query= Contig240-R (500 letters) Database: octdata.fna
1 sequences; 2,160,837 total letters
If so, you win!
From http://www.vcu.edu/csbc/bbsi/inst/archives/bioinf/RunLocalBlast.html
1. Get to a DOS window (e.g. by RUN command)
2. Type the following command to run Blast:
blastp -db databaseName -query contigFile -out filename -evalue e-value
For example:
blastp -db octdata -query maydata.fna -out myResults.txt -evalue .00001
- blastp invokes the program of comparing individual protein sequences to a database of protein sequences
- Other blast programs to
consider:
- blastn to compare nucleotide sequence(s) against a database of nucleotide sequences
- blastp to compare protein sequence(s) against a database of protein sequences
- blastx to compare nucleotide sequence(s) translated in all six reading frames against a database of protein sequences
- tblastn to compare protein sequence(s) against a database of nucleotide sequences translated in all six reading frames
- tblastx to compare nucleotide sequence(s) translated in all six reading frames against a database of nucleotide sequences translated in all six reading frames
- .
- -db databaseName tells the program to use the databaseName you identified when you set up the database.
- -query contigFile tells the program to use the specified file as the query (input) to Blast. Give the full path if the file isn't in the same directory as Blast.
- -out filename tells the program to use the specified file as the output file.
- -evalue e-value tells the program to ignore matches that would occur by chance with an e-value(probability) greater than the decimal number given
4. Output could be modest when comparing two small sequences, but with lots of sequences, you can fill your disk drive with LOTS of output (dozens of megabytes).
5. How do you know whether the program worked? If you have a large output file (i.e. dozens of megabytes), don't try to read it into something like Word (you risk choking it). I don't think that Microsoft has any solution for us, but there is an ancient freeware program from the pre-Windows era that will do the job. Click here to download DR (standing for DiRectory). Put it in the Blast directory. Type DR at a DOS prompt to run.
6. To run DR, type DR at a DOS prompt to get a list of files in \Blast, then press the F10 key to sort the files by date of creation, then press the End key to go to the end of the list. You should see the file you just made. Press the Enter key to see the contents of the file (you can scroll through the file using the usual keys).
7. However you look at the output file, you should see something like: BLASTP 2.2.9 [May-01-2004]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Query= Contig240-R (500 letters) Database: octdata.fna
1 sequences; 2,160,837 total letters
If so, you win!
From http://www.vcu.edu/csbc/bbsi/inst/archives/bioinf/RunLocalBlast.html
How to run a sequence through BLAST at TIGR
Click on CMR Blast on the blue bar near the top | |
Click in the down
arrow next to the
Program window and choose
the appropriate program. Click in the down arrow next to the Database window and choose the appropriate database. Paste your sequence into the window supplied for that purpose. Click the Submit BLAST job button. from internet |
How to set up a local Blast database
Get to directory where you put Blast files
Type in the following:
makeblastdb -in file -out name -dbtype prot
-hash_index
(for a database of proteins)
OR
makeblastdb -in file -out name -dbtype nucl
-hash_index
(for a database of DNA or RNA)
What it means:
WARNING #2: Windows XP and NT users may experience trouble cutting and pasting the command line makeblastdb. Evidently the system does something strange to the hyphens. Type the command in instead.
From:http://www.vcu.edu/csbc/bbsi/inst/archives/bioinf/SetupLocalBlast.html
Type in the following:
makeblastdb -in file -out name -dbtype prot
-hash_index
(for a database of proteins)
OR
makeblastdb -in file -out name -dbtype nucl
-hash_index
(for a database of DNA or RNA)
What it means:
- makeblastdb invokes the Blast accessory program to create the database
- -in tells the program that the path that follows leads to the input file.
- -out tells the program that the characters that follow should be used as the name of the database (you can name it anything you want, so long as you use 8 or fewer legal characters).
- -dbtype prot Tells the program "the
file does consist of protein sequences".
-dbtype nucl tells the program "the file consists of nucleotide sequences" - -hash_index tells the program "you should make an index of the identification numbers for the sequences" Frankly, I don't know what good the index does, but it's cheap.
WARNING #2: Windows XP and NT users may experience trouble cutting and pasting the command line makeblastdb. Evidently the system does something strange to the hyphens. Type the command in instead.
From:http://www.vcu.edu/csbc/bbsi/inst/archives/bioinf/SetupLocalBlast.html
Subscribe to:
Posts (Atom)