BLAST+中makeblastdb參數詳解
以後打算工作中用到的相關BLAST操作全部用BLAST+來完成
與以前的Blast相以,我們還是從格式化數據庫到比對開始
一般我們是有一個fasta文件用來格式化數據庫,以前的命令是formatdb,現在是makeblastdb
一般用到的格式如下:
makeblastdb -in input_file -dbtype molecule_type -title database_title -parse_seqids -out database_name -logfile File_Name
-in 後接輸入文件,你要格式化的fasta序列
-dbtype 後接序列類型,nucl為核酸,prot為蛋白
-title 給數據庫起個名,好看~~(不能用在後面搜索時-db的參數)
-parse_seqids 推薦加上,現在有啥原因還沒搞清楚
-out 後接數據庫名,自己起一個有意義的名字,以後blast+搜索時要用到的-db的參數
-logfile 日誌文件,如果沒有默認輸出到屏幕
和以前的formatdb差別還是挺大的,呵呵
用makeblastdb接參數-help會打印出為些信息:
makeblastdb -help
USAGE
makeblastdb [-h] [-help] [-in input_file] [-dbtype molecule_type]
[-title database_title] [-parse_seqids] [-hash_index]
[-mask_data mask_data_files] [-out database_name]
[-max_file_sz number_of_bytes] [-taxid TaxID] [-taxid_map TaxIDMapFile]
[-logfile File_Name] [-version]
DESCRIPTION
Application to create BLAST databases, version 2.2.23+
OPTIONAL ARGUMENTS
-h
Print USAGE and DESCRIPTION; ignore other arguments
-help
Print USAGE, DESCRIPTION and ARGUMENTS description; ignore other arguments
-version
Print version number; ignore other arguments
*** Input options
-in <File_In>
Input file/database name; the data type is automatically detected, it may
be any of the following:
FASTA file(s) and/or
BLAST database(s)
Default = `-‘
-dbtype <String, `nucl‘, `prot‘>
Molecule type of input
Default = `prot‘
*** Configuration options
-title <String>
Title for BLAST database
Default = input file name provided to -in argument
-parse_seqids
Parse Seq-ids in FASTA input
-hash_index
Create index of sequence hash values.
*** Sequence masking options
-mask_data <String>
Comma-separated list of input files containing masking data as produced by
NCBI masking applications (e.g. dustmasker, segmasker, windowmasker)
*** Output options
-out <String>
Name of BLAST database to be created
Default = input file name provided to -in argumentRequired if multiple
file(s)/database(s) are provided as input
-max_file_sz <String>
Maximum file size for BLAST database files
Default = `1GB‘
*** Taxonomy options
-taxid <Integer, >=0>
Taxonomy ID to assign to all sequences
* Incompatible with: taxid_map
-taxid_map <File_In>
Text file mapping sequence IDs to taxonomy IDs.
Format:<SequenceId> <TaxonomyId><newline>
* Incompatible with: taxid
-logfile <File_Out>
File to which the program log should be redirected
BLAST+中makeblastdb參數詳解