blast 批量查询

问题是这样的：有很多很多序列，几百条，想大致了解一下这些序列分别是什么样的微生物，如果一条一条去blast，那是相当的累。想找一个工具告诉我每条序列blast结果的前几条的名称是什么即可，不需要其它信息。

在网上找了一下，没找到合适的软件或工具，虽然有些关于批量blast的教程之类的，比如这个，但是给出的结果及其繁琐，很多不需要的信息。

后来发现Biopython可以很简单就进行批量Blast。只需先安装Python和Biopython，Python和Biopython的下载地址分别为：
http://www.python.org/download/
http://www.biopython.org/wiki/Download

Windows版本下载后直接双击安装即可，非常简单。
然后打开IDLE（Python GUI)，”File”->”New Window”, 分如下两步进行：

第一步，运行下面的代码进行Blast

from Bio.Blast import NCBIWWW
from Bio import SeqIO
import time
SeqNumber = 0
for record in SeqIO.parse("allseq.seq", "fasta"):
    SeqNumber += 1
    try:
        result_handle = NCBIWWW.qblast("blastn", "nr", record.seq)
        save_file = open('xml\\'+str(SeqNumber)+'.xml', 'w')
        save_file.write(result_handle.read())
        save_file.close()
        print SeqNumber,' OK!'
    except:
        print SeqNumber,' Error! Will try again later!'
        time.sleep(600)
        SeqNumber -=1
print 'Done!'

序列要以fasta格式放在allseq.seq这个文件中，另外需要在当前目录下建一个名字为xml文件夹，代码运行结束后结果放在这个文件夹中。

第二步，列出每条序列blast结果的前几条的名称

from Bio.Blast import NCBIXML
import glob
total_xml_file = len(glob.glob('xml\\*.xml'))
for xmlnumber in range(1,total_xml_file+1):
  result_handle = open('xml\\'+str(xmlnumber)+'.xml')
  blast_records = NCBIXML.parse(result_handle)
  blast_record = blast_records.next()
  i = 0
  print '-------------------No.', xmlnumber, '-------------------'
  for alignment in blast_record.alignments:
    if i<10 : ##如果想看前20条结果就该为20
      print alignment.title
    else:
      break
    i+=1
  result_handle.close()

如果序列比较多，第一步可能需要很长时间，可以趁程序运行的时候先出去踢踢球或逛逛街，回来之后再运行第二步，第二步很快：）

第一次接触使用Python语言，非常好用，个人感觉要比Perl好一些，当然Python的生物信息学组件Biopython的可能现在没有Bioperl强大。

上次做了个序列文件合并小程序，后来发现用Bioedit就可以很简单地进行合并，白忙活了。如果进行批量Blast也有更好的方法，请留言告诉我一下，谢谢

转载自：有个博客 [ http://www.yelinsky.com/blog/ ]

本文链接地址：http://www.yelinsky.com/blog/archives/298.html

小生这厢有礼了(BioFaceBook Personal Blog)

分类

Recent Comments

链接表

blast 批量查询

1 comment to blast 批量查询

Leave a Reply to Charity Cancel reply

Archives

Meta

小生这厢有礼了(BioFaceBook Personal Blog)

分类

Recent Comments

链接表

blast 批量查询

1 comment to blast 批量查询

Leave a Reply to Charity Cancel reply

Tags

Archives

Meta