HHblits：让序列比对更快更准更灵敏_生物软件圈_商圈

HHblits：让序列比对更快更准更灵敏

楼主收藏举报帖子创建时间: 2018-12-27 00:00 回复：1 关注量：81

来自德国慕尼黑大学的研究人员发表了题为“HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment”的文章，介绍一种能提高蛋白序列比对分析的新工具：HHblits，这是一种能极大增加蛋白功能性分析技术的软件，能通过新颖的序列寻找方法，更快更准确的识别数据库中具有相似序列的蛋白，比现有的方法能快2500倍！相关成果公布在Nature Methods杂志上。

领导这一研究的是慕尼黑大学基因中心的Johannes Söding博士，他表示，“我们的方法能延伸序列分析的广度和力度，从而能方便之后的蛋白结构和功能的解析。

蛋白存在于生命中几乎所有生化过程中，一个蛋白的功能很大程度上依赖于其20种氨基酸排列组合的顺序，以及氨基酸序列组成的三维空间结构。因此对于序列相似的蛋白来说，生物信息学方法能预测出其进化相关性，从而预测出相似的结构和功能。

所以蛋白结构分析是蛋白研究的一个重要方面，蛋白结构比对也成为了常规手段之一，研究人员常常在公众数据库中比对蛋白结构，通过分析这些相似的已知结构来分析蛋白的功能，Söding博士说，“这种序列分析方法是生物信息学领域的一种基础研究手段。

序列寻找程序能通过计算配对方式来评估序列相似性——两个氨基酸序列被按照先后顺序排列，这是根据常见识别，或者同样方式的氨基酸配对。“也许比配对序列相似性更加重要的是，所谓的多序列比对，在这种情况下，研究人员可以寻找许多相关蛋白中的相似序列，或者将其安排进矩阵中——矩阵是指每个序列排列一行，相同单元中具有相同氨基酸”，Söding博士说。因为进化上相关蛋白的功能和结构都通常是保守的，比如说即使进化过程中出现突变，序列改变了，但是多序列比对能找到未知蛋白的结构和分子功能。

在过去的15年间，最流行的比对蛋白质序列的工具是PSI-BLAST，这是由于这一程序兼具速度和高灵敏度，以及精确度。

但这一新成果，Söding博士的这一最新HHblits方法在各方面更胜一筹，这主要体现在两个方面，首先研究人员能将兴趣蛋白的序列，与数据库中蛋白的序列相互转换，进入Hidden Markov Models (HMMs)模式，HMMs是一种能配合序列比对过程中出现的突变可能的统计模型，因此这一步能提升亚序列相似搜索的灵敏度和准确性。

其次，这一研究组还研发了一种能帮助降低筛选量，而又不损害搜索灵敏性的过滤成像，这种方法能将搜索时间提高2500倍，Söding博士强调这种HHblits方法，比较于之前的方法，能更快更精确预测蛋白功能和结构，其研究组已经着手更深入提升这一方法，这包括协同蛋白三维结构数据进行分析。

下面是论文的英文摘要

Sequence-based protein function and structure prediction depends crucially on sequence-search sensitivity and accuracy of the resulting sequence alignments. We present an open-source, general-purpose tool that represents both query and database sequences by profile hidden Markov models (HMMs): 'HMM-HMM–based lightning-fast iterative sequence search' (HHblits;http://toolkit.genzentrum.lmu.de/hhblits/). Compared to the sequence-search tool PSI-BLAST, HHblits is faster owing to its discretized-profile prefilter, has 50–100% higher sensitivity and generates more accurate alignments.

HHblits：让序列比对更快更准更灵敏

楼主 | 收藏 | 举报 2018-12-27 00:00 浏览: 81 回复: 1

在过去的15年间，最流行的比对蛋白质序列的工具是PSI-BLAST，这是由于这一程序兼具速度和高灵敏度，以及精确度。

下面是论文的英文摘要

Sequence-based protein function and structure prediction depends crucially on sequence-search sensitivity and accuracy of the resulting sequence alignments. We present an open-source, general-purpose tool that represents both query and database sequences by profile hidden Markov models (HMMs): 'HMM-HMM–based lightning-fast iterative sequence search' (HHblits;http://toolkit.genzentrum.lmu.de/hhblits/). Compared to the sequence-search tool PSI-BLAST, HHblits is faster owing to its discretized-profile prefilter, has 50–100% higher sensitivity and generates more accurate alignments.

楼主 | 收藏 | 举报 2018-12-27 00:00 浏览: 81 回复: 1

在过去的15年间，最流行的比对蛋白质序列的工具是PSI-BLAST，这是由于这一程序兼具速度和高灵敏度，以及精确度。

下面是论文的英文摘要

Sequence-based protein function and structure prediction depends crucially on sequence-search sensitivity and accuracy of the resulting sequence alignments. We present an open-source, general-purpose tool that represents both query and database sequences by profile hidden Markov models (HMMs): 'HMM-HMM–based lightning-fast iterative sequence search' (HHblits;http://toolkit.genzentrum.lmu.de/hhblits/). Compared to the sequence-search tool PSI-BLAST, HHblits is faster owing to its discretized-profile prefilter, has 50–100% higher sensitivity and generates more accurate alignments.