Corbi: a new R package for biological network alignment and queryingReport as inadecuate

Corbi: a new R package for biological network alignment and querying - Download this document for free, or read online. Document in PDF available to download.

BMC Systems Biology

, 7:S6

First Online: 14 October 2013


In the last decade, plenty of biological networks are built from the large scale experimental data produced by the rapidly developing high-throughput techniques as well as literature and other sources. But the huge amount of network data have not been fully utilized due to the limited biological network analysis tools. As a basic and essential bioinformatics method, biological network alignment and querying have been applied in many fields such as predicting new protein-protein interactions PPI. Although many algorithms were published, the network alignment and querying problems are not solved satisfactorily. In this paper, we extended CNetQ, a novel network querying method based on the conditional random fields model, to solve network alignment problem, by adopting an iterative bi-directional mapping strategy. The new method, called CNetA, was compared with other four methods on fifty simulated and three real PPI network alignment instances by using four structural and five biological measures. The computational experiments on the simulated data, which were generated from a biological network evolutionary model to validate the effectiveness of network alignment methods, show that CNetA gets the best accuracy in terms of both nodes and networks. For the real data, larger biological conserved subnetworks and larger connected subnetworks were identified, compared with the structural-dominated methods and the biological-dominated methods, respectively, which suggests that CNetA can better balances the biological and structural similarities. Further, CNetQ and CNetA have been implemented in a new R package Corbi, and freely accessible and easy used web services for CNetQ and CNetA have also been constructed based on the R package. The simulated and real datasets used in this paper are available for downloading at

AbbreviationMPMatching pairs

ECedge correctness

LCCSLargest common connected subgraph

MFMolecular function

BPBiological process

CCCellular component

OPOrthologous proteins

HPHit pathways

PACPathway average coverage. The numbers in LCCS are the number of nodes and edges of LCCS respectively. DUP: duplicated nodes. ALL: all the nodes in the alignment networks.

Electronic supplementary materialThe online version of this article doi:10.1186-1752-0509-7-S2-S6 contains supplementary material, which is available to authorized users.

Download fulltext PDF

Author: Qiang Huang - Ling-Yun Wu - Xiang-Sun Zhang


Related documents