Graphic analysis of population structure on genome-wide rheumatoid arthritis dataReport as inadecuate

Graphic analysis of population structure on genome-wide rheumatoid arthritis data - Download this document for free, or read online. Document in PDF available to download.

BMC Proceedings

, 3:S110

First Online: 15 December 2009DOI: 10.1186-1753-6561-3-S7-S110

Cite this article as: Zhang, J., Weng, C. & Niyogi, P. BMC Proc 2009 3Suppl 7: S110. doi:10.1186-1753-6561-3-S7-S110


Principal-component analysis PCA has been used for decades to summarize the human genetic variation across geographic regions and to infer population migration history. Reduction of spurious associations due to population structure is crucial for the success of disease association studies. Recently, PCA has also become a popular method for detecting population structure and correction of population stratification in disease association studies. Inspired by manifold learning, we propose a novel method based on spectral graph theory. Regarding each study subject as a node with suitably defined weights for its edges to close neighbors, one can form a weighted graph. We suggest using the spectrum of the associated graph Laplacian operator, namely, Laplacian eigenfunctions, to infer population structures instead of principal components PCs. For the whole genome-wide association data for the North American Rheumatoid Arthritis Consortium NARAC provided by Genetic Workshop Analysis 16, Laplacian eigenfunctions revealed more meaningful structures of the underlying population than PCA. The proposed method has connection to PCA, and it naturally includes PCA as a special case. Our simple method is computationally fast and is suitable for disease studies at the genome-wide scale.

List of abbreviations usedNARACNorth American Rheumatoid Arthritis Consortium


PCAPrincipal component analysis


SNPSingle-nucleotide polymorphism.

Download fulltext PDF

Author: Jun Zhang - Chunhua Weng - Partha Niyogi


Related documents