遗传 ›› 2006, Vol. 28 ›› Issue (9): 1129-1134.

• 技术与方法 • 上一篇    下一篇

基于耦合双向聚类技术的DLBCL异质性分析

李 丽1, 2; 李 霞2~5; 陈义汉1; 郭 政2, 3; 姜 伟3; 张瑞杰3; 饶绍奇3, 6   

  1. 1. 同济大学医学遗传研究所, 上海 200092; 2. 同济大学生命科学技术学院, 上海 200092; 3.哈尔滨医科大学生物信息学系, 哈尔滨 150086;
    4.哈尔滨工业大学计算机学院, 哈尔滨 150001; 5.首都医科大学生物医学工程学院, 北京 100054; 6. Departments of Cardiovascular Medicine and Molecular Cardiology, Cleveland Clinic Foundation, Cleveland, Ohio 44195, USA

  • 收稿日期:2006-01-20 修回日期:2006-03-15 出版日期:2006-09-01 发布日期:2006-09-01
  • 通讯作者: 李 霞

Analysis of Diffuse Large B-cell Lymphoma Heterogeneity Based on Coupled Two-way Clustering

LI Li1,2; LI Xia2~5; CHEN Yi-Han1; GUO Zheng 2, 3; JIANG Wei3 ; ZHAGN Rui-Jie3; RAO Shao-Qi3, 6

  

  1. 1. Institute of Medical Genetics, Tongji Unversity, Shanghai 200092, China; 2. College of Biological Science and Technology, Tongji Uni-versity, Shanghai 200092, China; 3. Department of Bioinformatics, Harbin Medical University,Harbin 150086, China; 4. Department of Com-puter Science, Harbin Institute of Technology, Harbin 150080, China; 5. Biomedical Engineering Institute of CUMS, Beijing 10004, China; 6. Departments of Cardiovascular Medicine and Molecular Cardiology, Cleveland Clinic Foundation, Cleveland, Ohio 44195, USA
  • Received:2006-01-20 Revised:2006-03-15 Online:2006-09-01 Published:2006-09-01
  • Contact: LI Xia

摘要:

基因芯片技术为疾病异质性研究提供了有力的工具。当前基于传统聚类分析的方法一般利用芯片上大量基因作为特征来发现疾病的亚型, 因此它们没有考虑到特征中包含的大量无关基因会掩盖有意义的疾病样本的分割。为了避免这个缺点, 提出了基于耦合双向聚类的异质性分析方法(Heterogeneous Analysis Based on Coupled Two-Way Clustering, HCTWC)来搜索有意义的基因簇以便发现样本的内在分割。该方法被应用于弥漫性大B细胞淋巴瘤(diffuse large B-cell lymphoma DLBCL)芯片数据集, 通过识别的基因簇作为特征对DLBCL样本聚类发现生存期分别为55%和25%的两类DLBCL亚型(P<0.05), 因此, HCTWC方法在解决疾病异质性是有效的。

关键词: 概念一致性, 超顺磁性聚类, 疾病异质性, 基因表达谱

Abstract:

Microarray technology has proposed a powerful tool in dealling with the heterogeneity of disease. Currently, many methods in the field are based on traditional hierarchical clustering to discover subtypes of disease using a large number of genes on microarray.However, they did not considered that large unrelated noise (genes)may mask significant partitions and correlations of disease samples. To avoid the shortcoming, this paper presented a heterogeneous analysis based on coupled two-way clustering (HCTWC) to search interesting gene signature and find the natural partitions of disease samples. The method was applied to diffuse large B-cell lymphoma (DLBCL) microarray dataset. By identifying significant gene signature, we were able to discover the two new subtypes of DLBCL with survival rate 55% and 25% respectively. The results showed that HCTWC had the potential to be a powerful tool for solving the heterogeneity of disease on gene expression profile.

中图分类号: