融合通道与空间注意力机制的转录因子结合位点预测方法

doi:10.16288/j.yczz.25-184

遗传

• 技术与方法 •

融合通道与空间注意力机制的转录因子结合位点预测方法

丰继华¹^，²，陈忠兴¹^，²，康琦林¹^，²，李龙飞¹^，²，杨佳慧¹^，²，张雨亭¹^，²

1.云南民族大学电气信息工程学院信息工程系，昆明 650504

2.云南省无人自主系统重点实验室，昆明 650504

收稿日期:2025-10-16 修回日期:2026-01-04 发布日期:2026-01-13
基金资助:
国家自然科学基金项目(编号：31160234)资助

Prediction method for transcription factor binding sites integrating channel and spatial attention mechanisms

Jihua Feng¹^，², Zhongxing Chen¹^，², Qilin Kang¹^，², Longfei Li¹^，², Jiahui Yang¹^，², Yuting Zhang¹^，²

1.School of Electrical and Information Engineering, Yunnan Minzu University, Kunming 650504, China

2.Yunnan Key Laboratory of Unmanned Autonomous System, Kunming 650504, China

Received:2025-10-16 Revised:2026-01-04 Online:2026-01-13
Supported by:
[Supported by the National Natural Science Foundation of China (No. 31160234)]

摘要/Abstract

摘要：

精准识别单核苷酸分辨率下的转录因子结合位点（transcription factor binding sites, TFBSs）是解析基因表达调控网络的核心科学问题。为改进现有计算模型在跨细胞类型预测中的性能，本研究提出一种融合通道与空间注意力机制的深度学习模型。通过系统整合10个核心转录调控因子（包括CTCF、EGR1、FOXA1等）在13种典型人类细胞系（涵盖A549、GM12878、H1-hESC等）的51组染色质免疫沉淀测序（chromatin immunoprecipitation sequencing, ChIP-seq）数据和13组脱氧核糖核酸酶I高敏感位点测序(deoxyribonuclease I hypersensitive site sequencing, DNase-seq)数据对模型进行训练与测试，结果表明，在23个测试的TF-细胞类型中表现出优异性能，平均受试者工作特征曲线下面积(area under receiver operating characteristic curve, AUROC)达到0.986，其中91%样本的AUROC超过0.970；平均精确率-召回率曲线下面积（area under precision recall curve, AUPRC）为0.169，较随机预测基线（0.000156）提升超1,000倍。相较于FactorNet、Leopard及DeepGRN等当前领域内具有代表性的模型，本模型在9个共有的TF-细胞类型数据集上，其AUROC均值展现出优势。可视化分析表明，模型能精准识别TF在不同细胞类型中的特异性结合位点。上述结果表明，本模型为跨细胞类型的TFBSs精准预测提供了高效计算工具，有望为基因表达调控机制的深入解析及相关疾病分子机理研究提供重要支撑。

关键词:

转录因子结合位点, 注意力机制, 深度学习, 单核苷酸分辨率, 跨细胞预测

Abstract:

Accurate identification of transcription factor binding sites (TFBSs) at single-nucleotide resolution remains a central challenge in deciphering gene expression regulatory networks. To improve the performance of existing computational models for predicting TFBSs across different cell types, we present a deep learning model integrating channel and spatial attention mechanisms. In this study, we trained and tested the model using a comprehensive dataset that includes ChIP-seq data from 51 groups, involving 10 core transcription factors (e.g., CTCF, EGR1, FOXA1) across 13 human cell lines (e.g., A549, GM12878, H1-hESC), and DNase-seq data from 13 datasets. The results demonstrated that this model exhibited superior performance across 23 TF-cell type combinations, achieving a mean area under the receiver operating characteristic curve (AUROC) of 0.986, with 91% of samples yielding an AUROC above 0.970. Additionally, the mean area under the precision-recall curve (AUPRC) reached 0.169, over 1,000-fold higher than the random baseline 0.000156. When compared to state-of-the-art models in the field, such as FactorNet, Leopard, and DeepGRN, our model outperformed them in terms of AUROC on 9 shared TF-cell type datasets. Visualization analyses further confirmed that our model enables accurate identification of cell-type-specific TFBSs. This study provides an efficient computational framework for precise cross-cell-type TFBS prediction, thereby facilitating in-depth investigations into gene expression regulatory mechanisms and the molecular pathogenesis of related diseases.

Key words:

transcription factor binding sites, attention mechanism, deep learning, single-nucleotide resolution, cross-cell prediction

丰继华, 陈忠兴, 康琦林, 李龙飞, 杨佳慧, 张雨亭. 融合通道与空间注意力机制的转录因子结合位点预测方法[J]. 遗传, doi: 10.16288/j.yczz.25-184.

Jihua Feng, Zhongxing Chen, Qilin Kang, Longfei Li, Jiahui Yang, Yuting Zhang . Prediction method for transcription factor binding sites integrating channel and spatial attention mechanisms[J]. Hereditas(Beijing), doi: 10.16288/j.yczz.25-184.

[1]	高炳熙, 吴华煊, 杜志强. 应用图像转换与深度学习提升单细胞分类精度[J]. 遗传, 2025, 47(3): 382-392.
[2]	鲍艳春, 石彩霞, 张传强, 谷明娟, 朱琳, 刘在霞, 周乐, 马凤英, 娜日苏, 张文广. 深度学习在基因组学中的研究进展[J]. 遗传, 2024, 46(9): 701-715.
[3]	杨帆, 韩巧玲, 赵文迪, 赵玥. 基于层级和全局特征结合的蛋白质序列EC编号预测[J]. 遗传, 2024, 46(8): 661-669.
[4]	郑慧怡, 吴华煊, 杜志强. 肠道宏基因组图像增强和深度学习改善代谢性疾病分类预测精度[J]. 遗传, 2024, 46(10): 886-896.
[5]	胡伟澎, 李佑平, 张秀清. 基于迁移学习的MHC-I型抗原表位呈递预测[J]. 遗传, 2019, 41(11): 1041-1049.

融合通道与空间注意力机制的转录因子结合位点预测方法

Prediction method for transcription factor binding sites integrating channel and spatial attention mechanisms

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 5

编辑推荐

Metrics