遗传 ›› 2025, Vol. 47 ›› Issue (3): 382-392.doi: 10.16288/j.yczz.24-213

• 技术与方法 • 上一篇    

应用图像转换与深度学习提升单细胞分类精度

高炳熙(), 吴华煊, 杜志强()   

  1. 长江大学动物科学技术学院,荆州 434025
  • 收稿日期:2024-07-19 修回日期:2024-09-23 出版日期:2025-03-20 发布日期:2024-09-26
  • 通讯作者: 杜志强,博士,教授,研究方向:动物遗传育种与繁殖。E-mail: zhqdu@yangtzeu.edu.cn
  • 作者简介:高炳熙,硕士研究生,专业方向:动物遗传育种与繁殖。E-mail: gbx15020771250@163.com
  • 基金资助:
    安徽省畜禽联合育种改良项目(2021~2025)

Enhancing single-cell classification accuracy using image conversion and deep learning

Bingxi Gao(), Huaxuan Wu, Zhiqiang Du()   

  1. College of Animal Science and Technology, Yangtze University, Jingzhou 434025, China
  • Received:2024-07-19 Revised:2024-09-23 Published:2025-03-20 Online:2024-09-26
  • Supported by:
    Joint Research on Improved Livestock and Poultry Breeds in Anhui Province(2021~2025)

摘要:

单细胞转录组测序(single-cell transcriptome sequencing, scRNA-seq)通过高通量获取单细胞转录丰度数据,能够深入揭示细胞类型、亚型组成、特异性基因标记及功能差异,广泛应用于动植物发育生物学和重要性状解析等领域。然而,scRNA-seq数据常伴随高噪声、高维度和批次效应等问题,导致大量低表达基因和变异的出现,严重影响数据分析的准确性和可靠性。这不仅增加了数据处理的复杂性,还制约了特征选择和下游分析的效果。尽管已有多种统计推断和机器学习方法用于应对这些挑战,但在细胞类型识别、特征选择以及批次效应校正等方面,现有方法仍存在着局限性,难以满足复杂生物学研究的需求。因此本研究提出了一种创新的单细胞分类方法scIC (single-cell image classification),将scRNA-seq数据转换为图像形式,并结合深度学习技术进行细胞分类。通过这种图像转换的方式能够更有效地捕捉数据中的复杂模式,进而利用卷积神经网络(convolutional neural networks, CNN)和残差网络(residual network, ResNet)构建高效的分类模型。在测试4种细胞类型(小鼠皮肤基底细胞、小鼠淋巴细胞、人类神经元细胞和小鼠脊髓细胞)的scRNA-seq数据后,分类模型的准确率均超过94%,其中小鼠皮肤基底细胞数据集使用ResNet50模型时的分类准确率高达99.8%。这些结果表明,将scRNA-seq数据进行图像转换并与深度学习技术结合,能够显著提高分类准确性,为解决单细胞数据分析中的关键挑战提供了新思路和有效工具。本研究代码已公开于:https://github.com/Bingxi-Gao/ SCImageClassify。

关键词: 单细胞测序, 深度学习, 图像化处理, 细胞分类

Abstract:

Single-cell transcriptome sequencing (scRNA-seq) is widely used in the fields of animal and plant developmental biology and important trait analysis by obtaining single-cell transcript abundance data in high throughput, which can deeply reveal cell types, subtype composition, specific gene markers and functional differences. However, scRNA-seq data are often accompanied by problems such as high noise, high dimensionality and batch effect, resulting in a large number of low-expressed genes and variants, which seriously affect the accuracy and reliability of data analysis. This not only increases the complexity of data processing, but also limits the effectiveness of feature selection and downstream analysis. Although several statistical inference and machine learning methods have been used to address these challenges, the existing methods still have limitations in cell type identification, feature selection, and batch effect correction, which are difficult to meet the needs of complex biological research. In this study, we proposes an innovative single-cell classification method, scIC (single-cell image classification), which converts scRNA-seq data into image form and combines it with deep learning techniques for cell classification. Through this image conversion, we are able to capture complex patterns in the data more efficiently, and then construct efficient classification models using convolutional neural networks (CNN) and residual networks (ResNet). After testing scRNA-seq data from four cell types (mouse skin basal cells, mouse lymphocytes, human neuronal cells, and mouse spinal cord cells), the accuracy of the classification models exceeded 94%, with the mouse skin basal cell dataset achieving a classification accuracy of 99.8% when using the ResNet50 model. These results indicate that image transformation of scRNA-seq data and combining it with deep learning techniques can significantly improve the classification accuracy, providing new ideas and effective tools for solving key challenges in single-cell data analysis. The code for this study is publicly available at: https://github.com/Bingxi-Gao/SCImageClassify.

Key words: single cell sequencing, deep learning, image processing, cell classification