融合序列和多標簽嵌入信息的多視角深度學習多功能酶預測
首發時間:2023-04-28
摘要:多功能酶能以不同的功能和形式對生物體的生存、進化產生積極作用,因此了解相關酶的功能就顯得至關重要。當前,傳統的機器學習方法已廣泛應用到酶功能分類預測方法中,但大多數方法僅針對于單功能酶的分類預測任務,且現有的少數多功能酶分類模型只能預測酶委員會(EC)編號的第一層。針對上述挑戰,本文提出了一種融合序列和多標簽嵌入信息的多視角深度學習多功能酶預測方法。在該方法中,使用由帶注意力機制的卷積神經網絡(CNN)和雙向長短記憶網絡(BiLSTM)組成的混合網絡對酶序列深度特征進行學習。同時,對EC編號每一層的分類預測模型構建一個EC類相關圖,并利用圖卷積網絡(GCN)對EC類相關標簽進行嵌入,利用得到的標簽嵌入對特征學習過程進行指導。最后通過一個多標簽分類器對多功能酶進行分類預測。實驗結果表明,該方法在EC編碼第四層的子集精度達到75.75%,其Macro_F1參數達到90.41%,與現有方法相比,該方法在多功能酶四層EC碼預測性能上均得到了一定提升。
關鍵詞: 深度學習 多功能酶分類 多視角學習 圖卷積網絡 多標簽分類
For information in English, please click here
Multi-view Deep Learning Multifunctional Enzyme Prediction Based on Fusion of Sequence and Multi label Embedded Information
Abstract:Multifunctional enzymes can have positive effects on the survival and evolution of organisms in different functions and forms, so it is crucial to understand the functions of related enzymes. Currently, traditional machine learning methods have been widely used in enzyme function classification prediction methods, but most of them are only for the classification prediction task of single-function enzymes, and the few existing multifunctional enzyme classification models can only predict the first level of enzyme committee (EC) numbering. To address the above challenges, this paper proposes a multi-view deep learning multifunctional enzyme prediction method that fuses sequence and multi-label embedding information. In this method, a hybrid network consisting of a convolutional neural network (CNN) with attention mechanism and a bi-directional long-short memory network (Bi-LSTM) is used to learn the deep features of enzyme sequences. Meanwhile, an EC class correlation graph is constructed for the classification prediction model of each layer of EC numbering, and the EC class correlation labels are embedded using a graph convolutional network (GCN), and the obtained label embeddings are used to guide the feature learning process. Finally, a multi-label classifier is used for classification prediction of multifunctional enzymes. The experimental results show that the method achieves 75.75% accuracy for a subset of EC codes in the fourth layer, and its Macro_F1 parameter reaches 90.41%, which is a certain improvement in the prediction performance of all four EC code layers of multifunctional enzymes compared with existing methods.
Keywords: deep learning multifunctional enzyme classification multi-view Learning graph convolutional network multi-label classification
基金:
引用
No.****
動態公開評議
共計0人參與
勘誤表
融合序列和多標簽嵌入信息的多視角深度學習多功能酶預測
評論
全部評論