Welcome to Resf5C-Pred !
What's in Resf5C-Pred

5-formylcytidine (f5C) is a unique post-transcriptional RNA modification found in mRNA and tRNA at the wobble site, playing a crucial role in mitochondrial protein synthesis and potentially contributing to the regulation of translation. Recent studies have unveiled that the f5C modifications may drive mitochondrial mRNA translation to power cancer metastasis. Accurate identification of f5C sites is essential for further unraveling their molecular functions and regulatory mechanisms, but there are currently no computational methods available for predicting their locations. In this study, we introduce an innovative ensemble approach, successfully enabling the computational recognition of Saccharomyces cerevisiae f5C for the first time with annotation from 10.18129/B9.bioc.BSgenome.Scerevisiae.UCSC.sacCer3. We conducted a comprehensive model selection process that involved multiple basic machine learning algorithms and deep learning architectures such as recurrent neural networks, convolutional neural networks and Transformer-based models. Initially trained only on sequence information, these individual models achieved an AUROC ranging from 0.71 to 0.74. Through the integration of 32 novel genomic features derived from f5C-related domain knowledge (transcripts), the performance of individual models has significantly improved to an AUROC between 0.73 and 0.80. To further enhance prediction accuracy and robustness, we then constructed the ensembles of these individual models with different combinations. The best performance attained by our ensemble models reached an AUROC of 0.8391. For detailed model construction, such as transformer and ResNet, please visit my github repository: https://github.com/Jiaming21/F5C-codes.git

index.png