Full Text (PDF)
Review Article

Use of AI in Biological Databases

Prachi Srivastava, Ayushi Mishra, Shrijal Singh

Author Information

Licence:

Attribution-Non-commercial 4.0 International (CC BY-NC 4.0)

This license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. 


International Journal of Neurology and Neurosurgery 17(2):p 88-101, May-August 2025. | DOI: 10.21088/ijnns.0975.0223.17225.4

How Cite This Article:

Ayushi Mishra, Shrijal Singh, Prachi Srivastava. Use of AI in Biological Databases. International Journal of Neurology and Neurosurgery. 2025; 17(2): 88-101.

Timeline

Received : March 13, 2025         Accepted : May 08, 2025          Published : July 30, 2025

Abstract

Biological databases are fundamental to modern life sciences, serving as repositories for genomic sequences, protein structures, metabolic pathways, and more. With the increasing complexity and volume of biological data, artificial intelligence (AI) has become indispensable in enhancing data integration, analysis, and interpretation. This review explores the role of AI in transforming biological databases, highlighting key applications such as protein structure prediction, functional annotation, pandemic surveillance, and antibiotic resistance analysis. AI-driven tools like Alpha Fold have revolutionized protein structure determination, while UniProt leverages machine learning for automated protein function annotation. In genomic research, AI enhances text mining in PubMed through PubTator, aiding in extracting relevant biological information. AI also facilitates pathway predictions in KEGG, improves virus tracking in GISAID, and strengthens protein-protein interaction predictions in STRING. Furthermore, AI has advanced CRISPR-Cas9 off-target effect predictions, ensuring safer genome editing, and has been pivotal in analyzing antibiotic resistance through the CARD database. Despite challenges such as data heterogeneity and model interpretability, AI continues to expand the capabilities of biological databases, accelerating research, innovation, and applications in medicine, drug discovery, and personalized healthcare


References

  • 1.   Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, et al. GenBank. Nucleic Acids Res. 2013 Jan 1;41(D1):D36–42.
  • 2.   Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res. 2000 Jan 1;28(1):235–42.
  • 3.   Cunningham F, Allen JE, Allen J, Alvarez-Jarreta J, Amode MR, Armean IM, et al. Ensembl 2022. Nucleic Acids Res. 2022 Jan 7;50(D1):D988–95.
  • 4.   Stephens ZD, Lee SY, Faghri F, Campbell RH, Zhai C, Efron MJ, et al. Big Data: Astronomical or Genomical? PLoS Biol. 2015 Jul;13(7):e1002195.
  • 5.   Elbe S, Buckland-Merrett G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Glob Chall Hoboken NJ. 2017 Jan;1(1):33–46.
  • 6.   Angermueller C, Pärnamaa T, Parts L, Stegle O. Deep learning for computational biology. Mol Syst Biol. 2016 Jul 29;12(7):878.
  • 7.   Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021 Aug;596(7873):583–9.
  • 8.   Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T. The rise of deep learning in drug discovery. Drug Discov Today. 2018 Jun;23(6):1241–50.
  • 9.   Radivojac P, Clark WT, Oron TR, Schnoes AM, Wittkop T, Sokolov A, et al. A large-scale evaluation of computational protein function prediction. Nat Methods. 2013 Mar;10(3):221–7.
  • 10.   Lu Z. PubMed and beyond: a survey of web tools for searching biomedical literature. Database J Biol Databases Curation. 2011;2011:baq036.
  • 11.   Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020 Jan;577(7792):706–10.
  • 12.   Janssen BJC, Gros P. Structural insights into the central complement component C3. Mol Immunol. 2007 Jan;44(1–3):3–10.
  • 13.   Tunyasuvunakool K, Adler J, Wu Z, Green T, Zielinski M, Žídek A, et al. Highly accurate protein structure prediction for the human proteome. Nature. 2021 Aug;596(7873):590–6.
  • 14.   Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci U S A. 2021 Apr 13;118(15):e2016239118.
  • 15.   Rao R, Liu J, Verkuil R, Meier J, Canny JF, Abbeel P, et al. MSA Transformer [Internet]. bioRxiv; 2021 [cited 2025 Jan 13]. p. 2021.02.12.430858. Available from: https://www.biorxiv.org/content/10.1101/2021.02.12.430858v3
  • 16.   Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science. 2021 Aug 20;373(6557):871–6.
  • 17.   Zeng M, Li M, Wu FX, Li Y, Pan Y. DeepEP: a deep learning framework for identifying essential proteins. BMC Bioinformatics. 2019 Dec 2;20(16):506.
  • 18.   Hong J, Luo Y, Zhang Y, Ying J, Xue W, Xie T, et al. Protein functional annotation of simultaneously improved stability, accuracy and false discovery rate achieved by a sequence-based deep learning. Brief Bioinform. 2019 Aug 2;21(4):1437–47.
  • 19.   Yu CS, Chen YC, Lu CH, Hwang JK. Prediction of protein subcellular localization. Proteins. 2006 Aug 15;64(3):643–51.
  • 20.   Liu S., Liu C., Deng L. Machine Learning Approaches for Protein-Protein Interaction Hot Spot Prediction: Progress and Comparative Assessment. Mol Basel Switz. 2018 Oct 4;
  • 21.   Wei C.H., Kao H.Y, Lu Z. PubTator: a webbased text mining tool for assisting biocuration. Nucleic Acids Res. 2013 Jul;41(Web Server
  • 22.   Hong J., Luo Y., Zhang Y., Ying J., Xue W., Xie T., et al. Protein functional annotation of simultaneously improved stability, accuracy and false discovery rate achieved by a sequence-based deep learning. Brief Bioinform. 2019 Aug 2; 21(4): 1437–47.
  • 23.   Yue Y., Ye C., Peng P.Y., Zhai H.X., Ahmad I., Xia C., et al. A deep learning framework for identifying essential proteins based on multiple biological information. BMC Bioinformatics. 2022 Aug 4; 23(1): 318.
  • 24.   Yu C.S., Chen Y.C., Lu C.H., Hwang J.K. Prediction of protein subcellular localization. Proteins. 2006 Aug 15; 64(3): 643–51.
  • 25.   Jin Q., Leaman R., Lu Z. PubMed and beyond: biomedical literature search in the age of artificial intelligence. eBioMedicine. 2024 Feb 1; 100: 104988.
  • 26.   Wei C.H., Allot A., Leaman R., Lu Z. PubTator central: automated concept annotation for biomedical full text articles. Nucleic Acids Res. 2019 Jul 2; 47(W1): W587–93.
  • 27.   Zhao S., Su C., Lu Z., Wang F. Recent advances in biomedical literature mining. Brief Bioinform. 2020 May 18;22(3):bbaa057.
  • 28.   Zhang Y., Luo M., Wu P., Wu S., Lee T.Y., Bai C. Application of Computational Biology and Artificial Intelligence in Drug Design. Int J Mol Sci. 2022 Nov 5;23(21):13568.
  • 29.   Helmy M., Smith D., Selvarajoo K. Systems biology approaches integrated with artificial intelligence for optimized metabolic engineering. Metab Eng Commun. 2020 Dec 1; 11: e00149..
  • 30.   van Hilten A., Kushner SA, Kayser M., Ikram M.A., Adams H.H.H., Klaver C.C.W., et al. GenNet framework: interpretable deep learning for predicting phenotypes from genetic data. Commun Biol. 2021 Sep 17;4:1094..
  • 31.   Kipf T.N., Welling M. Semi-Supervised Classification with Graph Convolutional Networks [Internet]. arXiv; 2017 [cited 2025 Jan 20]. Available from: http://arxiv.org/ab 1609.02907
  • 32.   Janizek J.D., Spiro A., Celik S., Blue B.W., Russell J.C., Lee T.I., et al. PAUSE: principled feature attribution for unsupervised gene expression analysis. Genome Biol. 2023 Apr 19;24(1):81
  • 33.   Tosta S., Moreno K., Schuab G., Fonseca V., Segovia F.M.C., Kashima S., et al. Global SARSCoV-2 genomic surveillance: What we have learned (so far). Infect Genet Evol. 2023 Mar 1;108:105405.
  • 34.   Singh J., Rahman S.A., Ehtesham N.Z., Hira S, Hasnain S.E. SARS-CoV-2 variants of concern are emerging in India. Nat Med. 2021 Jul; 27(7):1131–3.
  • 35.   Dhama K., Nainu F., Frediansyah A., Yatoo MohdI, Mohapatra R.K., Chakraborty S., et al. Global emerging Omicron variant of SARSCoV-2: Impacts, challenges and strategies. J
  • 36.   Gutnik D., Evseev P., Miroshnikov K., Shneider M. Using AlphaFold Predictions in Viral Research. Curr Issues Mol Biol. 2023 Apr 21; 45(4): 3705–32.
  • 37.   Greaney A.J., Loes A.N., Crawford K.H.D., Starr T.N., Malone K.D., Chu H.Y., et al. Comprehensive mapping of mutations in the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human plasma antibodies. Cell Host Microbe. 2021 Mar 10; 29(3): 463-476.e6.
  • 38.   Shu Y., McCauley J. GISAID: Global initiative on sharing all influenza data – from vision to reality. Eurosurveillance. 2017 Mar 30;22(13):30494.
  • 40.   Harishbhai Tilala M., Kumar Chenchala P., Choppadandi A., Kaur J., Naguri S., Saoji R., et al. Ethical Considerations in the Use of
  • 41.   Szklarczyk D., Gable A.L., Lyon D., Junge A., Wyder S., Huerta-Cepas J., et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019 Jan 8; 47(D1):
  • 42.   Xian L., Wang Y. Advances in Computational Methods for Protein–Protein Interaction Prediction. Electronics. 2024 Jan; 13(6): 1059.
  • 43.   Soleymani F., Paquet E., Viktor H., Michalowski W., Spinello D. Protein–protein interaction prediction with deep learning: A comprehensive review. Comput Struct Biotechnol J. 2022 Jan 1; 20: 5316–41.
  • 45.   Deep learning methods for protein function prediction - Boadu - 2025 - PROTEOMICS - Wiley Online Library [Internet]. [cited 2025 Jan 20]. Available from: https:// analyticalsciencejournals.onlinelibrary.wiley. com/doi/10.1002/pmic.202300471?af=R.
  • 46.   Canzler S., Fischer M., Ulbricht D., Ristic N., Hildebrand P.W., Staritzbichler R. ProteinPrompt: a webserver for predicting protein-protein interactions. Bioinforma Adv. 2022; 2(1): vbac059
  • 47.   The use of artificial intelligence for automating or semi-automating biomedical literature analyses: A scoping review - ScienceDirect [Internet]. [cited 2025 Jan 20]. Available from: https://www.sciencedirect.com/science/ article/pii/S1532046423001107?via%3Dihub
  • 48.   Kiouri D.P., Batsis G.C., Chasapis C.T. Structure-Based Approaches for Protein– Protein Interaction Prediction Using Machine Learning and Deep Learning. Biomolecules. 2025 Jan; 15(1): 141..
  • 49.   Espina-Romero L., Gutiérrez Hurtado H., Ríos Parra D., Vilchez Pirela R.A., TalaveraAguirre R., Ochoa-Díaz A. Challenges and Opportunities in the Implementation of AI in Manufacturing: A Bibliometric Analysis. Sci.
  • 50.   Kewalramani N., Emili A., Crovella M. Stateof-the-art computational methods to predict protein–protein interactions with high accuracy and coverage. PROTEOMICS. 2023; 23(21–22): 2200292.
  • 51.   Casadio R., Martelli P.L., Savojardo C. Machine learning solutions for predicting protein– protein interactions. WIREs Comput Mol Sci. 2022; 12(6): e1618.
  • 52.   Zhang Z., Lamson A.R., Shelley M., Troyanskaya O. Interpretable neural architecture search and transfer learning for understanding CRISPR– Cas9 off-target enzymatic reactions. Nat Comput Sci. 2023 Dec; 3(12): 1056–66.
  • 53.   Özden F., Minary P. Learning to quantify uncertainty in off-target activity for CRISPR guide RNAs. Nucleic Acids Res. 2024 Oct 14; 52(18): e87.
  • 54.   Chen Q., Chuai G., Zhang H., Tang J., Duan L., Guan H., et al. Genome-wide CRISPR off-target prediction and optimization using RNA-DNA interaction fingerprints. Nat Commun. 2023 Nov 18; 14(1): 7521.
  • 55.   Lee M. Deep learning in CRISPR-Cas systems: a review of recent studies. Front Bioeng Biotechnol [Internet]. 2023 Jul 3 [cited 2025 Jan 20]; 11. Available from: https://www. frontiersin.org/journals/bioengineeringand-biotechnology/articles/10.3389/ fbioe.2023..
  • 56.   Dixit S., Kumar A., Srinivasan K., Vincent PMDR, Ramu Krishnan N. Advancing genome editing with artificial intelligence: opportunities, challenges, and future directions. Front Bioeng Biotechnol [Internet]. 2024 Jan 8 [cited 2025 Jan 20]; 11. Available from: https://www.frontiersin.org/ journals/bioengineering-and-biotechnology/ articles/10.3389/fbioe.2023.1335901/full.
  • 57.   Sari O., Liu Z., Pan Y., Shao X. Predicting CRISPR-Cas9 off-target effects in human primary cells using bidirectional LSTM with BERT embedding. Bioinforma Adv. 2025; 5(1): vbae184.
  • 58.   Khoshandam M., Soltaninejad H., Mousazadeh M., Hamidieh A.A., Hosseinkhani S. Clinical applications of the CRISPR/Cas9 genomeediting system: Delivery options and challenges in precision medicine. Genes Dis. 2024 Jan 1; 11(1): 268–82
  • 59.   O’Neill J. Tackling drug-resistant infections globally: final report and recommendations [Internet]. Government of the United Kingdom; 2016 May [cited 2025 Jan 20]. Available from: https://apo.org.au/node/63983
  • 60.   Popa S.L., Pop C., Dita M.O., Brata V.D., Bolchis R., Czako Z., et al. Deep Learning and Antibiotic Resistance. Antibiotics. 2022 Nov 21; 11(11): 1674.
  • 61.   Olatunji I., Bardaji D.K.R., Miranda R.R., Savka M.A., Hudson A.O. Artificial intelligence tools for the identification of antibiotic resistance genes. Front Microbiol. 2024 Jul 12; 15: 1437602
  • 62.   Kim J.I., Maguire F., Tsang K.K., Gouliouris 101 IJNNS / Volume 17, Number 2 / May - August 2025 T, Peacock S.J., McAllister TA, et al. Machine Learning for Antimicrobial Resistance Prediction: Current Practice, Limitations, and Clinical Perspective. Clin Microbiol Rev. 35(3): e00179-21.
  • 63.   Li Y., Cui X., Yang X., Liu G., Zhang J. Artificial intelligence in predicting pathogenic microorganisms’ antimicrobial resistance: challenges, progress, and prospects. Front Cell Infect Microbiol. 2024 Nov 1; 14: 1482186.
  • 64.   Weaver D.T., King E.S., Maltas J., Scott J.G. Reinforcement learning informs optimal treatment strategies to limit antibiotic resistance. Proc Natl Acad Sci U S A. 121(16): e2303165121.
  • 65.   Guo W., Sun F., Liu F., Cao L., Yang J., Chen Y. Antimicrobial resistance surveillance and prediction of Gram-negative bacteria based on antimicrobial consumption in a hospital setting: A 15-year retrospective study. Medicine (Baltimore). 2019 Sep; 98(37): e17157
  • 66.   Ryu B., Jeon W., Kim D. Integrating genomic and molecular data to predict antimicrobial minimum inhibitory concentration in Klebsiella pneumoniae. Sci Rep. 2024 Oct 29; 14(1): 25951..
  • 67.   Behling A.H., Wilson B.C., Ho D., Virta M., O’Sullivan J.M., Vatanen T. Addressing antibiotic resistance: computational answers to a biological problem? Curr Opin Microbiol. 2023 Aug 1; 74: 102305
  • 68.   Lv J., Deng S., Zhang L. A review of artificial intelligence applications for antimicrobial resistance. Biosaf Health. 2021 Feb 1; 3(1): 22–31

Data Sharing Statement

There are no additional data available. All raw data and code are available upon request.

Funding

This research received no funding.

Author Contributions

All authors contributed significantly to the work and approve its publication.

Ethics Declaration

This article does not involve any human or animal subjects, and therefore does not require ethics approval.

Acknowledgements

We would like to express our gratitude to the patients, their families, and all those who have contributed to this study.

Conflicts of Interest

No conflicts of interest .


About this article


Cite this article

Ayushi Mishra, Shrijal Singh, Prachi Srivastava. Use of AI in Biological Databases. International Journal of Neurology and Neurosurgery. 2025; 17(2): 88-101.


Licence:

Attribution-Non-commercial 4.0 International (CC BY-NC 4.0)

This license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. 


Received Accepted Published
March 13, 2025 May 08, 2025 July 30, 2025

DOI: 10.21088/ijnns.0975.0223.17225.4

Keywords

Artificial IntelligenceBiological DatabasesCRISPR-Cas9KEGGSTRING

Article Level Metrics

Last Updated

Thursday 18 June 2026, 04:21:58 (IST)


977

Accesses

11
226
00

Citations


NA
NA
NA

Download citation


Article Keywords


Keyword Highlighting

Highlight selected keywords in the article text.


Timeline


Received March 13, 2025
Accepted May 08, 2025
Published July 30, 2025

licence


Attribution-Non-commercial 4.0 International (CC BY-NC 4.0)

This license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. 


Access this article



Share