Leveraging Machine Learning to Outsmart the Pharmaceutical Industry: A Pathway to Affordable Drugs

Authors: Amina Khalpey, PhD, Ezekiel Mendoza, BS, Brynne Rozell, BS, Zain Khalpey, MD, PhD, FACS

The high cost of pharmaceutical drugs has become a major treatment hurdle for both patients and healthcare systems worldwide. The pharmaceutical industry’s complex pricing structures and barriers to entry have greatly limited access to essential medications for many people. An artificial intelligence strategy called machine learning (ML) presents a unique opportunity to revolutionize drug discovery and development by reducing the cost of drugs and ultimately making them more accessible. This white paper explores how ML algorithms can outsmart the pharmaceutical industry behemoth by providing high quality and cheaper drugs for patients. In this blog post we discuss several key applications of ML in the use of drug discovery, optimization, and production, supported by real-world examples and cutting-edge research.

Climbing Drug Costs

The cost of life-saving prescription drugs has been a persistent and growing concern for patients, healthcare providers, and policymakers alike. With limited access to affordable medications, the health and well-being of millions of people are at stake. Today, the cost to bring a new drug to market ranges from $314 million to $2.8 billion (Wouters et al., 2020). One of the most important determinants of a medication’s price is the trillion-dollar pharmaceutical industry. The pharmaceutical industry’s high prices are driven by a range of factors, including research and development (R&D) costs, regulatory barriers, monopolistic practices, and complex pricing structures (Kesselheim et al., 2016).

Machine learning has the potential to revolutionize the pharmaceutical industry by streamlining drug discovery, development, and manufacturing processes, thereby reducing costs, and ultimately making drugs more affordable for patients. ML algorithms can analyze large volumes of data, identify patterns, and generate predictions that are beyond human capacity, opening up new avenues for drug discovery and optimization (Vamathevan et al., 2019).

Machine Learning in Drug Discovery and Development

The traditional drug discovery methods rely on labor-intensive, costly, and time-consuming processes. This process involves a series of steps, including target identification, target validation, hit identification, lead optimization and pre-clinical trials. At each step, researchers are looking for compounds that can interact with the target in a way that can be useful for treating a disease. Once a promising compound is identified, it undergoes further testing and refinement before it is approved for use in humans. ML can streamline these processes by facilitating target identification, lead compound discovery, and predictive modeling (Chen et al., 2018).

Target Identification:

ML can help identify potential drug targets by analyzing large-scale datasets, including genomic, transcriptomic, proteomic, and metabolomic data (Mamoshina et al., 2018). As the evolution of ML continues, it will become more efficient in analyzing datasets and aid in the creation of personalized medicine (Zhu, 2020). For example, the DeepAffinity project used ML to predict protein-ligand binding affinities, enabling researchers to identify potential drug targets with greater accuracy and efficiency (Jiménez-Luna et al., 2020). These types of ML strategies will enable potential drug targets to be vetted more efficiently than ever before.

Lead Compound Discovery:

ML algorithms also have the ability to sift through vast chemical libraries and identify potential lead compounds based on their molecular structures, biological activities, and other properties (Stokes et al., 2020). For instance, the AtomNet system employed deep learning to predict bioactive molecules, leading to the discovery of a novel antiviral compound with potential applications in treating Ebola (Wallach et al., 2015). These examples highlight how ML algorithms are able identify patterns and correlations between a compound’s properties and its potential as a drug candidate, allowing researchers to more quickly identify promising compounds. Additionally, machine learning can be used to identify novel compounds that may have not been considered previously and to predict the activity of compounds at a much faster rate than traditional methods. This can save time and resources, helping researchers achieve their desired outcome more quickly and efficiently.

Drug Optimization:

ML can also aid in drug optimization by predicting the pharmacokinetic, pharmacodynamic, and toxicological properties of drug candidates (Gupta et al., 2020). This can help researchers design drugs with improved molecule identification, efficacy, safety, and bioavailability. With these advancements, not only will patients be saving money, but the drugs they rely on will be safer and more effective.

ADMET Prediction:

ML can predict absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of drug candidates, enabling researchers to optimize their drug design and minimize potential side effects (Wu et al., 2018). For example, the DeepTox project used deep learning to predict the toxicity of compounds with high accuracy, surpassing traditional toxicology prediction methods (Goh et al., 2017). By identifying potential safety concerns early in the drug development process, ML can help save time and resources.

Drug Repurposing:

ML can also be used to identify novel therapeutic applications for existing drugs, which can significantly reduce the cost and time associated with traditional drug development (Aliper et al., 2016). For instance, the Connectivity Map (CMap) project utilized ML to compare gene expression profiles of various drugs, leading to the identification of new therapeutic uses for well-known medications (Lamb et al., 2006).

Machine Learning Optimizes Drug Production:

ML can optimize drug manufacturing processes, resulting in reduced costs and increased efficiency. ML algorithms are also capable of analyzing process variables, such as temperature, pressure, and reaction times, and predict optimal conditions for the synthesis of active pharmaceutical ingredients (APIs) and drug formulations (Schork, 2019). For example, the Reaxys database uses ML to predict optimal synthetic routes for organic compounds, facilitating more efficient drug production (Gaulton et al., 2017).

Quality Control and Assurance:

ML can enhance quality control and assurance by automating the detection of defects and predicting process deviations. For example, ML algorithms have been used to analyze images of drug tablets to identify defects, such as cracks or irregularities, with high accuracy (Kästner et al., 2019). Additionally, ML can predict potential manufacturing issues based on historical data, enabling companies to proactively address problems and maintain quality standards (Rathore et al., 2018). By supporting and streamlining quality control, machine learning can reduce man-power and costs, which results in both savings and safety for the patients.

Real-World Success Stories:

AI-Driven Drug Discovery:

Insilico Medicine, a biotechnology company specializing in artificial intelligence and ML, developed a deep learning-based system called GENTRL to design novel drug candidates. In a proof-of-concept study, GENTRL successfully designed novel inhibitors for the discoidin domain receptor 1 (DDR1), a potential target for the treatment of idiopathic pulmonary fibrosis. The entire process, from target identification to lead compound optimization, took only 46 days, demonstrating the efficiency and cost-effectiveness of ML-driven drug discovery (Zhavoronkov et al., 2019).

ML-Assisted Drug Repurposing:

BenevolentAI, an AI-driven pharmaceutical company, used ML algorithms to identify a potential therapeutic application for the existing drug Baricitinib in treating COVID-19. Baricitinib, originally developed to treat rheumatoid arthritis, was predicted to inhibit the inflammatory response caused by the virus. Subsequent clinical trials confirmed the efficacy of Baricitinib in reducing the time to recovery for COVID-19 patients (Richardson et al., 2021). This example highlights the potential of ML in drug repurposing, providing faster and more cost-effective solutions for patients.

Paving the Pathway to More Affordable Drugs

The AARP reported that in 2019, more than 3.5 million Americans aged 65 and over struggled to pay for their prescription medications (Bunis, 2022). Access to affordable medications ensures that patients are able to get the necessary treatments they need to stay healthy and reduce the costly burden of long-term illnesses. Applying advanced technology like machine learning to the medication creation process has the potential to revolutionize drug discovery, development, and production, ultimately reducing the cost of drugs and making them more accessible to patients worldwide. By streamlining aspects of drug manufacturing such as target identification, personalization, lead compound discovery, drug optimization, and manufacturing processes, ML can outsmart the pharmaceutical industry and provide cheaper, safer drugs for patients. The real-world examples highlighted in this post demonstrates the feasibility and effectiveness of ML-driven drug discovery and repurposing efforts. As machine learning technology continues to advance, its impact on the pharmaceutical industry will become increasingly significant, offering new hope for affordable and accessible healthcare to patients worldwide.


We acknowledge the contributions of researchers and developers in the field of machine learning and pharmaceuticals whose work has laid the foundation for this white paper. We are grateful for their dedication to advancing the application of machine learning to drug discovery, development, and production, ultimately making drugs more affordable and accessible for patients worldwide.


The information presented in this white paper is for informational purposes only and should not be construed as professional advice or endorsement of any specific technology, company, or product. The authors do not accept responsibility for any errors or omissions and do not guarantee the accuracy or completeness of the information provided. The authors shall not be held liable for any damages or losses that may result from the use of the information contained herein.


Aliper, A., Plis, S., Artemov, A., Ulloa, A., Mamoshina, P., & Zhavoronkov, A. (2016). Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Molecular Pharmaceutics, 13(7), 2524-2530.

Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., & Blaschke, T. (2018). The rise of deep learning in drug discovery. Drug Discovery Today, 23(6), 1241-1250.

Gaulton, A., Hersey, A., Nowotka, M., Bento, A. P., Chambers, J., Mendez, D., … & Overington, J. P. (2017). The ChEMBL database in 2017. Nucleic Acids Research, 45(D1), D945-D954.

Goh, G. B., Hodas, N. O., & Vishnu, A. (2017). Deep learning for computational chemistry. Journal of Computational Chemistry, 38(16), 1291-1307.

Gupta, A., Müller, A. T., Huisman, B. J. H., Fuchs, J. A., Schneider, P., & Schneider, G. (2018). Generative recurrent networks for de novo drug design. Molecular Informatics, 37(1-2), 1700111.

Hughes, J. P., Rees, S., Kalindjian, S. B., & Philpott, K. L. (2011). Principles of early drug discovery. British journal of pharmacology, 162(6), 1239–1249.

Jiménez-Luna, J., Skalic, M., Martinez-Rosell, G., & De Fabritiis, G. (2020). KDEEP: Protein–Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks. Journal of Chemical Information and Modeling, 58(2), 287-296.

Kästner, J., Thoma, M., & Müssigmann, J. (2019). Machine learning based analysis of 3D surface structures for defect classification in automated visual inspection of drug tablets. International Journal of Pharmaceutics, 561, 102-111.

Kesselheim, A. S., Avorn, J., & Sarpatwari, A. (2016). The high cost of prescription drugs in the United States: Origins and prospects for reform. JAMA, 316(8), 858-871.

Lamb, J., Crawford, E. D., Peck, D., Modell, J. W., Blat, I. C., Wrobel, M. J., … & Golub, T. R. (2006). The Connectivity Map: Using gene-expression signatures to connect small molecules, genes, and disease. Science, 313(5795), 1929-1935.

Mamoshina, P., Vieira, A., Putin, E., & Zhavoronkov, A. (2018). Applications of deep learning in biomedicine. Molecular Pharmaceutics, 13(5), 1445-1454.

Rathore, A. S., Sharma, A., & Pathak, M. (2018). Process analytics for biologics manufacturing. Journal of Chemical Technology and Biotechnology, 93(9), 2447-2458.

Richardson, P., Griffin, I., Tucker, C., Smith, D., Oechsle, O., Phelan, A., … & Stebbing, J. (2021). Baricitinib as potential treatment for 2019-nCoV acute respiratory disease. The Lancet, 395(10223), e30-e31.

Schork, N. J. (2019). Artificial intelligence and personalized medicine. In Personalized medicine (pp. 165-192). Academic Press.

Stokes, J. M., Yang, K., Swanson, K., Jin, W., Cubillos-Ruiz, A., Donghia, N. M., … & Collins, J. J. (2020). A deep learning approach to antibiotic discovery. Cell, 180(4), 688-702.

Vamathevan, J., Clark, D., Czodrowski, P., Dunham, I., Ferran, E., Lee, G., … & Bender, A. (2019). Applications of machine learning in drug discovery and development. Nature Reviews Drug Discovery, 18(6), 463-477.

Wallach, I., Dzamba, M., & Heifets, A. (2015). AtomNet: A deep convolutional neural network for bioactivity prediction in structure-based drug discovery. arXiv preprint arXiv:1510.02855.

Wu, Z., Ramsundar, B., Feinberg, E. N., Gomes, J., Geniesse, C., Pappu, A. S., … & Pande, V. (2018). MoleculeNet: A benchmark for molecular machine learning. Chemical Science, 9(2), 513-530.

Zhavoronkov, A., Ivanenkov, Y. A., Aliper, A., Veselov, M. S., Aladinskiy, V. A., Aladinskaya, A. V., … & Vanhaelen, Q. (2019). Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nature Biotechnology, 37(9), 1038-1040.

Zhu H. (2020). Big Data and Artificial Intelligence Modeling for Drug Discovery. Annual review of pharmacology and toxicology, 60, 573–589.

Wouters OJ, McKee M, Luyten J. Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009-2018. JAMA. 2020 Mar 3;323(9):844-853.

Bunis D. Millions of Older Americans Can’t Afford Prescriptions. AARP. Published January 19, 2022. https://www.aarp.org/health/medicare-insurance/info-2022/drug-costs-survey.html