Drug discovery and development pipelines are complex, long, and depend on several factors, including FDA trials. Recently artificial intelligence has moved from largely theoretical studies to real-world applications. The pharmaceutical industry currently faces challenges sustaining its drug development programs because of increased R&D costs and reduced efficiency. There is a critical need for time and cost-efficient strategies to analyze and interpret these data to advance human drug development prediction. Also, to predict if new drugs will pass FDA trials or not. In this study, we attempt to accomplish four tasks (1) create a reliable dependent variable to categorize drugs with minimal noise, (2) link this dependent variable to predictor variables, (3) utilize a boosted tree model with the principal component method to develop an algorithm to predict FDA trial outcomes, and (4) to develop a design matrix of regressor variables for 3500 approved and investigational drugs built with DrugBank 5.1.8 Drug Targets and Drug Categories data, as well as ATC codes from both DrugBank and ChEMBL databases. Additionally, intensive computer simulations (1) show a 91% prediction success rate over a wide range of drug categories, (2) provide new insights into predicting the success of drug development, and (3) present data that can save time and resources and help decision-making for the benefit of companies future investigation, potentially to help clinical trials and investment.
|