Publications : 2020

Zorn KM, Foil DH, Lane TR, Hillwalker W, Feifarek DJ, Jones F, Klaren WD, Brinkman AM, Ekins S. 2020. Comparing machine learning models for aromatase (P450 19A1). Environ Sci Technol 54(23):15546–15555, PMID: 33207874.


Aromatase, or cytochrome P450 19A1, catalyzes the aromatization of androgens to estrogens within the body. Changes in the activity of this enzyme can produce hormonal imbalances that can be detrimental to sexual and skeletal development. Inhibition of this enzyme can occur with drugs and natural products as well as environmental chemicals. Therefore, predicting potential endocrine disruption via exogenous chemicals requires that aromatase inhibition be considered in addition to androgen and estrogen pathway interference. Bayesian machine learning methods can be used for prospective prediction from the molecular structure without the need for experimental data. Herein, the generation and evaluation of multiple machine learning models utilizing different sources of aromatase inhibition data are described. These models are applied to two test sets for external validation with molecules relevant to drug discovery from the public domain. In addition, the performance of multiple machine learning algorithms was evaluated by comparing internal five-fold cross-validation statistics of the training data. These methods to predict aromatase inhibition from molecular structure, when used in concert with estrogen and androgen machine learning models, allow for a more holistic assessment of endocrine-disrupting potential of chemicals with limited empirical data and enable the reduction of the use of hazardous substances.