Detailed Information

Cited 1 time in webofscience Cited 3 time in scopus
Metadata Downloads

Development of Machine Learning Models for Accurately Predicting and Ranking the Activity of Lead Molecules to Inhibit PRC2 Dependent Canceropen access

Authors
DanishuddinKumar, VikasParate, ShraddhaBahuguna, AshutoshLee, GihwanKim, Myeong OkLee, Keun Woo
Issue Date
Jul-2021
Publisher
MDPI
Keywords
cancer; epigenetic; PRC2; machine learning; multi-class models
Citation
PHARMACEUTICALS, v.14, no.7
Indexed
SCIE
SCOPUS
Journal Title
PHARMACEUTICALS
Volume
14
Number
7
URI
https://scholarworks.bwise.kr/gnu/handle/sw.gnu/3515
DOI
10.3390/ph14070699
ISSN
1424-8247
Abstract
Disruption of epigenetic processes to eradicate tumor cells is among the most promising interventions for cancer control. EZH2 (Enhancer of zeste homolog 2), a catalytic component of polycomb repressive complex 2 (PRC2), methylates lysine 27 of histone H3 to promote transcriptional silencing and is an important drug target for controlling cancer via epigenetic processes. In the present study, we have developed various predictive models for modeling the inhibitory activity of EZH2. Binary and multiclass models were built using SVM, random forest and XGBoost methods. Rigorous validation approaches including predictiveness curve, Y-randomization and applicability domain (AD) were employed for evaluation of the developed models. Eighteen descriptors selected from Boruta methods have been used for modeling. For binary classification, random forest and XGBoost achieved an accuracy of 0.80 and 0.82, respectively, on external test set. Contrastingly, for multiclass models, random forest and XGBoost achieved an accuracy of 0.73 and 0.75, respectively. 500 Y-randomization runs demonstrate that the models were robust and the correlations were not by chance. Evaluation metrics from predictiveness curve show that the selected eighteen descriptors predict active compounds with total gain (TG) of 0.79 and 0.59 for XGBoost and random forest, respectively. Validated models were further used for virtual screening and molecular docking in search of potential hits. A total of 221 compounds were commonly predicted as active with above the set probability threshold and also under the AD of training set. Molecular docking revealed that three compounds have reasonable binding energy and favorable interactions with critical residues in the active site of EZH2. In conclusion, we highlighted the potential of rigorously validated models for accurately predicting and ranking the activities of lead molecules against cancer epigenetic targets. The models presented in this study represent the platform for development of EZH2 inhibitors.
Files in This Item
There are no files associated with this item.
Appears in
Collections
ETC > Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Myeong Ok photo

Kim, Myeong Ok
대학원 (응용생명과학부)
Read more

Altmetrics

Total Views & Downloads

BROWSE