Palestras 2017-1
Palestra 1: Quo vadis, QSAR?
Data: 05/04/2017
Local: Anfiteatro do ICB I, às 14:00
Palestrante: Prof. Dr. - Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC 27599, USA. - Área de Bioinformática.
Abstract
Quantitative Structure-Activity Relationship (QSAR) modeling is one of the major computational tools employed in medicinal chemistry. However, throughout its entire history it has drawn both praise and criticism concerning its reliability, limitations, successes, and failures. This presentation will be devoted to the current trends, unsolved problems, and pressing challenges and several novel and emerging applications of QSAR modeling. We will provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive QSAR models1. These guidelines are primarily centered on four following elements of predictive QSAR workflow: data collection and curation; model building and rigorous external validation; model exploitation.
Data curation. Careful curation of input data is critical for the success of any cheminformatics analysis including QSAR modeling. We will describe the workflows for both chemical and biological data curation developed in our lab2. Treatment of chemical data included the removal of inorganics, organometallics, counterions, and mixtures, structural cleaning (e.g., detection of valence violations), ring aromatization, normalization of specific chemotypes, standardization of tautomeric forms, deletion of duplicates, and manual checking of complex cases. Biological data curation included detection and verification of activity cliffs, analysis of experimental variability, calculation and tuning of dataset modelability index, consensus QSAR predictions identification and correction of misannotated compounds.
We will also introduce the concept of "MODelability Index" (MODI)3 that was proposed not only as a quantitative tool to quickly estimate whether predictive QSAR model(s) can be obtained for a given binary dataset but also as an attempt to answer the following questions: (i) how the number of activity cliffs in a given dataset correlates with the overall prediction performance of QSAR models for this dataset; (ii) whether such correlation is conserved across different datasets; (iii) whether one could use the fraction of activity cliffs in a datasets to assess the overall possibility of success or failure for QSAR modeling; (iv) why some datasets are modelable whereas others are not; and (v) how (and whether it is possible at all) to find the subset of compounds in overall non-modelable dataset, for which local QSAR models can be obtained.
Model building and validation. We will describe predictive QSAR modeling workflow developed in our lab with particular attention paid to rigorous internal and external cross-validation, estimation of applicability domain of a QSAR model, and consensus predictions. Here we will also present a brief overview of most popular descriptors and modeling.
Model exploitation. Experimental validation is the only indicator of actual utility of QSAR modeling. In conclusive part of the presentation we will emphasize several examples of experimentally-assisted computational drug design including the development of novel compounds with with desired complex polypharmacological profile as well as more traditional optimization and design of novel antivirals, antimicrobials, etc. We will also describe new non-trivial applications and future trends of QSAR such as modeling of peptides and chemical mixtures, quantitative nanostructure-activity relationship, application of QSAR in materials informatics, approaches for model interpretation, etc.
1 Cherkasov, A.; Muratov, E.; Fourches, D; et al. J. Med. Chem. 2014, DOI: 10.1021/jm4004285.
2 Fourches, D.; Muratov, E.; Tropsha, A. J. Chem. Inf. Model. 2010, 50, 1189.
3 Golbraikh, A.; Muratov, E.; Fourches, D.; Tropsha, A. J. Chem. Inf. Model. 2014, DOI: 10.1021/ci400572x.