TY - GEN
T1 - Predicting error bars for QSAR models
AU - Schroeter, Timon
AU - Schwaighofer, Anton
AU - Mika, Sebastian
AU - Ter Laak, Antonius
AU - Suelzle, Detlev
AU - Ganzer, Ursula
AU - Heinrich, Nikolaus
AU - Müller, Klaus Robert
PY - 2007
Y1 - 2007
N2 - Unfavorable physicochemical properties often cause drug failures. It is therefore important to take lipophilicity and water solubility into account early on in lead discovery. This study presents log D7 models built using Gaussian Process regression, Support Vector Machines, decision trees and ridge regression algorithms based on 14556 drug discovery compounds of Bayer Schering Pharma. A blind test was conducted using 7013 new measurements from the last months. We also present independent evaluations using public data. Apart from accuracy, we discuss the quality of error bars that can be computed by Gaussian Process models, and ensemble and distance based techniques for the other modelling approaches.
AB - Unfavorable physicochemical properties often cause drug failures. It is therefore important to take lipophilicity and water solubility into account early on in lead discovery. This study presents log D7 models built using Gaussian Process regression, Support Vector Machines, decision trees and ridge regression algorithms based on 14556 drug discovery compounds of Bayer Schering Pharma. A blind test was conducted using 7013 new measurements from the last months. We also present independent evaluations using public data. Apart from accuracy, we discuss the quality of error bars that can be computed by Gaussian Process models, and ensemble and distance based techniques for the other modelling approaches.
UR - http://www.scopus.com/inward/record.url?scp=40249113486&partnerID=8YFLogxK
U2 - 10.1063/1.2793398
DO - 10.1063/1.2793398
M3 - Conference contribution
AN - SCOPUS:40249113486
SN - 9780735404526
T3 - AIP Conference Proceedings
SP - 158
EP - 167
BT - CompLife 2007 - 3rd International Symposium on Computational Life Science
T2 - 3rd International Symposium on Computational Life Science, CompLife 2007
Y2 - 4 October 2007 through 5 October 2007
ER -