Use your Brain – Artificial Neural Networks in Drug Design Gerhard F. Ecker Emerging Field Pharmacoinformatics Department of Medicinal Chemistry, University of Vienna Althanstrasse 14, A-1090 Wien, Austria e-mail:
[email protected]; http://homepage.univie.ac.at/gerhard.f.ecker
G. Ecker
Use your Brain!
THM 1
G. Ecker
Drug Development - Failures
Kubinyi, Nature Rev Drug Discov 2, 665 (2003)
G. Ecker
Toxicity hERG Potassium Channel • Long QT syndrome • Highly promiscuous
Sanguinetti, Nature 443, 463 (2006) G. Ecker
Metabolism Cytochrome P450 Complex Problems: • Rapid and extensive metabolism • Toxification • Drug/Drug Interactions • Poor Metabolizers
de Graaf, J Med Chem 48, 2725 (2005) G. Ecker
Bioavailability and Transporter
Kim, Mol Pharm, 3, 26 (2006)
G. Ecker
ABC-Transporter • • • •
membrane-bound efflux pumps energy driven (ATP) ATP Binding Cassette (ABC-transporter) 48 ABC transporter in humans
– P-glycoprotein (P-gp) – Multidrug Resistance Related Protein (MRP) – Breast Cancer Resistance Protein (BCRP, MXR) • a lot of analogous transport proteins in bacteria, fungi, protozoes and plants very often multispecific in ligand recognition G. Ecker
P-Glycoprotein (ABCB1, MDR1) • • • • • • • •
170 kD 2 transmembrane domains 2 ATP-binding sites Xenotoxin transporter hydrophobic vacuum cleaner intestine, liver, kidney blood-brain barrier tumors G. Ecker
Linear „classical“ QSAR • Advantages – clear relationship between descriptors and biological activity (linear or e.g. bilinear) – influence and importance of given descriptors is visible (confidence intervals, t-test) – scaling gives relative importance of the descriptors G. Ecker
Linear „classical“ QSAR • Disadvantages: – type of relationship must be given (linear, bilinear, sigmoidal, ...) – difficult to treat complex interactions (receptor, membrane) – sensitive to noise in the data
G. Ecker
Artificial Neural Networks • Simulation of the human brain for recognition of complex relationships • examples: face recognition, stock market, ... • Currently more than 9000 references with the concept „Artificial Neural Networks“ in the Chemical Abstracts
G. Ecker
Artificial Neural Networks • Supervised Learning – Learning via presentation of cases – Train a kid to separate dogs from cats
• Unsupervised Learning – Classification of groups via pattern recognition – Separate cats and dogs without knowing the differences G. Ecker
Artificial Neural Networks Structure of the human brain
G. Ecker
Output
Transfer Function
Combining Function
Inputs
General Model for ANNs Weights
G. Ecker
Transfer Function
G. Ecker
Examples Feedforward Multilayer Network w1π+w2π+w3σ+...+w8I
σ MR
w1π w2π
Output
Inputs
π
TF
I G. Ecker
Training - Learning Training occurs via iterative adjustment of the weights till output signal corresponds to actual value (supervised learning): Delta Rule: ΔkWij = β(tj-aj)ai + αΔk-1Wij tj: aj: ai: β: α:
target output aktueller output aktueller input learning rate momentum factor G. Ecker
Stopping Conditions • • • • •
number of cycles maximum target error minimum improvment of target error maximum verification error minimum improvment of verification error
G. Ecker
Target Error - Verification Error
G. Ecker
Disadvantages of Artificial Neural Networks • Only little information on relationships between input data and output (positive or negative correlation?) • difficult to receive informations on relative importance of descriptors • most often global minimum not found; this leads to different results in different runs G. Ecker
Examples for use of Feedforward Networks • Regression analysis: caution with extrapolations, especially if you use min/max scaling of the output variable; (better: scale between 0.2 und 0.8). • Classification problems: in case of two classes you may use one or two out put neurons; examples: drug like/non drug like, active/non active, substrate or not,...; in case of more than two classes it is recommended to use one output neuron per class rather than 0-1-2 as output; G. Ecker
Examples • Classification: – exclusive and/or Problem – drug like/non drug like (H.Kubinyi)
• Regression – MDR-modulating Activity (G.Ecker)
G. Ecker
drug like/non drug like • ISIS Fingerprints as input • Training with 5000 compounds from ACD and 5000 from WDI • Test with the remaining compounds; 80% right predictions
G. Ecker
MDR-modulating Activity X1
N
F
X2
X8
(ortho) CH(OH)C2H4Ph
X9
(ortho) CH(OCH3)C2H4Ph
X10
(ortho) COCH3
X11
(ortho) COC2H4Naphth
X12
(ortho) COC2H5
X13
(ortho) COPh
X14
(ortho → para) COC2H4Ph
X15
5-OH
X16
5-OCH2Ph
X17
(ortho → para) COCH3
X18
(ortho → meta) COC2H4Ph
X19
(ortho → meta) COCH3
N N
H3C
X3
CH3 N
CH3 CH3
X4
OH O
H N
CH3
O
N O
N
X5
X6
N
N
OH
X7
H N
OH R2
O
R1
R1: X8-X19 R2: X1-X7 G. Ecker
MDR-modulating Activity log(1/EC50) predicted
Topology: 7:4:1
r = 0.92, s = 0.32, Q2cv = 0.85
log(1/EC50) beobachtet
G. Ecker
Virtual Library Screening OH
Predicted activity (μM)
OH N
O
ANN CoMFA CoMSIA HQSAR
O
O
0.036 0.011
0.004
0.003
0.040 0.054
0.042
0.018
OH OH O O
O
CH3
G. Ecker
Classification Problems • Feedforward Networks – classification of compounds according to their odour impression
• Self Organizing Maps – identification of new lead compounds
G. Ecker
Aroma Quality of 1,4-Pyrazines • 98 Pyrazine derivatives with green, bell-pepper and nutty aroma • feedforward network with 5:3:3 topology: sum of electrotopological indices, number of carbon atoms of R2, charge on the first atom of R4, molecular surface of R1 and R3 • nominal output variable with accept and R3 R2 N reject threshold levels for each aroma • 86% right classifications for the test set R4 N R1 B. Wailzer, J. Med. Chem. 44, 2805 (2001)
G. Ecker
Support Vector Machines
linear
SVM G. Ecker
Support Vector Machines
Higher Dimensions G. Ecker
Self Organising Maps
G. Ecker
SONNIA Self-Organizing Neural Network for Information Analysis
C3
© Gasteiger et al.
G. Ecker
The Data Set O OH
R1
O R1 O
R2
R1 O
R2
Propafenon Type
Benzopyranone Type
OH
Benzofurane Type
OH
O
O R1
R1
R1
O R2 O R2
O
R2
Indanone Type G. Ecker
Connectivity - Autocorrelation • An autocorrelation vector gives informations about the distribution of atom properties within a molecule:
• AC(d) = Σ p(i) • p(j) i,j ∈ M(d) p(i), p(j): Properties of atoms i,j d: distance of atoms i and j • 2D-Autocorrelation: d = number of bonds • 3D-Autocorrelation: d = distance in angström
G. Ecker
Autocorrelation • Example: 4-Hydroxy-2-butanon – Property: atom weight – Distance: 2 bons
O 1
H3C
C
6 2 3
C
C
4 5
OH
AC(2) = p(1)p(3) + p(2)p(4) + p(3)p(5) + p(1)p(6) + p(3)p(6) = 12•12 + 12•12 + 12•16 + 12•16 + 12•16 = 864
G. Ecker
Kohonen Maps • Data set: 131 Propafenon-analogs • autocorrelations vectors (PETRA) lead to separation of highly active from inactive compounds • increase network size to 250 x 250 • projection of the propafenons together with the SPECS database (150.000 compounds) • analyse co-localisations
G. Ecker
Results
G. Ecker
Kohonen Maps - Hits O
O
N
N
O
N
N
N
N
O
N
O
N N N
N N
S
AG-690/11972772
N
O
AG-690/12887361
S
Cl
AJ-131/15197008
NH2
AJ-292/13162028
O
N
O
N
O
S
N N
S
N
N O
N
S
S N
AJ-292/15089034
AN-989/14669159
AO-364/14480185
G. Ecker
Kohonen Maps - Results • out of 7 compounds colocalizing with highly active propafenones are – 2 compounds with EC50 < 1 μM – 4 compounds with EC50 < 10 μM – 1 compound inactive
• out of 8 compounds colocalizing with inactive propafenones are – 1 compound with EC50 = 17 μM – 5 compounds with 100 μM < EC50 < 500 μM – 2 compounds with EC50 > 500 μM D. Kaiser, J. Med. Chem., 2007.
G. Ecker
Hit Follow-up • Select compounds similar to the two hits • Order and test them
G. Ecker
Software Packages • SONNIA: Molecular Networks • SNNS: University of Stuttgart • TRAJAN: implemented in Statistica
G. Ecker
References • J. Zupan & J. Gasteiger: Neural Networks in Drug Design, Wiley-VCH, 1999 • List of Web-Resources: – – – –
www.dsi.unifi.it/neural/w3-sites.html www.ncst.ernet.in/education/apgdst/aifac/resources/Neural psychology.about.com/od/neuralnetworks/ www2.chemie.uni-erlangen.de/publications/ANN-book/publications/ G. Ecker
Neural Networks - Conclusions • ANNs are good – for noisy data (HTS) – for complex relationships (ADMET) – for classification problems (yes/no)
• ANNs are bad – if you have a distinct drug/receptor interaction – if you want to know what‘s going on G. Ecker
Thank you! Gerhard Ecker
Michael Gottesman
Silke Schindler Barbara Zdrazil Karin Pleban Michael Demel Dominik Kaiser Claudia Hoffer
Gergely Szarkasz
Peter Chiba Stephan Kopp Manuela Hitzler
Edina Csaszar G. Ecker