The common diseases of goat, their symptoms, treatment, and methods used in Sindh-Pakistan. By Mrs. Farzana Panhwar, July 2005 Author: Farzana Panhwar (Mrs) Address: 157-C, Unit No.2, Latifabad, Hyderabad (Sindh), Pakistan [email protected]
Doi:10.1016/j.jphysparis.2005.12.084Journal of Physiology - Paris 99 (2006) 232–244 Development and virtual screening of target libraries Bioinformatics of the Drug, CNRS, UMR 7175, 74 route du Rhin, F-67400 Illkirch, France The concomitant development of in silico screening technologies and of three-dimensional information on therapeutically relevant macromolecular targets makes it possible to navigate in the structural proteome and to identify targets fulﬁlling user-deﬁned queries.
This review illustrates some in-house recent advances in the development of target libraries and how they can be browsed to unravelchemogenomic information.
2005 Elsevier Ltd. All rights reserved.
Keywords: Virtual screening; Docking; Chemogenomics gets and whenever possible relevant ligand binding sites,(2) predict the most likely target(s) of a given ligand, (3) Virtual screening of compound libraries ( predict a selectivity proﬁle for either a target or a ligand, ) has recently gained considerable importance in early (4) predict the ‘druggability' of a given target from a struc- hit ﬁnding programs, notably when technological or eco- tural point of view. All these issues require early answers in nomic hurdles disfavor experimental screening. Numerous the evaluation of drug discovery programs. We will try to successful applications of either ligand-based review each of these applications in the coming sections.
) or structure-based () in silicoscreening have been reported in the literature. Quite unex- 2. Setting up target libraries pectedly, the inverse paradigm still has not been deeplyinvestigated. Given a set of ligands, is it possible to prior- When developing a target library, a ﬁrst compromise itize their most likely targets for experimental validation? between available information (notably at the structural Answering this question ﬁrst requires the development of level) and the therapeutical relevance of selected targets a library covering the most reliable target space ( has to be made. Many proteins for which ﬁne structural By target library, we mean here a col- details are known (e.g. toxins, antibodies) are not ‘drugga- lection of macromolecules for which either the amino acid ble'. Conversely, some important protein families for the sequence and/or three-dimensional (3-D) coordinates are pharmaceutical industry (e.g. G-protein-coupled receptors) available and can be browsed using simple queries. Then, are poorly understood at the 3-D level. Next, a scope has to an appropriate screening method has to be set up which be assigned to the library. Which target space has to be cov- is able to select a panel of targets fulﬁlling requirements ered? Last, which kind of data (amino acid sequences, 3-D imposed by either a ligand structure or a speciﬁc ﬁngerprint atomic coordinates) is browsed for deﬁning a target list? ) or an evolutionary trace 2.1. sc-PDB: a collection of active sites from the Protein Once a target library has been developed, sev- eral applications can be foreseen: (1) simply compare tar- 2.1.1. Setting up the database To establish the proof-of-concept that a protein library Fax: +33 3 90 24 42 35.
E-mail address: might be of screening interest, we have chosen the Protein 0928-4257/$ - see front matter 2005 Elsevier Ltd. All rights reserved.
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Fig. 1. Flowchart for developing the sc-PDB databank ).
Data Bank (PDB) ) as it is the major tein active sites customized to accommodate small molecu- 3-D protein database for which experimentally determined lar-weight ‘drug-like' ligands. Generally, no diﬀerences protein coordinates are available. Several protein–ligand between solvent, detergent, co-factors and ligands (in the databases derived from the PDB have been recently pharmaceutical sense) are made in the above-mentioned databases. To ﬁll this gap, we recently developed a rela- tional database (sc-PDB) (speciﬁcally customized for screening purposes ().
easily allows retrieving protein–ligand com- Starting from 27,000 PDB entries, a series of hierarchi- plexes from a user-deﬁned query focusing on speciﬁc cal ﬁlters has been applied to constitute the database as molecular interactions. MSDsite () is a database search and retrieval system for listing PDBentries fulﬁlling user-deﬁned queries based on ligand • removal of undesirable entries: low resolution (>2.5 A information. The LPDB ) stores 195 X-ray structures, NMR structures; high-resolution protein–ligand complexes and related • on-the ﬂy detection of the molecule to which each refer- physicochemical descriptors as well as binding constants.
enced PDB atom belongs to (target, organic ligand, pep- Its main purpose, as well as related protein–ligand data- tide ligand, co-factor, ion, solvent, detergent) thanks to sets ) is to pro- knowledge-based rules and preexisting lists of ‘HET' vide reliable 3-D information for calibrating docking codes deﬁned in the PDBsum database algorithms and scoring functions. The ProLINT database (contains about 20,000 interaction • removal of undesirable small molecular-weight ligands data for two protein families (kinases, proteases) with (solvent, detergents, ions and co-factors exhibiting atom attached information about the ligand, the protein, exper- types not recognized by classical docking algorithms); imental binding constants and published literature. It has • deﬁnition of putative ligands (organic or peptidic mole- been used to derive structure–activity relationships and cules, co-factor if present alone); predict binding constants. LigBase ) • deﬁnition of the binding site for each ligand (collection is a database of ligand binding sites aligned with related of amino acids for which any heavy atom is closer than protein structures and sequences containing 50,000 bind- ˚ from any ligand atom); ing sites for heterogeneous ligands (ions, solvent, co-fac- • prioritization of a single ligand/active site for each PDB tors, inhibitors, etc.).
entry by calculating the buried surface area of the ligand However, none of the above-mentioned databases are and of the site, and selecting the ligand/site pair for directly usable to generate a collection of ‘druggable' pro- which the percentage of burial is the highest;
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 • storage, for each selected PDB entry, of 3-D atomic great interest for application like virtual screening. Indeed, coordinates in readable PDB format (target, active site) conformational diﬀerences between several copies of an and SD/MOL2 formats (ligand, co-factors, ions).
active site reﬂect the local protein ﬂexibility.
2.2. hGPCRDb: a collection of human non-olfactory 2.1.2. Annotating the database The current version of the sc-PDB database contains 5947 ligand-binding sites for 2626 small molecules; In total, 2.2.1. Setting up the database the database refers to 5947 PDB entries. We assigned a G Protein-Coupled Receptors (GPCRs) constitute a unique UniProt (accession number superfamily of membrane receptors of outmost importance to each protein, thereby identifying 1628 diﬀerent proteins in pharmaceutical research ().
in the database. Additional information collected from Hence, GPCRs are the macromolecular targets of ca.
both UniProt and PDB databanks was collected to obtain 30% of marketed drugs (). The ﬁrst draft the source organism and the biological function of each of the human genome suggests that over 800 genes encode protein. A functional classiﬁcation of the database entries for a GPCR out of which only a few is shown in Entries were separated into two super- (ca. 30) are currently addressed by marketed drugs. If families, namely enzymatic and non-enzymatic proteins.
one excludes the family of sensory receptors, about 400 Out of the 5947 diﬀerent entries of the database, ca. 85% GPCRs are potentially ‘druggable' with ca. 120 proteins are enzymes with a well-referenced EC (Enzyme Commis- being still considered as orphan targets ( sion) number The distribution of enzyme Traditionally, the ﬁrst stage in the design of GPCR families displayed in reveals that the most populated ligands has focused on the potency of the ligands for the family is that of hydrolases (35% of the enzymes). This is selected receptor target. Selectivity towards the host recep- correlated to the high number of proteases in the sc-PDB tor is usually considered once potency has already been database. B gives an overview of the redundancy of reached. It would however be highly desirable to consider current database entries. In most cases, less than 10 copies selectivity as soon as possible in the design process. Ideally, of an active site corresponding to a given protein are avail- one would like to consider the GPCR universe for design- able in the database. The uneven protein entries distribu- ing a ligand with the desired selectivity proﬁle. As address- tion, which reﬂects the intrinsic PDB redundancy, is of ing this issue by high-throughput screening is currentlyimpossible, ‘in silico' screening could provide a reasonablestart. Indeed the recently described 2.8 A ˚ -resolution X-ray structure of bovine rhodopsin pro-vides a possible 3-D template for modeling other GPCRs.
Recent reports unambiguously demonstrated that rhodop-sin-based GPCR homology models are accurate enough topropose reliable 3-D models of receptors very diﬀerentfrom bovine rhodopsin ) and to identify new ligands by structure-basedvirtual screening ). Ofcourse, using classical homology modeling to establish a3-D target library including ca. 400 reliable 3-D models isnot possible. We therefore designed a chemoinformatictool (GPCRMod) speciﬁcally dedicated to high-through-put GPCR modeling (). From the verybeginning, several considerations were taken in the designof the code: (i) the target library should cover all humannon-olfactory GPCRs, (ii) a reliable multiple alignmentof all investigated GPCRs should be obtained at theseven-transmembrane (7-TM) domain only, acknowledg-ing that high-throughput modeling of intra- and extra-cel-lular loops is not feasible, (iii) the 7-TM binding cavity ofevery 3-D model should not be biased by the X-ray struc-ture of bovine rhodopsin.
In a ﬁrst step, 372 human GPCR amino acid sequences were aligned at the 7-TM by browsing the target sequence Fig. 2. sc-PDB content (release 3, March 2005): (A) distribution of for family-speciﬁc ﬁngerprints and motifs ( enzymes and non-enzymes; (B) observed redundancy.
Then, alignments were converted into 3-D
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Fig. 3. Multiple alignment ﬂowchart in GPCRMod.
model using a comparative modeling tool that uses a set of nal binding site in bovine rhodopsin, were extracted from ligand-biased GPCR models as main chain templates, and all entries and concatenated into ungapped sequences out two rotamer libraries for side chain positioning (). A of which a phylogenetic tree could be derived using the key point of the modeling procedure is that 7-TM cavities standard UPGMA clustering method ( are modeled starting from templates which prove useful to discriminate known ligands from decoys. Resulting 3-D Twenty two clusters could be unambiguously detected models are qualitatively quite similar to those obtained from the present analysis of 30 amino acid positions by ligand-assisted comparative modeling These clusters were deﬁned in order to encompass ) but obtained at through- the maximum number of related entries within a branch put allowing the fast generation of hundreds of targets.
characterized by the highest possible statistical bootstrapvalue. Thirty four out of 372 entries could not be assigned 2.2.2. Annotation of the hGPCRdb to one of the existing 22 clusters and are deﬁned as single- Assuming that similar targets recognize similar ligands, tons. The herein presented tree is very similar to the most an accurate annotation of all entries should consider simi- complete phylogenetic tree (GRAFS classiﬁcation) known larities/diﬀerences at their binding cavity. As most small to date although the latter has molecular-weight ligands probably bind to the 7-TM core, been obtained from full TM sequences. In both classiﬁca- all GPCR entries have been annotated using a chemoge- tions, GPCRs of the Frizzled, Glutamate, Secretin and nomic procedure considering a ﬁngerprint characterizing Adhesion families cluster in well-separated groups whereas their 7-TM binding cavity. Thirty positions lining the reti- the large Rhodopsin family can be classiﬁed into 18 Fig. 4. 3-D model generation ﬂowchart in GPCRMod.
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Fig. 5. Two-step protocol to generate a TM cavity-driven phylogenetic tree: (1) selection of 30 critical positions, (2) deﬁnition of ungapped sequencesdescribing the 7-TM cavity, (3) TM cavity-derived phylogenetic tree for 372 human GPCRs. The consensus tree was derived from 1000 replicas usingamino acid identity within a set of 30 discontinuous positions to measure protein distances. Numbers in commas indicate the number of entries in eachcluster. Numbers in italic represent bootstrap values to assess the statistical signiﬁcance of the tree. Receptors classiﬁed as singletons (see text) are notdisplayed here for sake of clarity. Glutamate, Rhodospin, Adhesion, Frizzled and Secretin subfamilies are colored in green, cyan, yellow, pink and orange,respectively.
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 diﬀerent clusters. Remarkably, all known GPCR subfami-lies (e.g. receptors for biogenic amines, purines, and che-mokines) are reproduced with high bootstrap support.
The ﬁve main families (Glutamate, Rhodopsin, Adhesion,Frizzled, Secretin) reported in the GRAFS classiﬁcationare recovered with no overlaps between the correspondingclusters with the single exception of Q9GZN0 (GPR88), arhodopsin-like GPCR clustered with class III GPCRs.
Interestingly, receptors for which the orthosteric bindingsite is not located in the TM domain (Adhesion, Secretinand Glutamate families) are nevertheless grouped into Fig. 6. Target library screening ﬂowchart.
homogeneous clusters. Relating cluster members to precisemolecular features is here greatly facilitated by the analysisof a small subset of amino acids. For each of the 22 clus- 3.1. 1-D screening ters, there is often a clear relationship between knownligand chemotypes (e.g. amines, carboxylic acids, phos- Simple 1-D screening is less precise than 3-D screening phates, peptides, eicosanoids, and lipids) and the cognate but also less sensitive to errors. When applied entire target TM cavities. For example, receptors for bulky ligands families (e.g. GPCRs, kinases), its accuracy only depends (e.g. phospholipids, prostanoids) have a TM cavity signiﬁ- on the quality of the sequence alignment which is generally cantly larger than that for smaller compounds (e.g. bio- much higher that of 3-D structural models. Assuming that genic amines, nucleotides). Receptors for charged ligands similar ligands should bind to similar cavities, browsing a (cationic amines, phosphates, mono and di-carboxylic database of sequence alignments can easily provide access acids) always present among the 30 critical residues one to reliable information if speciﬁc ﬁngerprints are already or more conserved amino acid exhibiting the opposite known. Three possible applications of 1-D screening of a charge (e.g. Asp3.32 for biogenic amines; Asp4.60/Glu7.39 GPCR target library are presented here.
for chemokines; Arg3.29/Lys6.55/Arg7.35 for nucleotides).
Our clustering approach implies two assumptions: (i) the 3.1.1. Searching for orthologs/paralogs overall fold of the 7-TM domain around the binding cavity The amino acid sequence of GPCRs is extremely vari- has been conserved along evolution; (ii) critical hotspots able in length (from 290 to 6300 residues for human spread over the 7-TM domain repeatedly account for GPCRs) notably at extra- and intra-cellular loops. Relying ligand binding. Although solid biostructural data for the receptor comparisons on full sequence alignment may thus three most important GPCR classes (class I, class II, class be quite misleading. Comparing the above-deﬁned TM III) are missing, numerous experimental do provide evi- cavity-lining residues is much more appropriate. For any dence for data in favor of strong similarities among many GPCR target of interest, these 30 residues can be identiﬁed GPCRs: (i) residues known to aﬀect small molecular- quite unambiguously at least for rhodopsin-like GPCRs as weight ligand binding to unrelated GPCRs are mostly several class speciﬁc TM ﬁngerprints previously identiﬁed spread among the herein selected 30 residues suggesting a in this family of receptors can guide the sequence alignment common architecture of the TM pocket, (ii) many known ligands are promiscuous for even unrelated GPCRs and As an example, we have been looking for the human ortho- are usually anchored through so-called privileged struc- log(s) of a gene product from C. elegans (Y22D7AR_13) in tures to common subpockets of diﬀerent GPCRs ( order to predict the functional role of this presumed GPCR.
). Of course, we are aware that Blasting its full amino acid sequence against human GPCRs class II and class III GPCRs exhibit an additional leads to ambiguous conclusions because the level of orthosteric site located outside the 7-TM bundle. There- sequence identity with the closest human GPCRs is low (usu- fore, conclusions drawn herein only apply to the 7-TM ally in the 15–30% range) and that several candidates are binding site.
possible ). Looking at local sequence identity withina set of 30 TM cavity-lining residues provides an answer that 3. Screening target libraries is easier to interpret because the sequence identities with theinput query are much higher (above 70% for the ﬁrst three 5- Provided that a target library has been set up, two TH receptors, Since 7 out of the top 10 ranked can- screening methods are possible (In a 1-D screening, didates were 5-HT receptors, the C. elegans gene product a query enclosing amino acid sequence information (e.g.
was predicted to be a receptor for serotonin, which was ﬁngerprint) is used to parse family-speciﬁc alignments in further experimentally validated (Segalat, personal commu- order to retrieve interesting targets. In a 3-D screening, nication). The proposed approach has the merit to be extre- the 3-D structure of either a ligand or a known active site mely fast (a few ms) but requires the a priori identiﬁcation of is used to browse 3-D structures or homology models. Both the 7-TMs and a good sequence alignment of the latter applications will be detailed in the following section.
domain. Therefore, the presence of TM ﬁngerprints (usually D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Table 1Searching for the 10 closest human orthologs of the C. elegans Y22D7AR_13 gene product Full sequence blast Sequence identity, % Sequence identity, % a Sequence comparison achieved using standard settings of the BLASTP program ().
b Sequence comparison achieved using our in-house GPCRﬁnd program present in nearly all entries) who identiﬁed ligands for the CRTh2 (GPR44) recep- in the input query is a prerequisite.
tor by evaluating angiotensin 2 receptor (AG2R, AG2S)ligands, the corresponding targets being close when consid- 3.1.2. Computer-guided target deorphanization ering the 7-TM cavity ().
A TM-cavity biased phylogenetic tree oﬀers the oppor- tunity to navigate in target space without the necessity to 3.1.3. Matching target space with ligand space rely on questionable 3-D structures. Receptors close in tar- GPCR ligands sharing a common privileged structure get space can be expected to bind ligands close in chemical and exhibiting promiscuous binding to unrelated GPCRs space. Known GPCR ligands are thus a good starting point are a current important source for GPCR library design.
to start deorphanizing receptors predicted to be close Assuming that conserved moieties of the ligands are likely enough to liganded receptors ).
to bind to conserved subsites of the targets For example, focusing our cavity-based tree on two ), matching privileged structures with TM hot- related orphan receptors (GPR19, GPR83) predicts a sig- spots can be achieved very easily without biasing the match by a manual or automated 3-D docking.
(NK1R, NK2R, NK3R; Likewise, GPR54 is pre- As an example, biphenyltetrazoles and biphenylcarb- dicted to be close to three galanine receptor subtypes oxylic acids ) are known to bind to at least six (GALR, GALS, and GALT). Therefore, a rational start GPCRs (AG22, AG2R, AG2S, GHSR, L4R1, L4R2) to ﬁnd ligands for these three orphans would be ﬁrst to test known ligands for neurokinine and galanine receptors, details of 3-D recognition of this privileged substructure by respectively. An experimental validation of this approach GPCR hotspots have been recently proposed by a thor- has been recently reported by scientists at 7TM-Pharma ough mutagenesis-guided manual docking of several Table 2Possible ligand source for some orphan GPCRs Orphan receptor(s GABA-B allosteric ligands CaSR allosteric ligands LH/FSH nonpeptide ligands Cannabinoid receptors ligands Tachykinin receptors ligands Galanine receptor ligands Oxytocin/vasopressin receptor ligands O14804, GP57, GP58 Biogenic amine receptors ligands Brain-gut peptides Neuromedin U receptors ligands Chemokine receptor ligands Somatostatine receptor ligands GP15, GP25, GP44, GPR1 Angiotensin II receptor ligands LPC/SPC receptor ligands Purinergic nucleotide receptor ligands GP17, GP34, FK79, P2YA Cysteinyl Leukotriene receptor ligands a Receptors are labeled according to their UniProt (entry name.
b For cluster deﬁnition, see .
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Fig. 7. Close up to the peptide receptors cluster.
Fig. 8. Matching privileged structures of known GPCR ligands to TM hotspots. An in-house GPCR ligands database is searched to retrieve privilegedstructures common to multiple GPCRs and to ﬁnd conserved residues within the 7-TM cavity of selected entries. Browsing the in-house GPCR cavitydatabase (sequence of 30 critical positions lining the 7-TM cavity of 372 human GPCRs) allow to retrieve new GPCR entries satisfying the query and likelyto accommodate the privileged structure.
GPCR ligands (). We propose identiﬁed the same important anchoring residues than here a much simpler approach leading to the same out- by simply looking at sequence align- come; looking at the 30 residues lining the TM cavity of ments of TM cavity-lining amino acids, without relying on the later six GPCRs allows us to clearly identify putative any 3-D docking data. Searching our TM cavity database TM residues able to interact with this substructure ).
for additional GPCRs fulﬁlling the above-described Conserved aromatic residues are likely to interact with the biaryl moiety cluster between TMs 6 and 7 (Phe6.44, Tyr7.43 and Lys5.42 or Arg6.55 or Arg7.35 and His/Gln6.52) Trp6.48, Phe/Tyr/His6.51, Phe/Tyr7.43). A positively charged permits us to extract 17 new GPCRs that might accommo- residue that probably interacts with the bioisosteric tetra- date biphenyl-tetrazoles and biphenyl-carboxylic acids zole and carboxylate groups should be located nearby the ). Among putative targets are 10 chemoattractant aromatic cluster. Hence, three basic residues (Lys5.42, receptors (APJ, C3AR, C5AR, C5L2, CML1, FML1, Arg6.55, and Arg7.35) fulﬁll this requirement. Last a polar FMLR, GP15, GP44, and GPR1), three brain-gut peptide side chain at position 6.52 (His/Gln) is conserved for the receptors (MTLR, NTR1, and Q9GZQ4), two cationic six investigated GPCRs and might H-bond to the acidic phospholipid receptors (G2A, SPR1) and two peptide moiety of the privileged structure. We have then clearly receptors (GALR, GALS). This target list encompasses D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 receptors recently identiﬁed by (e.g. APJ, NTR1). It also suggests totally new putative tar- Streptavidine Others gets for the investigated privileged structure that might serve as a common scaﬀold for small-sized combinatoriallibraries targeting the new receptors list.
3.2. 3-D screening High-throughput docking of large chemical libraries has established as a promising tool for identifying new hits from protein 3-D structures comingmostly from X-ray diﬀraction data but also from homology modeling (). Finding out of a large library which ligands are likely to bind to a protein of interest is slowly turning to routine computational chemistry Sur- prisingly, the opposite question is still an issue. Given aknown ligand, is it possible to recover its most likely tar- get(s)? Answering this question using the above-mentioned estrogen receptor α docking approach implies ﬁrst the development of a collec- 3α-hydroxysteroid dehydrogenase NADP[H] quinone oxidoreductase tion of protein active sites (see Section and then use of an inverse docking tool able to dock a single ligand to mul-tiple macromolecules. Although inverse screening uses the same paradigm as ligand screening (predicting the most likely ligand-target interactions from molecular docking),docking a single ligand to a target library is more diﬃcult to setup than classical docking of a ligand library to a sin- gle target. One should automate the generation of inputﬁles (3-D coordinates of the target or/and of the cognate binding site; docking conﬁguration ﬁle) for a large array of heterogeneous targets, which is much more diﬃcult than setting up a reliable set of coordinates for a ligand library.
Notably, protein and binding site 3-D coordinates should Fig. 9. Inverse screening of the sc-PDB database for ﬁnding the target of be prepared automatically and should be rendered suitable four small molecular weight ligands: top panel, biotin; bottom panel: 4- for docking by removal of any additional molecule (sol- hydroxy tamoxifen. Filled stars indicate the diﬀerent sc-PDB copies of the vent, ion, and co-factor) not essential for ligand binding.
true target (top: streptavidin, bottom: estrogen receptor a). Filled triangles We have chosen the GOLD docking software and squares indicate known secondary targets of 4-hydroxy tamoxifen(3a-hydroxysteroid dehydrogenase and NADP[H] quinone oxidoreduc- for two main reasons: (i) it is one of the most tase, respectively). Targets are ranked by decreasing GOLD ﬁtness scores robust and accurate docking tool in our hands averaged over 10 independent docking runs.
); (ii) it only requires a single conﬁgurationﬁle whose distribution over a target library is easy toprocess.
for which a key amino acid (Asp128) has been mutated(1swt) or alternative binding sites (peptide binding sites 3.2.1. 3-D screening of the PDB: proof of concept for 1vwr and 1rsu). Altogether, the proposed inverse The ﬁrst validation of inverse screening was to recover screening protocol is able to unambiguously rank streptavi- among 2 150 entries of the sc-PDB (release 1, February din as the most likely target for biotin with a percentage of 2004) the true target(s) of either selective (e.g. biotin, 6- coverage of 70% (7 out of 10) among the top 10 (0.5%) hydroxyl-1,6-dihydropurine ribonucleoside) or promiscu- ous ligands (e.g. 4-hydroxytamoxifen, methotrexate).
Likewise, the two sc-PDB entries of the estrogen recep- Screening the sc-PDB database clearly allowed to unam- tor a were ranked at the top two positions when screening biguously recover the true targets of the four investigated for the target of 4-hydroxy tamoxifen ) Interestingly, ligands (). When screening our database two other targets ((NADP[H] quinone oxidoreductase, 3a- for potential targets of biotin, 7 out of the 10 streptavidin hydroxysteroid dehydrogenase) at least ranked twice entries present in the sc-PDB were ranked at the top eight among the top 25 scorers, are known minor targets of this positions with very good averaged ﬁtness scores ligand. Therefore, inverse screening of target databases Interestingly, the three streptavidin copies with lower rank- could also be viewed as a computational ﬁlter to roughly ings (90th, 195th, 315th) correspond to either an active site predict the selectivity proﬁle of a given ligand and thus D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Fig. 10. Percentage of recovery of known targets as a function of the top scoring fraction found by inverse screening (green line) and random picking (redline). The percentage of coverage of known targets is the ratio in percentage between the number of true target entries recovered by inverse screening at adeﬁned top scoring fraction and the total number of true target entries in the sc-PDB dataset.
putative side eﬀects. When compared to random screening, representative compounds from the library In a signiﬁcant enrichment in the true target is observed the sc-PDB, a target is deﬁned either as an enzyme from among the top scorers ). Analyzing both the enrich- the PDB with a unique EC number, or a non-enzymatic ment factor and the percentage of coverage of known tar- protein with a unique name according to our previous gets indicates that the best compromise can be reached byselecting a very small fraction (0.5%) of the sc-PDB data-base. Even for the rather diﬃcult case of methotrexate, Table 3Predicted targets for ﬁve compounds from a triazepanedione library selecting the top 2.6% scorers would allow to select 40%of all dihydrofolate reductase entries with a 15-fold enrich- ment with respect to random screening.
3.2.2. 3-D screening of the PDB: test case Having validated the inverse screening approach for four unrelated ligands, a prospective screening was applied to the identiﬁcation of putative targets for representative compounds of a scaﬀold-focused combinatorial library (). Release 1 of the sc-PDB (2148 entries) was screened to prioritize targets likely to accommodate ﬁve a Enzyme commission number.
b Number of copies in the sc-PDB (release 1, February 2004).
Target rate: Percentage of targets ranked in the top 2% scoring entries.
h Methionine aminopeptidase.
j Purine nucleoside phosphorylase.
Fig. 11. The 1,3,5-triazepane-2,6-dione scaﬀold with ﬁve diversity points.
k Thymidine kinase.
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 annotation of the database. Diﬀerences related to species, PNP, PLA2). A detailed description of corresponding isoforms or mutations are thus not considered in our clas- structures and inhibitory constants will be reported siﬁcation scheme. For each of the ﬁve investigated com- pounds, a target was selected if it fulﬁlls any of the threefollowing criteria: (i) 50% of target entries present in the 3.2.3. 3-D screening of the hGPCR library: sc-PDB were scored, according to the average GOLD ﬁt- ness score, among the top 2% scoring entries, (ii) the aver- Screening the collection of human GPCRs for identify- age ﬁtness score for all entries of the corresponding target ing the receptors of known ligands is a quite demanding was above 50; two entries of the same target were scored in task regarding the current limited accuracy of GPCR mod- the top 2% scoring entries.
els. We however tried to recover, from the GPCR target Out of the nine targets fulﬁlling this selection procedure, database, either the known receptor of a selective puriner- ﬁve were ﬁnally selected for biological evaluation (ES, MA, gic P2Y1 antagonist (MRS-2179) or the known receptors of PLA2, PNP, TK; ). About 24 compounds enclosing a promiscuous antagonist (NAN-190; ) previously the ﬁve representative used for inverse screening were shown to bind to several monoamine receptors with nano- tested for inhibition of the above-described ﬁve enzymes.
molar aﬃnities (a1A, D2, D3, 5-HT1A, 5-HT1D, 5-HT1F, 5- Micromolar inhibitors from this small library could be HT2A, 5-HT2C, 5-HT7). When screening the protein library found for three out of the ﬁve predicted entries (MA, for putative receptors of MRS-2179, the P2Y1 receptor isindeed ranked among the top scorers (7th, Fiveout of the nine known targets of NAN-190, the secondligand investigated herein, are ranked in the top 25 posi- tions, and seven out of nine in the top 31 positions (The worst-ranked true receptor (5-HT1A) isranked 68th. For both ligands, ca. 80% of GPCRs closely related to the true target(s) (P2Y receptors for MRS-2179; 5-HT receptors for NAN-190) usually clustered in the top 20% scorers. Thus, the current inverse screening procedure is more aimed at identifying the likely receptor subfamily (dopamine, serotonin, adenosine, etc.) than pre- cisely mapping the individual preference for highly related GPCR subtypes. It could thus be used as a computational ﬁlter to study the most likely targets when addressing theselectivity proﬁle of a given compound or trying to identify the yet unknown receptor of a molecule showing promising in vivo biological eﬀects. Although the hGPCR database enclosed ground-state models suitable for docking antago-nists and inverse agonists ) we checked whether the same protocol could be applied to identify the Fig. 12. Ranking of the true receptor(s) of a selective ligand (A: MRS- 2179, P2Y1 receptor antagonist) and of a promiscuous ligand (B: NAN-90, antagonist of the dopamine D2 and D3 receptors, serotonin 5-HT1A, 5-HT1D, 5-HT1F, 5-HT2A, 5-HT2C, 5-HT7 receptors, and adrenergic a1a Fig. 13. Ranking of the true receptor (GPR91, ﬁlled star) of an endog- receptor). Known receptor(s) are indicated by ﬁlled stars. Targets are enous ligand (succinic acid) by an inverse screening of a GPCR 3-D ranked by decreasing GOLD ﬁtness scores averaged over 10 independent library. Targets are ranked by decreasing GOLD ﬁtness scores averaged docking runs.
over 10 independent docking runs.
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 receptor of endogenous ligands. The hGPCR database was Bairoch, A., 2000. The ENZYME database in 2000. Nucl. Acids Res. 28, therefore screened to recover the receptor of succinic acid Bairoch, A., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., (a recently identiﬁed ligand for the previously Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, orphan GPR91 receptor (Although M.J., Natale, D.A., O'Donovan, C., Redaschi, N., Yeh, L.S., 2005.
ground-state 3-D models were screened, the native receptor The Universal Protein Resource (UniProt). Nucl. Acids Res. 33, 154– was surprisingly ranked among the top-scoring receptors (11th) in our inverse screening. Again, the true receptor Bajorath, J., 2002. Integration of virtual and high-throughput screening.
Nat. Rev. Drug. Discov. 11, 882–894.
was not ranked ﬁrst but high enough in a shortlist that Becker, O.M., Marantz, Y., Shacham, S., Inbal, B., Heifetz, A., Kalid, O., could be experimentally evaluated.
Bar-Haim, S., Warshaviak, D., Fichman, M., Noiman, S., 2004. Gprotein-coupled receptors: in silico drug discovery in 3D. Proc. Natl.
Acad. Sci. USA 101, 11304–11309.
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E., 2000. The Protein Data Virtual screening of target libraries oﬀers new opportu- Bank. Nucl. Acids Res. 28, 235–242.
nities to prioritize a few targets for experimental evaluation Bissantz, C., Bernard, P., Hibert, M., Rognan, D., 2003. Protein-based by applying simple ligand-based or target-based queries.
virtual screening of chemical databases. II. Are homology models of There is no reason that single ligand docking to a wide G-Protein Coupled Receptors suitable targets? Proteins 50, 5–25.
array of targets might not be as useful as classical docking Bissantz, C., Logean, A., Rognan, D., 2004. High-throughput modeling of human G-protein coupled receptors: amino acid sequence alignment of ligand libraries to a single protein, assuming comparable three-dimensional model building and receptor library screening. J.
accuracies of input data. The increasing coverage of target Chem. Info. Comput. Sci. 44, 1162–1176.
space by the Protein Data Bank as well as the development Bondensgaard, K., Ankersen, M., Thogersen, H., Hansen, B.S., Wulﬀ, of accurate comparative models describing entire protein B.S., et al., 2004. Recognition of privileged structures by G-protein families is likely to favor target screening in a near future.
coupled receptors. J. Med. Chem. 47, 888–899.
Evers, A., Klabunde, T., 2005. Structure-based drug discovery using Pharmacophore-based and protein-based computational GPCR homology modeling: successful virtual screening for antago- ﬁlters are nowadays used sequentially in virtual screening nists of the alpha1A adrenergic receptor. J. Med. Chem. 48, 1088– One could imagine very similar scenarios for target screen- Evers, A., Klebe, G., 2004a. Successful virtual screening for a submi- ing, where interesting cavities would be ﬁrst ﬁltered by sim- cromolar antagonist of the neurokinin-1 receptor based on a ligand-supported homology model. J. Med. Chem. 47, 5381–5392.
ilarity measurements to a binding site of interest Evers, A., Klebe, G., 2004b. Ligand-supported homology modeling of ), and then selected by ligand g-protein-coupled receptor sites: models suﬃcient for successful virtual docking. Furthermore, orthogonal clustering of target fam- screening. Angew. Chem. Intl. Ed. Engl. 43, 248–251.
ilies and of their ligands should soon provide precise che- Fredriksson, R., Lagerstrom, M.C., Lundin, L.G., Schioth, H.B., 2003.
mogenomic information for selecting the most interesting The G-protein-coupled receptors in the human genome form ﬁve mainfamilies. Phylogenetic analysis, paralogon groups, and ﬁngerprints.
compounds/scaﬀolds according to a predeﬁned selectivity Mol. Pharmacol. 63, 1256–1272.
proﬁle. Addressing simultaneously potency and selectivity Frimurer, T.M., Ulven, T., Elling, C.E., Gerlach, L.O., Kostenis, E., in hit evaluation will undoubtedly aﬀords added-value mol- Hogberg, T., 2005. A physicogenetic method to assign ligand-binding ecules in early drug discovery processes.
relationships between 7TM receptors. Bioorg. Med. Chem. Lett. 15,3707–3712.
Golovin, A., Dimitropoulos, D., Oldﬁeld, T., Rachedi, A., Henrick, K., 2005. MSDsite: a database search and retrieval system for the analysis and viewing of bound ligands and active sites. Proteins 58, 190–199.
Halperin, I., Ma, B., Wolfson, H., Nussinov, R., 2002. Principles of I would like to thank several former and current collab- docking: an overview of search algorithms and a guide to scoring orators of the Bioinformatics group (C. Bissantz, G. Bret, functions. Proteins 47, 409–443.
He, W., Miao, F.J., Lin, D.C., Schwandner, R.T., Wang, Z., Gao, J., E. Kellenberger, A. Logean, P. Muller, N. Paul, and C.
Chen, J.L., Tian, H., Ling, L., 2004. Citric acid cycle intermediates as Schalon) for their invaluable work in the development of ligands for orphan G-protein-coupled receptors. Nature 429, 188–193.
target libraries. Financial support of the French Ministry Hendlich, M., Bergner, A., Gunther, J., Klebe, G., 2003. Relibase: design of Research and Technology, and of the Alsace-Lorraine and development of a database for comprehensive analysis of protein– Genopole is acknowledged as well as the allocation of com- ligand interactions. J. Mol. Biol. 326, 607–620.
Jambon, M., Imberty, A., Deleage, G., Geourjon, C., 2003. A new puting resources at the Centre Informatique National de bioinformatic approach to detect common 3D sites in protein l'Enseignement supe´rieur (CINES, Montpellier, France).
structures. Proteins 52, 137–145.
Ji, H., Leung, M., Zhang, Y., Catt, K.J., Sandberg, K., 1994. Diﬀerential structural requirements for speciﬁc binding of nonpeptide and peptide antagonists to the AT1 receptor. Identiﬁcation of amino acid residuesthat determine binding of the antihypertensive drug losartan. J. Biol.
Attwood, T.K., Bradley, P., Flower, D.R., Gaulton, A., Maudling, N., Chem. 269, 16533–16536.
Mitchell, A.L., Moulton, G., Nordle, A., Paine, K., Taylor, P., Uddin, Kellenberger, E., Rodrigo, J., Muller, P., Rognan, D., 2004. Comparative A., Zygouri, C., 2003. PRINTS and its automatic supplement, evaluation of eight docking tools for docking and virtual screening prePRINTS. Nucl. Acids Res. 31, 400–402.
accuracy. Proteins 57, 225–242.
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Kitajima, K., Ahmad, S., Selvaraj, S., Kubodera, H., Sunada, S., An, J., Reiter, L.A., Koch, K., Piscopio, A.D., Showell, H.J., Alpert, R., et al., Sarai, A., 2002. Development of a protein–ligand interaction database, 1998. Trans-3-benzyl-4-hydroxy-7-chromanylbenzoic acid derivatives ProLINT, and its application to QSAR analysis. Genome Informat.
as antagonists of the leukotriene B4 (LTB4) receptor. Bioorg. Med.
Chem. Lett. 8, 1781–1786.
Kitchen, D.B., Decornez, H., Furr, J.R., Bajorath, J., 2004. Docking and Roche, O., Kiyama, R., Brooks III, C.L., 2001. Ligand–protein database: scoring in virtual screening for drug discovery: methods and applica- linking protein–ligand complex structures to binding data. J. Med.
tions. Nat. Rev. Drug. Discov. 3, 935–949.
Chem. 44, 3592–3598.
Kramer, B., Rarey, M., Lengauer T, T., 1999. Evaluation of the FLEXX Schoichet, B.K., 2004. Virtual screening of chemical libraries. Nature 432, incremental construction algorithm for protein–ligand docking. Pro- teins 37, 228–241.
Schwalbe, H., Wess, G., 2002. Dissecting G-protein-coupled receptors: Laskowski, R.A., Chistyakov, V.V., Thornton, J.M., 2005. PDBsum: structure, function, and ligand interaction. ChemBioChem 3, 915–919.
summaries and analyses of PDB structures. Nucl. Acids Res. D26, Stuart, C., Ilyin, V.A., Sali, A., 2002. LigBase: a database of families of aligned ligand binding sites in known protein sequences and structures.
Lichtarge, O., Bourne, H., Cohen, F., 1996. An evolutionary trace method Bioinformatics 18, 200–201.
deﬁnes binding surfaces common to protein families. J. Mol. Biol. 257, Smith, R.G., Cheng, K., Schoen, W.R., Pong, S.S., Hickey, G., et al., 1993. A non peptidyl growth hormone secretagogue. Science 260, Lipinski, C., Hopkins, A., 2004. Navigating chemical space for biology and medicine. Nature 432, 855–861.
Surgand, J.S., Rodrigo, J., Kellenberger, E., Rognan, D., 2006. A Malherbe, P., Kratochvwil, N., Knoﬂach, F., Zenner, M.-T., Kew, J.N.C., chemogenomic analysis of the transmembrane binding cavity of Krattzeisen, C., Maerki, H.P., Adam, G., Mutel, V., 2003. Mutational human G-protein-coupled receptors. Proteins 62, 509–538.
analysis and molecular modeling of the allosteric binding site of a Varady, J., Wu, X., Fang, X., Min, J., Hu, Z., Levant, B., Wang, S., 2003.
novel selective, noncompetitive antagonist of the metabotropic gluta- Molecular modeling of the three-dimensional structure of dopamine 3 mate 1 receptor. J. Biol. Chem. 278, 8340–8347.
(D3) subtype receptor: discovery of novel and potent D3 ligands Nissink, J.W., Murray, C., Hartshorn, M., Verdonk, M.L., Cole, J.S., through a hybrid pharmacophore- and structure-based database Taylor, R., 2002. A new test set for validating predictions of protein– searching approach. J. Med. Chem. 46, 4377–4392.
ligand interaction. Proteins 49, 457–471.
Venter, J.C. et al., 2004. The sequence of the human genome. Science 291, Palczewski, K., Kumasaka, T., Hori, T., Behnke, C.A., Motoshima, H., Fox, B.A., Trong, I.L., Teller, D.C., Okada, T., Stenkamp, R.E., Verdonk, M.L., Cole, J.C., Hartshorn, M.J., Murray, C.W., Taylor, R.D., Yamamoto, M., Miyano, M., 2000. Crystal structure of rhodopsin: a 2003. Improved protein–ligand docking using GOLD. Proteins 52 (4), G protein-coupled receptor. Science 289, 739–745.
Paul, N., Bret, G., Kellenberger, E., Mu¨ller, P., Rognan, D., 2004.
Weber, A., Casini, A., Heine, A., Kuhn, D., Supuran, C.T., Scozzafava, Recovering the true targets of speciﬁc ligands by virtual screening of A., Klebe, G., 2004. Unexpected nanomolar inhibition of carbonic the protein data bank. Proteins 54, 671–680.
anhydrase by COX-2 selective celecoxib: new pharmacological oppor- Petrel, C., Kessler, A., Maslah, F., Dauban, P., Dood, R.H., tunities due to related binding site recognition. J. Med. Chem. 47, 550– Rognan, D., Ruat, M., 2003. Modeling and mutagenesis of the binding site of Calhex 231, a novel negative allosteric modulator of the Wise, A., Jupe, S.C., Rees, S., 2004. The identiﬁcation of ligands at extracellular Ca(2+)-sensing receptor. J. Biol. Chem. 278, 49487– Orphan G-Protein coupled receptors. Annu. Rev. Pharmacol. Toxicol.
Derecho y Ciencia Seminario de Derecho y Ciencia Departamento Académico de Derecho Instituto Tecnológico Autónomo de México Cuadernos de Derecho y Ciencia Consejo Editorial Isabel Davara F. de Marcos Christian López Silva Ana Teresa Valdivia Alvarado Instituto Tecnológico Autónomo de México Arturo Fernández Departamento Académico de Derecho Coordinadora General