Journal of Physiology - Paris 99 (2006) 232–244 Development and virtual screening of target libraries Bioinformatics of the Drug, CNRS, UMR 7175, 74 route du Rhin, F-67400 Illkirch, France The concomitant development of in silico screening technologies and of three-dimensional information on therapeutically relevant macromolecular targets makes it possible to navigate in the structural proteome and to identify targets fulfilling user-defined queries.
This review illustrates some in-house recent advances in the development of target libraries and how they can be browsed to unravelchemogenomic information.
 2005 Elsevier Ltd. All rights reserved.
Keywords: Virtual screening; Docking; Chemogenomics gets and whenever possible relevant ligand binding sites,(2) predict the most likely target(s) of a given ligand, (3) Virtual screening of compound libraries ( predict a selectivity profile for either a target or a ligand, ) has recently gained considerable importance in early (4) predict the ‘druggability' of a given target from a struc- hit finding programs, notably when technological or eco- tural point of view. All these issues require early answers in nomic hurdles disfavor experimental screening. Numerous the evaluation of drug discovery programs. We will try to successful applications of either ligand-based review each of these applications in the coming sections.
) or structure-based () in silicoscreening have been reported in the literature. Quite unex- 2. Setting up target libraries pectedly, the inverse paradigm still has not been deeplyinvestigated. Given a set of ligands, is it possible to prior- When developing a target library, a first compromise itize their most likely targets for experimental validation? between available information (notably at the structural Answering this question first requires the development of level) and the therapeutical relevance of selected targets a library covering the most reliable target space ( has to be made. Many proteins for which fine structural By target library, we mean here a col- details are known (e.g. toxins, antibodies) are not ‘drugga- lection of macromolecules for which either the amino acid ble'. Conversely, some important protein families for the sequence and/or three-dimensional (3-D) coordinates are pharmaceutical industry (e.g. G-protein-coupled receptors) available and can be browsed using simple queries. Then, are poorly understood at the 3-D level. Next, a scope has to an appropriate screening method has to be set up which be assigned to the library. Which target space has to be cov- is able to select a panel of targets fulfilling requirements ered? Last, which kind of data (amino acid sequences, 3-D imposed by either a ligand structure or a specific fingerprint atomic coordinates) is browsed for defining a target list? ) or an evolutionary trace 2.1. sc-PDB: a collection of active sites from the Protein Once a target library has been developed, sev- eral applications can be foreseen: (1) simply compare tar- 2.1.1. Setting up the database To establish the proof-of-concept that a protein library Fax: +33 3 90 24 42 35.
E-mail address: might be of screening interest, we have chosen the Protein 0928-4257/$ - see front matter  2005 Elsevier Ltd. All rights reserved.

D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Fig. 1. Flowchart for developing the sc-PDB databank ).
Data Bank (PDB) ) as it is the major tein active sites customized to accommodate small molecu- 3-D protein database for which experimentally determined lar-weight ‘drug-like' ligands. Generally, no differences protein coordinates are available. Several protein–ligand between solvent, detergent, co-factors and ligands (in the databases derived from the PDB have been recently pharmaceutical sense) are made in the above-mentioned databases. To fill this gap, we recently developed a rela- tional database (sc-PDB) (specifically customized for screening purposes ().
easily allows retrieving protein–ligand com- Starting from 27,000 PDB entries, a series of hierarchi- plexes from a user-defined query focusing on specific cal filters has been applied to constitute the database as molecular interactions. MSDsite () is a database search and retrieval system for listing PDBentries fulfilling user-defined queries based on ligand • removal of undesirable entries: low resolution (>2.5 A information. The LPDB ) stores 195 X-ray structures, NMR structures; high-resolution protein–ligand complexes and related • on-the fly detection of the molecule to which each refer- physicochemical descriptors as well as binding constants.
enced PDB atom belongs to (target, organic ligand, pep- Its main purpose, as well as related protein–ligand data- tide ligand, co-factor, ion, solvent, detergent) thanks to sets ) is to pro- knowledge-based rules and preexisting lists of ‘HET' vide reliable 3-D information for calibrating docking codes defined in the PDBsum database algorithms and scoring functions. The ProLINT database (contains about 20,000 interaction • removal of undesirable small molecular-weight ligands data for two protein families (kinases, proteases) with (solvent, detergents, ions and co-factors exhibiting atom attached information about the ligand, the protein, exper- types not recognized by classical docking algorithms); imental binding constants and published literature. It has • definition of putative ligands (organic or peptidic mole- been used to derive structure–activity relationships and cules, co-factor if present alone); predict binding constants. LigBase ) • definition of the binding site for each ligand (collection is a database of ligand binding sites aligned with related of amino acids for which any heavy atom is closer than protein structures and sequences containing 50,000 bind- ˚ from any ligand atom); ing sites for heterogeneous ligands (ions, solvent, co-fac- • prioritization of a single ligand/active site for each PDB tors, inhibitors, etc.).
entry by calculating the buried surface area of the ligand However, none of the above-mentioned databases are and of the site, and selecting the ligand/site pair for directly usable to generate a collection of ‘druggable' pro- which the percentage of burial is the highest;

D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 • storage, for each selected PDB entry, of 3-D atomic great interest for application like virtual screening. Indeed, coordinates in readable PDB format (target, active site) conformational differences between several copies of an and SD/MOL2 formats (ligand, co-factors, ions).
active site reflect the local protein flexibility.
2.2. hGPCRDb: a collection of human non-olfactory 2.1.2. Annotating the database The current version of the sc-PDB database contains 5947 ligand-binding sites for 2626 small molecules; In total, 2.2.1. Setting up the database the database refers to 5947 PDB entries. We assigned a G Protein-Coupled Receptors (GPCRs) constitute a unique UniProt (accession number superfamily of membrane receptors of outmost importance to each protein, thereby identifying 1628 different proteins in pharmaceutical research ().
in the database. Additional information collected from Hence, GPCRs are the macromolecular targets of ca.
both UniProt and PDB databanks was collected to obtain 30% of marketed drugs (). The first draft the source organism and the biological function of each of the human genome suggests that over 800 genes encode protein. A functional classification of the database entries for a GPCR out of which only a few is shown in Entries were separated into two super- (ca. 30) are currently addressed by marketed drugs. If families, namely enzymatic and non-enzymatic proteins.
one excludes the family of sensory receptors, about 400 Out of the 5947 different entries of the database, ca. 85% GPCRs are potentially ‘druggable' with ca. 120 proteins are enzymes with a well-referenced EC (Enzyme Commis- being still considered as orphan targets ( sion) number The distribution of enzyme Traditionally, the first stage in the design of GPCR families displayed in reveals that the most populated ligands has focused on the potency of the ligands for the family is that of hydrolases (35% of the enzymes). This is selected receptor target. Selectivity towards the host recep- correlated to the high number of proteases in the sc-PDB tor is usually considered once potency has already been database. B gives an overview of the redundancy of reached. It would however be highly desirable to consider current database entries. In most cases, less than 10 copies selectivity as soon as possible in the design process. Ideally, of an active site corresponding to a given protein are avail- one would like to consider the GPCR universe for design- able in the database. The uneven protein entries distribu- ing a ligand with the desired selectivity profile. As address- tion, which reflects the intrinsic PDB redundancy, is of ing this issue by high-throughput screening is currentlyimpossible, ‘in silico' screening could provide a reasonablestart. Indeed the recently described 2.8 A ˚ -resolution X-ray structure of bovine rhodopsin pro-vides a possible 3-D template for modeling other GPCRs.
Recent reports unambiguously demonstrated that rhodop-sin-based GPCR homology models are accurate enough topropose reliable 3-D models of receptors very differentfrom bovine rhodopsin ) and to identify new ligands by structure-basedvirtual screening ). Ofcourse, using classical homology modeling to establish a3-D target library including ca. 400 reliable 3-D models isnot possible. We therefore designed a chemoinformatictool (GPCRMod) specifically dedicated to high-through-put GPCR modeling (). From the verybeginning, several considerations were taken in the designof the code: (i) the target library should cover all humannon-olfactory GPCRs, (ii) a reliable multiple alignmentof all investigated GPCRs should be obtained at theseven-transmembrane (7-TM) domain only, acknowledg-ing that high-throughput modeling of intra- and extra-cel-lular loops is not feasible, (iii) the 7-TM binding cavity ofevery 3-D model should not be biased by the X-ray struc-ture of bovine rhodopsin.
In a first step, 372 human GPCR amino acid sequences were aligned at the 7-TM by browsing the target sequence Fig. 2. sc-PDB content (release 3, March 2005): (A) distribution of for family-specific fingerprints and motifs ( enzymes and non-enzymes; (B) observed redundancy.
Then, alignments were converted into 3-D

D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Fig. 3. Multiple alignment flowchart in GPCRMod.
model using a comparative modeling tool that uses a set of nal binding site in bovine rhodopsin, were extracted from ligand-biased GPCR models as main chain templates, and all entries and concatenated into ungapped sequences out two rotamer libraries for side chain positioning (). A of which a phylogenetic tree could be derived using the key point of the modeling procedure is that 7-TM cavities standard UPGMA clustering method ( are modeled starting from templates which prove useful to discriminate known ligands from decoys. Resulting 3-D Twenty two clusters could be unambiguously detected models are qualitatively quite similar to those obtained from the present analysis of 30 amino acid positions by ligand-assisted comparative modeling These clusters were defined in order to encompass ) but obtained at through- the maximum number of related entries within a branch put allowing the fast generation of hundreds of targets.
characterized by the highest possible statistical bootstrapvalue. Thirty four out of 372 entries could not be assigned 2.2.2. Annotation of the hGPCRdb to one of the existing 22 clusters and are defined as single- Assuming that similar targets recognize similar ligands, tons. The herein presented tree is very similar to the most an accurate annotation of all entries should consider simi- complete phylogenetic tree (GRAFS classification) known larities/differences at their binding cavity. As most small to date although the latter has molecular-weight ligands probably bind to the 7-TM core, been obtained from full TM sequences. In both classifica- all GPCR entries have been annotated using a chemoge- tions, GPCRs of the Frizzled, Glutamate, Secretin and nomic procedure considering a fingerprint characterizing Adhesion families cluster in well-separated groups whereas their 7-TM binding cavity. Thirty positions lining the reti- the large Rhodopsin family can be classified into 18 Fig. 4. 3-D model generation flowchart in GPCRMod.

D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Fig. 5. Two-step protocol to generate a TM cavity-driven phylogenetic tree: (1) selection of 30 critical positions, (2) definition of ungapped sequencesdescribing the 7-TM cavity, (3) TM cavity-derived phylogenetic tree for 372 human GPCRs. The consensus tree was derived from 1000 replicas usingamino acid identity within a set of 30 discontinuous positions to measure protein distances. Numbers in commas indicate the number of entries in eachcluster. Numbers in italic represent bootstrap values to assess the statistical significance of the tree. Receptors classified as singletons (see text) are notdisplayed here for sake of clarity. Glutamate, Rhodospin, Adhesion, Frizzled and Secretin subfamilies are colored in green, cyan, yellow, pink and orange,respectively.

D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 different clusters. Remarkably, all known GPCR subfami-lies (e.g. receptors for biogenic amines, purines, and che-mokines) are reproduced with high bootstrap support.
The five main families (Glutamate, Rhodopsin, Adhesion,Frizzled, Secretin) reported in the GRAFS classificationare recovered with no overlaps between the correspondingclusters with the single exception of Q9GZN0 (GPR88), arhodopsin-like GPCR clustered with class III GPCRs.
Interestingly, receptors for which the orthosteric bindingsite is not located in the TM domain (Adhesion, Secretinand Glutamate families) are nevertheless grouped into Fig. 6. Target library screening flowchart.
homogeneous clusters. Relating cluster members to precisemolecular features is here greatly facilitated by the analysisof a small subset of amino acids. For each of the 22 clus- 3.1. 1-D screening ters, there is often a clear relationship between knownligand chemotypes (e.g. amines, carboxylic acids, phos- Simple 1-D screening is less precise than 3-D screening phates, peptides, eicosanoids, and lipids) and the cognate but also less sensitive to errors. When applied entire target TM cavities. For example, receptors for bulky ligands families (e.g. GPCRs, kinases), its accuracy only depends (e.g. phospholipids, prostanoids) have a TM cavity signifi- on the quality of the sequence alignment which is generally cantly larger than that for smaller compounds (e.g. bio- much higher that of 3-D structural models. Assuming that genic amines, nucleotides). Receptors for charged ligands similar ligands should bind to similar cavities, browsing a (cationic amines, phosphates, mono and di-carboxylic database of sequence alignments can easily provide access acids) always present among the 30 critical residues one to reliable information if specific fingerprints are already or more conserved amino acid exhibiting the opposite known. Three possible applications of 1-D screening of a charge (e.g. Asp3.32 for biogenic amines; Asp4.60/Glu7.39 GPCR target library are presented here.
for chemokines; Arg3.29/Lys6.55/Arg7.35 for nucleotides).
Our clustering approach implies two assumptions: (i) the 3.1.1. Searching for orthologs/paralogs overall fold of the 7-TM domain around the binding cavity The amino acid sequence of GPCRs is extremely vari- has been conserved along evolution; (ii) critical hotspots able in length (from 290 to 6300 residues for human spread over the 7-TM domain repeatedly account for GPCRs) notably at extra- and intra-cellular loops. Relying ligand binding. Although solid biostructural data for the receptor comparisons on full sequence alignment may thus three most important GPCR classes (class I, class II, class be quite misleading. Comparing the above-defined TM III) are missing, numerous experimental do provide evi- cavity-lining residues is much more appropriate. For any dence for data in favor of strong similarities among many GPCR target of interest, these 30 residues can be identified GPCRs: (i) residues known to affect small molecular- quite unambiguously at least for rhodopsin-like GPCRs as weight ligand binding to unrelated GPCRs are mostly several class specific TM fingerprints previously identified spread among the herein selected 30 residues suggesting a in this family of receptors can guide the sequence alignment common architecture of the TM pocket, (ii) many known ligands are promiscuous for even unrelated GPCRs and As an example, we have been looking for the human ortho- are usually anchored through so-called privileged struc- log(s) of a gene product from C. elegans (Y22D7AR_13) in tures to common subpockets of different GPCRs ( order to predict the functional role of this presumed GPCR.
). Of course, we are aware that Blasting its full amino acid sequence against human GPCRs class II and class III GPCRs exhibit an additional leads to ambiguous conclusions because the level of orthosteric site located outside the 7-TM bundle. There- sequence identity with the closest human GPCRs is low (usu- fore, conclusions drawn herein only apply to the 7-TM ally in the 15–30% range) and that several candidates are binding site.
possible ). Looking at local sequence identity withina set of 30 TM cavity-lining residues provides an answer that 3. Screening target libraries is easier to interpret because the sequence identities with theinput query are much higher (above 70% for the first three 5- Provided that a target library has been set up, two TH receptors, Since 7 out of the top 10 ranked can- screening methods are possible (In a 1-D screening, didates were 5-HT receptors, the C. elegans gene product a query enclosing amino acid sequence information (e.g.
was predicted to be a receptor for serotonin, which was fingerprint) is used to parse family-specific alignments in further experimentally validated (Segalat, personal commu- order to retrieve interesting targets. In a 3-D screening, nication). The proposed approach has the merit to be extre- the 3-D structure of either a ligand or a known active site mely fast (a few ms) but requires the a priori identification of is used to browse 3-D structures or homology models. Both the 7-TMs and a good sequence alignment of the latter applications will be detailed in the following section.
domain. Therefore, the presence of TM fingerprints (usually D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Table 1Searching for the 10 closest human orthologs of the C. elegans Y22D7AR_13 gene product Full sequence blast Sequence identity, % Sequence identity, % a Sequence comparison achieved using standard settings of the BLASTP program ().
b Sequence comparison achieved using our in-house GPCRfind program present in nearly all entries) who identified ligands for the CRTh2 (GPR44) recep- in the input query is a prerequisite.
tor by evaluating angiotensin 2 receptor (AG2R, AG2S)ligands, the corresponding targets being close when consid- 3.1.2. Computer-guided target deorphanization ering the 7-TM cavity ().
A TM-cavity biased phylogenetic tree offers the oppor- tunity to navigate in target space without the necessity to 3.1.3. Matching target space with ligand space rely on questionable 3-D structures. Receptors close in tar- GPCR ligands sharing a common privileged structure get space can be expected to bind ligands close in chemical and exhibiting promiscuous binding to unrelated GPCRs space. Known GPCR ligands are thus a good starting point are a current important source for GPCR library design.
to start deorphanizing receptors predicted to be close Assuming that conserved moieties of the ligands are likely enough to liganded receptors ).
to bind to conserved subsites of the targets For example, focusing our cavity-based tree on two ), matching privileged structures with TM hot- related orphan receptors (GPR19, GPR83) predicts a sig- spots can be achieved very easily without biasing the match by a manual or automated 3-D docking.
(NK1R, NK2R, NK3R; Likewise, GPR54 is pre- As an example, biphenyltetrazoles and biphenylcarb- dicted to be close to three galanine receptor subtypes oxylic acids ) are known to bind to at least six (GALR, GALS, and GALT). Therefore, a rational start GPCRs (AG22, AG2R, AG2S, GHSR, L4R1, L4R2) to find ligands for these three orphans would be first to test known ligands for neurokinine and galanine receptors, details of 3-D recognition of this privileged substructure by respectively. An experimental validation of this approach GPCR hotspots have been recently proposed by a thor- has been recently reported by scientists at 7TM-Pharma ough mutagenesis-guided manual docking of several Table 2Possible ligand source for some orphan GPCRs Orphan receptor(s GABA-B allosteric ligands CaSR allosteric ligands LH/FSH nonpeptide ligands Cannabinoid receptors ligands Tachykinin receptors ligands Galanine receptor ligands Oxytocin/vasopressin receptor ligands O14804, GP57, GP58 Biogenic amine receptors ligands Brain-gut peptides Neuromedin U receptors ligands Chemokine receptor ligands Somatostatine receptor ligands GP15, GP25, GP44, GPR1 Angiotensin II receptor ligands LPC/SPC receptor ligands Purinergic nucleotide receptor ligands GP17, GP34, FK79, P2YA Cysteinyl Leukotriene receptor ligands a Receptors are labeled according to their UniProt (entry name.
b For cluster definition, see .

D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Fig. 7. Close up to the peptide receptors cluster.
Fig. 8. Matching privileged structures of known GPCR ligands to TM hotspots. An in-house GPCR ligands database is searched to retrieve privilegedstructures common to multiple GPCRs and to find conserved residues within the 7-TM cavity of selected entries. Browsing the in-house GPCR cavitydatabase (sequence of 30 critical positions lining the 7-TM cavity of 372 human GPCRs) allow to retrieve new GPCR entries satisfying the query and likelyto accommodate the privileged structure.
GPCR ligands (). We propose identified the same important anchoring residues than here a much simpler approach leading to the same out- by simply looking at sequence align- come; looking at the 30 residues lining the TM cavity of ments of TM cavity-lining amino acids, without relying on the later six GPCRs allows us to clearly identify putative any 3-D docking data. Searching our TM cavity database TM residues able to interact with this substructure ).
for additional GPCRs fulfilling the above-described Conserved aromatic residues are likely to interact with the biaryl moiety cluster between TMs 6 and 7 (Phe6.44, Tyr7.43 and Lys5.42 or Arg6.55 or Arg7.35 and His/Gln6.52) Trp6.48, Phe/Tyr/His6.51, Phe/Tyr7.43). A positively charged permits us to extract 17 new GPCRs that might accommo- residue that probably interacts with the bioisosteric tetra- date biphenyl-tetrazoles and biphenyl-carboxylic acids zole and carboxylate groups should be located nearby the ). Among putative targets are 10 chemoattractant aromatic cluster. Hence, three basic residues (Lys5.42, receptors (APJ, C3AR, C5AR, C5L2, CML1, FML1, Arg6.55, and Arg7.35) fulfill this requirement. Last a polar FMLR, GP15, GP44, and GPR1), three brain-gut peptide side chain at position 6.52 (His/Gln) is conserved for the receptors (MTLR, NTR1, and Q9GZQ4), two cationic six investigated GPCRs and might H-bond to the acidic phospholipid receptors (G2A, SPR1) and two peptide moiety of the privileged structure. We have then clearly receptors (GALR, GALS). This target list encompasses D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 receptors recently identified by (e.g. APJ, NTR1). It also suggests totally new putative tar- Streptavidine Others gets for the investigated privileged structure that might serve as a common scaffold for small-sized combinatoriallibraries targeting the new receptors list.
3.2. 3-D screening High-throughput docking of large chemical libraries has established as a promising tool for identifying new hits from protein 3-D structures comingmostly from X-ray diffraction data but also from homology modeling (). Finding out of a large library which ligands are likely to bind to a protein of interest is slowly turning to routine computational chemistry Sur- prisingly, the opposite question is still an issue. Given aknown ligand, is it possible to recover its most likely tar- get(s)? Answering this question using the above-mentioned estrogen receptor α docking approach implies first the development of a collec- 3α-hydroxysteroid dehydrogenase NADP[H] quinone oxidoreductase tion of protein active sites (see Section and then use of an inverse docking tool able to dock a single ligand to mul-tiple macromolecules. Although inverse screening uses the same paradigm as ligand screening (predicting the most likely ligand-target interactions from molecular docking),docking a single ligand to a target library is more difficult to setup than classical docking of a ligand library to a sin- gle target. One should automate the generation of inputfiles (3-D coordinates of the target or/and of the cognate binding site; docking configuration file) for a large array of heterogeneous targets, which is much more difficult than setting up a reliable set of coordinates for a ligand library.
Notably, protein and binding site 3-D coordinates should Fig. 9. Inverse screening of the sc-PDB database for finding the target of be prepared automatically and should be rendered suitable four small molecular weight ligands: top panel, biotin; bottom panel: 4- for docking by removal of any additional molecule (sol- hydroxy tamoxifen. Filled stars indicate the different sc-PDB copies of the vent, ion, and co-factor) not essential for ligand binding.
true target (top: streptavidin, bottom: estrogen receptor a). Filled triangles We have chosen the GOLD docking software and squares indicate known secondary targets of 4-hydroxy tamoxifen(3a-hydroxysteroid dehydrogenase and NADP[H] quinone oxidoreduc- for two main reasons: (i) it is one of the most tase, respectively). Targets are ranked by decreasing GOLD fitness scores robust and accurate docking tool in our hands averaged over 10 independent docking runs.
); (ii) it only requires a single configurationfile whose distribution over a target library is easy toprocess.
for which a key amino acid (Asp128) has been mutated(1swt) or alternative binding sites (peptide binding sites 3.2.1. 3-D screening of the PDB: proof of concept for 1vwr and 1rsu). Altogether, the proposed inverse The first validation of inverse screening was to recover screening protocol is able to unambiguously rank streptavi- among 2 150 entries of the sc-PDB (release 1, February din as the most likely target for biotin with a percentage of 2004) the true target(s) of either selective (e.g. biotin, 6- coverage of 70% (7 out of 10) among the top 10 (0.5%) hydroxyl-1,6-dihydropurine ribonucleoside) or promiscu- ous ligands (e.g. 4-hydroxytamoxifen, methotrexate).
Likewise, the two sc-PDB entries of the estrogen recep- Screening the sc-PDB database clearly allowed to unam- tor a were ranked at the top two positions when screening biguously recover the true targets of the four investigated for the target of 4-hydroxy tamoxifen ) Interestingly, ligands (). When screening our database two other targets ((NADP[H] quinone oxidoreductase, 3a- for potential targets of biotin, 7 out of the 10 streptavidin hydroxysteroid dehydrogenase) at least ranked twice entries present in the sc-PDB were ranked at the top eight among the top 25 scorers, are known minor targets of this positions with very good averaged fitness scores ligand. Therefore, inverse screening of target databases Interestingly, the three streptavidin copies with lower rank- could also be viewed as a computational filter to roughly ings (90th, 195th, 315th) correspond to either an active site predict the selectivity profile of a given ligand and thus D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Fig. 10. Percentage of recovery of known targets as a function of the top scoring fraction found by inverse screening (green line) and random picking (redline). The percentage of coverage of known targets is the ratio in percentage between the number of true target entries recovered by inverse screening at adefined top scoring fraction and the total number of true target entries in the sc-PDB dataset.
putative side effects. When compared to random screening, representative compounds from the library In a significant enrichment in the true target is observed the sc-PDB, a target is defined either as an enzyme from among the top scorers ). Analyzing both the enrich- the PDB with a unique EC number, or a non-enzymatic ment factor and the percentage of coverage of known tar- protein with a unique name according to our previous gets indicates that the best compromise can be reached byselecting a very small fraction (0.5%) of the sc-PDB data-base. Even for the rather difficult case of methotrexate, Table 3Predicted targets for five compounds from a triazepanedione library selecting the top 2.6% scorers would allow to select 40%of all dihydrofolate reductase entries with a 15-fold enrich- ment with respect to random screening.
3.2.2. 3-D screening of the PDB: test case Having validated the inverse screening approach for four unrelated ligands, a prospective screening was applied to the identification of putative targets for representative compounds of a scaffold-focused combinatorial library (). Release 1 of the sc-PDB (2148 entries) was screened to prioritize targets likely to accommodate five a Enzyme commission number.
b Number of copies in the sc-PDB (release 1, February 2004).
Target rate: Percentage of targets ranked in the top 2% scoring entries.
h Methionine aminopeptidase.
Phospholipase A2.
j Purine nucleoside phosphorylase.
Fig. 11. The 1,3,5-triazepane-2,6-dione scaffold with five diversity points.
k Thymidine kinase.
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 annotation of the database. Differences related to species, PNP, PLA2). A detailed description of corresponding isoforms or mutations are thus not considered in our clas- structures and inhibitory constants will be reported sification scheme. For each of the five investigated com- pounds, a target was selected if it fulfills any of the threefollowing criteria: (i) 50% of target entries present in the 3.2.3. 3-D screening of the hGPCR library: sc-PDB were scored, according to the average GOLD fit- ness score, among the top 2% scoring entries, (ii) the aver- Screening the collection of human GPCRs for identify- age fitness score for all entries of the corresponding target ing the receptors of known ligands is a quite demanding was above 50; two entries of the same target were scored in task regarding the current limited accuracy of GPCR mod- the top 2% scoring entries.
els. We however tried to recover, from the GPCR target Out of the nine targets fulfilling this selection procedure, database, either the known receptor of a selective puriner- five were finally selected for biological evaluation (ES, MA, gic P2Y1 antagonist (MRS-2179) or the known receptors of PLA2, PNP, TK; ). About 24 compounds enclosing a promiscuous antagonist (NAN-190; ) previously the five representative used for inverse screening were shown to bind to several monoamine receptors with nano- tested for inhibition of the above-described five enzymes.
molar affinities (a1A, D2, D3, 5-HT1A, 5-HT1D, 5-HT1F, 5- Micromolar inhibitors from this small library could be HT2A, 5-HT2C, 5-HT7). When screening the protein library found for three out of the five predicted entries (MA, for putative receptors of MRS-2179, the P2Y1 receptor isindeed ranked among the top scorers (7th, Fiveout of the nine known targets of NAN-190, the secondligand investigated herein, are ranked in the top 25 posi- tions, and seven out of nine in the top 31 positions (The worst-ranked true receptor (5-HT1A) isranked 68th. For both ligands, ca. 80% of GPCRs closely related to the true target(s) (P2Y receptors for MRS-2179; 5-HT receptors for NAN-190) usually clustered in the top 20% scorers. Thus, the current inverse screening procedure is more aimed at identifying the likely receptor subfamily (dopamine, serotonin, adenosine, etc.) than pre- cisely mapping the individual preference for highly related GPCR subtypes. It could thus be used as a computational filter to study the most likely targets when addressing theselectivity profile of a given compound or trying to identify the yet unknown receptor of a molecule showing promising in vivo biological effects. Although the hGPCR database enclosed ground-state models suitable for docking antago-nists and inverse agonists ) we checked whether the same protocol could be applied to identify the Fig. 12. Ranking of the true receptor(s) of a selective ligand (A: MRS- 2179, P2Y1 receptor antagonist) and of a promiscuous ligand (B: NAN-90, antagonist of the dopamine D2 and D3 receptors, serotonin 5-HT1A, 5-HT1D, 5-HT1F, 5-HT2A, 5-HT2C, 5-HT7 receptors, and adrenergic a1a Fig. 13. Ranking of the true receptor (GPR91, filled star) of an endog- receptor). Known receptor(s) are indicated by filled stars. Targets are enous ligand (succinic acid) by an inverse screening of a GPCR 3-D ranked by decreasing GOLD fitness scores averaged over 10 independent library. Targets are ranked by decreasing GOLD fitness scores averaged docking runs.
over 10 independent docking runs.
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 receptor of endogenous ligands. The hGPCR database was Bairoch, A., 2000. The ENZYME database in 2000. Nucl. Acids Res. 28, therefore screened to recover the receptor of succinic acid Bairoch, A., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., (a recently identified ligand for the previously Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, orphan GPR91 receptor (Although M.J., Natale, D.A., O'Donovan, C., Redaschi, N., Yeh, L.S., 2005.
ground-state 3-D models were screened, the native receptor The Universal Protein Resource (UniProt). Nucl. Acids Res. 33, 154– was surprisingly ranked among the top-scoring receptors (11th) in our inverse screening. Again, the true receptor Bajorath, J., 2002. Integration of virtual and high-throughput screening.
Nat. Rev. Drug. Discov. 11, 882–894.
was not ranked first but high enough in a shortlist that Becker, O.M., Marantz, Y., Shacham, S., Inbal, B., Heifetz, A., Kalid, O., could be experimentally evaluated.
Bar-Haim, S., Warshaviak, D., Fichman, M., Noiman, S., 2004. Gprotein-coupled receptors: in silico drug discovery in 3D. Proc. Natl.
Acad. Sci. USA 101, 11304–11309.
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E., 2000. The Protein Data Virtual screening of target libraries offers new opportu- Bank. Nucl. Acids Res. 28, 235–242.
nities to prioritize a few targets for experimental evaluation Bissantz, C., Bernard, P., Hibert, M., Rognan, D., 2003. Protein-based by applying simple ligand-based or target-based queries.
virtual screening of chemical databases. II. Are homology models of There is no reason that single ligand docking to a wide G-Protein Coupled Receptors suitable targets? Proteins 50, 5–25.
array of targets might not be as useful as classical docking Bissantz, C., Logean, A., Rognan, D., 2004. High-throughput modeling of human G-protein coupled receptors: amino acid sequence alignment of ligand libraries to a single protein, assuming comparable three-dimensional model building and receptor library screening. J.
accuracies of input data. The increasing coverage of target Chem. Info. Comput. Sci. 44, 1162–1176.
space by the Protein Data Bank as well as the development Bondensgaard, K., Ankersen, M., Thogersen, H., Hansen, B.S., Wulff, of accurate comparative models describing entire protein B.S., et al., 2004. Recognition of privileged structures by G-protein families is likely to favor target screening in a near future.
coupled receptors. J. Med. Chem. 47, 888–899.
Evers, A., Klabunde, T., 2005. Structure-based drug discovery using Pharmacophore-based and protein-based computational GPCR homology modeling: successful virtual screening for antago- filters are nowadays used sequentially in virtual screening nists of the alpha1A adrenergic receptor. J. Med. Chem. 48, 1088– One could imagine very similar scenarios for target screen- Evers, A., Klebe, G., 2004a. Successful virtual screening for a submi- ing, where interesting cavities would be first filtered by sim- cromolar antagonist of the neurokinin-1 receptor based on a ligand-supported homology model. J. Med. Chem. 47, 5381–5392.
ilarity measurements to a binding site of interest Evers, A., Klebe, G., 2004b. Ligand-supported homology modeling of ), and then selected by ligand g-protein-coupled receptor sites: models sufficient for successful virtual docking. Furthermore, orthogonal clustering of target fam- screening. Angew. Chem. Intl. Ed. Engl. 43, 248–251.
ilies and of their ligands should soon provide precise che- Fredriksson, R., Lagerstrom, M.C., Lundin, L.G., Schioth, H.B., 2003.
mogenomic information for selecting the most interesting The G-protein-coupled receptors in the human genome form five mainfamilies. Phylogenetic analysis, paralogon groups, and fingerprints.
compounds/scaffolds according to a predefined selectivity Mol. Pharmacol. 63, 1256–1272.
profile. Addressing simultaneously potency and selectivity Frimurer, T.M., Ulven, T., Elling, C.E., Gerlach, L.O., Kostenis, E., in hit evaluation will undoubtedly affords added-value mol- Hogberg, T., 2005. A physicogenetic method to assign ligand-binding ecules in early drug discovery processes.
relationships between 7TM receptors. Bioorg. Med. Chem. Lett. 15,3707–3712.
Golovin, A., Dimitropoulos, D., Oldfield, T., Rachedi, A., Henrick, K., 2005. MSDsite: a database search and retrieval system for the analysis and viewing of bound ligands and active sites. Proteins 58, 190–199.
Halperin, I., Ma, B., Wolfson, H., Nussinov, R., 2002. Principles of I would like to thank several former and current collab- docking: an overview of search algorithms and a guide to scoring orators of the Bioinformatics group (C. Bissantz, G. Bret, functions. Proteins 47, 409–443.
He, W., Miao, F.J., Lin, D.C., Schwandner, R.T., Wang, Z., Gao, J., E. Kellenberger, A. Logean, P. Muller, N. Paul, and C.
Chen, J.L., Tian, H., Ling, L., 2004. Citric acid cycle intermediates as Schalon) for their invaluable work in the development of ligands for orphan G-protein-coupled receptors. Nature 429, 188–193.
target libraries. Financial support of the French Ministry Hendlich, M., Bergner, A., Gunther, J., Klebe, G., 2003. Relibase: design of Research and Technology, and of the Alsace-Lorraine and development of a database for comprehensive analysis of protein– Genopole is acknowledged as well as the allocation of com- ligand interactions. J. Mol. Biol. 326, 607–620.
Jambon, M., Imberty, A., Deleage, G., Geourjon, C., 2003. A new puting resources at the Centre Informatique National de bioinformatic approach to detect common 3D sites in protein l'Enseignement supe´rieur (CINES, Montpellier, France).
structures. Proteins 52, 137–145.
Ji, H., Leung, M., Zhang, Y., Catt, K.J., Sandberg, K., 1994. Differential structural requirements for specific binding of nonpeptide and peptide antagonists to the AT1 receptor. Identification of amino acid residuesthat determine binding of the antihypertensive drug losartan. J. Biol.
Attwood, T.K., Bradley, P., Flower, D.R., Gaulton, A., Maudling, N., Chem. 269, 16533–16536.
Mitchell, A.L., Moulton, G., Nordle, A., Paine, K., Taylor, P., Uddin, Kellenberger, E., Rodrigo, J., Muller, P., Rognan, D., 2004. Comparative A., Zygouri, C., 2003. PRINTS and its automatic supplement, evaluation of eight docking tools for docking and virtual screening prePRINTS. Nucl. Acids Res. 31, 400–402.
accuracy. Proteins 57, 225–242.
D. Rognan / Journal of Physiology - Paris 99 (2006) 232–244 Kitajima, K., Ahmad, S., Selvaraj, S., Kubodera, H., Sunada, S., An, J., Reiter, L.A., Koch, K., Piscopio, A.D., Showell, H.J., Alpert, R., et al., Sarai, A., 2002. Development of a protein–ligand interaction database, 1998. Trans-3-benzyl-4-hydroxy-7-chromanylbenzoic acid derivatives ProLINT, and its application to QSAR analysis. Genome Informat.
as antagonists of the leukotriene B4 (LTB4) receptor. Bioorg. Med.
13, 498–499.
Chem. Lett. 8, 1781–1786.
Kitchen, D.B., Decornez, H., Furr, J.R., Bajorath, J., 2004. Docking and Roche, O., Kiyama, R., Brooks III, C.L., 2001. Ligand–protein database: scoring in virtual screening for drug discovery: methods and applica- linking protein–ligand complex structures to binding data. J. Med.
tions. Nat. Rev. Drug. Discov. 3, 935–949.
Chem. 44, 3592–3598.
Kramer, B., Rarey, M., Lengauer T, T., 1999. Evaluation of the FLEXX Schoichet, B.K., 2004. Virtual screening of chemical libraries. Nature 432, incremental construction algorithm for protein–ligand docking. Pro- teins 37, 228–241.
Schwalbe, H., Wess, G., 2002. Dissecting G-protein-coupled receptors: Laskowski, R.A., Chistyakov, V.V., Thornton, J.M., 2005. PDBsum: structure, function, and ligand interaction. ChemBioChem 3, 915–919.
summaries and analyses of PDB structures. Nucl. Acids Res. D26, Stuart, C., Ilyin, V.A., Sali, A., 2002. LigBase: a database of families of aligned ligand binding sites in known protein sequences and structures.
Lichtarge, O., Bourne, H., Cohen, F., 1996. An evolutionary trace method Bioinformatics 18, 200–201.
defines binding surfaces common to protein families. J. Mol. Biol. 257, Smith, R.G., Cheng, K., Schoen, W.R., Pong, S.S., Hickey, G., et al., 1993. A non peptidyl growth hormone secretagogue. Science 260, Lipinski, C., Hopkins, A., 2004. Navigating chemical space for biology and medicine. Nature 432, 855–861.
Surgand, J.S., Rodrigo, J., Kellenberger, E., Rognan, D., 2006. A Malherbe, P., Kratochvwil, N., Knoflach, F., Zenner, M.-T., Kew, J.N.C., chemogenomic analysis of the transmembrane binding cavity of Krattzeisen, C., Maerki, H.P., Adam, G., Mutel, V., 2003. Mutational human G-protein-coupled receptors. Proteins 62, 509–538.
analysis and molecular modeling of the allosteric binding site of a Varady, J., Wu, X., Fang, X., Min, J., Hu, Z., Levant, B., Wang, S., 2003.
novel selective, noncompetitive antagonist of the metabotropic gluta- Molecular modeling of the three-dimensional structure of dopamine 3 mate 1 receptor. J. Biol. Chem. 278, 8340–8347.
(D3) subtype receptor: discovery of novel and potent D3 ligands Nissink, J.W., Murray, C., Hartshorn, M., Verdonk, M.L., Cole, J.S., through a hybrid pharmacophore- and structure-based database Taylor, R., 2002. A new test set for validating predictions of protein– searching approach. J. Med. Chem. 46, 4377–4392.
ligand interaction. Proteins 49, 457–471.
Venter, J.C. et al., 2004. The sequence of the human genome. Science 291, Palczewski, K., Kumasaka, T., Hori, T., Behnke, C.A., Motoshima, H., Fox, B.A., Trong, I.L., Teller, D.C., Okada, T., Stenkamp, R.E., Verdonk, M.L., Cole, J.C., Hartshorn, M.J., Murray, C.W., Taylor, R.D., Yamamoto, M., Miyano, M., 2000. Crystal structure of rhodopsin: a 2003. Improved protein–ligand docking using GOLD. Proteins 52 (4), G protein-coupled receptor. Science 289, 739–745.
Paul, N., Bret, G., Kellenberger, E., Mu¨ller, P., Rognan, D., 2004.
Weber, A., Casini, A., Heine, A., Kuhn, D., Supuran, C.T., Scozzafava, Recovering the true targets of specific ligands by virtual screening of A., Klebe, G., 2004. Unexpected nanomolar inhibition of carbonic the protein data bank. Proteins 54, 671–680.
anhydrase by COX-2 selective celecoxib: new pharmacological oppor- Petrel, C., Kessler, A., Maslah, F., Dauban, P., Dood, R.H., tunities due to related binding site recognition. J. Med. Chem. 47, 550– Rognan, D., Ruat, M., 2003. Modeling and mutagenesis of the binding site of Calhex 231, a novel negative allosteric modulator of the Wise, A., Jupe, S.C., Rees, S., 2004. The identification of ligands at extracellular Ca(2+)-sensing receptor. J. Biol. Chem. 278, 49487– Orphan G-Protein coupled receptors. Annu. Rev. Pharmacol. Toxicol.
44, 43–66.


Diseases of goat

The common diseases of goat, their symptoms, treatment, and methods used in Sindh-Pakistan. By Mrs. Farzana Panhwar, July 2005 Author: Farzana Panhwar (Mrs) Address: 157-C, Unit No.2, Latifabad, Hyderabad (Sindh), Pakistan [email protected]

Derecho y Ciencia Seminario de Derecho y Ciencia Departamento Académico de Derecho Instituto Tecnológico Autónomo de México Cuadernos de Derecho y Ciencia Consejo Editorial Isabel Davara F. de Marcos Christian López Silva Ana Teresa Valdivia Alvarado Instituto Tecnológico Autónomo de México Arturo Fernández Departamento Académico de Derecho Coordinadora General