Chemical Libraries: Screening for Biologically Active Small Molecules - CHEMICAL BIOLOGY

CHEMICAL BIOLOGY

Chemical Libraries: Screening for Biologically Active Small Molecules

Ansgar Schuffenhauer, Novartis Institutes for BioMedical Research, Basel, Switzerland

doi: 10.1002/9780470048672.wecb072

The screening of chemical libraries is one of the major sources for new leads in drug discovery. The large size of chemistry space compared with the library sizes that are feasible to screen requires careful selection of the compounds for the screening library to maximize screening success. Besides issues around technology compatibility and chemical tractability of the compounds, the main objective is to increase the probability of obtaining hits for the screened targets. Diversity selection approaches have often shown only limited success. In the absence of any knowledge, it is proposed to screen smaller ''lead-like'' ligands with preference. When knowledge about the target is available, it can be used for target-focused compound selection or for library design. In the screening process, physical high-throughput screening (HTS) can be combined with virtual screening either to avoid the high-throughput primary screen of the whole library or to limit false negatives by combining primary HTS and virtual screening results. Screening an initial subset then using the results obtained to predict likely hits for subsequent screening rounds in sequential screening can lessen the number of compounds to be screened, but it causes a greater logistics effort and has the risk of missing compounds that are not well represented structurally by the initial set. Data analysis and visualization of the screening results are a necessary final step of a screening campaign to ensure that the prioritization of compounds followed up is based on all available relevant information.

When in vitro biologic assays replaced in vivo animal models as the first tool to assess biologic activity of molecules in drug discovery, the possibility existed to test many more compounds than was possible before. This triggered the hope that the slow process of lead discovery, which relies to a large extent on medical chemists’ intuition and serendipity, could be accelerated by a systematic brute-force screening of large collections of chemical compounds, for which the term “chemical libraries” has been introduced. Consequently, the pharmaceutical industry has built up high-throughput screening (HTS) facilities (see the article “High Throughput Screening (HTS) Techniques: Overview of Applications in Chemical Biology”), in which in vitro assays could be performed in a highly parallel, miniaturized, and automated way. With HTS available, it became not only possible to screen the historically accumulated compound collections of pharmaceutical companies, but also a much greater number of compounds exist. This finding triggered the demand for a highly parallelized and automated synthesis of compounds to feed the HTS machinery (see the articles “Combinatorial Libraries: Overview of Applications in Chemical Biology” and “Small Molecule Combinatorial Libraries”). Although large pharmaceutical companies screen compound libraries in the magnitude of 106 molecules, this approach is far away from a systematic brute-force approach because the chemistry space is estimated to contain 1013-1060 small molecules (1). The conservative estimate of 1013 molecules is based on well-established chemical reactions and commercially available reagents (2). Extrapolation from a systematic enumeration of all theoretically viable organic molecules up to 11 non-H atom toward 25 non-H atoms (the average size of drug-like molecules) suggest the existence of 1027 unique structures (3). Because only a small subset of the chemistry space can be screened, the compounds must be chosen appropriately to maximize the success of the screen. Three groups of criteria for this exist. First, the compounds must be compatible with the compound handling and screening technology used and should not cause assay artifacts. Second, the library must contain molecules with the desired activity. Last, once a hit is identified, the molecule must be optimizable into a drug candidate with suitable efficacy, bioavailability, therapeutic window, and, in the case of industrial drug discovery, patentability (see the article “Lead Optimization in Drug Discovery”).

Technology Compatibility and Optimizability

These two objectives are discussed together because many of the selection criteria for screening compounds are important to fulfill both objectives, namely physical chemical properties and chemical purity and stability. Especially with respect to optimizability, a violation of the selection criteria discussed here is not a definite reason for exclusion, but it is a liability of the compound, which needs to be addressed during the optimization of the compound that follows the discovery of the hit. The more such liabilities a compound exhibits, the more difficult the optimization of a lead compound will be. Also, not every violation of a technology compatibility criterion is a hard reason for exclusion, but rather it increases the potential of a compound to cause artifacts under a certain assay technology; appropriate experimental procedures are required to detect these artifacts.

Physical-chemical properties

Generally, biologic assays are performed in aqueous solution, typically in a concentration of up to 50-100 |xmol. These solutions are produced by diluting a stock solution of the compound in dimethylsulfoxide (DMSO) in the millimolar concentration range with buffer. Therefore the compounds must be soluble in water and in DMSO under the respective conditions, or a potential activity of the compound remains undetected or is underestimated largely. Water solubility is equally important for bioavailability of the drug, in which sufficiently high blood plasma levels must be achieved for efficacy. Unfortunately, neither the experimental determination of water solubility nor its prediction by computational methods is straightforward, because both depend not only on the hydrophilicity of compound, but also on the lattice energy of the crystal (4). Thus, based on Yalkowski’s equation (5) as a guideline to estimate water solubility, the logarithm of the octanol-water partition coefficient (logP) has been used frequently. It can be predicted by summing up fragment contributions that have been fitted on experimental data (6) as the ratio between hydrophobic and hydrophilic fragments. From a high lipophilicity as indicated by a high computed logP (ClogP), it can be concluded that the water solubility of the neutral compound is low; however, a low ClogP does not guarantee high water solubility. Protonation of basic groups or deprotonation of acidic groups lead to ionic species that frequently have higher solubility than neutral compounds.

In this context, it is noteworthy that lipophilicity is not only related to low aqueous solubility, but also to the tendency of compounds to form aggregates. Such aggregates can sequester the enzyme in biochemical assays in an unspecific way and can lead to the detection of false-positive hits. The exact cause for the aggregate formation and the mechanism and conditions of the enzyme sequestration are not understood completely; however, experimental procedures have been suggested to detect false positives caused by aggregation (7).

The second property of importance for bioavailability is the polar surface area (PSA) that is associated with intestinal absorption and cell membrane penetration by passive transport. Compounds with a high polar surface are less likely to penetrate the lipophilic environment of the cell membranes by passive transport. Like the logP, PSA can be computed by summing up fragment contributions (8) with H-bonding fragments as the main contributor.

The role of the physical chemical properties discussed so far is the rationale behind two popular rules of the thumb to estimate “drug-likeness”: Lipinski’s rule of-five (9), in which counts of hydrogen bond donors and acceptors take the place of the PSA, and the “Egan Egg” (10) (see Table 1).

Table 1. Empirical “rules-of-thumb” to estimate the suitability of compounds at different stages of drug discovery based on structural properties

Name

Rule

Purpose

Reference

Rule of fives

Two or more of the following conditions violated:

MW ≤ 500 Da

ClogP ≤ 5

HBD ≤ 5

HBA ≤ 0

Estimate whether a compound’s

absorption and membrane

permeation are good enough

to be orally bioavailable

(9)

Egan Egg

Ellipse defined in the ClogP and PSA space.

Estimate whether a compound’s

absorption and membrane

permeation are good enough

to be orally bioavailable

(10)

Lead-likeness

MW ≤ 460 Da and

—4 ≤ ClogP ≤ 4.2 and

LogSw ≥ —5 and RTB ≤ 5

and RNG ≤ 4 and HBD ≤ 5

and HBA ≤ 9

Identify compounds that have the potential to be successful leads

(11)

Rule of the three

MW ≤ 300 Da and ClogP ≤ 3

and HBD ≤ 3 and HBA ≤ 3

PSA ≤ 60 A2 and RTB ≤ n3

Identify compounds that have

the potential to be successful

fragment screening hits.

(12)

logSw, logarithm of aqueous solubility; RTB, number of rotatable bonds; RNG, number of rings; HBD, number of H-bond donors; HBA, number of H-bond acceptors; PSA, polar surface area.

Chemical purity and stability

To rationalize the results of a screen and to derive structure activity relationships (SAR) guiding the additional optimization of the compounds, it is prerequisite that the activity of a compound sample results from the structure being attributed to it. This can only be ensured if the compounds included in the screening collection are reasonably pure. Typical purity requirements are in the range of 85% to 95%. Impurities that interfere with the assay technology especially must be avoided. To remain in that state of purity, the compounds must be chemically stable under the conditions of storage, and because the fresh production of screening solutions from powder sample is not feasible for each individual HTS, the compound must be stable in DMSO solution over a period of time if the stock solution is intended to be used.

A chemically unstable compound is also not well suited to be marketed as a drug; therefore, insufficient chemical stability is also an issue for optimizability. If the compound contains chemical groups that are reactive toward DNA, the compound can become mutagenic, which constitutes another liability for optimization.

Practically, the stability issue is addressed mostly by the application of substructural filters to remove compounds with known labile and reactive functionality. Several sets of substructure filter sets published share a large degree of overlap (13, 14).

Other reasons for technology incompatibility

In addition to the general criteria discussed above, each assay technology can also suffer from technology-specific interference of chemical compounds. Auto-fluorescent compounds can interfere with any fluorescence-based readout. In assays that use the biotin-streptavidin interaction, biotin analogs are potential false positives.

Selection of Compounds with Desired Biologic Activity

In this section, it will be discussed how likely the screening library is to contain compounds that are active on the targets of interest. In the absence of any knowledge about the structure of the target or its ligands, only diversity-based sampling methods can be used. Knowledge about the target or its target family can be used for the design of target-(family)-focused library.

Diversity-based strategies

The central hypothesis for all diversity-based strategies of compound selection is the similarity property principle, which states that molecules with similar structures can be expected to have similar properties and to bind to the same target proteins (15). Following this principle, it is only necessary to screen one representative out of a group of molecules with similar structures because the other molecules of the group should have the same binding behavior as the representative. Consequently, many algorithms exist to select diverse subsets of molecules from a database that represent the groups of unselected molecules. These algorithms have been reviewed elsewhere (16, 17) and only a short overview is provided here. Most methods encode the molecular structures as a descriptor vector, from which similarity coefficients for pairs of molecules can be calculated without aligning the molecules. Then, these similarity coefficients are used in diversity selection or clustering algorithms (18). From a clustering solution, a diversity selection is obtained by choosing one or more representative molecules from each cluster. Each molecule in a diverse subset is expected to represent the nonselected molecules, and it can be interpreted as the center of a cluster formed by its similar neighbors. For the sake of a more descriptive discussion, the clustering viewpoint is assumed in the following paragraphs; however, the arguments made can generalize to other diversity-selecting procedures. Alternative to clustering, rule-based methods, typically based on the molecular scaffold, can be used to create partitions of molecules from which the representatives are selected (19-21).

Despite initially high expectations, diversity-based strategies for compound selection have shown only limited success. Diversity selection from the MDL Drug Data Report (MDDR), a database that contains only molecules with documented pharmacological properties, led to an enrichment of covered activity classes (22). However, diversity selections from a compilation of screening data that includes inactive molecules did not lead to an enrichment of targets covered by selected compounds (23). In a clustering experiment, the intracluster similarity of the IC50 vectors of the compounds measured in a uniform panel of assays was not much greater than the intra-group IC50 similarities of compounds grouped randomly (24).

How can these results be understood? At first, the similarity property principle is only valid on rather short similarity ranges. According to a popular rule of thumb, molecules that have a Tanimoto similarity coefficient of 0.85 calculated over the Daylight fingerprints are supposed to be very similar; however, often they differ significantly in their protein binding properties (25). Also, the inversion of the similarity property principle that dissimilar molecules should also have dissimilar protein binding properties is not generally true (26).

Second, the theory that the screening of only one representative per structural cluster is sufficient to determine the activity of the cluster assumes that the screening procedure is error free. Any error in the screening for the representative molecule is extrapolated to the whole cluster and leads to its misclassification. To compensate screening errors and to determine its activity safely, it is necessary to screen several representative molecules from a cluster of molecules with assumed common biologic activity. A statistical model to determine the number of representatives that need to be screened per cluster based on empirically estimated false positive and negative rates has been published by Harper et al. (27). They described that the probability that a compound is active is determined by the product of the probability n, of the cluster that contains the active compound (variable from cluster to cluster) and a probability a that describes the probability of an individual active compound, provided the cluster is active. In this model, a accounts for the average error of the screening process that leads to an erroneous determination of an individual compound’s activity and errors of the clustering procedures that lead to the erroneous grouping of a compound to a cluster with different activity. The activity related to the common “chemotype” or pharmacophore of the cluster i is described by πj, and even an active cluster with a high πi, is likely to be missed if α is low. If n compounds per cluster are screened, the probability to find at least one hit if the cluster is active equals (1-(1-αn)).

The third reason for the limited success of diversity-based strategies is the low baseline probability for bioactivity, with hit rates of 0.1% as the typical order of magnitude. In the case where a clustering was highly predictive of biologic activity, with active clusters that show a 100-fold enrichment of active compounds, this would still indicate that, on average, only 10% of compounds in any given active cluster are active. If the cluster was sampled with only one compound, the probability that the cluster is identified correctly as active would only be 10%. This theory is illustrated qualitatively in Fig. 1.

Figure 1. Qualitative illustration of a cluster-based sampling of a compound data set for screening and its impact on finding active compounds. In the clustering stage, illustrated on the left picture, the compounds are distributed into clusters and one (centroid) representative from each cluster is selected for testing. After testing these compounds, each cluster is attributed as active or inactive, which depends on the testing outcome of the representative. In cluster A, this works well, and a cluster with several active compounds (29%) is identified correctly. It is important to note in this context that even ''active'' clusters often contain only 20% of active compounds, which is still a large fraction compared with the overall baseline probability for bioactivity (19). In the case of cluster B, the representative was inactive, which leads to the misclassification of the cluster as inactive although 29% of the compounds are active. For most of the compounds in this cluster, the prediction that they would be inactive is correct; the overall outcome is more important than a false-negative active cluster.

How large should a screening library be?

Because of the inefficiency of an untargeted, diversity-oriented sampling, it is not possible to determine a meaningful library size as sufficient to sample the chemistry space. However, one can only get value out of a screening library that is matched by adequate resources for screening, compound logistics, and data handling. The costs for this and for the compounds practically limit the library size. The statistical model by Harper et al. (27) predicts the growth of the expected rate of successful screens. Successful means here that at least one hits series is found. According to this model the average number of hit series identified per assay increases linearly with the number of compounds in the library, assuming the average number of compounds per series is kept unchanged. However, the probability of success does not grow linearly with the size of the library. Each incremental addition of compounds leads to a smaller increase of the probability of success than the previous addition of the same number of compounds.

How large should the molecules in the library be?

The probability of a molecule to match the binding site of its target is assumed to be dependent on the size of the molecule. This relationship is described by the qualitative model of Hann et al. (28). After an initial increase, the probability that a ligand matches the binding site in exactly one orientation decreases with the ligand size. In addition, the number of potential molecules increases exponentially with the number of atoms; therefore, with increasing cutoff for the molecular size, it becomes more and more difficult to sample the chemistry space (3, 11). On the other hand, the larger the molecule, the more binding contacts it makes if it fits the binding site perfectly, which leads to a higher binding affinity (29). The binding affinity that is required minimally for detection depends on the sensitivity of the assay. An increase of assay sensitivity leads to a decrease of the required minimal size for ligands that have the potential to bind with a detectable affinity. For the reasons stated above, it makes sense to screen molecules in the size range that is large enough to allow for a detectable affinity, but not larger. Lead-likeness criteria have been formulated based on this finding (see Table 1). In fragment-based screening (FBS), highly sensitive biophysical assay technologies are used to detect the binding events of small molecular fragments to proteins (12). In the molecular size range used in FBS, the hit rate is, in accordance with the Hann model, much higher than in conventional biologic assays. The observed affinity is much lower (30), which requires the fragments to be amenable to chemical transformation to evolve them to molecules for which activity can be validated and be optimized using biochemical assays.

Binding or ligand efficiency (LE) metrics (31), in which the activity is normalized by the molecular size, have been introduced to prioritize screening hits. It has been observed by Hajduk (32) that, during the optimization of a chemical series, the ligand efficiency for the best compound after each optimization step is in most cases constant, what indicates that an increase of affinity coincides with an increase of molecular weight (MW). To achieve a final drug candidate with a potency of less than or equal to 10 nM and a MW less than or equal to 500 Da to comply with Lipinski’s rule, a LE of 0.016 pKi∙ Da-1 is the minimum requirement. Conventional HTS are typically sensitive enough to detect compounds Ki in the range of 1 μM. Assuming that the ligand efficiency is indeed constant for a chemical series, and only ligands with LE greater than or equal to 0.016 pKi ∙ Da-1 have the potential to optimized into a suitable drug candidate, then it would be sufficient to screen compounds with a maximum MW of 375 Da. Likewise, for a biophysical fragment screen that can detect KD in the range of 1mmol/L, a maximum MW of 188 Da would be sufficient if the binding constant translates into an inhibition constant of the same order of magnitude. However, the criterion of LE greater than or equal to 0.016 pKi ∙ Da-1 may not be achievable for every target. In Reference 32, chemical series exist with lower LE values that nevertheless went into preclinical development. It must also be taken into account that a screening with a low LE can still be a suitable tool compound and may serve as a starting point to design a scaffold with a higher LE, which has not been present in the screening library. The definition of optimization of a chemical series used by Hajduk is very narrow, and it allows an initial hit fragment to grow but not to have parts of it removed. In the exploration of HTS hits, pruning operations are frequent, although an ideal library also would have contained the compound that resulted from the pruning in the first place. Therefore, such LE-derived cut-off criteria for screening hits rarely can be applied stringently.

Target family-focused screening libraries

The limitations of the brute-force sampling of the chemistry space lead to attempts to identify structures that would have a greater chance of success. Virtual screening methods are now well-established technologies (33) (see the article “Computational Approaches in Drug Discovery and Development”). However, these methods work on individual targets, of which some knowledge is needed either about protein structure or about ligands. When assembling a screening library for lead discovery, the main objective is to design a broadly usable collection that also contains ligands for targets about which very little is known. Fortunately, drug targets are not isolated in the pharmacology space, and many pairs of targets share some common ligands, especially if they belong to the same protein family (34). Within a protein family, ligands are often similar enough that searching for chemical similarity with the ligands of one representative member also identifies ligands of the other members without reported activity on the reference protein (35). According to the SAR homology concept (36), targets in one family often share similar ligands and similar SAR behavior. This allows the library design to focus on a target family rather than an individual target and to leverage the whole ligand knowledge for a target class (see the article “Target Family-Biased Compound Library: Optimization, Target Selection, and Validation”). Typically, chemical similarity searching is based on the comparison of molecular descriptor vectors with similarity coefficients (37). Although similarity searching with the individual ligands and combination of the results by data fusion can be highly successful (38), the numerical or binary nature of the descriptor vectors allows a whole range of machine learning techniques to be applied from other areas of multivariate statistics. Examples of these techniques are binary kernel discrimators (38), support vector machines (39), emerging patterns (40), naive Bayesian classifiers (41), and self-organizing maps (42). Self-organizing maps have become especially popular because of the intuitive visualization of their results (43, 44). Often, the results depend strongly on the target class chosen and the available data. For this reason, one key success factor is to compile as comprehensive and accurate a reference set as possible, which requires bioactivity databases that are well integrated into both bioinformatics databases that describe protein family membership and chemical databases that characterize the ligands. Although considerable room for improvement still exists in this sector, a wide range of databases has become available, and they were reviewed recently (45) (see the article “Small Molecule, Drug-Target Databases”).

Structure-based virtual screening technologies use the complementarity between the structural features of ligand and its target protein-binding site. Docking, which tries to predict explicitly the orientation of a ligand within the binding site (pose) and to estimate its binding energy (score), is the most frequently used method and has now become well established (46). However, a new method uses the target-ligand complementarity for the generation of predictive models without generating binding poses (47).

To make docking suitable to identify the ligands of a whole target family, it is necessary to address the issue of how to deal with family members without an available protein structure and how to overcome the inaccuracy of the scoring functions for the analysis of the docking results. A substantial improvement of the results can be obtained by using not only the docking score as a decision criterion to retain or reject a pose, but also the key interactions between ligand and protein. Based on binding affinity of known ligand, the scoring weights of different interactions can be adjusted to reflect the SAR of the ligands (48). Ligand-protein interactions can be described by discrete bit vectors comparable with chemical fingerprint descriptors, which allow the efficient and fast filtering of poses as described for the design of a kinase-focused library (49). Recently, a method has been described to train the weights of interactions in the scoring functions automatically based on a set of known ligands. This method allowed the authors to predict not only the activity for the kinase to which the ligand has actually docked into, but also, by using a training set of activity data for a second kinase that is different from the one used for docking, the activity for this second kinase. This method allows predicting activity for kinases without a known structure (50). It is an example of a method in which classic docking is combined with statistical learning. To apply these methods, both protein structures and activity information over large set of ligands is required. Even further generalizations of the binding sites are schematic descriptions of kinase binding sites summarizing the features of several kinase inhibitor complexes and the variations in their binding pockets between different kinases (51, 52). Such qualitative models have been used successfully to design kinase-focused libraries. In the case of G-protein coupled receptors (GPCRs), the structural information is rather sparse; bovine rhodopsin is the only GPCR for which a crystal structure is currently available. However, this structure could be used to determine which residues are positioned within the binding cavity. A qualitative model has been derived to visualize of the molecular recognition of ligands in different GPCRs that depend on the amino acids side chains exposed in the binding site (52).

Although many methods described above can be used to screen individual structures or enumerated libraries for potential biologic activity, they offer, with exception of the generalized binding site models, no guidance for the de novo design of new scaffolds that have an increased potential for biologic activity on a range of targets. One of the earliest concepts to offer such guidance is based on the observation, that common core scaffolds exist, which can be differentiated by modification of its side chains into ligands that are individually selective for different target proteins. Evans et al. (53), who made this observation in the case of the benzodiazepine core contained in selective ligands for different cholecystokinin receptors, called such scaffolds privileged structures. This concept has been elaborated additionally (54) and studied systematically on the ligands in the MDDR. For each target family in the MDDR, ligands were exported and maximum common substructures (MCS) were extracted. Then, these substructures were assumed to be privileged structures and were checked for the presence in the ligands of other target families with the result that indeed many of these structures were present in the ligands of more than one target family (55). Targets that have similar ligand sets are not necessarily members of the same target family (56). It has been proposed that the limited number of protein folds in the proteome also leads to a limited number of ligand-sensing cores. A ligand-sensing core is defined as the folding pattern of the protein in a sphere with a diameter of 20-30 A around the binding site without taking the individual protein side chains into account (57). Different side chains in the ligand-sensing core can lead to a variety of diversely functionalized binding cavities, which may fulfill different functions and may occur in more than one target family. A privileged structure might be a suitable scaffold that orients its side chains in different regions of the binding pocket that is defined by the ligand-sensing core. Depending on the amino acid side chain, which is exposed in the binding pocket by the individual protein, different functional groups are required to be present on the privileged scaffold. The concept of biologically-oriented synthesis (BIOS (58)) suggests that in absence of detailed knowledge of an individual target protein’s structure, it is required to screen a diversely-functionalized library around a privileged or biologically pre-validated scaffold for the ligand sensing core in order to identify the correct substitution pattern for the individual target.

Natural products

Natural products are a traditional source of biologically active compounds, which are used either as drugs themselves or have inspired the discovery of synthetic drugs (59) (see the articles “Natural Products: An Overview” and “Natural Products to

Probe Biosynthetic Pathways”). They cover a wide range of chemical classes (59), and they are expected to be fine-tuned by evolution to fulfill a purpose that is often still unknown, but likely to involve the interaction with biomolecules. The capability of an organism to produce many variations of a metabolite at a low effort is considered a beneficial evolutionary trait of a species, which allows it to adapt its range of produced metabolites quickly to a changing environment. For this reason, the metabolic pathways involved in the synthesis of natural products are often branched highly and lead to high chemical diversity in collections of natural products (60). The synthetic complexity of natural products makes them often difficult to optimize. It has been shown, however, that not only natural products themselves, but also synthetic libraries of simplified analogs that retain only the key features of the original natural product can be applied successfully in screens for biologically active compounds. It has been demonstrated that from such a simplified natural product, core selective ligands for different targets can be derived and may be regarded as privileged structures (58).

Screening Processes and Strategies

The next question is how to make the best use of a library assembled according to these design principles. The goal of screening is to identify as many actives as in the screening library possible and to characterize them by measuring dose response curves, confirming both the chemical identity and the purity of the sample. Part of a thorough characterization of the hits is also running the the counter assays necessary to confirm that the observed activity results from a specific interaction of the ligand with the target and is not the result of an artifact caused by a technology incompatibility of the compound. These tasks must be executed with the lowest possible experimental effort, reagent costs, protein, and compound consumption. Although in basic research it may be sufficient to identity a limited number of tool compounds, in pharmaceutical industry it is of interest to identify as many of the active chemical series present in the screening library as possible and to establish intellectual property, because the attrition in the next steps of drug discovery is high.

HTS processes

Typically, the screening process (see the article “High Throughput Screening (HTS) Techniques: Overview of Applications in Chemical Biology”) begins with the production of stock solutions by dissolving powder samples and reformatting the solution samples into a uniform deck of stock solution plates. These samples are then stored under controlled conditions, and from these samples the screening plates are produced by plate replication systems.

Perhaps the most direct approach to screening is first to measure the dose response curves with the prefabricated assay plates that contain the compounds in the different concentrations (Fig. 2a). This technique has been shown to be feasible with a high level of automation for libraries up to the size of 100,000 samples. Because the same number of data points is measured for active and inactive compounds, absence and presence of activity are determined with the same degree of reliability. This reliability is an advantage for building SAR models. In addition, the analysis of the dose response curve shapes allows some conclusion as to whether the interaction between ligand and protein is specific (61).

However, in the pharmaceutical industry, larger libraries of a million or more compounds are often screened, and it is desirable to have a lower consumption of protein and compounds on those compounds that are inactive. Therefore, HTS begins with a single concentration screen: the primary screen. Then, for the compounds found active in the primary screen, dose response curves are determined in a validation phase. Between the primary screen and validation a confirmation screen can be performed in which, for the primary hits, the single concentration experiment is repeated and only hits with confirmed activity are validated (Fig. 2b). In any case, it is necessary to confirm chemical identity and purity of the samples found active to avoid misleading SAR information. For the same reasons, counter screens or secondary assays that use different read-out methods are performed to exclude an unspecific interaction of the compound with the assay system.

This process requires the capability to access large subsets of the screening library. In this process step, called cherry picking, individual samples must be taken from the mother plates with stock solution and dispensed into plates for the confirmation or validation screen. In addition, dilution series for dose-response curve measurement must be produced of the cherry-picked samples for validation. Technically, cherry picking is a nontrivial task, and if all compounds with significant primary activity are to be confirmed, which can be several thousand compounds, and then not only the screening capacity for confirmation screening needs to be available, but also the cherry-picking capacity for these samples must be available.

For large screening libraries, these processes can only be run with a high degree of automation in sample storage, cherry picking, screening, and chemical analytics. These automation systems must be driven by an informatics platform that tracks the contents of plates; collects the results of the different readers used for screening; and performs normalization, curve fitting, and detection of errors that may result from spillage and carry over of compounds in the pipetting process or edge effects (62, 63). The results of these automated preprocessing steps must be presented to the screener in an appropriate visualization after each screening step for quality control and final decision making. If the primary screening and its results justify the follow-up of more compounds than can be processed, then chemoinformatics techniques such as clustering can be used to ensure appropriate representation of all chemical classes in the validation set. Also in this step, compounds can be removed that interfere with the assay technology and are unlikely to interact specifically with the target (64). The decisions taken at these steps must be captured, and the lists of the selected compounds must be handed over to the cherry-picking system for automated process. The software tools used for these different tasks must be well integrated to achieve a process that runs smoothly (65).

Figure 2. Workflows for experimental physical HTS and virtual screening of compound libraries and their combination.

Integration of in silico screening

Because of the large investments in the hardware and software infrastructure required for HTS, to replace the primary screening in HTS by virtual screening, followed by the validation of a relatively small number of hits in experimental screening is seen as a valuable alternative (Fig. 2c). However, this task is only feasible if either information about protein structure or reference ligands is available (33). In addition to the physically available in-house collection, virtual screening can include compounds from vendor catalogs and even enumerated virtual libraries from which the hit compounds are then purchased or synthesized. Compilations of screening compound catalogs exist both publicly, such as ZINC (compiled by the Shoichet laboratory at UCSF, San Francisco, http://zinc.docking.org), which contains docking ready 3D structures (66), or in the commercial sector such as ChemNavigator (Chemnavigator, San Diego, http://www.chemnavigator.com), which is linked to a sample procurement service. Several cases have been reported in which active ligands have been discovered successfully using such processes (67). However, if automated high-throughput experimentation is abandoned, then only small numbers of compounds can be validated (typically below 1000), whereas typical HTS setups allow the validation of a couple of more than 1000 compounds. Similar to physical HTS, but to a higher extent, virtual screening is affected by false positives and false negatives. In typical virtual screening, accumulation of 90% of the true positives in the top 10% ranking compounds is an excellent result that is almost never reached in practice (68, 69). Assuming an industrial HTS library of a million compounds, to validate a virtual screening hitlist that consists only of 1% of this library despite the inevitably high false-negative rate this will cause, requires HT experimentation. Data fusion of HT experimentation and virtual screening can be expected to compensate errors of each of the methods and to allow the validation of a significant number of hits. Virtual screening in this setup no longer has the purpose to save investment in HTexperimentation, but to maximize the positively validated compounds over the whole process to identify as many true positive hits in the collection as possible to feed in the drug discovery process (Fig. 2d).

Sequential screening

Instead of screening the whole library in one batch, it has been proposed to screen an initial subset and use the screening results from this subset to train a statistical model to predict and to prioritize the remaining library. The remaining library is then screened, and the cycle of model building, prediction, and screening can be executed several times, which is referred to as sequential or iterative screening (70). Although this seems to be very attractive because it reduces the number of compounds that require screening, the multiple selection cycles leads to a longer overall screening time increasing the assay logistics effort. Together with the multiple cherry-picking and data processing cycles this may cause more effort than the savings from screening less compounds. In sequential screening it is necessary to choose an initial set. In the absence of reasonable knowledge for the selection of a focused subset, the initial set must be selected by diversity selection, whose limitations have been discussed above. In a compound collection that has been designed to avoid unnecessary redundancy by applying reasonable diversity selection, little can be gained by additional diversity selection. Any active compound class not represented reasonably by the initial screening set is unlikely to be recovered in the additional screening cycles, because the statistical models built on the screening results cannot make valid predictions for it. However, one can expect to identify additional actives in the series covered by the initial set. Recently, it has been demonstrated that screening 25% of a one million compound library selected as a diversity set based on full plates followed by one prediction and screening cycle offers a reasonable compromise between logistical efforts, numbers of compounds screened and hit series covered (Fig 2e) (71). When the screening cost per compound is high and dominates the logistics effort sequential screening can be expected to be beneficial, provided it is acceptable to identify only a limited number of tool compounds instead of as many hit series out of the library as possible.

Analysis, Reporting, and Visualization

Often, the number of validated hits in an HTS is so high that the results cannot be analyzed without computational assistance. The primary objective of HTS analysis is to identify chemical series, which are characterized by a common structural core that allows the exploration of the series in a joint chemical synthesis effort and aligns the structures of a series to derive SAR. Therefore, in HTS analysis, compounds are grouped by structure, which can be achieved either with clustering procedures, especially those based on maximal common substructures (72, 73) or with rule-based classifications (18, 19, 74). An advantage to the clustering methods is that they adapt themselves to the data set, whereas an advantage for the rule-based methods is that they are rigid and are resilient to changes in the composition of the screening library. In the beginning of HTS, hits were often prioritized by potency alone; however, nowadays it is recognized that a wide range of properties has an impact on the additional success of a chemical series. Therefore, each series must be annotated with all information relevant for their additional prioritization, such as potency and ligand efficiency on the screening target, activity on other targets, and calculated or measured physical chemical properties. It is also of interest to identify inactive compounds that contain the substructure of an active series. The data set generated in this way is very information-rich, which requires special visualization techniques that are able to display several properties simultaneously and interactively (75, 76). Because the data visualized results from different experiments or computational procedures, its aggregation is a nontrivial task, which requires a highly modular information technology architecture that is flexible enough to integrate new analysis algorithms or visualization software. In this aspect, the monolithic software packages of the first generation of HTS analysis software in which data warehouse building, clustering and visualization are hard-wired together are often problematic. Traditionally the data storage techniques used for screening data processing were oriented more to safekeep data than to facilitate data analysis. The recent activity both from commercial and academic sectors in the data analysis leads to cautious optimism that this particular bottleneck of the usage of HTS data might be overcome.

References

1. Gorse AD. Diversity in medicinal chemical space, Curr. Top. Med. Chem. 2006; 6:3-18.

2. Andrews KM, Cramer RD. Toward general methods of targeted library design: topomer shape similarity searching with diverse structures as queries. J. Med. Chem. 2000; 43:1723-1740.

3. Fink T, Reymond J-L. Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. J. Chem. Inf. Comput. Sci. 2007; 47:342-353.

4. Delaney JS. Predicting aqueous solubility from structure. Drug Disc. Today 2005; 10:289-295.

5. Ran Y, Yalkowsky SH. Prediction of drug solubility by the general solubility equation (GSE). J. Chem. Inf. Comput. Sci. 2001; 41:354-357.

6. Leo AJ. Calculating log Poct from structures. Chem. Rev. 1993; 93:1281-1306.

7. McGovern SL, Helfand BT, Feng B, Shoichet BK. A specific mechanism of nonspecific inhibition. J. Med. Chem. 2003; 46:4265-4272.

8. Ertl P, Rohde B, Selzer P. Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. J. Med. Chem. 2000; 43:3714-3717.

9. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug. Deliv. Rev. 1997; 23:3-25.

10. Egan WJ, Merz KM, Baldwin JJ. Prediction of drug absorption using multivariate Statistics. J. Med. Chem. 2000; 43:3867-3877.

11. Hann MM, Oprea TI. Pursuing the leadlikeness concept in pharmaceutical research. Curr. Opin. Chem. Biol. 2004; 8:255-263.

12. Rees DC, Congreve M, Murray CW, Carr R. Fragment based lead discovery, Nature Rev. Drug. Disc. 2004; 3:600-672.

13. Rishton GM. Nonleadlikeness and leadlikeness in biochemical screening. Drug Disc. Today 2002; 8:86-96.

14. Charifson PS, Walters WP. Filtering databases and chemical libraries. J. Comput.-Aid. Mol. Design 2002; 16:311-323.

15. Dean PM. Molecular recognition: the measurement and search for molecular similarity in ligand-receptor interaction. In: Concepts and Applications of Molecular Similarity. Maggiora GM, Johnson MA, eds. 1990. John Wiley & Sons, New York. pp. 99-117.

16. Schuffenhauer A, Brown N. Chemical diversity and biological activity. Drug Disc. Today Technol. 2006; 3:387-395.

17. Gibbs AC, Agrafiotis DK. Chemical diversity: definition and quantification. In: Exploiting Chemical Diversity for Drug Discovery. Bartlett PA, Entzeroth M, eds. 2006. RSC Publishing, Cambridge. pp 137-159.

18. Downs GM, Barnard JM. Clustering methods and their uses in computational chemistry. Rev. Comput. Chem. 2002; 18:1-40.

19. Bemis GW, Murcko MA. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 1996; 39:2887-2893.

20. Xu YJ, Johnson M. Using molecular equivalence numbers to visually explore structural features that distinguish chemical libraries. J. Chem. Inf. Comput. Sci. 2002; 42:912-926.

21. Schuffenhauer A, Ertl P, Wetzel S, Koch MA, Waldmann H. The scaffold tree-visualization of the scaffold universe by hierarchical scaffold classification. J. Chem. Inf. Model. 2007; 47:47-58.

22. Snarey M, Terrett NK, Willett P. Comparison of algorithms for dissimilarity-based compound selection. J. Mol. Graph. Mod. 1998; 15:372-385.

23. Schuffenhauer A, Brown N, Selzer P, Ertl P, Jacoby E. Relationships between molecular complexity, biological activity, and structural diversity. J. Chem. Inf. Model. 2006; 46:525-535.

24. Schuffenhauer A, Brown N, Ertl P, Jenkins JL, Selzer P, Hamon J. Clustering and rule-based classifications of chemical structures evaluated in the biological activity space. Chem. Inf. Model. 2007; 47:325-336.

25. Martin YC, Kofron JL, Traphagen LM. Do structurally similar molecules have similar biological activity? J. Med. Chem. 2002; 45:4350-4358.

26. Barbosa F, Horvath D. Molecular similarity and property similarity. Curr. Top. Med. Chem. 2004; 4:589-600.

27. Harper G, Pickett SD, Green DVS. Design of a compound screening collection for use in high throughput screening. Comb. Chem. HTS 2004; 7:63-70.

28. Hann MM, Leach AR, Harper G. Molecular complexity and its impact on the probability of finding leads for drug discovery. J. Chem. Inf. Comput. Sci. 2001; 41:856-864.

29. Selzer P, Roth HJ, Ertl P, Schuffenhauer A. Complex molecules - do they add value?. Curr Opin. Chem. Biol. 2005; 9:310-316.

30. Schuffenhauer A, Ruedisser S, Marzinzik A, Jahnke W, Selzer P, Jacoby E. Library design for fragment based screening. Curr. Top. Med. Chem. 2005; 5:751-762.

31. Hopkins LA, Groom CR, Alex A. Ligand efficiency: a useful metric for lead selection. Drug Disc. Today 2004; 9:430-431.

32. Hajduk PJ. Fragment-based drug design: how big is too big? J. Med. Chem. 2006; 49:6972-6976.

33. Bajorath J. Integration of virtual and high-troughput screening. Nature Rev. Drug Disc. 2002; 1:882-894.

34. Paolini GV, Shapland RHB, van Hoorn WP, Mason JS, Hopkins AL. Global mapping of pharmacological space. Nature Biotechnol. 2006; 24:805-815.

35. Schuffenhauer A, Floersheim P, Acklin P, Jacoby E. Similarity metrics for ligands reflecting the similarity of the target proteins. J. Chem. Inf. Comput. Sci. 2003; 43:391-405.

36. Frye SV. Structure-activity relationship homology (SARAH); a conceptual framework for drug discovery in the genomic era. Chem. Biol. 1999; 6:R3-R7.

37. Willett P, Barnard JM, Downs GM. Chemical similarity searching. J. Chem. Inf. Comput. Sci. 1998; 38:983-996.

38. Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A. Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures. J. Chem. Inf. Comput. Sci. 2004; 44:1177-1185.

39. Byvatov E, Schneider G. SVM-based feature selection for characterization of focused compound collections. J. Chem. Inf. Comput. Sci. 2004; 44:993-999.

40. Auer J, Bajorath J. Emerging chemical patterns: a new methodology for molecular classification and compound selection. J. Chem. Inf. Model. 2006; 46:2502-2514.

41. Xia X, Maliski EG, Galliant P, Rogers D. Classification of kinase inhibitors using a bayesian model. J. Med. Chem. 2004; 47:4463-4470.

42. Zupan J, Gasteiger J. Neural Networks in Chemistry and Drug Design. 1999. Wiley-VCH, Weinheim.

43. von Korff M, Hilpert K. Assessing the predictive power of unsupervised visualization techniques to improve the identification of GPCR-focused compound libraries. J. Chem. Inf. Model. 2006; 46:1580-1587.

44. Selzer P, Ertl P. Applications of self-organizing neural networks in virtual screening and diversity selection. J. Chem. Inf. Model. 2006; 46:2319-2323.

45. Oprea TI, Tropsha A. Target, chemical and bioactivity databases- integration is key. Drug Disc. Today: Technol. 2006; 3:357-365.

46. Kitchen DB, Decornez H, Furr JR, Bajorath J. Docking and scoring in virtual screening for drug discovery: methods and applications. Nat. Rev. Drug. Discov. 2004; 3:935-949.

47. Oloff S, Zhang S, Sukumar N, Breneman C, Tropsha A. Chemo- metric analysis of ligand receptor complementarity: identifying complementary ligands based on receptor information (CoLiBRI). J. Chem. Inf. Mod. 2006; 46:844-851.

48. Jansen JM, Martin EJ. Target-biased scoring approaches and expert systems in structure-based virtual screening. Curr. Opin. Chem. Biol. 2004; 8:359-364.

49. Sun D, Chuaqui C, Deng Z, Bowes S, Chin D, Singh J, Cullen P, Hankins G, Lee WC, Donelly J, Friedmann J, Josiah S. A kinase-focused compound collection: compilation and screening stragtegy. Chem. Biol. Drug. Des. 2006; 67:385-394.

50. Martin E, Sullivan D. “Surrogate docking” with AUTOSHIM ensembles: using PLS/MAGNET to customize scoring functions for an ensemble of diverse kinases to predict the activity of new kinases, even without crystal structures or homology models. 2006. 232nd ACS National Meeting, San Francisco, CA.

51. Liao JJL. Molecular recognition of protein kinase binding pockets for design of potent and selective kinase inhibitors. J. Med. Chem. 2007; 50:1-16.

52. Harris JC, Stevens AP Chemogenomics: structuring the drug discovery process to gene families. Drug Disc. Today 2006; 11:880-888.

53. Evans BE, Rittle KE, Bock MG, DiPardo RM, Freidinger RM, Whitter WL, Lundell GF, Veber DF, Anderson PS, Chang RSL, Lotti VJ, Cerino DJ, Chen TB, Kling PJ, Kunkel KA, Springer JP, Hirshfield J. Methods for drug discovery: development of potent, selective, orally effective cholecystokinin anatagonists. J. Med. Chem. 1988; 31:2235-2246.

54. Muller G. Medicinal chemistry of target family-directed masterkeys. Drug Disc. Today 2003; 8:681-691.

55. Schnur DM, Hermsmeier MA, Tebben AJ. Are target-family-privileged substructures truly privileged. J. Med. Chem. 2006; 49:2000-2009.

56. Keiser MJ, Roth BL, Armbruster BN, Ernsberger P; Irwin JJ, Shoichet BK Relating protein pharmacology by ligand chemistry. Nature Biotechnol. 2007; 25:197-206.

57. Koch MA, Wittenberg L-O, Basu S, Jeyaraj DA, Gourzoulidou E, Reinecke K, Odermatt A, Waldmann H. Compound library development guided by protein structure similarity clustering and natural product structure. Proc. Nat. Acad. Sci. U.S.A. 2004; 101:16721-16726.

58. Noren-Muller A, Reis-Correa I, Prinz H, Rosenbaum C, Saxena K, Schwalbe HJ, Vestweber D, Cagna G, Schunk S, Schwarz O, Schiewe H, Waldmann H. Discovery of protein phosphatase inhibitor classes by biology-oriented synthesis. Proc. Nat. Acad. Sci. U.S.A. 2006; 1003:10606-10611.

59. Newman DJ, Cragg GM, Snader KM. Natural products as sources of new drugs over the period 1981-2002. J. Nat. Prod. 2003; 66:1022-1037.

60. Firn RD, Jones CG. Natural products-a simple model to explain chemical diversity Nat. Prod. Rep. 2003; 20:382-391.

61. Inglese J, Auld DS, Jadhav A, Johnson RL, Simeonov A, Yasgar A, Zheng W, Austin CP. Quantitative high-throughput screening: a titration-based approach that efficiently identifies biological activities in large chemical libraries. Proc. Nat. Acad. Sci. U.S.A. 2006; 103:11473-11478.

62. Harper G, Picket SD. Methods for mining HTS data. Drug Disc. Today 2006; 11:694-699.

63. Heyse S. Comprehensive analysis of high-throughput screening data. Proc. SPIE 2002; 4626:535-547.

64. Davies JW, Glick M, Jenkins JL. Streamlining lead discovery by aligning in silico and high-throughput screening. Curr. Opin. Chem. Biol. 2006; 10:343-251.

65. Fay N. The role of the discovery informatics framework in early lead discovery. Drug Disc. Today 2006; 11:1075-1084.

66. Irwin JJ, Shoichet B. ZINC - A free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 2005; 45:177-182.

67. Fara DC, Oprea TI, Prossnitz ER, Bologa CG, Edwards BS, Sklar LA. Integration of virtual and physical screening. Drug Disc. Today: Technol. 2006; 3:377-385.

68. Hert J, Willett P, Wilton DJ, Addin P, Azzaoui K, Jacoby E, Schuffenhauer A. Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. Org. Biomol. Chem. 2004; 2:3256-3266.

69. Warren GL, Andrews CW, Capelli A-M, Clarke B, LaLonde J, Lambert MH, Lindvall M, Nevins N, Semus SF, Senger S, Tedesco G, Wall ID, Woolven JM, Peishoff CE, Head MS. A critical assessment of docking programs and scoring functions. J. Med. Chem. 2006; 49:5912-5931.

70. Engels MFM, Venkatatangam P. Smart screening: approaches to efficient HTS. Curr. Opin. Drug. Disc. Dev. 2001; 4:275-283.

71. Crisman TJ, Jenkins JL, Parker CN, Hill WAG, Bender A, Deng Z, Nettles JH, Davies JW, Glick M. “Plate Cherry Picking”: a novel semi-sequential screening paradigm for cheaper, faster, information-rich compound selection. J Biomol Screen 2007; 12:320-327.

72. Raymond JW, Kibbey CE. An automated method for exploring targeted substructural diversity within sets of chemical structures. J. Chem. Inf. Model. 2005; 45:1195-1204.

73. Tamura SY, Bacha PA, Gruver HS, Nutt RF. Data analysis of high-throughput screening results: application of multidomain clustering to the NCI anti-HIV data set. J. Med. Chem. 2002; 45:3082-3093.

74. Roberts G, Myatt GJ, Johnson WP, Cross KP, Blower PE Jr. LeadScope: software for exploring large sets of screening data. J. Chem. Inf. Comput. Sci. 2000; 40:1302-1314.

75. Howe TJ, Mahieu G, Maricahl P, Tabruyn T, Vugts P. Data reduction and presentation in drug discovery. Drug Disc. Today 2007; 12:45-53.

76. Agrafiotis DK, Bandyopadhyay D, Farnum M. Radial cluster-grams: visualizing the aggregate properties of hierarchical clusters. J. Chem. Inf. Model. 2007; 47:69-75.

Further Reading

Bartlett PA, Entzeroth M, eds. Exploiting Chemical Diversity for Drug Discovery. 2006. RSC Publishing, Cambridge.

Jacoby E, ed. Chemogenomics. Knowledge-based Approaches to Drug Discovery. 2006. Imperial College Press, London.

Lipinski C, Hopkins A. Navigating chemical space for biology and medicine. Nature Rev. Drug. Disc. 2004; 432:855-861.

Oprea TI, Davis AM, Teague SJ. Is there a difference between leads and drugs? A historical perspective. J. Chem. Inf. Comput. Sci. 2001; 41:1308-1315.

See Also

Combinatorial Libraries: Overview of Applications in Chemical Biology

Compound Handling

Computational Approaches in Drug Discovery and Development

High Throughput Screening (HTS) Techniques: Overview of Applications in Chemical Biology

Lead Optimization in Drug Discovery

Natural Products to Probe Biosynthetic Pathways

Natural Products: An Overview

Small Molecule Combinatorial Libraries

Small Molecule, Drug-Target Databases

Target Family-Biased Compound Library: Optimization, Target Selection and Validation