Structure activity relationships (SAR) analysis is the utilization of information on the molecular structure of chemicals to predict important characteristics related to persistence, distribution, uptake and absorption, and toxicity. SAR is an alternative method of identifying potential hazardous chemicals, which holds promise of assisting industries and governments in prioritizing substances for further evaluation or for early-stage decision making for new chemicals. Toxicology is an increasingly expensive and resource-intensive undertaking. Increased concerns over the potential for chemicals to cause adverse effects in exposed human populations have prompted regulatory and health agencies to expand the range and sensitivity of tests to detect toxicological hazards. At the same time, the real and perceived burdens of regulation upon industry have provoked concerns for the practicality of toxicity testing methods and data analysis. At present, the determination of chemical carcinogenicity depends upon lifetime testing of at least two species, both sexes, at several doses, with careful histopathological analysis of multiple organs, as well as detection of preneoplastic changes in cells and target organs. In the United States, the cancer bioassay is estimated to cost in excess of $3 million (1995 dollars).
Even with unlimited financial resources, the burden of testing the approximately 70,000 existing chemicals produced in the world today would exceed the available resources of trained toxicologists. Centuries would be required to complete even a first tier evaluation of these chemicals (NRC 1984). In many countries ethical concerns over the use of animals in toxicity testing have increased, bringing additional pressures upon the uses of standard methods of toxicity testing. SAR has been widely used in the pharmaceutical industry to identify molecules with potential for beneficial use in treatment (Hansch and Zhang 1993). In environmental and occupational health policy, SAR is used to predict the dispersion of compounds in the physical-chemical environment and to screen new chemicals for further evaluation of potential toxicity. Under the US Toxic Substances Control Act (TSCA), the EPA has used since 1979 an SAR approach as a “first screen” of new chemicals in the premanufacture notification (PMN) process; Australia uses a similar approach as part of its new chemicals notification (NICNAS) procedure. In the US SAR analysis is an important basis for determining that there is a reasonable basis to conclude that manufacture, processing, distribution, use or disposal of the substance will present an unreasonable risk of injury to human health or the environment, as required by Section 5(f) of TSCA. On the basis of this finding, EPA can then require actual tests of the substance under Section 6 of TSCA.
Rationale for SAR
The scientific rationale for SAR is based upon the assumption that the molecular structure of a chemical will predict important aspects of its behaviour in physical-chemical and biological systems (Hansch and Leo 1979).
SAR Process
The SAR review process includes identification of the chemical structure, including empirical formulations as well as the pure compound; identification of structurally analogous substances; searching databases and literature for information on structural analogs; and analysis of toxicity and other data on structural analogs. In some rare cases, information on the structure of the compound alone can be sufficient to support some SAR analysis, based upon well-understood mechanisms of toxicity. Several databases on SAR have been compiled, as well as computer-based methods for molecular structure prediction.
With this information, the following endpoints can be estimated with SAR:
- physical-chemical parameters: boiling point, vapour pressure, water solubility, octanol/water partition coefficient
- biological/environmental fate parameters: biodegradation, soil sorption, photodegradation, pharmacokinetics
- toxicity parameters: aquatic organism toxicity, absorption, acute mammalian toxicity (limit test or LD50), dermal, lung and eye irritation, sensitization, subchronic toxicity, mutagenicity.
It should be noted that SAR methods do not exist for such important health endpoints as carcinogenicity, developmental toxicity, reproductive toxicity, neurotoxicity, immunotoxicity or other target organ effects. This is due to three factors: the lack of a large database upon which to test SAR hypotheses, lack of knowledge of structural determinants of toxic action, and the multiplicity of target cells and mechanisms that are involved in these endpoints (see “The United States approach to risk assessment of reproductive toxicants and neurotoxic agents”). Some limited attempts to utilize SAR for predicting pharmacokinetics using information on partition coefficients and solubility (Johanson and Naslund 1988). More extensive quantitative SAR has been done to predict P450-dependent metabolism of a range of compounds and binding of dioxin- and PCB-like molecules to the cytosolic “dioxin” receptor (Hansch and Zhang 1993).
SAR has been shown to have varying predictability for some of the endpoints listed above, as shown in table 1. This table presents data from two comparisons of predicted activity with actual results obtained by empirical measurement or toxicity testing. SAR as conducted by US EPA experts performed more poorly for predicting physical-chemical properties than for predicting biological activity, including biodegradation. For toxicity endpoints, SAR performed best for predicting mutagenicity. Ashby and Tennant (1991) in a more extended study also found good predictability of short-term genotoxicity in their analysis of NTP chemicals. These findings are not surprising, given current understanding of molecular mechanisms of genotoxicity (see “Genetic toxicology”) and the role of electrophilicity in DNA binding. In contrast, SAR tended to underpredict systemic and subchronic toxicity in mammals and to overpredict acute toxicity to aquatic organisms.
Table 1. Comparison of SAR and test data: OECD/NTP analyses
Endpoint | Agreement (%) | Disagreement (%) | Number |
Boiling point | 50 | 50 | 30 |
Vapour pressure | 63 | 37 | 113 |
Water solubility | 68 | 32 | 133 |
Partition coefficient | 61 | 39 | 82 |
Biodegradation | 93 | 7 | 107 |
Fish toxicity | 77 | 22 | 130 |
Daphnia toxicity | 67 | 33 | 127 |
Acute mammalian toxicity (LD50 ) | 80 | 201 | 142 |
Skin irritation | 82 | 18 | 144 |
Eye irritation | 78 | 22 | 144 |
Skin sensitization | 84 | 16 | 144 |
Subchronic toxicity | 57 | 32 | 143 |
Mutagenicity2 | 88 | 12 | 139 |
Mutagenicity3 | 82–944 | 1–10 | 301 |
Carcinogenicity3 : Two year bioassay | 72–954 | — | 301 |
Source: Data from OECD, personal communication C. Auer ,US EPA. Only those endpoints for which comparable SAR predictions and actual test data were available were used in this analysis. NTP data are from Ashby and Tennant 1991.
1 Of concern was the failure by SAR to predict acute toxicity in 12% of the chemicals tested.
2 OECD data, based on Ames test concordance with SAR
3 NTP data, based on genetox assays compared to SAR predictions for several classes of “structurally alerting chemicals”.
4 Concordance varies with class; highest concordance was with aromatic amino/nitro compounds; lowest with “miscellaneous” structures.
For other toxic endpoints, as noted above, SAR has less demonstrable utility. Mammalian toxicity predictions are complicated by the lack of SAR for toxicokinetics of complex molecules. Nevertheless, some attempts have been made to propose SAR principles for complex mammalian toxicity endpoints (for instance, see Bernstein (1984) for an SAR analysis of potential male reproductive toxicants). In most cases, the database is too small to permit rigorous testing of structure-based predictions.
At this point it may be concluded that SAR may be useful mainly for prioritizing the investment of toxicity testing resources or for raising early concerns about potential hazard. Only in the case of mutagenicity is it likely that SAR analysis by itself can be utilized with reliability to inform other decisions. For no endpoint is it likely that SAR can provide the type of quantitative information required for risk assessment purposes as discussed elsewhere in this chapter and Encyclopaedia.