Skip Navigation


ToxSci Advance Access originally published online on August 3, 2007
Toxicological Sciences 2007 99(2):532-544; doi:10.1093/toxsci/kfm185
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
99/2/532    most recent
kfm185v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Li, Y.
Right arrow Articles by Tseng, Y. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Li, Y.
Right arrow Articles by Tseng, Y. J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press on behalf of the Society of Toxicology. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Categorical QSAR Models for Skin Sensitization based upon Local Lymph Node Assay Classification Measures Part 2: 4D-Fingerprint Three-State and Two-2-State Logistic Regression Models

Yi Li*, Dahua Pan*, Jianzhong Liu{dagger},{ddagger}, Petra S. Kern§, G. Frank Gerberick||, Anton J. Hopfinger{dagger},{ddagger} and Yufeng J. Tseng,{ddagger},1

* Laboratory of Molecular Modeling and Design (MC 781), College of Pharmacy, University of Illinois at Chicago, Chicago, Illinois 60612-7231 {dagger} College of Pharmacy, SC09 5360, University of New Mexico, Albuquerque, New Mexico 87131-0001 {ddagger} The Chem21 Group, Inc., Lake Forest, Illinois 60045 § Procter & Gamble Eurocor, B-1853 Strombeek-Bever, Belgium || The Procter & Gamble Company, Miami Valley Innovation Center, Cincinnati, Ohio 45253-8707 Graduate Institute of Biomedical Engineering and Bioinformatics, Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan 106

1 To whom correspondence should be addressed at Graduate Institute of Biomedical Engineering and Bioinformatics, Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, Taiwan 106. Fax: +886-2-236-28167. E-mail: yjtseng{at}csie.ntu.edu.tw.

Received May 4, 2007; accepted July 9, 2007


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 FUNDING
 REFERENCES
 
Three and four state categorical quantitative structure–activity relationship (QSAR) models for skin sensitization have been constructed using data from the murine Local Lymph Node Assay studies. These are the same data we previously used to build two-state (sensitizer, nonsensitizer) QSAR models (Li et al., 2007Go, Chem. Res. Toxicol. 20, 114–128). 4D-fingerprint descriptors derived from the 4D-molecular similarity paradigm are used to generate these models. A training set of 196 and a test set of 22 structurally diverse compounds were used in this study. Logistic regression, and partial least square coupled logistic regression were used to build the models. The three-state QSAR model gives a classification accuracy of 73.4% for the training set and 63.6% for the test set, while the random average value of classification accuracy for any three-state data set is 33.3%. The two-2-state [four categories in total] QSAR model gives a classification accuracy of 83.2% for the training set and 54.6% for the test set, while the random average value of classification accuracy for any two-2-state data set is 25%. An analysis of the skin-sensitization models developed in this study, as well as the two-state QSAR models developed in our previous analysis, suggests that the "moderate" sensitizers may be the main source of limited model accuracy.

Key Words: skin sensitization; QSAR, logistic regression; 4D-fingerprints; categorical models.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 FUNDING
 REFERENCES
 
The only validated methods currently available to identify skin sensitization effects are in vivo models such as the murine Local Lymph Node Assay (LLNA) (Basketter et al., 2000Go; Kimber and Basketter, 1992Go). But there is a tremendous need to develop new approaches as alternatives to animal testing strategies, in general, and for skin sensitization in particular. Quantitative structure–activity relationships (QSARs) are increasingly seen as playing a role in compound evaluation and screening, and are considered an important alternative for the estimation of toxicity effects.

Allergic contact dermatitis (ACD) was discussed in detail in our previous paper, and we focus only upon work done in establishing ACD cause–effect and structure–activity relationships here (Li et al., 2007Go). Landsteiner and Jacobs (1936)Go led in linking the occurrence of ACD with protein reactivity to allergens. Such hapten–protein reactions usually involve the allergen, or its metabolite, acting as an electrophile, and a protein acting as a nucleophile. Based on this concept, structure–activity relationships, SARs, could be established for a variety of chemical classes that relate skin sensitization potential to physicochemical parameters using a variety of sensitization assay data.

Published QSARs have been largely developed for individual classes of chemicals. Moreover, structure-skin sensitization correlations have been derived mainly for homologous series of chemicals (Basketter et al., 1992Go; Roberts, 1987Go; Roberts et al., 1991Go). Roberts and Williams established SARs based on electrophilicity and hydrophobicity parameters using the relative alkylation index. This approach, and its corresponding models, have been used to evaluate data on various sets of skin sensitizing chemicals, including alkylating agents (Roberts and Basketter, 2000Go), sulfonate esters and acrylates (Roberts, 1987Go), {alpha},ß-diketones (Roberts et al., 1999Go), aldehydes (Patlewicz et al., 2001Go), and other classes of chemicals (Roberts and Williams, 1982Go).

Overall, skin-sensitization modeling can be divided into two different major categories: mechanism-based and data correlation-based. Most of studies carried out until the early 1990's belong to the former category (Basketter and Roberts, 1990Go; Roberts and Basketter, 1990Go; Roberts and Benezra, 1993Go). The principal limitation of these types of predictive QSAR studies is that they can only be reliably applied to homolog sets of chemicals, which, in effect, ensures that the chemicals share a common mechanism of action. Predictive QSAR models for heterogeneous, structurally diverse databases have been sought since the middle 1990's using categorical statistical methods. Discriminate analysis is the most widely adopted categorical method (Cronin and Basketter, 1994Go; Enslein et al., 1997Go; Magee et al., 1994Go), while logistic regression (LR) is receiving increasing attention (Fedorowicz et al., 2004Go). LR requires few assumptions, in theory, and is easier to use and understand than is discriminate analysis (Fedorowicz et al., 2004Go).

Development of the LLNA has facilitated the use of QSARs to predict skin sensitization potential by providing a standardized continuous scale of a quantitative assessment of skin sensitization (Basketter et al., 1999Go). From the dose–response curve EC3, the estimated concentration of a test material required to produce a stimulation index of 3, can be calculated and used as a response variable for biological activity QSAR modeling. In addition, the skin sensitization potency of a chemical can be categorized into potency classes according to its EC3 value. Consequently, categorical discrimination across potency classes, e.g., non-, weak, moderate, strong, and extreme sensitizers, can be assigned on the basis of outcomes from the LLNA.

Recently, we reported the development of two-state (sensitizer, nonsensitizer) categorical QSAR models based upon an LLNA database (Li et al., 2007Go). These two-state models provided fits to the data of the training set with from 87.1 to 89.4% accuracy. Correspondingly, these models performed, on a representative test set, with prediction accuracies ranging from 80.0 to 86.7%.

A set of universal descriptors, called 4D-fingerprints, were used to develop two-state (sensitizer, nonsensitizer) categorical QSAR models. The 4D-fingerprints (Senese et al., 2004Go) are derived from a methodology called 4D-molecular similarity analysis (Duca et al., 2001), which is based upon the 4D-QSAR paradigm (Hopfinger et al., 1997) pioneered in our laboratory. Each "finger" of the 4D-fingerprints corresponds to a particular atom/pharmacophore pair type in a molecule. Moreover, the 4D-fingerprints not only capture conformational ensemble, molecular size, and chemical structure information of the molecule, but can be determined independent of molecular alignment. The 4D-fingerprints have again been used in this study to generate the trial descriptor pool for the construction of the three-state and two-2-state categorical QSAR models. Generation of these models was accomplished using both using LR and partial least square coupled logistic regression (PLS-CLR) (Nguyen and Rocke, 2002Go).


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 FUNDING
 REFERENCES
 
The LLNA Database
The LLNA training and test sets used in this study are the same as used in our previous study (Li et al., 2007Go) and taken from a master skin sensitization database constructed from data contributions made by a set of interested organizations (Gerberick et al., 2005Go). Metal ions and mixtures in the master database were excluded in selecting compounds for the studies reported here. This left a pruned database of 218 compounds with EC3 values, which are correspondingly categorized as non-, weak-, moderate-, strong-, and extreme-skin sensitizers.

Training and Test Sets for the Three-State QSAR Models
The pruned database was reclassified into three categories: 101 "non–weak" sensitizers, 71 "moderate" sensitizers, and 46 "strong–extreme" sensitizers, which have been further divided into the training set composed of 196 compounds, and the test set composed of 22 compounds. Table 1 summarizes the distribution of compounds across the pruned database, and the corresponding training and test sets used to perform the three-state categorical QSAR modeling. The three-state QSAR models are built for a three-state dependent variable having a value of "1" for "non and weak" sensitizers, a value of "2" for "moderate" sensitizers, and a value of "3" for "strong and extreme" sensitizers.


View this table:
[in this window]
[in a new window]

 
TABLE 1 Number of Compounds in Each Skin Sensitization Potency Category for the Training Set and the Test Set for Three-State Categorical QSAR Modeling

 
Training and Test Sets for the Two-2-State QSAR Models
The skin sensitization database for two-2-state model construction is the same pruned database used for three-state model building. However, the pruned database has been reclassified in two steps.

In the first step, compounds in the database are categorized as 101 "non–weak" sensitizers, and 117 "moderate–strong–extreme" sensitizers, which then are further divided into the training set and the test set, as shown and defined in Table 1. Values for the dependent variables have been assigned based on the skin sensitization potency of each compound, that is, a value of "0" for "non and weak" sensitizers, and a value of "1" for "moderate, strong, and extreme" sensitizers.

In the second step, compounds are selected from the database if their skin sensitization potencies are predicted correctly using the two-state PLS-CLR QSAR model (see below) developed in the first step. For the selected "non–weak" sensitizers, values of the dependent variable have again been assigned based upon the skin sensitization potency of each compound, that is, a value of "0" for a nonsensitizer, and a value of "1" for a weak sensitizer. These data are then used to build one of the two second-step two-state categorical models.

Similarly, for the selected "moderate, strong, and extreme" sensitizers, values for the dependent variables have been assigned based on the skin sensitization potency of each compound, that is, a value of "0" for moderate sensitizers, and a value of "1" for the "strong and extreme" sensitizers. These data are correspondingly used to build the other two-state categorical model of the second step in the model building scheme.

4D-Fingerprint Formalism and Methodology
The theory and formalism underlying the 4D-molecular fingerprints (Senese et al., 2004Go), 4D-FP, were presented in our previous paper on two-state (sensitizer, nonsensitizer) categorical QSAR models for skin sensitization (Li et al., 2007Go). The formalism and methodology are only summarized here.

All 3D structures of the compounds in the LLNA database were built using the Chemlab-II molecular modeling package (Pearlstein, 1998Go). Molecular dynamic simulations, using the Molsim program (Doherty, 2001Go), were then carried out on each molecule to perform a conformational ensemble sampling of the set of compounds. A 1000 conformations of each molecule were sampled. Such conformational sampling constitutes part of the pseudo "fourth dimension" of the 4D methodology.

The molecules are divided into their "functional pieces," called interaction pharmacophore elements (IPEs), as defined in Table 2 (Senese et al., 2004Go). The IPE 4D-fingerprint descriptors are eigenvalues from the eigenvectors determined for a molecule from its absolute molecular similarity main distance-dependent matrix (MDDM) (Duca and Hopfinger, 2001Go). This matrix captures the intrinsic size, shape, and conformational flexibility of the molecules, and it is constructed for each IPE pair. The elements of the MDDM are defined as

Formula (1)
{lambda} is a constant determined by maximizing the sum of differences of the eigenvalues for any two arbitrary molecules with the same number, N, of IPE type (Duca and Hopfinger, 2001Go). The term <dij> refers to the average distance between the atom pair i,j of IPE types u and v, such that

Formula (2)
where p(k) is the thermodynamic probability of the kth conformer state sampled in the assessment of conformational flexibility, and dij(k) is the corresponding distance between atom pair i and j of IPE type u and v for the kth conformer state.


View this table:
[in this window]
[in a new window]

 
TABLE 2 IPEs Currently Used in the 4D-QSAR Paradigm

 
For the same-type IPE, i.e., u = v, the MDDMs can be directly diagonalized. To calculate the n normalized eigenvalues for IPE type m of compound {alpha}, {epsilon}mn({alpha}), the nonscaled eigenvalues {epsilon} mn' ({alpha}) are scaled relative to the rank of the MDDM

Formula (3)

Thus, {epsilon}1,2(5) corresponds to the second eigenvalue of the MDDM for IPE type 1 (nonpolar atom) of compound 5.

When the IPE types are different (u != v), the following square MDDMs are constructed:

Formula (4)
and for [nv x nu]

Formula (5)

Since the same rank and trace are present in Equations 4 and 5, both MDDM (u, u) and MDDM (v, v) have the same set of eigenvalues. Consequently, for each pair of IPEs, where u != v

Formula (6)

The 4D-fingerprint descriptor set, for each compound in the training and test sets, comprises all the eigenvalues of all IPE eigenvector pairs for the compound. There are 36 possible combinations of the eight IPE types when the distinct cross-IPE terms (u != v) are included. A threshold cutoff value of 0.001 was applied in this study, and those normalized eigenvalues below the threshold value were disregarded.

Construction of the 4D-Fingerprint Trial Descriptor Pool
The construction of the 4D-fingerpints is similar to that done in our previous work on the development of two-state (sensitizer, nonsensitizer) categorical QSAR models for skin sensitization (Li et al., 2007Go). A set of 729 4D-fingerprints have been calculated for the entire LLNA database. Based on the results of our two-state categorical QSAR modeling (Li et al., 2007Go), autoscaling of the descriptors is preferred if PLS fitting is involved in the construction of QSAR models (Glen et al., 1989Go). Therefore, the universal autoscaled matrix (USMAX), consisting of 729 autoscaled 4D-fingerprints, has been used as the independent descriptor pool for QSAR model construction.

Statistical Data-Fitting and Model Optimization Tools
Three-state model building.
PLS-CLR described in our previous paper on two-state skin sensitization QSAR models (Li et al., 2007Go) has been used as the statistical modeling technique for the three-state QSAR study. The effect-selection method for model construction is stepwise selection where the number of variables in a resulting model can be changed by adjusting the significance level of variable entry into, or remaining in, a model.

There are three predicted probabilities for each compound, corresponding to a "non–weak," a "moderate," and a "strong–extreme" sensitizer, respectively. The largest value among these three probabilities defines the predicted skin sensitization potency of a compound. For example, if the three values of predicted probabilities of a compound being a "non–weak," a "moderate," and a "strong–extreme" sensitizer, are 0.300, 0.600, 0.100, the compound is classified as a "moderate" skin sensitizer. The classification accuracy of a PLS-CLR three-state model is defined as

Formula (7)
where t1, t2, and t3 are, respectively, the number of compounds for which both the observed and the predicted skin sensitization potencies are "non–weak," "moderate," and "strong–extreme," and "total" represents the total number of compounds in the corresponding database. Therefore, the classification accuracy is actually the percentage of the compounds in the database correctly classified.

For a three-state model, the classification accuracy for the training set compounds is an indication of the goodness of fit of the model. The higher the value of the classification accuracy of the training set, the better goodness of fit of the model. Similarly, a high classification accuracy of the test set indicates high prediction capability. In these three-state models, as well as in the two-2-state models discussed below, the cutoff value for each model was selected on the basis of maximizing the predictivity across the corresponding training set.

Two-2-state QSAR models.
The USMAX, which consists of 729 autoscaled 4D-fingerprints, is the same as that of the three-state model. This matrix is used as the independent descriptor pool for the construction of the two-2-state QSAR models. In the first step of building the two-2-state models, PLS-CLR is applied in a similar fashion to that used in building the two-state sensitizer, nonsensitizer categorical QSAR models in our previous study (Li et al., 2007Go). For a certain cutoff value, according to the value of the predicted probability for a compound being a "moderate–strong–extreme" sensitizer, the skin sensitization potency of a compound can be predicted either as "non–weak" if the predicted probability is lower than the cutoff value, or "moderate–strong–extreme" if the predicted probability is higher than the cutoff value.

In the second step of building the two-2-state models, compounds were selected from the database if their skin sensitization potencies were predicted correctly using the two-state PLS-CLR QSAR model constructed in the first step. For the selected "non–weak" sensitizers, the values of the dependent variable have been assigned based on the skin sensitization potency of each compound, that is, a value of "0" for a nonsensitizer, and a value of "1" for a weak sensitizer. The USMAX, which consists of 729 autoscaled 4D-fingerprints of the selected "non–weak" sensitizers, made up the descriptor pool, and the PLS-CLR was again the statistical used in the construction of the two-2-state QSAR models. Under a certain cutoff value, according to the value of the predicted probability for a compound being a weak sensitizer, the skin sensitization potency of a compound can be predicted either as "non" or "weak". Similarly, for the selected "moderate–strong–extreme" sensitizers, the values of the dependent variable have been assigned based on the skin sensitization potency of each compound. That is, a value of "0" was assigned to a moderate sensitizer, while a value of "1" was given to a "strong–extreme" sensitizer. The corresponding USMAX again was the descriptor pool for the construction of the two-2-state QSAR models, also using PLS-CLR. For a certain cutoff value, according to the value of the predicted probability for a compound being a "strong–extreme" sensitizer, the skin sensitization potency of a compound can be predicted either as "moderate" if the predicted probability is lower than the cutoff value, or "strong–extreme" if the predicted probability is higher than the cutoff value.

The goodness of fit of a two-2-state QSAR model can be evaluated using either the Hosmer–Lemeshow goodness-of-fit statistic {chi}Formula2, or the classification accuracy for the training set compounds (Agresti, 2002Go; Hosmer and Lemeshow, 2000Go). Lower value of {chi}Formula2, or higher value of classification accuracy of the training set, indicates a better goodness fit of the model. Similarly, the higher the value of the classification accuracy for the test set, the better the predictive capability of a two-2-state QSAR model.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 FUNDING
 REFERENCES
 
Three-State QSAR Models
By varying the significance level of a descriptor's entry into and retention within a model, different numbers of descriptor terms can be realized in the resulting PLS-CLR three-state QSAR models. As shown in Table 3, when there are 19 terms in the three-state models, the models give the largest measure of classification accuracy for the training set, 91.3%, but the lowest value of classification accuracy for the test set, 45.6%. This contrary behavior indicates the 19-term three-state models "overfit" the data. In contrast, the eight-term three-state QSAR models give a moderate value of classification accuracy for the training set, 73.4%, but the highest value of classification accuracy for the test set, 63.6%. Since the random average value of classification accuracy for any three-state data set is 33.3%, the eight-term three-state QSAR models fit the training set data well, and also have meaningful predictive capability.


View this table:
[in this window]
[in a new window]

 
TABLE 3 Summaries of the Performance Measures of the PLS-CLR Three-State QSAR Models Having Different Numbers of Model Terms

 
The eight-term PLS-CLR three-state QSAR models are as shown below:

Formula (8)

Formula (9)
where xscri is the ith extracted component from the USMAX using PLS analysis and the P(Y ≤ N|X) is the response probability of success in the linear logistic model, where N = 1 if a compound belongs to the strong- or extreme-skin sensitizer, category 1, and N = 2 if a compound belongs to strong, extreme, or medium skin sensitizer. All the regression coefficients, and their corresponding standard errors of fit, are on the same scale because each PLS component has a zero mean and unit variance. Using Equations 8 and 9, the predicted probabilities, and corresponding classification assignments of the training set and the test set, are given in Tables 4 and 5, respectively.


View this table:
[in this window]
[in a new window]

 
TABLE 4 The Predicted Probabilities and Corresponding Classification Assignments of the Training Set Using the Eight-term PLS-CLR QSAR Model for the USMAX

 

View this table:
[in this window]
[in a new window]

 
TABLE 5 The Predicted Probabilities and the Corresponding Classifications for the Test Set Using the Eight-Term PLS-CLR QSAR Models for the USMAX

 
The Two-2-State QSAR Model
PLS-CLR two-state ("non–weak" and "moderate–strong–extreme") QSAR models with different numbers of model terms have been constructed for the skin sensitization database. The performance measures of these models are summarized in Table 6. When the number of model terms is seven, the model gives the highest value of classification accuracy for the test set, 77.3%. Therefore, the two-state model with seven terms has the best predictive capability. The seven-term two-state QSAR model is,

Formula (10)


View this table:
[in this window]
[in a new window]

 
TABLE 6 Summary of the Performance Measures of the PLS-CLR Two-State and Two-2-State QSAR Models with Different Numbers of Descriptor Terms

 
In Equation 10 xscri denotes the ith PLS component extracted from the 720 nonscaled 4D-fingerprints. Although 132 components, which is the smaller value between the number of compounds and the number of descriptors, can be extracted by PLS, only the first several components are selected. The reason for this limited selection is due to the first several components accounting for most of the variances in both the explanatory variables and the response logit. The value of the Hosmer–Lemeshow goodness-of-fit statistic, {chi}Formula2, is 3.161, which corresponds to a p-value of 0.924, and supports the null hypothesis that Equation 10 fits the training set data very well.

As is shown in Table 7 for the cutoff value of 0.53, and using Equation 10 for the USMAX of the training set, 94 out of 105 "moderate–strong–extreme" sensitizers, and 82 out of 91 "non–weak" sensitizers, are correctly classified. This corresponds to a specificity of 90.1%, a sensitivity of 89.5%, and a classification accuracy of 89.8%. The predicted probabilities of being a "moderate–strong–extreme" sensitizer for the test set compounds are listed in Table 8. For the test set, 8 out of 12 "moderate–strong–extreme" sensitizers, and 9 out of 10 "non–weak" sensitizers, are correctly classified using Equation 10. This prediction behavior corresponds to a specificity of 66.7%, a sensitivity of 90.0%, and a classification accuracy of 77.3%.


View this table:
[in this window]
[in a new window]

 
TABLE 7 Summary of the Performance Measures of the PLS-CLR Two-State and Two-2-State QSAR Models

 

View this table:
[in this window]
[in a new window]

 
TABLE 8 The Predicted Probabilities and the Corresponding Classifications for the Test Set Using PLS-CLR Two-State ("Non–weak" and "Moderate–Strong–Extreme") QSAR Model for the USMAX

 
PLS-CLR models have been constructed to discriminate between the "non’ and the "weak" sensitizers using the compounds whose skin sensitization potencies can be correctly classified by Equation 10 for the initial training set of 82 "non–weak" sensitizers. The performance measures of these models are summarized in Tables 6 and 7. It should be noted that the test set for this model building step is composed of nine "non’ and "weak" sensitizers. These compounds are part of the original test set for the two-state QSAR model, Equation 10, and their skin sensitization potencies have been correctly classified using Equation 10. When the number of model terms is five, the highest value of classification accuracy for the test set is realized at 77.8%. Therefore, a two-state model with five terms is judged to have the best predictive capability and is given by

Formula (11)

In Equation 11, the meanings of the xscri PLS components are the same as those for Equation 10. The value of the Hosmer–Lemeshow goodness-of-fit statistic, {chi}Formula2, 1.571, corresponds to a p-value of 0.992, and supports the null hypothesis that Equation 11 fits the training set data very well. As shown in Table 7, for the cutoff value of 0.57, using Equation 11 and the USMAX of the training set, 46 out of 49 weak sensitizers, and 31 out of 33 nonsensitizers, are correctly classified. This corresponds to a specificity of 93.9%, a sensitivity of 93.9%, and a classification accuracy of 93.9%. The predicted probabilities of being a weak sensitizer for the test set compounds are listed in Table 9. For the test set, four out of five weak sensitizers, and three out of four nonsensitizers, are correctly classified using Equation 11. This corresponds to a specificity of 80.0%, a sensitivity of 75.0%, and a classification accuracy of 77.8%.


View this table:
[in this window]
[in a new window]

 
TABLE 9 The Predicted Probabilities and the Corresponding Classifications for the Test Set of the "Non" and "Weak" Sensitizers Using PLS-CLR Two-2-State QSAR Models for the USMAX

 
Based on the skin sensitization potencies which can be correctly classified for the training set of 94 "moderate–strong–extreme" sensitizers using Equation 10, PLS-CLR two-2-state models have been constructed to classify sensitizers as "moderate" or as "strong–extreme". The performance measures of these models are summarized in Tables 6 and 7. It should be noted that the test set in this step is composed of eight "moderate" and "strong–extreme" sensitizers. They are part of the original test set for the two-state QSAR model for which their skin sensitization potencies can be correctly classified using Equation 10. When the number of model terms is five, the model gives the highest classification accuracy for the test set, 62.5%. Therefore, the two-state model with five terms is judged to have the best predictive capability and is given below,

Formula (12)

In Equation 12, the meanings of the xscri PLS components and the regression coefficients are again similar to those for Equation 10. The value of the Hosmer–Lemeshow goodness-of-fit statistic, {chi}Formula2, is 5.863, which corresponds to a p-value of 0.663, and supports the null hypothesis that Equation 12 fits the training set data well. As shown in Tables 6 and 7, for the cutoff value of 0.42, using Equation 12, and the USMAX of the training set, 33 out of 36 "strong–extreme" sensitizers, and 53 out of 58 moderate sensitizers, are correctly classified. This classification behavior corresponds to a specificity of 91.7%, a sensitivity of 91.4%, and a classification accuracy of 91.5%. The predicted probabilities of being a "strong–extreme" sensitizer for the test set compounds are listed in Table 10. For the test set, two out of four "strong–extreme" sensitizers, and three out of four moderate sensitizers, are correctly classified using Equation 12. This corresponds to a specificity of 50.0%, a sensitivity of 75.0%, and a classification accuracy of 62.5%.


View this table:
[in this window]
[in a new window]

 
TABLE 10 The Predicted Probabilities and the Corresponding Classifications for the Test Set of the "Moderate" and "Strong–Extreme" Sensitizers Using PLS-CLR Two-2-State QSAR Models for the USMAX

 
Three compounds in the test set, C4-Azlactone, Pyridine, and trans-Anethol, are presented as representative examples regarding how the two-2-state QSAR model is used to predict skin sensitization potency for a compound. C4-Azlactone is a moderate sensitizer. As shown in Table 8, using the two-state QSAR model given by Equation 10, the predicted probability for C4-Azlactone being a "moderate–strong–extreme" sensitizer is 0.385. This probability is less than the cutoff value of 0.53. Therefore, C4-Azlactone is falsely predicted as a "non–weak" sensitizer. Consequently, C4-Azlactone has not been subsequently evaluated using the two-2-state QSAR model to discriminate between "moderate" and "strong–extreme" sensitizers.

Pyridine is a classified "non–weak" sensitizer, based on Equation 10, since its predicted probability for being a "moderate–strong–extreme" sensitizer is 0.063, which is less than the cutoff value of 0.53. Therefore, it is correctly predicted as a "non–weak" sensitizer, and is subsequently evaluated in the two-2-state QSAR model in order to discriminate it as a "non’ or a "weak" sensitizer. As is shown in Table 9, using the two-2-state QSAR model for separating "non" and "weak" sensitizers, Equation 11, the predicted probability for Pyridine being a weak sensitizer is 0.953, which is larger than the cutoff value of 0.57. Therefore, it is correctly predicted as a weak sensitizer.

trans-Anethol is a moderate sensitizer. As shown in Table 8, using the initial two-state QSAR model, Equation 10, the predicted probability for trans-Anethol being a "moderate–strong–extreme" sensitizer is 0.728, which is larger than the cutoff value of 0.53. Therefore, it is correctly predicted to fall into the "moderate–strong–extreme" sensitizer classification and becomes a compound for evaluation by the corresponding two-2-state QSAR model for discriminating "moderate" sensitizers from "strong–extreme" sensitizers. As reported in Table 10, the two-2-state QSAR model for separating "moderate" sensitizers from "strong–extreme" sensitizers, Equation 12, indicates the probability for trans-Anethol being a "strong–extreme" sensitizer is 0.000, which is less than the cutoff value of 0.42. Therefore, it is correctly predicted as a moderate sensitizer.

The two-2-state QSAR models, for the training set yield classification accuracy, sensitivity, and specificity measures of 83.2, 84.6, and 81.9%, respectively (see Table 7). For the test set, the classification accuracy, sensitivity, and specificity are 54.6, 41.7, and 70.0%, respectively. Since four categories, "non-", "weak-," "moderate-," and "strong–extreme" sensitizers, are involved in the classification using two-2-state models, the average random classification accuracy for such a two-2-state modeling strategy is 25%. Therefore, in tandem the PLS-CLR two-state and two-2-state QSAR models effectively, and significantly, increase the four-category classification accuracy for skin sensitization potency above that due to chance selection.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 FUNDING
 REFERENCES
 
The values of the classification accuracy for the training and the test sets of the three-state model are 73.4 and 63.6%, respectively, for the eight-term model. The relatively small difference between these two values indicates that "overfitting" in model construction has been minimized by the application of this novel chemometric methodology. However, the values of classification accuracy for the training and the test sets are 83.2 and 54.6%, respectively, using the PLS-CLR to build the two-2-state QSAR models. In this case, the large difference between these two accuracy values indicates a likely "overfitting" as part of model construction. The low classification accuracy for the test set is mainly due to a low model sensitivity, 41.7%, indicating that many of the "moderate" and/or "strong–extreme" sensitizers are incorrectly classified.

These results, particularly for the two-2-state model, may indicate that the 4D-fingerprints are not adequate descriptors to describe the chemical reactivity events in skin sensitization. It has been widely accepted that the chemical reactivity of a hapten is one of the main factors that account for the skin sensitization potency of the hapten (Smith and Hotchkiss, 2001Go). The 4D-fingerprints are ground-state molecular descriptors, and have only been successfully applied (validated) for those cases where noncovalent binding plays an important role (Senese et al., 2004Go). Thus, descriptors measuring features of chemical reactivity may be necessary to develop better categorical QSAR models of skin sensitization.

On the other hand, a comparison of the two-state model discriminating "non–weak" sensitizers from "strong–extreme" sensitizers in our first study (Li et al., 2007) to the two-state model in this study separating "non–weak" sensitizers from "moderate–strong–extreme" sensitizers is revealing. The best original two-state model that does not include "moderate" sensitizers has a test set predictivity of 86.7%. This predictivity is far better than the 77.3% of the corresponding two-state model of this present study that discriminates "non–weak" sensitizers from "moderate–strong–extreme" sensitizers. And it is even more significant as compared to the 63.6% test set accuracy of the three-state model of this study discriminating "non–weak" sensitizers, "moderate" sensitizers, and "strong–extreme" sensitizers These observations would suggest that the "moderate" sensitizers may be the source of limited categorical QSAR modeling accuracy. The 4D-fingerprint descriptors may not be able to capture all of the molecular features which make a compound more than a "weak" sensitizer, but less than a "strong" sensitizer.

Figures 1a and 1b present histogram summaries of the performance measures of the two-state model developed in the previous study and the categorical QSAR models developed in this investigation. It is seen in Figure 1a that each of the models has an accuracy of greater than 73%, and sensitivity more than 68% for the training sets used to build the models. However, for the test sets the accuracy of the models decreases except for the initial LR two-state model which excludes "moderate" sensitizers (see Fig. 1b). For all models whose training sets include "moderate" sensitizers, the test set prediction accuracies decrease by more than 10% relative to the values of the corresponding training sets.


Figure 1
View larger version (34K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1. (a) Summary of the performance measures of the LR two-state model (Li et al., 2007Go), the PLS-CLR three-state model, the PLS-CLR two-state (step 1), and the two-2-state (step 2) QSAR models for the training sets. (b) Summary of the performance measures of the LR two-state model (Li et al., 2007Go), the PLS-CLR three-state model, the PLS-CLR two-state (step 1), and the two-2-state (step 2) QSAR models for the test sets.

 
There may be two possible ways to improve the predictive power of the categorical QSAR models being sought in this type of study. The first way is to add to the trial descriptor pool some specific molecular orbital reactivity descriptors, such as lowest unoccupied molecular orbital found to be a significant descriptor in QSAR modeling of eye irritation (Kulkarni et al., 2001Go), and/or 2D reactivity descriptors like the electrotopological descriptors (Kier and Hall, 1999Go). Secondly, since molecules react with each other in their excited states, another way to include chemical reactivity in a QSAR analysis is to develop additional sets of universal 4D-fingerprints based on excited states of a compound. We are now completing the development of a methodology to determine the 4D-fingerprints of excited states of a molecule. These nonground state descriptors are being computed for the LLNA database, and will be added to the current ground-state 4D-fingerpint descriptors to generate an expanded descriptor pool to build a new set of categorical QSAR models. The use of random forest methods to generate categorical skin-sensitization models is also an approach we are currently considering.


    FUNDING
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 FUNDING
 REFERENCES
 
National Institutes of Health through the NIH Roadmap for Medical Research, Grant 1 (R21 GM075775-01); and the Procter & Gamble Company.


    ACKNOWLEDGMENTS
 
Information on Novel Preclinical Tools for Predictive ADME-Toxicology can be found at http://grants.nih.gov/grants/guide/rfa-files/RFA-RM-04-023.html. Links to nine initiatives are found here http://nihroadmap.nih.gov/initiatives.asp. Resources of the Laboratory of Molecular Modeling and Design at UIC and The Chem21 Group, Inc. were used in performing these studies.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 FUNDING
 REFERENCES
 
Agresti A. Categorical Data Analysis (2002) 2nd ed. New York: Wiley-Interscience.

Basketter DA, Blaikie L, Dearman RJ, Kimber I, Ryan CA, Gerberick GF, Harvey P, Evans P, White IR, Rycroft RJG. Use of the local lymph node assay for the estimation of relative contact allergy potency. Contact Dermatitis (2000) 42:344–348.[CrossRef][Web of Science][Medline]

Basketter DA, Lea LJ, Cooper K, Stocks J, Dickens A, Pate I, Dearman RJ, Kimber I. Threshold for classification as a skin sensitizer in the local lymph node assay: A statistical evaluation. Food Chem. Toxicol. (1999) 37:1167–1174.[CrossRef][Web of Science][Medline]

Basketter DA, Roberts DW. A quantitative structure activity/dose relationship for contact allergic potential of alkyl group transfer agents. Toxicol. In Vitro (1990) 4:686–687.[CrossRef][Web of Science]

Basketter DA, Roberts DW, Cronin M, Scholes EW. The value of the local lymph node assay in quantitative structure-activity investigations. Contact Dermatitis (1992) 27:137–142.[CrossRef][Web of Science][Medline]

Cronin MT, Basketter DA. Multivariate QSAR analysis of a skin sensitization database. SAR QSAR Environ. Res. (1994) 2:159–179.[Medline]

Doherty DC. Molsim User's Guide, Ver. 2.1 (2001) Lake Forest, IL: The Chem21 Group, Inc.

Duca JS, Hopfinger AJ. Estimation of molecular similarity based on 4D-QSAR analysis: Formalism and validation. J. Chem. Inf. Comput. Sci. (2001) 41:1367–1387.[CrossRef][Web of Science][Medline]

Enslein K, Gombar VK, Blake BW, Maibach HI, Hostynek JJ, Sigman CC, Bagheri D. A quantitative structure-toxicity relationships model for the dermal sensitization guinea pig maximization assay. Food Chem. Toxicol. (1997) 35:1091–1098.[CrossRef][Web of Science][Medline]

Fedorowicz A, Zheng L, Singh H, Demchuk E. QSAR study of skin sensitization using local lymph node assay data. Int. J. Mol. Sci. (2004) 5:56–66.

Gerberick GF, Ryan CA, Kern PS, Schlatter H, Dearman RJ, Kimber I, Patlewicz GY, Basketter DA. Compilation of historical local lymph node assay data for evaluation of skin sensitization alternative methods. Dermatitis (2005) 16:157–202.[Web of Science][Medline]

Glen WG, Dunn WJ III, Scott DR. Principle component analysis and partial least squares regression. Tetrahedron Comput. Methodol. (1989) 2:349–376.[CrossRef]

Hopfinger AJ, Wang S, Tokarski JS, Jin B, Albuquerque M, Madhiv PJ, Duraiswami C. Construction of 3D-QSAR models using the 4D-QSAR analysis formalism. J. Am. Chem. Soc. (1997) 119:10509–10524.[CrossRef][Web of Science]

Hosmer DW, Lemeshow S. Applied Logistic Regression (2000) 2nd ed. New York: John Wiley & Sons, Inc.

Kier LB, Hall LH. Molecular Structure Description: The Electrotopological State (1999) San Diego, CA: Academic Press.

Kimber I, Basketter DA. The murine Local Lymph Node assay; collaborative studies and directions: A commentary. Food Chem. Toxicol. (1992) 30:165–169.[CrossRef][Web of Science][Medline]

Kulkarni A, Hopfinger AJ, Osborne R, Bruner LH, Thompson ED. Prediction of eye irritation from organic chemicals using membrane-interaction QSAR analysis. Toxicol Sci. (2001) 59:335–345.[Abstract/Free Full Text]

Landsteiner K, Jacobs J. Studies on the sensitization of animals with simple chemical compounds. J. Exp. Med. (1936) 64:625–639.[Abstract]

Li Y, Pan D, Liu J, Tseng YJ, Kern P, Gerberick GF, Hopfinger AJ. 4D-fingerprint categorical QSAR models for skin sensitization based on classification local lymph node assay measures. Chem. Res. Toxicol. (2007) 20:114–128.[CrossRef][Web of Science][Medline]

Magee PS, Hostynek JJ, Maibach HI. A classification model for allergic contact dermatitis. Quant. Struct. Act. Relationsh. (1994) 13:22–33.

Nguyen DV, Rocke DM. Tumor classification by partial least squares using microarray gene expression data. Bioinformatics (2002) 18:39–50.[Abstract/Free Full Text]

Patlewicz G, Basketter DA, Smith CK, Hotchkiss SAM, Roberts DA. Skin-sensitization structure-activity relationships for aldehydes. Contact Dermatitis (2001) 44:331–336.[CrossRef][Web of Science][Medline]

Pearlstein RA. CHEMLAB-II Users Guide (1998) Lake Forest, IL: The Chem21 Group, Inc.

Roberts DW. Structure-activity relationships for skin sensitization potential of diacrylates and dimethacrylates. Contact Dermatitis (1987) 17:281–289.[CrossRef][Web of Science][Medline]

Roberts DW, Basketter DA. A quantitative structure activity/dose response relationship for contact allergic potential of alkyl group transfer agents. Contact Dermatitis (1990) 23:331–335.[CrossRef][Web of Science][Medline]

Roberts DW, Basketter DA. QSAR: Sulfonate esters in the LLNA. Contact Dermatitis (2000) 42:154–161.[CrossRef][Web of Science][Medline]

Roberts DW, Benezra C. Quantitative structure-activity relationships for skin sensitization potential of urushiol analogues. Contact Dermatitis (1993) 29:78–83.[CrossRef][Web of Science][Medline]

Roberts DW, Fraginals R, Lepoittevin JP, Benezra C. Refinement of the relative alkylation index (RAI) model for skin sensitization and application to mouse and guinea-pig test data for alkyl alkanesulphonates. Arch. Dermatol. Res. (1991) 283:387–394.[CrossRef][Web of Science][Medline]

Roberts DW, Williams DL. The derivation of quantitative correlation between skin sensitisation and physio-chemical parameters for alkylating agents, and their application to experimental data for sultones. J. Theor. Biol. (1982) 99:807–825.[CrossRef][Web of Science][Medline]

Roberts DW, York M, Basketter DA. Structure-activity relationships in the murine local lymph node assay for skin sensitization: {alpha},ß-Diketones. Contact Dermatitis (1999) 41:264–271.[CrossRef][Web of Science][Medline]

Senese CL, Duca J, Pan D, Hopfinger AJ, Tseng YJ. 4D-fingerprints, universal QSAR and QSPR descriptors. J. Chem. Inf. Comput. Sci. (2004) 44:1526–1539.[CrossRef][Web of Science][Medline]

Smith CK, Hotchkiss SAM. Allergic Contact Dermatitis: Chemical and Metabolic Mechanisms (2001) London, New York: Taylor & Francis.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
99/2/532    most recent
kfm185v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Li, Y.
Right arrow Articles by Tseng, Y. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Li, Y.
Right arrow Articles by Tseng, Y. J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?