ToxSci Advance Access originally published online on April 19, 2006
Toxicological Sciences 2006 92(1):329-334; doi:10.1093/toxsci/kfj202
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A Regression Spline Model for Developmental Toxicity Data
Department of Biostatistics, St Jude Children's Research Hospital, Memphis, Tennessee 38105
1 To whom correspondence should be addressed at Department of Biostatistics, St Jude Children's Research Hospital, 332 North Lauderdale Street, Memphis, Tennessee 38105. Fax: (901) 544-8843. E-mail: daniel.hunt{at}stjude.org.
Received February 22, 2006; accepted April 13, 2006
| ABSTRACT |
|---|
|
|
|---|
Observed dose-response patterns of data from several developmental toxicity experiments appear to be nonlinear and should be characterized by an appropriate model to adequately fit this observed pattern. Information from these animal studies of ambient substances that are noncarcinogenic, yet potentially toxic, to humans is used by federal protection agencies (Environmental Protection Agency, Occupational Safety and Health Administration, Food and Drug Administration) to determine safe exposure levels, such as no observed adverse effects level and benchmark dose. We have developed a flexible regression linear B-spline model for application to developmental toxicity dose-response data from animal studies of these noncarcinogens. We apply our model to data from two CD-1 mice studies of the National Toxicology Program; the observed dose-response pattern from both appears nonlinear: (1) experiment of 131 pregnant mice allocated over five exposure levels (0, 0.025, 0.05, 0.10, and 0.15% diet) of diethylhexyl phthalate and (2) experiment of 111 pregnant mice exposed to five levels (0, 62.5, 125, 250, and 500 mg/kg/day) of diethylene glycol dimethyl ether. In each study, we measure litter response as the proportion of adversely affected fetuses. Upon applying our B-spline model to the data from both studies, we predict nonlinear dose-response, with improvement over the more typical logistic dose-response model in each of the two studies.
Key Words: developmental toxicity; dose-response; interior knot; linear B-spline; threshold.
| INTRODUCTION |
|---|
|
|
|---|
Developmental toxicity studies involve impregnated animals, usually rodents, which are randomly allocated to be administered or exposed to dose levels (including control) of some environmentally ambient substance to which humans are exposed. Regulatory agencies, such as the Environmental Protection Agency (EPA), Food and Drug Administration, and Occupational Safety and Health Administration, use the results of such experiments in identifying acceptable exposure levels in the environment, home, and workplace, respectively. For each animal, measured outcomes of fetal endpoints include those related to growth, such as body weight and length, birth defect indicators (structural abnormalities), and death indicators (pre- or postnatal deaths). We analyze only the discrete data. For the experiments described in this paper, in each litter, we group deaths (pre- and postnatal) and malformations together as "adverse events." This grouping is consistent with the assertion that this may be better in estimating the dose-response than modeling death and malformation incidence separately (U.S. EPA, 1991
The linear model has been the default dose-response model when assessing cancer risks to toxic exposure (U.S. EPA, 2005
). In the early 1970's, the advent of the threshold allowed this concept to become a basic assumption by the EPA in noncarcinogenic risk assessment (U.S. EPA, 1991
). "Threshold" is the largest nonzero dose level whose response is equivalent to the control level. If applicable, one or more of the most current guidelines for other areas of risk assessment are used in conjunction with or in lieu of the EPA developmental toxicity guidelines when chemical exposure is suspected to cause additional harmful effects not noted in developmental guidelines (U.S. EPA, 1986
, 1992
, 1996
, 1998
, 2005
). When the data warrants, the benchmark dose (Crump, 1984
) is the current ideal point of departure (POD) in the early phase of developmental risk assessment (as opposed to no observed adverse effects level [NOAEL]), its advantage over NOAEL being its ability to account for the shape of the dose-response curve. A safety factor is then applied to the POD to obtain a suitable reference dose. Although both approaches in some sense adhere to the threshold concept, neither estimates the threshold.
The pure threshold dose-response model presumes, in addition to identical response at control and threshold dose levels, constant response in the control-to-threshold dose range, i.e., all responses in this range are equivalent to control and threshold. The advantage of such a model, if a true threshold exists, is the ability to estimate threshold. We compare the results from applying our threshold model to the results of simply identifying the NOAELs in the two studies under investigation. We will discuss this in more detail in the "Materials and Methods" section. Cox (1987)
theorized several reasonable pure threshold models for application to toxicological data in general, but certainly that can be applied specifically to developmental data with some possible additions and/or modifications. Schwartz et al. (1995)
fit a pure threshold model to data from a National Toxicology Program (NTP) study, which is one of the data sets that we analyze here. Hunt and Rai (2003)
modeled their threshold model to the same NTP data set with the addition of random effect into the dose-response model that we use here. Our model, in addition to including the pure threshold model as a subset, has the flexibility to characterize the dose-response region that behaves on either side of the threshold. The threshold becomes more of a point of change in the local behavior of dose-response.
Our proposed model is based on regression spline theory, characterizes the nonlinear nature of the dose-response pattern, and implicitly models the threshold dose level as well as incorporating the inherent litter effect into the dose-response. We apply our model to data sets from two developmental toxicity studies conducted at the NTP. The "Materials and Methods" section is divided into several parts: "Materials," "DEHP Study," "DYME Study," and "Statistical Analysis." The "Results" section is divided into three subsections: "DEHP Study," "DYME Study," and "B-Spline Plots"; here we present model estimates with SEs, goodness-of-fit tests, and tables and figures illustrating results. The "Discussion" section summarizes and discusses the model's performance on the data and advantages of using the model to characterize dose-response in developmental toxicity studies.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Materials.
The two animal studies involved the following developmental toxins: (1) diethylhexyl phthalate (DEHP) and (2) diethylene glycol dimethyl ether (DYME). These two toxins are in the class of agents that are known to cause toxicity in rodents, and although there is human exposure to these toxins in the environment, threshold for humans is much higher than that in rodents for these two toxins (U.S. EPA, 1991
|
|
DEHP study.
DEHP was administered to the diet of 131 pregnant CD-1 mice on gestational days 017 (Tyl et al., 1983
DYME study.
DYME was administered orally to 111 pregnant CD-1 mice on gestational days 615 (Price et al., 1987
). Mice were randomly assigned one of five exposure levels: 0, 62.5, 125, 250, or 500 mg/kg/day. Resorptions were recorded and live fetuses studied for body weight and malformations. For our analysis, the number of implantation sites is litter size, and we categorize resorption or malformation of a fetus as an adverse event. For the analyses related to the nongrowth endpoints, i.e., fetal deaths and abnormalities, the NOAEL is 125 mg/kg/day (for each endpoint separately). Figure 1b is a plot of observed dose-response.
Statistical analysis.
We model the number of adversely affected fetuses X in a litter of n implanted fetuses. Binomial distribution is not sufficient for X, and various methods have been used over the years to account for the natural clustering due to the litter effect: the beta-binomial distribution (Chen and Kodell, 1989
; Haseman and Kupper, 1979
; Hunt and Bowman, 2004
; Kupper et al., 1986
; Paul, 1982
; Williams, 1975
), overdispersed binomial (Schwartz et al., 1995
), and binomial conditioned on random effects (Hunt and Rai, 2003
; Li and Hunt, 2004
). We use the last method. In addition to being computationally desirable, it has natural interpretation, which we will illustrate. Along with this, we propose a model that accurately characterizes the dose-response relationship, dose here being the exposure level of toxin. Response for each litter is measured as the proportion of adversely affected fetuses, i.e., X/n.
Our proposed dose-response model is a linear combination of numerically stable (for parameter estimation) linear B-spline basis functions, which includes a point of change parameter (called interior knot) that is in the dose range at which point a change in the slope occurs. The linear spline one-knot dose-response model for the data here is represented as a linear combination of linear B-splines as follows:
![]() | (1) |
is a regression spline function (with interior knot
) relating dose level d to linear response in two regions: below
and above
; the
are the linear B-spline basis functions (of dose d), constructed as stable linear functions of dose;
are linear B-spline coefficients; z represents a standard normal random variable to aid in modeling the litter effects, i.e., z varies from litter to litter. As
is the response,
The makeup of Equation 1 (excluding the random component
z) follows from B-spline theory (De Boor, 1978
This approach for modeling data, i.e., using polynomial regression splines, is somewhat driven by the local behavior of the data. Although the number of interior knots can vary depending on the dose-response pattern, in the case of developmental toxicity studies, which typically have a limited number of distinct dose levels and perhaps change in the slope of the dose-response from the low- to high-dose region, generally one interior knot should suffice. Hence, we are building a piecewise linear function with interior knot
as a change point. The term "interior" knot distinguishes the knots within the dose-response range (interior) to the lowest and highest doses of the study, which are technically "knots," but not interior knots. These limits have no bearing in determining the number of parameters, although they are used in the construction of B-splines for the lowest and highest dose levels, respectively. However, they are regarded as fixed with respect to parameter estimation.
Our linear spline Model 1 is flexible in that it includes the pure threshold model (horizontal line representing dose-response below
) and allows litter effects through random component
z to be modeled directly into the dose-response, following Hunt and Rai (2003)
. Most threshold models presume continuity across the dose range. Therefore, in the pure threshold model with constancy below threshold, there is a mathematical dependency between the threshold parameter and the parameters that model the above-threshold region of the dose-response curve, i.e., in addition to the threshold parameter being a change point, it is also a parameter that is part of the dose-response model in the upper dose region (Cox, 1987
; Hunt and Rai, 2003
; Schwartz et al., 1995
). For the linear B-spline model, this is doubly true, as the interior knot parameter
is used to calculate the B-splines in both regions above and below the interior knot. Therefore, there is dependence between the interior knot and the B-spline coefficients
i that determine the shape of the dose-response curve.
The response data being proportions compels us to logistically transform so that Equation 1 becomes:
![]() | (2) |
. Parametric methods can be used for estimation. As the knot parameter makes our model a nonlinear, mixed-effects model, we use a method outlined in Collett (1991)
To estimate the interior knot parameter
, as it enters the dose-response function nonlinearly, we developed and used a grid-searching algorithm (code written in FORTRAN 90) over the range of doses for the experiment to find the optimal location of
in relation to the other parameter estimates. That is, in each search, we find the profile likelihood function evaluated at that search's
value; then we maximize the set of profile likelihood functions over all
to find the optimal
(the true change point). We get the SEs of the parameter estimates from the sample information matrix calculated from the approximate likelihood function of the full model of all parameters
and
. The program based on this latter procedure is written in R version 1.9 code.
| RESULTS |
|---|
|
|
|---|
DEHP Study
Fitting our Model 2 to the data yields the parameter estimates in Table 2. Estimates and SEs were obtained using the approach at the end of the "Statistical Methods" section. The estimate for the interior knot
(0.033%) places it optimally between the second and third dose levels (0.025 and 0.05%, respectively). Figure 2a represents the predicted dose-response curve (with pointwise confidence limits) fit to the observed data. The predicted curve adequately tracks the pattern of the data across the dose range. To get this predicted curve, we back-transformed from the logistic model in Equation 2 to get the probabilistic dose-response function P, and then we substituted parameter estimates from Table 2 into this function to yield the estimated probabilities.
|
|
Of interest is that our model estimates decreasing response from control (0.0%) to knot estimate (0.033%). This would imply that the 0.025% level has lower response rate than that at control level, not just uniform response, as a pure threshold model would imply. As the curve begins to increase above 0.033%, this implies the threshold dose as the dose in the higher-dose region (slightly above interior knot) whose response is identical to control response. We estimate that dose at 0.041%. This is similar to the threshold dose estimated by Hunt and Bowman (2004)
To assess the fit of our model, we compared it to the traditional logistic dose-response model (no B-splines, no interior knots). This traditional model is nested within the regression spline model (Molinari et al., 2001
); hence, we use the likelihood ratio test (LRT). Log likelihood from applying our spline model to the data is 546.334; log likelihood based on the traditional logistic to this data is 557.021. LRT statistic is 21.374 (p < 0.0001), indicative of our model providing better fit than logistic. Logistic curve does not fit well here due to the fact that it models increasing dose-response, but the data indicates nonlinearity. Figure 2b is the fit of logistic curve and of our predicted curve to the data. Our model tracks the pattern better than the logistic. In particular, the logistic curve underestimates control response rate and overestimates responses at intermediate dose levels; also, our model identifies a threshold level.
DYME Study
Estimates from this study are in Table 3. Estimated interior knot is 165.4 mg/kg, placing it optimally between the third and fourth dose levels (125 and 250 mg/kg). We believe that this is due to the closeness in response at the three lowest levels (including control). This result is not that surprising as the NOAELs in the study for the endpoints related to adverse effects (malformation and death) were at 125 mg/kg. Figure 3a is the predicted dose-response curve (with pointwise confidence limits) fit to the observed data. The predicted curve is very slowly increasing below the knot due to lower observed responses at the intermediate dose levels (see Table 1); the predicted dose-response is close to being a pure threshold model. LRT results reveal that our model provides better fit than logistic. Log likelihood for our model is 388.622; for the logistic model it is 395.286; LRT statistic is 13.328 (p = 0.001). Figure 3b is a graph of the logistic curve, and our predicted curve fit to the data. Logistic curve does not estimate control response as poorly as it did for the previous data (due to observed increasing dose-response patterns), but it still does not perform as well as our model in fitting the data; specifically, it overestimates response at the intermediate dose levels (125 and 250 mg/kg/day) due to its increasing nature; however, our model accounts for the slowly increasing pattern at low-to-intermediate levels, then also for the more sharply increasing pattern at the higher levels.
|
|
B-Spline Plots
Figure 4 are the plots of the linear B-splines
discussed in Equation 1, for both studies. Construction of these B-splines represents the initial step in the estimation process. Figures 4a, 4b, and 4c (first row) are the plots for the linear B-spline functions
and
respectively, of the DEHP study. Figures 4d, 4e, and 4f (second row) are plots for
and
respectively, for the DYME study. For each study, during each step of the estimation process described in the "Statistical Analysis" section, B-splines
were calculated for the fixed interior knot value
at that profile likelihood step. Resulting profile likelihood estimates for model parameters were then obtained. This process was repeated over the entire dose range until B-splines based on the optimal interior knot value
were obtained. These are the B-splines in Figure 4.
|
| DISCUSSION |
|---|
|
|
|---|
We have proposed a method, a regression linear spline with interior knot represented as a linear combination of linear B-splines with interior knot, for predicting the dose-response pattern of developmental toxicity exposure leveloutcome data that exhibit nonlinearity. In addition to inherently modeling threshold dose, assuming its existence, our model can also detect nonlinear patterns other than the pure threshold model. It inherently accounts for various dose-responses and can reduce to the pure threshold model used on similar data by other authors (Cox, 1987
Our model can generalize to a model with higher-degree splines (quadratic, etc.) and multiple knots, but in the context of developmental studies, which by nature have few dose levels, fitting these higher-order nonlinear models would be difficult, and perhaps unnecessary (overfitting). Even a simple move up to a quadratic spline-interior knot model, which could give more smoothness to the curve, might overfit. One benefit of a quadratic spline-interior knot model is that the interior knot in this case is equivalent to the threshold dose so that threshold would be estimated directly. One has to consider whether the benefit outweighs the additional complexity. Certainly for such data, the quadratic spline would be sufficient, but perhaps not necessary. The linear spline model suffices here. Ours is a rather simple extension of the pure threshold model. It is flexible enough to accept lower-order models. Our model has the ability to adequately model such data.
Other important aspects of our model are its direct incorporation of the litter effect parameter into the dose-response (following the lead of Hunt and Rai, 2003
) and the ability to still employ likelihood-based methods of parameter estimation (Collett, 1991
). Direct inclusion of the litter effect, in the form of random component in the dose-response model, allows for a more natural interpretation of dose-response. That is, the randomness implies varying litter probabilities (within dose level) due to litter effects. This approach is perhaps more desirable than other approaches like the beta-binomial (models litter effects separate from dose-response) or overdispersed binomial (treats litter effects as fixed rather than random).
The two toxins under study here, DEHP and DYME, are of a subtype of chemicals that are used in developmental experiments as part of the risk assessment process for determining human exposure. Inherent to this process is the assumption of threshold level (U.S. EPA, 1991
). In both experiments, the endpoints related to adverse fetal effects resulted in NOAEL at exposure levels above control, lending credence to the threshold assumption. Additionally, studies of DEHP have illustrated nonlinearity of the dose-response (Doull et al., 1999
). Here, both studies illustrated the goodness of fit of our model.
Application of our model here indicates its utility for the adverse effect data of developmental toxicity studies. Its inherent modeling of threshold and model-based approach satisfies the current guidelines for such studies. It can be generalized, and its ability to model distinct patterns between low- and high-dose regions has been shown. It has also shown agreement with the original NOAEL results of the two experiments. Our model is one that can be effectively used in the dose-response phase of risk assessment.
| ACKNOWLEDGMENTS |
|---|
This work was partially supported by Cancer Center Support Grant CA21765 from the National Institutes of Health and by the American Lebanese Syrian Associated Charities.
| REFERENCES |
|---|
|
|
|---|
Chen, J. J., and Kodell, R. L. (1989). Quantitative risk assessment for teratological effects. J. Am. Stat. Assoc. 84, 966971.[CrossRef]
Collett, D. (1991). Fitting a model that contains a random effect. In Modelling Binary Data, pp. 207212. Chapman and Hall/CRC Press, Boca Raton, FL.
Cox, C. (1987). Threshold dose-response models in toxicology. Biometrics 43, 511523.[CrossRef][Web of Science][Medline]
Crump, K. S. (1984). A new method for determining allowable daily intakes. Fundam. Appl. Toxicol. 4, 854871.[CrossRef][Web of Science][Medline]
De Boor, C. (1978). A Practical Guide to Splines. Springer, New York.
Doull, J., Cattley, R., Elcombe, C., Lake, B. G., Swenberg, J., Wilkinson, C., Williams, G., and van Gemert, M. (1999). A cancer risk assessment of di(2-ethylhexyl)phthalate: Application of the new U.S. EPA Risk Assessment Guidelines. Regul. Toxicol. Pharmacol. 29, 327357.[CrossRef][Web of Science][Medline]
Haseman, J. K., and Kupper, L. L. (1979). Analysis of dichotomous response data from certain toxicological experiments. Biometrics 35, 281293.[CrossRef][Web of Science][Medline]
Hunt, D., and Rai, S. N. (2003). A threshold dose-response model with random effects in teratological experiments. Commun. Stat. Theor. Methods 32, 14391457.[CrossRef]
Hunt, D. L., and Bowman, D. (2004). A parametric model for detecting hormetic effects in developmental toxicity studies. Risk Anal. 24, 6572.[CrossRef][Web of Science][Medline]
Kupper, L. L., Portier, C., Hogan, M. D., and Yamamoto, E. (1986). The impact of litter effects on dose-response modeling in teratology. Biometrics 42, 8598.[CrossRef][Web of Science][Medline]
Li, C. S., and Hunt, D. (2004). Regression splines for threshold selection with application to a random-effects logistic dose-response model. Comput. Stat. Data Anal. 46, 19.
Molinari, N., Daures, J., and Durand, J. (2001). Regression splines for threshold selection in survival data analysis. Stat. Med. 20, 237247.[CrossRef][Web of Science][Medline]
Paul, S. R. (1982). Analysis of proportions of affected foetuses in teratological experiments. Biometrics 38, 361370.[Medline]
Price, C. J., Kimmel, C. A., George, J. D., and Marr, M. C. (1987). The developmental toxicity of diethylene glycol dimethyl ether in mice. Fundam. Appl. Toxicol. 81, 113127.
Schwartz, P. F., Gennings, C., and Chinchilli, V. M. (1995). Threshold models for combination data from reproductive and developmental experiments. J. Am. Stat. Assoc. 90, 862870.[CrossRef]
Tyl, R. W., Jones-Price, C., Marr, M. C., and Kimmel, C. A. (1983). Teratologic Evaluation of Diethylhexyl Phthalate (CAS No. 117-81-7) in CD-1 Mice. Final Study Report for NCTR/NTP Contract No. 222-80-2031 (C), NTIS No. PB85105674. National Technical Information Service, Springfield, VA.
United States EPA (1986). Guidelines for mutagenicity risk assessment. Fed. Regist. 51, 3400634012.
United States EPA (1991). Guidelines for developmental toxicity risk assessment. Fed. Regist. 56, 6379863826.
United States EPA (1992). Guidelines for exposure assessment. Fed. Regist. 57, 2288822938.
United States EPA (1996). Guidelines for reproductive toxicity assessment. Fed. Regist. 61, 5627456322.
United States EPA (1998). Guidelines for neurotoxicity risk assessment. Fed. Regist. 63, 2692626954.
United States EPA (2005). Guidelines for carcinogen risk assessment. Fed. Regist. 70, 1776517817.
Williams, D. A. (1975). The analysis of binary responses from toxicological experiments involving reproduction and teratogenicity. Biometrics 31, 949952.[CrossRef][Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





