- Coordinator: Steinberg Pablo
- High-throughput screening (HTS) | BMG LABTECH
- High-Throughput Screening Methods in Toxicity Testing

Together the ToxCast and Tox21 programs have had a transformative impact on how chemicals are evaluated for safety and hazard towards effects on both human health and the environment. The rich mechanistic information provided by such a large and diverse dataset has lead to the results being used in many different contexts.

### Coordinator: Steinberg Pablo

Predictive models have been developed for reproductive toxicity [ 14 ], hepatotoxicity [ 31 , 32 ], carcinogenicity [ 33 ], developmental toxicity [ 34 ], vascular development toxicity [ 35 , 36 ], and estrogen receptor ER disruption [ 37 , 38 ]. In addition, researchers have used the large amount of data in HTS to build computational models to predict HTS results for untested chemicals where little is known about their toxicity [ 39 , 40 ]. With this information, results from ToxCast have been used for prioritizing chemicals for more targeted testing [ 45 ].

The ability to link HTS results to high throughput exposure estimates [ 46 ] and in vivo assays using in vitro to in vivo extrapolation IVIVE pharmacokinetics measurements [ 47 — 49 ] has allowed HTS results to be increasingly used in risk assessment [ 5 , 50 , 51 ]. However, there have been studies highlighting limitations to predictivity from HTS results [ 52 , 53 ]. While numerous factors can contribute to reduced predictivity, the uncertainty in concentration-response parameters of the HTS data has to date been an underexplored contributor.

While the need for incorporating quantitative uncertainty analysis for high throughput screening has been acknowledged, the increased computational expense has limited the application of robust statistical methods [ 54 — 56 ]. There are several challenges for calculating uncertainty in HTS data. The choice of a method to quantify uncertainty must consider these issues. In this paper, we introduce non-parametric bootstrap resampling [ 59 , 60 ] as a method that can calculate uncertainty estimates in HTS data.

While the computational expense of a large number of resamples has hindered the adoption of bootstrap methods in the past [ 56 , 61 ], advancements in computational power have made the method feasible to apply to the ToxCast HTS dataset. We describe a bootstrap implementation suitable for incorporation in the ToxCast pipeline and explore how the method meets the challenges for quantifying uncertainty in a diverse dataset like ToxCast.

As a case study, we explore an application to the ToxCast estrogen receptor ER model for bioactivity [ 37 , 38 ]. Calculating uncertainty in this model must meet all of the challenges described above. The model calculates area under the curve AUC values for a given chemical using the fitted curves for that chemical from 18 ER assays. Uncertainty in the fitted curve requires that we capture uncertainty in the hit call, model selection, and all fit parameters from the winning model challenge 1.

This model is well characterized and has recently been approved to replace in vivo tests as part of the Endocrine Disruptor Screening Program EDSP Tier 1 battery [ 62 , 63 ]. This means that not only do developers need to understand the uncertainty in the model prediction, but the method used must be easy to communicate to regulators and industry partners who make use of the model as part of their risk assessments challenge 4. While numerous bootstrapping algorithms have been described in the literature [ 63 — 66 ], we chose to use smoothed nonparametric resampling smooth bootstrap.

There are minimal assumptions used in this method. First, the observed response values are physically possible a small assumption since they were observed. Second, for each response value there is some noise and uncertainty included in the measurement. While non-smoothed nonparametric resampling case bootstrap removes the second assumption, this comes at the cost of jagged parameter distributions in samples with few points and the inability to bootstrap curves with only a single biological replicate.

Smoothing removes the jaggedness, slightly increases the amount of variation, and allows resampling for curves with only a single biological replicate. Because the nonparametric methods do not rely on a specific functional form of the curve, they can be used to quantify the uncertainty in model selection and activity call as well. Methods that resample residuals make a hard assumption on the model. Since the residuals are calculated from the fitted curve, the choice of function must be made prior to bootstrapping, removing the ability to capture uncertainty in model selection and activity.

Directly resampling the residuals makes an additional assumption that the variance of errors is constant, and like case resampling, this method can result in jagged distributions for curves with few points. Wild resampling removes the assumption of homoscedasticity [ 63 — 66 ], and depending on the random variable used to multiply the residuals, can smooth out some of the jaggedness in residual resampling. However, the choice of random variable is not trivial and may need to be adjusted for different assay types.

The wild bootstrap is also sensitive to the regression method and the pattern of heteroscedasticity [ 66 ]. Based on the comparisons summarized in Table 1 , the smooth bootstrap was selected as most applicable to the diverse datasets found in ToxCast in general and the ER assays in particular.

The amount of noise added into the smooth bootstrap can have a significant impact on the results. Not enough and the results will be much like case resampling: often discrete bins of parameter values will be observed for curves with few points. If the random noise is too high, the uncertainty calculated will be artificially inflated. Fortunately, the ToxCast pipeline already contains an estimate on the noise.

In the data fitting process, the baseline median absolute deviation bmad is calculated by binning the response values of the lowest two concentrations for every chemical, and then computing the scaled mad, where X i is the i th value in the binned baseline response values and is the median of the baseline response values [ 67 , 68 ]. We use the median and mad rather than mean and standard deviation because a small number of chemicals are highly potent and have a response even at the lowest concentrations.

Within the ToxCast pipeline, the bmad is used as a measure of noise. In addition, many assays have the cutoff value for a statistically significant response set to a multiple of the bmad, with 3, 6, and 10x bmad frequently used. Given that the assumption that bmad represents the noise in assay data is already built into how the ToxCast pipeline is constructed, maintaining that assumption for the smooth bootstrap makes sense.

Therefore, we sampled from random noise calculated from a normal distribution with standard deviation equal to the bmad for that assay. We compare the empirical baseline values for the two lowest concentrations tested across all chemicals to the normal distribution built on the bmad in Fig 1 for all 18 ER assays. In each pane,l the empirical values are plotted as the empirical cumulative distribution function in black, while the normal distribution with standard deviation set to the bmad is plotted as orange lines.

In all 18 assays, there is substantial similarity between the two distributions. This indicates that the normal distribution is a good approximation to the actual underlying distribution. For each assay, the bmad is calculated as the scaled mad of the response values for the lowest two concentrations per chemical. Deviations between the ecdf and the normal distribution at higher response values can be attributed to highly potent chemicals with a biological response at the lowest two concentrations as well as sources of noise that are from a non-normally distributed process. It is also clear that the largest deviation occurs as the response value increases.

This occurs due to highly potent chemicals with activity within the lowest two concentrations of the tested range. Because these assays were tested at fewer concentrations clustered at the higher concentration values, more chemicals show activity in the baseline. In contrast, the Tox21 assays were tested at more than fold lower concentration. Because of this, there is a much smaller deviation between the two distributions for the Tox21 assays. In all cases, the normal function makes an excellent approximation for the background noise in the assay, highlighting that a normal distribution built on the bmad represents a good choice for sampling noise in the smooth bootstrap as well as providing confidence in the use of bmad within the pipeline for hit call cutoffs.

The most straightforward analysis of the bootstrap results is to consider the distribution of the model fit parameters.

- The Soviet Arctic.
- The Chimney Sweepers Boy.
- High-Throughput Screening Methods in Toxicity Testing.
- Multiplexing for biological relevance in high throughput screening.

The three parameters fit in the hill and gain loss gnls models are the log AC50 , top, and hill coefficient. J : Correlation plot of winning model top vs. K : Normalized experimentally measured values black circles and winning model gain loss, black curve. Subset of fitted bootstrap resamples, with winning hill red lines and gain loss blue lines models plotted. Horizontal black lines represent 3x bmad dashed and activity cutoff solid. L : Comparison to results from other assays. Cumulative empirical distribution function of winning model gain log AC50 value for all bisphenol AF samples in all assays where the experiment results were determined to be a positive hit.

However, the other parameters have different distributions. A long tail is observed for the hill coefficient B. The gnls log AC50 D has large tails on both sides of the distribution and gnls top F is bi-modal. This indicates that a simple normal distribution and associated confidence intervals cannot be assumed to be applicable. While the distribution of an individual model parameter is informative, many analyses of ToxCast data make use of the winning model rather than focusing on the hill or gnls models specifically. In addition to individual model fit parameters, each bootstrap sample has the calculated Akaike information criteria AIC for all three models.

Using this, we choose a winning model for each resampled curve by selecting the lowest AIC using the same algorithm used in the ToxCast pipeline point estimate. Fig 2 is an example where the winning model can vary between bootstrap samples. While there are measurements for the hill log AC50 and the gnls log AC50 , not all of those represent curves where those models are the winning models. For each bootstrap sample, we select the log AC50 that corresponds to the winning model and pool those results Fig 2G. Comparing Fig 2A, 2D and 2G , it is clear that distribution of the winning model log AC50 is broader than either the hill or gnls log AC50 and is bimodal, representing the combination of the two different distributions from the hill and gnls subsets.

Fig 2H and 2I highlight the winning model gain coefficient and top parameter distributions, respectively. The uncertainty in the winning model is adding to the uncertainty in the potency parameter. By keeping the parameters paired with the bootstrap sample, the correlations between parameters can be explored. In Fig 2J , the log AC50 and top parameters for the winning model in all bootstrap samples are shown.

Notably, the hill and gnls components of the winning model parameters have different correlations. The shape and angle are different, with a stronger correlation between the log AC50 and the top parameters observed in the gnls than in the hill model. The reason for the shift in efficacy and potency between the two models is clarified by examining the bootstrap sample curves Fig 2K.

The response at 0. In the ToxCast pipeline, this data fits to the gnls model solid black curve. When bootstrapped, however, uncertainty in the points shifts the winning model, such that out of bootstrap samples the hill red and gnls blue models are the chosen and times, respectively.

In the gnls model, this is clustered around 4, much greater than the 2. Because the log AC50 represents the calculated concentration where the response is half the value of the top parameter, the shift in the top between gnls and hill manifests as a shift in the log AC50 as well. As the winning model can vary between bootstrap samples, the hit call can change as well.

We explored the uncertainty in the hit call and model selection for all chemicals in 18 ER assays in the ToxCast database. For each chemical assay pair, a model selection and hit call was made for each bootstrap sample. Therefore, for each curve a hit probability was calculated, and among the samples that were hits the ratio of hill to gnls was determined. These results are shown in Fig 3.

For each plot, chemicals are ordered on the x-axis based on their hit call probability. The y axis indicates the percent of bootstrap resamples that were calculated to be a positive hit with a hill model red , gain loss blue , or a negative hit black. The percentage of chemicals with a hit probability greater than 0 but less than 1 varies substantially between assays.

In contrast, many of the Odyssey Thera assays have a much smaller number of chemicals in this probability range. Propagation of model parameters, model selection, and hit call probability will vary depending on the final use case. If the assay hit call is an input into a model, such as building a QSAR model to predict assay activity, one option is to leave out any chemical with a hit probability between 0 and 1.

Another approach would be assign a hit probability threshold for a chemical to be included as a positive or negative e. In this study, we explore applications to the ER Model, which is handled differently. The model returns an AUC value for 26 different "receptors" in the pathway model corresponding to predicted patterns of activity. These include agonist, antagonist, and pseudoreceptors technology-specific assay interference activity.

A cutoff of 0. Calculating the uncertainty in the ER AUC value requires meeting the four challenges highlighted in the introduction. The model is built on the entire curve for each chemical-assay pair, including all fit parameters, model selection, and activity call. Robustness is introduced to the model by using 18 assays from five different sources using different assay technologies. With chemicals and 18 assays, over 32, concentration-response curves are used when calculating the model scores. The model also has diverse applications. In addition to being used for regulatory decisions as part of the EDSP Tier 1 screening battery [ 62 ], the model has also been used to build QSAR models so that tens of thousands of additional chemicals can be screened in silico for estrogen agonism [ 39 ].

Therefore, the ER model makes an ideal use case for understanding how uncertainty quantification can be incorporated into analyzing HTS data. Uncertainty in all of the fit parameters, model selection, and activity call must be propagated for thousands of chemicals and 18 assays, in a way compatible with different assay technologies and giving a result useful for both scientific analysis and regulatory risk assessment.

By calculating the ER model score for each bootstrap sample, a distribution of ER model scores was determined. The bootstrapped uncertainty in this value is represented by error bars which mark the 2. Because the AUC value is calculated by aggregating results from 18 assays, noise from one assay will tend to be averaged out by noise in another assay, providing robustness to the AUC value. Chemicals with AUC Agonist values greater than 0. These chemicals are highly potent ER agonist control chemicals and are often active even at the lowest concentration tested in the ToxCast assays S1 Fig, If the low response values of the hill curve are not sampled i.

Other chemicals, like raloxifene hydrochloride, have a larger uncertainty in the AUC Agonist value because there is another AUC value with similar weighting within the model, in this case AUC Antagonist. The uncertainties around both the agonist and antagonist values are large because each bootstrap sample might skew towards the agonist or antagonist models being dominant S1 Fig, Values for AUC pseudoreceptor have high uncertainties in general.

These values are calculated based on a subset of the assays, and are therefore not as robust as the AUC Agonist value. There are, however, a few chemicals that have relatively large AUC Agonist uncertainty values. A closer examination of the first of these, nordihydroguaiaretic acid, is explored in detail in Fig 5. By plotting the bootstrap results for all 18 ER assays for this chemical, the contribution to the ER AUC uncertainty from each assay is explored.

Almost all the assays have a relatively narrow range of intra-assay potency values. This high potency estimate combined with high uncertainty in the activity call translates into large uncertainty in the ER AUC Agonist value. When the bootstrap samples are inactive, the calculated AUC values decrease. Therefore, we conclude that the large uncertainty in the nordihydroguaiaretic acid ER AUC Agonist value is driven primarily by the large uncertainty in the ACEA activity call for this chemical.

One follow-up to such a finding would be to rerun the assay driving the overall large uncertainty. Each of the 18 ER assays are shown in a separate panel with the assay cutoff indicated with a dashed horizontal line. Circles represent the pipeline normalized concentration-response data and the solid black line indicates the winning model fit to the data if the hit call was positive.

## High-throughput screening (HTS) | BMG LABTECH

All bootstrap curves with a positive hit call are drawn with hill and gns models colored red and blue respectively. Because one of the purposes of the ER model is to predict in vivo activity, it is informative to compare model scores to known in vivo activity for the subset of chemicals that have been tested in vivo. In Fig 6 we plot the ER AUC Agonist value for all chemicals that have at least two guideline-like studies in the uterotrophic assay, and color the values based on the results of the in vivo experiments.

The majority of in vivo positives are above the 0. By adding uncertainty quantification, we are able to further give context around the model score and to increase confidence decision making. The majority of compounds small uncertainty around their model score, and therefore a decision based on that model score can be made for confidently. Others, such as 4-nonylphenol and benz a anthracene, have confidence intervals that cross the activity cutoff, and therefore these cannot be confidently predicted to be ER in vivo active or not. Similarly, one of the false negatives in the model, tamoxifen, has a relatively large uncertainty that spans into the inconclusive range of model values.

By quantifying the uncertainty around the model score, results with low confidence can be flagged to avoid incorrect decision making. Point estimates for agonist are colored by the uterotrophic consensus result being positive red , equivocal blue , and negative black. Equivocal results in the uterotrophic assay indicate some tests were positive while others negative.

Using smooth nonparametric bootstrapping, we were able to quantify uncertainty in model fits to the experimental data, and propagate that uncertainty throughout the analysis of the data. Through the use of the ER model, we showed that the method is applicable to the use cases highlighted in the introduction. We calculated the uncertainty in all model fit parameters, and then propagated that uncertainty through model selection, hit call, and finally the ER AUC calculation. This method worked on data from numerous assay sources and technologies, and was fast enough to allow the full propagation to be calculated for all chemicals.

The limited number of assumptions and tuning parameters in applying the method make it simple for non-subject-matter-experts to apply the calculation to other analyses and provides confidence in interpreting the results from the uncertainty quantification. The latter is particularly important for analyses like the ER model that are used in a regulatory context. One question that might be raised is how our approach compares with the asymptotic, maximum likelihood method for estimating confidence intervals. The estimation process we use includes features that invalidate standard asymptotic theory for evaluating the uncertainty of estimates.

First, the parameter space is bounded, and estimates do end up on the boundaries of the space.

- Navigation Bar.
- Posttraumatic and Acute Stress Disorders.
- Adaptation of High-Throughput Screening in Drug Discovery—Toxicological Screening Tests.

Standard theory requires estimates to fall on the interior of the parameter space, and are invalid on the boundary. Second, we fit multiple models, and select the model with the best AIC. Again, standard theory does not apply. Finally, we believe that the sample sizes are such that we could not trust the asymptotic theory, even if the two issues above were not true. Thus, we believe that one would want to use some sort of resampling method in any case to more reliably quantify uncertainty.

By quantifying uncertainty in the ER model score, we were able to better understand the semi-arbitrary activity cutoff for in vivo ER activity prediction.

The distribution of ER model scores gives a measure of confidence around this cutoff. In particular, we were able to identify a false positive by the large uncertainty around the ER AUC Agonist value, and then take a closer look at the individual curves used to calculate this value and identify which curve was contributing the most variability. Flagging for closer inspection is a powerful aspect of this uncertainty quantification. With over 32, concentration-response curves used to calculate the ER AUC values, a manual inspection of every curve would be difficult and error prone.

By limiting the manual inspection to only those chemicals with large variability and quantifying which curves are contributing to that variability, subject-matter-expert time is optimized for studying only the most difficult examples. As the number of assays, molecular targets, tested chemicals, and analyses grows, tools that target the need for manual inspection increase in importance. Uncertainty quantification is an important component of this analysis pipeline. Concentration response data used in this study was obtained from 18 ER assays in the ToxCast database.

A summary of the assays used in this study can be found in Table 2. All model fits to the data used the ToxCast data pipeline R package tcpl version 1. The steps relevant to this study are briefly described. Three models are fit to the normalized concentration-response data using maximum-likeliood to estimate the parameters.

The Nelder-Mead algorithm was used to carry out the optimization. All experimental data concentrations x[i] and model potency parameters ga , la , are expressed as the log10 concentration where concentration is in uM. The constant 'cnst' model, with constant value of zero response, is given by:. The second model fit is the constrained hill 'hill' model: subject to constraints:. Fitted parameters are the top asymptote tp , concentration at which the activity is half that of the top asymptote ga , and hill coefficient gw with constraints indicated.

All constraints are subject to the max resp , min conc , and max conc for the data fit, not at the assay level. The bottom asymptote is set to zero. Notably the constraints on tp being greater than zero coupled with the bottom asymptote at zero forces the model to fit only in the gain direction. The final model fit is the constrained gain loss 'gnls' model. This model is constructed as product of a gain direction hill model and a hill model that operates in the loss direction with shared top and bottom asymptotes: subject to constraints:.

Cell Surface Marker Validation Service. Cell Immortalization Service. Custom Cell Services. In Vitro Hepatotoxicity. In Vitro Neurotoxicity. In Vitro Nephrotoxicity. Endocrine Disruption Screening Assay. Fluorescent In Situ Hybridization. Karyotyping G-Banded. Molecular Karyotyping aCGH. Transmission Electron Microscopy. Special Staining Services.

## High-Throughput Screening Methods in Toxicity Testing

Biosample Services Sample Collection Service. TMA Construction. Biosample Nucleic Acid Purification. Exosome Identification. Exosome Quantification. Exosome Application. Mucosa Oral Irritation Test. Respiratory Toxicity. In vitro Eye Irritation Test. Neurological Disorder Genaral in vitro Neurotoxicity. Parkinson's Disease Modeling and Assays.

Cell Therapy Immune cell therapy. Stem cell therapy. Frequently Asked Questions. Trending Newsletter. Contact Us. Tissue samples, tissue arrays, cells, microorganisms, probes and services for your research. High-Throughput Toxicity Screening Inquiry.

webventuregroup.com/zigu-the-best-cell.php Staffed with a group of well-experienced scientists in drug discovery and high-throughput screening, Creative Bioarray is able to provide: Toxicity screening using well-established cell lines or primary cell lines Toxicity screening using 3D cell cultures Post-assay experiments Detailed report of cytotoxicity results Raw data and images obtained in the assay Customized services Creative Bioarray aims at providing our clients with services and scientific data of the highest quality.

Related Sections Dermatology: Atopic Dermatitis. Percutaneous Absorption. High-Content Cytotoxicity Screening.