Skip to contents

Introduction

By default ssdtools does not rescale data when fitting distributions so that the parameter estimates can be used to directly estimate the HCx values. However, if rescale = TRUE in the ssd_fit_dists() or ssd_fit_burrlioz() functions then the data is rescaled by dividing by the geometric mean of the minimum and maximum positive finite values which may aid model fitting in some instances. To examine the extent to which model fitting is improved we fit the 9 distributions with valid likelihoods currently implemented in ssdtools to the 729 acute datasets in the envirotox R data package with and without rescaling.

Methods

The R code that performs the analysis is as follows.

Consistent with the default settings in ssdtools a distribution was considered to have successfully fitted if it had converged irrespective of whether the standard errors were computable for the estimates based on the likelihood or whether a parameter was at a boundary.

dists <- ssdtools::ssd_dists_all()

fit_dists <- function(data, d, r) {
  list(ssdtools::ssd_fit_dists(data = data, dists = d, rescale = r, 
                               computable = FALSE, at_boundary_ok = TRUE, silent = TRUE))
}

data <- envirotox::envirotox_acute |>
  dplyr::nest_by(Chemical) |>
  dplyr::mutate(ssd_fit_unscale = fit_dists(.data$data, d = dists, r = FALSE),
                ssd_fit_rescale = fit_dists(.data$data, d = dists, r = TRUE),
                dists_unscale = list(names(ssd_fit_unscale)),
                dists_rescale = list(names(ssd_fit_rescale))) |>
  dplyr::select(!c(ssd_fit_unscale, ssd_fit_rescale))

unscaled <- data |>
  dplyr::select(Chemical, Distribution = dists_unscale) |>
  tidyr::unnest(Distribution) |>
  dplyr::ungroup() |>
  dplyr::count(Distribution) |>
  dplyr::mutate(n = n / nrow(data) * 100) |>
  dplyr::select(Distribution, Unscaled = n)

rescaled <- data |>
  dplyr::select(Chemical, Distribution = dists_rescale) |>
  tidyr::unnest(Distribution) |>
  dplyr::ungroup() |>
  dplyr::count(Distribution) |>
  dplyr::mutate(n = n / nrow(data) * 100) |>
  dplyr::select(Distribution, Rescaled = n)

results <- unscaled |>
  dplyr::inner_join(rescaled, by = "Distribution")

Findings

The percentage of the acute datasets in the envirotox R data package to which the distribution was successfully fitted.
Distribution Unscaled Rescaled
burrIII3 100.0 100.0
gamma 100.0 100.0
lgumbel 100.0 100.0
llogis 100.0 100.0
llogis_llogis 95.9 95.7
lnorm 100.0 100.0
lnorm_lnorm 97.0 96.4
weibull 100.0 100.0

The results indicate that with the 729 acute datasets considered, rescaling has little to no effect on fitting for all the currently implemented distributions with valid likelihoods with one exception. The exception is the gompertz distribution for which the fitting rate increases from to %. Despite substantial improvement for the gompertz the fitting rate is still only ~ % which is insufficient to warrant reconsideration of its inclusion in the default set.

Recommendations

Rescaling the data has little to no effect on the fitting rate for the models in the default set. Consequently we recommend that the ssd_fit_dists() or ssd_fit_burrlioz() continue to use rescale = FALSE as the default value and that it remain the fixed option in the ssd_fit_bcanz() function.

Session Info

The results were generated with the following packages.

#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.5.0 (2025-04-11)
#>  os       Ubuntu 24.04.2 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language en-US
#>  collate  C.UTF-8
#>  ctype    C.UTF-8
#>  tz       UTC
#>  date     2025-06-13
#>  pandoc   3.1.11 @ /opt/hostedtoolcache/pandoc/3.1.11/x64/ (via rmarkdown)
#>  quarto   NA
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version    date (UTC) lib source
#>  abind          1.4-8      2024-09-12 [1] RSPM
#>  bslib          0.9.0      2025-01-30 [1] RSPM
#>  cachem         1.1.0      2024-05-16 [1] RSPM
#>  chk            0.10.0     2025-01-24 [1] RSPM
#>  cli            3.6.5      2025-04-23 [1] RSPM
#>  codetools      0.2-20     2024-03-31 [3] CRAN (R 4.5.0)
#>  desc           1.4.3      2023-12-10 [1] RSPM
#>  digest         0.6.37     2024-08-19 [1] RSPM
#>  dplyr          1.1.4      2023-11-17 [1] RSPM
#>  envirotox      0.0.0.9001 2025-06-13 [1] Github (poissonconsulting/envirotox@c3dabe2)
#>  evaluate       1.0.3      2025-01-10 [1] RSPM
#>  farver         2.1.2      2024-05-13 [1] RSPM
#>  fastmap        1.2.0      2024-05-15 [1] RSPM
#>  fs             1.6.6      2025-04-12 [1] RSPM
#>  furrr          0.3.1      2022-08-15 [1] RSPM
#>  future         1.58.0     2025-06-05 [1] RSPM
#>  generics       0.1.4      2025-05-09 [1] RSPM
#>  ggplot2        3.5.2      2025-04-09 [1] RSPM
#>  globals        0.18.0     2025-05-08 [1] RSPM
#>  glue           1.8.0      2024-09-30 [1] RSPM
#>  goftest        1.2-3      2021-10-07 [1] RSPM
#>  gtable         0.3.6      2024-10-25 [1] RSPM
#>  htmltools      0.5.8.1    2024-04-04 [1] RSPM
#>  jquerylib      0.1.4      2021-04-26 [1] RSPM
#>  jsonlite       2.0.0      2025-03-27 [1] RSPM
#>  knitr          1.50       2025-03-16 [1] RSPM
#>  lattice        0.22-6     2024-03-20 [3] CRAN (R 4.5.0)
#>  lifecycle      1.0.4      2023-11-07 [1] RSPM
#>  listenv        0.9.1      2024-01-29 [1] RSPM
#>  magrittr       2.0.3      2022-03-30 [1] RSPM
#>  Matrix         1.7-3      2025-03-11 [3] CRAN (R 4.5.0)
#>  parallelly     1.45.0     2025-06-02 [1] RSPM
#>  pillar         1.10.2     2025-04-05 [1] RSPM
#>  pkgconfig      2.0.3      2019-09-22 [1] RSPM
#>  pkgdown        2.1.3      2025-05-25 [1] any (@2.1.3)
#>  plyr           1.8.9      2023-10-02 [1] RSPM
#>  purrr          1.0.4      2025-02-05 [1] RSPM
#>  R6             2.6.1      2025-02-15 [1] RSPM
#>  ragg           1.4.0      2025-04-10 [1] RSPM
#>  rbibutils      2.3        2024-10-04 [1] RSPM
#>  RColorBrewer   1.1-3      2022-04-03 [1] RSPM
#>  Rcpp           1.0.14     2025-01-12 [1] RSPM
#>  Rdpack         2.6.4      2025-04-09 [1] RSPM
#>  rlang          1.1.6      2025-04-11 [1] RSPM
#>  rmarkdown      2.29       2024-11-04 [1] RSPM
#>  sass           0.4.10     2025-04-11 [1] RSPM
#>  scales         1.4.0      2025-04-24 [1] RSPM
#>  sessioninfo    1.2.3      2025-02-05 [1] RSPM
#>  ssddata        1.0.0      2021-11-05 [1] RSPM
#>  ssdtools       2.3.0.9004 2025-06-13 [1] Github (poissonconsulting/ssdtools@17874d3)
#>  stringi        1.8.7      2025-03-27 [1] RSPM
#>  stringr        1.5.1      2023-11-14 [1] RSPM
#>  systemfonts    1.2.3      2025-04-30 [1] RSPM
#>  textshaping    1.0.1      2025-05-01 [1] RSPM
#>  tibble         3.3.0      2025-06-08 [1] RSPM
#>  tidyr          1.3.1      2024-01-24 [1] RSPM
#>  tidyselect     1.2.1      2024-03-11 [1] RSPM
#>  TMB            1.9.17     2025-03-10 [1] RSPM
#>  universals     0.0.5      2022-09-22 [1] RSPM
#>  vctrs          0.6.5      2023-12-01 [1] RSPM
#>  withr          3.0.2      2024-10-28 [1] RSPM
#>  xfun           0.52       2025-04-02 [1] RSPM
#>  yaml           2.3.10     2024-07-26 [1] RSPM
#> 
#>  [1] /home/runner/work/_temp/Library
#>  [2] /opt/R/4.5.0/lib/R/site-library
#>  [3] /opt/R/4.5.0/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────