The canonical expansion entry point (TARGETS-DESIGN.md section 1/section 2):
ssd_scenario_tasks() derives the sample, fit, and hc task tables from a
scenario in one call and bundles them into an ssdsims_task_set. The per-step
derivations (ssd_scenario_sample_tasks(), ssd_scenario_fit_tasks(),
ssd_scenario_hc_tasks()) remain available for callers that need a single
table; each is equivalent to ssd_scenario_tasks(scenario, step) for the
matching step.
All derivations are RNG-free: they perform no random-number generation and add
no seed/primer/stream columns (those arrive in later roadmap steps; see
TARGETS-DESIGN.md section 2). Each row carries a path-style <step>_id
primary key (the Hive partition path) and, for non-root steps, its parent
step's <parent>_id as a joinable foreign key, so a child task references its
parent by a single column.
Usage
ssd_scenario_tasks(scenario, step = NULL)
ssd_scenario_sample_tasks(scenario)
ssd_scenario_fit_tasks(scenario)
ssd_scenario_hc_tasks(scenario)Arguments
- scenario
An
ssdsims_scenariofromssd_define_scenario().- step
Optional single step name (
"sample","fit", or"hc"). When supplied, returns just that step'sssdsims_taskstable (the same as the matchingssd_scenario_*_tasks()); whenNULL(default) returns the fullssdsims_task_set.
Value
An ssdsims_task_set object (a list with sample, fit, and hc
elements, each an ssdsims_tasks table), or - when step is supplied - the
single ssdsims_tasks table for that step. Each ssdsims_tasks table is a
classed tibble recording one step, with one row per cell of that step's
cross-join.
Functions
ssd_scenario_sample_tasks(): Derive just thesampletask table: one row per cell of the cross-join of the scenario's dataset names, replicate index (1:nsim), andreplacevalues, keyed bysample_id. Each row is the single random draw that everynrowvalue sub-truncates (TARGETS-DESIGN.mdsection 5), sonrowis not a sample axis - the draw is shared. The draw size is the scenario'snrow_maxsetting, resolved by the runner against each dataset, not a row column: the table carries only the task identity.ssd_scenario_fit_tasks(): Derive just thefittask table: cross each sample-task identity (dataset,sim,replace) with the scenario'snrowvalues and each row of the scenario'sfitargument grid (rescale,computable,at_boundary_ok,min_pmixname,range_shape1,range_shape2).nrowis a genuinefitcross-join axis: thefitstep truncates its parent sample inline (head(sample, nrow), RNG-free) before fitting, so the shared draw is sub-truncated without a separatedatastep (TARGETS-DESIGN.mdsection 5).min_pmixis referenced by name, not by function value (TARGETS-DESIGN.mdsection 1.1). Each row carries afit_idprimary key and asample_idforeign key referencing its parent sample task.ssd_scenario_hc_tasks(): Derive just thehctask table: cross each fit-task identity with each row of the scenario'shcargument grid (nboot,ci_method,parametric) and with the scenario's declared distribution sets (distset, the set names). The scenario's scalarciflag and theest_methodsetting are applied uniformly to every hc row - neither is a cross-join axis nor an emitted column; the runners readcifrom the scenario and every requestedest_methodis summarised within each task from its single bootstrap sample set. Whenci = FALSEthe bootstrap-only scenario options (nboot,ci_method,parametric) are canonicallyNA, leavingdistsetas the only fan-out, so the grid is exactlyDhc rows per fit task (one per set); whenci = TRUEthe grid fans out acrossdistset x nboot x ci_method x parametric. A single-set collection yields onedistsetvalue (one hc row per fit task whenci = FALSE). Each row carries anhc_idprimary key, itsdistsetname, and afit_idforeign key referencing its parent (union) fit task.
Examples
data <- ssd_scenario_data(ssddata::ccme_boron)
scenario <- ssd_define_scenario(data, nsim = 3L, seed = 42L)
tasks <- ssd_scenario_tasks(scenario)
tasks
#> <ssdsims_task_set>
#> sample tasks: 3
#> fit tasks: 3
#> hc tasks: 3
tasks$hc
#> <ssdsims_tasks: hc>
#> axes: dataset, sim, replace, nrow, rescale, computable, at_boundary_ok, min_pmix, range_shape1, range_shape2, nboot, ci_method, parametric, distset
#> tasks: 3
#> # A tibble: 3 × 16
#> dataset sim replace nrow rescale computable at_boundary_ok min_pmix
#> <chr> <int> <lgl> <int> <lgl> <lgl> <lgl> <chr>
#> 1 ccme_boron 1 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> 2 ccme_boron 2 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> 3 ccme_boron 3 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> # ℹ 8 more variables: range_shape1 <list>, range_shape2 <list>, nboot <int>,
#> # ci_method <chr>, parametric <lgl>, distset <chr>, hc_id <chr>, fit_id <chr>
ssd_scenario_tasks(scenario, "hc")
#> <ssdsims_tasks: hc>
#> axes: dataset, sim, replace, nrow, rescale, computable, at_boundary_ok, min_pmix, range_shape1, range_shape2, nboot, ci_method, parametric, distset
#> tasks: 3
#> # A tibble: 3 × 16
#> dataset sim replace nrow rescale computable at_boundary_ok min_pmix
#> <chr> <int> <lgl> <int> <lgl> <lgl> <lgl> <chr>
#> 1 ccme_boron 1 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> 2 ccme_boron 2 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> 3 ccme_boron 3 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> # ℹ 8 more variables: range_shape1 <list>, range_shape2 <list>, nboot <int>,
#> # ci_method <chr>, parametric <lgl>, distset <chr>, hc_id <chr>, fit_id <chr>
ssd_scenario_sample_tasks(scenario)
#> <ssdsims_tasks: sample>
#> axes: dataset, sim, replace
#> tasks: 3
#> # A tibble: 3 × 4
#> dataset sim replace sample_id
#> <chr> <int> <lgl> <chr>
#> 1 ccme_boron 1 TRUE dataset=ccme_boron/sim=1/replace=TRUE
#> 2 ccme_boron 2 TRUE dataset=ccme_boron/sim=2/replace=TRUE
#> 3 ccme_boron 3 TRUE dataset=ccme_boron/sim=3/replace=TRUE
data <- ssd_scenario_data(ssddata::ccme_boron)
scenario <- ssd_define_scenario(
data,
nsim = 3L,
seed = 42L,
rescale = c(FALSE, TRUE)
)
ssd_scenario_fit_tasks(scenario)
#> <ssdsims_tasks: fit>
#> axes: dataset, sim, replace, nrow, rescale, computable, at_boundary_ok, min_pmix, range_shape1, range_shape2
#> tasks: 6
#> # A tibble: 6 × 12
#> dataset sim replace nrow rescale computable at_boundary_ok min_pmix
#> <chr> <int> <lgl> <int> <lgl> <lgl> <lgl> <chr>
#> 1 ccme_boron 1 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> 2 ccme_boron 1 TRUE 6 TRUE FALSE TRUE ssd_min_pmix
#> 3 ccme_boron 2 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> 4 ccme_boron 2 TRUE 6 TRUE FALSE TRUE ssd_min_pmix
#> 5 ccme_boron 3 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> 6 ccme_boron 3 TRUE 6 TRUE FALSE TRUE ssd_min_pmix
#> # ℹ 4 more variables: range_shape1 <list>, range_shape2 <list>, fit_id <chr>,
#> # sample_id <chr>
data <- ssd_scenario_data(ssddata::ccme_boron)
scenario <- ssd_define_scenario(
data,
nsim = 2L,
seed = 42L,
ci = TRUE,
nboot = c(10L, 100L)
)
ssd_scenario_hc_tasks(scenario)
#> <ssdsims_tasks: hc>
#> axes: dataset, sim, replace, nrow, rescale, computable, at_boundary_ok, min_pmix, range_shape1, range_shape2, nboot, ci_method, parametric, distset
#> tasks: 4
#> # A tibble: 4 × 16
#> dataset sim replace nrow rescale computable at_boundary_ok min_pmix
#> <chr> <int> <lgl> <int> <lgl> <lgl> <lgl> <chr>
#> 1 ccme_boron 1 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> 2 ccme_boron 1 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> 3 ccme_boron 2 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> 4 ccme_boron 2 TRUE 6 FALSE FALSE TRUE ssd_min_pmix
#> # ℹ 8 more variables: range_shape1 <list>, range_shape2 <list>, nboot <int>,
#> # ci_method <chr>, parametric <lgl>, distset <chr>, hc_id <chr>, fit_id <chr>