Skip to contents

Scenarios and task expansion

Define a declarative simulation scenario and expand it into the per-step task tables (sample, fit, hc) that the targets-based pipeline builds on, with a baseline loop runner.

ssd_scenario_data()
Assemble and Validate Datasets for a Simulation Scenario
ssd_gen()
Materialise Generator Datasets for a Simulation Scenario
ssd_pmix()
Assemble and Validate min_pmix Functions for a Simulation Scenario
ssd_distset()
Assemble One or More Distribution Sets
ssd_define_scenario()
Define a Simulation Scenario
ssd_scenario_tasks() ssd_scenario_sample_tasks() ssd_scenario_fit_tasks() ssd_scenario_hc_tasks()
Expand a Scenario into Task Tables
ssd_run_scenario_baseline()
Run a Scenario with the Baseline Loop Runner
ssd_run_scenario_shards()
Run a Scenario over Hive-partitioned Parquet Shards (single core)

Targets pipeline

Group a step’s tasks into per-shard tables keyed by partition_by, run a shard with the per-task RNG primitives writing one Parquet per shard, and fan in a summary - the building blocks of the static-branching targets pipeline (see the inst/targets-templates/small/_targets.R template).

ssd_scenario_sample_shards() ssd_scenario_fit_shards() ssd_scenario_hc_shards()
Group Tasks into Shards
ssd_run_sample_step() ssd_run_fit_step() ssd_run_hc_step()
Run a Step Shard
ssd_scenario_targets()
Build the Targets Pipeline for a Scenario
ssd_summarise()
Summarise a Run's hc Estimates Across Shards
scenario_results_dir()
Seed- and Layout-keyed Results Root for a Scenario

Designs (combining scenarios)

Run several scenarios as one pipeline: a design is the de-duplicated union of its members’ grids (the irregular/ragged grid - finer detail over a subset of the axes without the full cross-product), addressed by cell under a seed=/layout= tree. Build a ssd_design(), turn it into one targets pipeline with ssd_design_targets(), and fan in per-scenario and combined summaries.

ssd_design()
Assemble and Validate a Design of Scenarios
ssd_design_targets()
Build the Targets Pipeline for a Design
ssd_summarise_member()
Summarise One Design Member from the Shared hc Shards
ssd_summarise_design()
Combine Per-scenario Summaries into One Design Summary

Cloud upload

Typed, self-validating upload destinations (ssd_upload_azure(), ssd_upload_dryrun()) and the class-dispatched generics that probe credentials, ship each shard, and read the uploaded results back in place - the remote-destination sibling of root on ssd_scenario_targets().

ssd_upload_azure() ssd_upload_dryrun()
Upload Destinations for a Scenario's Shards
ssd_test_upload()
Probe an Upload Destination's Credentials and Connectivity
ssd_upload_shard()
Ship Shard (or Summary) Parquet Files to an Upload Destination
ssd_open_uploaded()
Open Uploaded Results for Querying, In Place
ssd_summarise_uploaded()
Summarise Uploaded Results, In Place (the cloud ssd_summarise())

Cost estimation

Predict, before launching, roughly how much compute a scenario costs and how long its single longest task runs. Calibrate the per-task cost model on the target machine (or use the shipped default), then apply it to a scenario read-only - no fit, bootstrap, or RNG.

ssd_estimate_cost()
Estimate a Scenario's Compute Cost and Longest Task
ssd_calibrate_cost()
Calibrate the Per-task Cost Model on the Current Machine
ssd_cost_calibration()
Default Cost Calibration
ssd_cost_calibration_default
Default Cost Calibration Object

Cost analysis

Read a completed run’s observed compute back from the per-task timings its fit/hc shards carry: attribute it to the scenario axes, compare it against the prediction, and recalibrate the cost model from the measured durations. All read-only - no pipeline, fit, bootstrap, or RNG.

ssd_analyse_cost()
Analyse a Run's Observed Compute Cost
ssd_compare_cost()
Compare Predicted Against Observed Compute Cost
ssd_calibrate_cost_from_run()
Recalibrate the Cost Model from an Observed Run

Scenario accessors

A technical detail of the pipelines: isolate an already-materialised value from a scenario by name - the dataset tibble or the min_pmix function. Names (not values) drive task hashing, so these accessors resolve a name back to the value carried on the scenario for execution.

scenario_dataset()
Isolate a Materialised Dataset from a Scenario by Name
scenario_min_pmix()
Isolate a Materialised min_pmix Function from a Scenario by Name
scenario_distset()
Isolate a Distribution Set from a Scenario by Name

Reproducible RNG

Parallel-safe seeding helpers for the dqrng + hash backend: a per-task primer derived from the scenario seed, scoped backend activation, and scoped state installation.

task_primer()
Derive a Per-task Primer from its Parameters
local_dqrng_backend()
Local dqrng pcg64 Backend
local_dqrng_state() with_dqrng_state()
Local/With dqrng State

Package

ssdsims ssdsims-package
ssdsims: Simulation Analyses for Species Sensitivity Distributions