Skip to contents

Water temperature data will be either classified as reasonable, questionable, or erroneous in the status_id column.

Usage

classify_water_temp_data(
  data,
  questionable_min = 0,
  questionable_max = 30,
  erroneous_min = -0.5,
  erroneous_max = 40,
  questionable_rate = 2,
  erroneous_rate = 5,
  questionable_buffer = 1,
  erroneous_buffer = 1,
  gap_range = 5
)

Arguments

data

A data frame.

questionable_min

A numeric value indicating the lower bound of the questionable range of temperature values.

questionable_max

A numeric value indicating the upper bound of the questionable range of temperature values.

erroneous_min

A numeric value indicating the lower bound of the erroneous range of temperature values.

erroneous_max

A numeric value indicating the upper bound of the erroneous range of temperature values.

questionable_rate

A numeric value indicating the rate of change (temperature per hour) of temperature values that is considered questionable.

erroneous_rate

A numeric value indicating the rate of change (temperature per hour) of temperature values that is considered erroneous.

questionable_buffer

A numeric value indicating a time buffer for questionable values.

erroneous_buffer

A numeric value indicating a time buffer for erroneous values.

gap_range

A numeric value indicating the range of hours between two non reasonable values that will be coded as questionable or erroneous.

Value

A data frame

Details

The function only works on a single deployment of a logger. The table output will be sorted by temperature_date_time.

The function will error if you have columns with the following names as they are used internally: status_id, ".lag_temp", ".diff_temp", ".lag_time", ".diff_time", ".rate_temp_per_time", ".lag_id", ".lead_id", ".id_row", ".quest_higher_next_id", ".quest_lower_next_id", ".error_higher_next_id", ".error_lower_next_id", ".quest_higher_next_time", ".quest_lower_next_time", ".error_higher_next_time", ".error_lower_next_time", ".quest_higher_time_diff_h", ".quest_lower_time_diff_h", ".error_higher_time_diff_h", ".error_lower_time_diff_h", ".gap_fill_higher_time", ".gap_fill_higher_type", ".gap_fill_lower_time", ".gap_fill_lower_type", ".gap_diff_time_h"

The function will error if there are missing temperature_date_time values missing. Missing values in water_temperature are ignored and treated as if they are not present. If you want to drop these values you can do that to the output by using tidyr::drop_na().

The data is processed by:

  1. Classifying the temperature values based on their values (questionable_min, questionable_max, erroneous_min, erroneous_max). 2. The rate of change between adjacent values is calculate and values are classified based on the rate parameters (questionable_rate, erroneous_rate). 3. Adjacent values to questionable/erroneous are coded as questionable/erroneous. 4. A buffer is applied that any value within the buffer is classified as questionable/erroneous based on the buffer parameters (questionable_buffer, erroneous_buffer). 5. Reasonable values identified between two questionable/erroneous values are coded as questionable/erroneous based on the gap hour difference allowed (gap_range).

Examples

data <- data.frame(
  temperature_date_time =
    as.POSIXct(c(
      "2021-05-07 08:00:00", "2021-05-07 09:00:00",
      "2021-05-07 10:00:00", "2021-05-07 11:00:00", "2021-05-07 12:00:00",
      "2021-05-07 13:00:00"
    )),
  water_temperature = c(4.124, 4.078, 4.102, 4.189, 4.243, 6.578)
)

classified_data <- classify_water_temp_data(data)