Water temperature data will be either classified as reasonable, questionable, or erroneous in the status_id column.
Usage
classify_water_temp_data(
data,
questionable_min = 0,
questionable_max = 30,
erroneous_min = -0.5,
erroneous_max = 40,
questionable_rate = 2,
erroneous_rate = 5,
questionable_buffer = 1,
erroneous_buffer = 1,
gap_range = 5
)
Arguments
- data
A data frame.
- questionable_min
A numeric value indicating the lower bound of the questionable range of temperature values.
- questionable_max
A numeric value indicating the upper bound of the questionable range of temperature values.
- erroneous_min
A numeric value indicating the lower bound of the erroneous range of temperature values.
- erroneous_max
A numeric value indicating the upper bound of the erroneous range of temperature values.
- questionable_rate
A numeric value indicating the rate of change (temperature per hour) of temperature values that is considered questionable.
- erroneous_rate
A numeric value indicating the rate of change (temperature per hour) of temperature values that is considered erroneous.
- questionable_buffer
A numeric value indicating a time buffer for questionable values.
- erroneous_buffer
A numeric value indicating a time buffer for erroneous values.
- gap_range
A numeric value indicating the range of hours between two non reasonable values that will be coded as questionable or erroneous.
Details
The function only works on a single deployment of a logger. The table output will be sorted by temperature_date_time.
The function will error if you have columns with the following names as they are used internally: status_id, ".lag_temp", ".diff_temp", ".lag_time", ".diff_time", ".rate_temp_per_time", ".lag_id", ".lead_id", ".id_row", ".quest_higher_next_id", ".quest_lower_next_id", ".error_higher_next_id", ".error_lower_next_id", ".quest_higher_next_time", ".quest_lower_next_time", ".error_higher_next_time", ".error_lower_next_time", ".quest_higher_time_diff_h", ".quest_lower_time_diff_h", ".error_higher_time_diff_h", ".error_lower_time_diff_h", ".gap_fill_higher_time", ".gap_fill_higher_type", ".gap_fill_lower_time", ".gap_fill_lower_type", ".gap_diff_time_h"
The function will error if there are missing temperature_date_time values
missing. Missing values in water_temperature are ignored and treated as if
they are not present. If you want to drop these values you can do that to
the output by using tidyr::drop_na()
.
The data is processed by:
Classifying the temperature values based on their values (questionable_min, questionable_max, erroneous_min, erroneous_max). 2. The rate of change between adjacent values is calculate and values are classified based on the rate parameters (questionable_rate, erroneous_rate). 3. Adjacent values to questionable/erroneous are coded as questionable/erroneous. 4. A buffer is applied that any value within the buffer is classified as questionable/erroneous based on the buffer parameters (questionable_buffer, erroneous_buffer). 5. Reasonable values identified between two questionable/erroneous values are coded as questionable/erroneous based on the gap hour difference allowed (gap_range).
Examples
data <- data.frame(
temperature_date_time =
as.POSIXct(c(
"2021-05-07 08:00:00", "2021-05-07 09:00:00",
"2021-05-07 10:00:00", "2021-05-07 11:00:00", "2021-05-07 12:00:00",
"2021-05-07 13:00:00"
)),
water_temperature = c(4.124, 4.078, 4.102, 4.189, 4.243, 6.578)
)
classified_data <- classify_water_temp_data(data)