Transforms data according to the modified coefficient of variation (CV) rule. This is used to add additional variance to datasets with unexpectedly low variance, which is sometimes encountered during testing of new materials over short periods of time.
Two versions of this transformation are implemented. The first version,
transform_mod_cv()
, transforms the data in a single group (with
no other structure) according to the modified CV rules.
The second
version, transform_mod_cv_ad()
, transforms data that is structured
according to both condition and batch, as is commonly done for
the Anderson–Darling k-Sample and Anderson-Darling tests when pooling
across environments.
transform_mod_cv_ad(x, condition, batch)
transform_mod_cv(x)
A vector of transformed data
The modified CV transformation takes the general form:
$$\frac{S_i^*}{S_i} (x_{ij} - \bar{x_i}) + \bar{x_i}$$
Where \(S_i^*\) is the modified standard deviation (mod CV times mean) for the \(ith\) group; \(S_i\) is the standard deviation for the \(ith\) group, \(\bar{x_i}\) is the group mean and \(x_{ij}\) is the observation.
transform_mod_cv()
takes a vector
containing the observations and transforms the data.
The equation above is used, and all observations
are considered to be from the same group.
transform_mod_cv_ad()
takes a vector containing the observations
plus a vector containing the corresponding conditions and a vector
containing the batches. This function first calculates the modified
CV value from the data from each condition (independently). Then,
within each condition, the transformation
above is applied to produce the transformed data \(x'\).
This transformed data is further transformed using the following
equation.
$$x_{ij}'' = C (x'_{ij} - \bar{x_i}) + \bar{x_i}$$
Where:
$$C = \sqrt{\frac{SSE^*}{SSE'}}$$
$$SSE^* = (n-1) (CV^* \bar{x})^2 - \sum(n_i(\bar{x_i}-\bar{x})^2)$$
$$SSE' = \sum(x'_{ij} - \bar{x_i})^2$$
# Transform data according to the modified CV transformation
# and report the original and modified CV for each condition
library(dplyr)
carbon.fabric %>%
filter(test == "FT") %>%
group_by(condition) %>%
mutate(trans_strength = transform_mod_cv(strength)) %>%
head(10)
#> # A tibble: 10 × 6
#> # Groups: condition [1]
#> id test condition batch strength trans_strength
#> <chr> <chr> <chr> <int> <dbl> <dbl>
#> 1 FT-RTD-1-1 FT RTD 1 126. 126.
#> 2 FT-RTD-1-2 FT RTD 1 139. 141.
#> 3 FT-RTD-1-3 FT RTD 1 116. 115.
#> 4 FT-RTD-1-4 FT RTD 1 132. 133.
#> 5 FT-RTD-1-5 FT RTD 1 129. 129.
#> 6 FT-RTD-1-6 FT RTD 1 130. 130.
#> 7 FT-RTD-2-1 FT RTD 2 131. 131.
#> 8 FT-RTD-2-2 FT RTD 2 124. 124.
#> 9 FT-RTD-2-3 FT RTD 2 125. 125.
#> 10 FT-RTD-2-4 FT RTD 2 120. 119.
## # A tibble: 10 x 6
## # Groups: condition [1]
## id test condition batch strength trans_strength
## <chr> <chr> <chr> <int> <dbl> <dbl>
## 1 FT-RTD-1-1 FT RTD 1 126. 126.
## 2 FT-RTD-1-2 FT RTD 1 139. 141.
## 3 FT-RTD-1-3 FT RTD 1 116. 115.
## 4 FT-RTD-1-4 FT RTD 1 132. 133.
## 5 FT-RTD-1-5 FT RTD 1 129. 129.
## 6 FT-RTD-1-6 FT RTD 1 130. 130.
## 7 FT-RTD-2-1 FT RTD 2 131. 131.
## 8 FT-RTD-2-2 FT RTD 2 124. 124.
## 9 FT-RTD-2-3 FT RTD 2 125. 125.
## 10 FT-RTD-2-4 FT RTD 2 120. 119.
# The CV of this transformed data can be computed to verify
# that the resulting CV follows the rules for modified CV
carbon.fabric %>%
filter(test == "FT") %>%
group_by(condition) %>%
mutate(trans_strength = transform_mod_cv(strength)) %>%
summarize(cv = sd(strength) / mean(strength),
mod_cv = sd(trans_strength) / mean(trans_strength))
#> # A tibble: 3 × 3
#> condition cv mod_cv
#> <chr> <dbl> <dbl>
#> 1 CTD 0.0423 0.0612
#> 2 ETW 0.0369 0.06
#> 3 RTD 0.0621 0.0711
## # A tibble: 3 x 3
## condition cv mod_cv
## <chr> <dbl> <dbl>
## 1 CTD 0.0423 0.0612
## 2 ETW 0.0369 0.0600
## 3 RTD 0.0621 0.0711