This tutorial demonstrates the quadratic_plateau()
function for fitting a continuous response model and estimating a
critical soil test value (CSTV). This function fits a segmented
regression model that follows two phases: a positive curvilinear
response followed by a flat plateau phase. The join point is often
interpreted as the CSTV. See Bullock and Bullock (1994) for
example.$$
\begin{cases}
x < j,\ y = a + bx + cx^2 \\
x > j,\ y = a + bj + cj^2
\end{cases}
$$
where
y
represents the fitted crop relative yield
x
the soil test value
a
the intercept (ry
when stv
=
0)
b
the linear slope (as the change in ry per unit of soil
nutrient supply or nutrient added)
c
the quadratic coefficient (giving the curve shape)
j
the join point when the plateau phase starts (i.e., the
CSTV).
This model is slightly more complex than the linear-plateau, but the
curvature of the response is argued to be more biologically reasonably
and economical useful. The quadratic_plateau()
function
works automatically with self-starting initial values to facilitate the
model convergence.
Disadvantages are that:
boot_quadratic_plateau()
function for a reliable confidence
interval estimation of parameters via bootstrapping (resampling with
replacement).Load your dataframe with soil test value and relative yield data.
Specify the following arguments into the function
quadratic_plateau()
:
data
(optional)
stv
(soil test value)
ry
(relative yield) columns or vectors
target
(optional) for calculating the soil test
value at some RY level along the slope segment.
tidy
TRUE
(produces a data.frame with
results) or FALSE
(store results as list),
plot
TRUE
(produces a ggplot as main
output) or FALSE
(no plot, only results as
data.frame),
resid
TRUE
(produces plots with
residuals analysis) or FALSE
(no plot)
Run and check results.
Check residuals plot, and warnings related to potential limitations of this model.
Adjust curve plots as desired with additional
ggplot2
functions.
Suggested packages
# Install if needed
library(ggplot2) # Plots
library(dplyr) # Data wrangling
library(tidyr) # Data wrangling
# library(utils) # Data wrangling
# library(data.table) # Mapping
library(purrr) # Mapping
This is a basic example using three different datasets:
tidy
= TRUE (default)It returns a tidy data frame (more organized results)
quadratic_plateau(corr_df, STV, RY, tidy = TRUE)
#> # A tibble: 1 × 17
#> intercept slope quadratic equation plateau CSTV lowerCL upperCL CI_type
#> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 44.1 2.86 -0.0392 44.1 + 2.86x … 96.4 36.5 29.7 43.4 Wald C…
#> # ℹ 8 more variables: target <dbl>, STVt <dbl>, AIC <dbl>, AICc <dbl>,
#> # BIC <dbl>, R2 <dbl>, RMSE <dbl>, pvalue <dbl>
tidy
= FALSEIt returns a LIST (may be more efficient for multiple fits at once)
quadratic_plateau(corr_df, STV, RY, tidy = FALSE)
#> $intercept
#> [1] 44.14905
#>
#> $slope
#> [1] 2.862649
#>
#> $quadratic
#> [1] -0.03919089
#>
#> $equation
#> [1] "44.1 + 2.86x + -0.04x^2 when x < 36.5"
#>
#> $plateau
#> [1] 96.42367
#>
#> $CSTV
#> [1] 36.52186
#>
#> $lowerCL
#> [1] 29.7
#>
#> $upperCL
#> [1] 43.4
#>
#> $CI_type
#> [1] "Wald Conf. Interval"
#>
#> $target
#> [1] 96.4
#>
#> $STVt
#> [1] 36.5
#>
#> $AIC
#> [1] 1023.43
#>
#> $AICc
#> [1] 1023.73
#>
#> $BIC
#> [1] 1035.11
#>
#> $R2
#> [1] 0.53
#>
#> $RMSE
#> [1] 9.85
#>
#> $pvalue
#> [1] 0
# Example 1. Fake dataset manually created
data_1 <- data.frame("RY" = c(65,80,85,88,90,94,93,96,97,95,98,100,99,99,100),
"STV" = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15))
# Example 2. Native fake dataset from soiltestcorr package
data_2 <- soiltestcorr::data_test
# Example 3. Native dataset from soiltestcorr package, Freitas et al. (1966), used by Cate & Nelson (1971)
data_3 <- soiltestcorr::freitas1966 %>%
rename(STV = STK)
data.all <- bind_rows(data_1, data_2, data_3, .id = "id")
Note: the stv
column needs to have the same name for all
datasets if binding rows.
map()
# Run multiple examples at once with purrr::map()
data.all %>%
nest(data = c("STV", "RY")) %>%
mutate(model = map(data, ~ quadratic_plateau(stv = .$STV, ry = .$RY))) %>%
unnest(model)
#> # A tibble: 3 × 19
#> id data intercept slope quadratic equation plateau CSTV lowerCL
#> <chr> <list> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 1 <tibble> 61.0 8.59 -0.502 61 + 8.59x + -… 97.7 8.56 6.6
#> 2 2 <tibble> 44.1 2.86 -0.0392 44.1 + 2.86x +… 96.4 36.5 29.7
#> 3 3 <tibble> 12.9 1.91 -0.0111 12.9 + 1.91x +… 95.3 86.2 45.5
#> # ℹ 10 more variables: upperCL <dbl>, CI_type <chr>, target <dbl>, STVt <dbl>,
#> # AIC <dbl>, AICc <dbl>, BIC <dbl>, R2 <dbl>, RMSE <dbl>, pvalue <dbl>
group_modify()
Alternatively, with group_modify
, nested data is not
required. However, it still requires a grouping variable (in this case,
id
) to identify each dataset. group_map()
may
also be used, though list_rbind()
is required to return a
tidy data frame of the model results instead of a list.
data.all %>%
group_by(id) %>%
group_modify(~ quadratic_plateau(data = ., STV, RY))
#> # A tibble: 3 × 18
#> # Groups: id [3]
#> id intercept slope quadratic equation plateau CSTV lowerCL upperCL CI_type
#> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 1 61.0 8.59 -0.502 61 + 8.… 97.7 8.56 6.6 10.5 Wald C…
#> 2 2 44.1 2.86 -0.0392 44.1 + … 96.4 36.5 29.7 43.4 Wald C…
#> 3 3 12.9 1.91 -0.0111 12.9 + … 95.3 86.2 45.5 127. Wald C…
#> # ℹ 8 more variables: target <dbl>, STVt <dbl>, AIC <dbl>, AICc <dbl>,
#> # BIC <dbl>, R2 <dbl>, RMSE <dbl>, pvalue <dbl>
Bootstrapping is a suitable method for obtaining confidence intervals for parameters or derived quantities. Bootstrapping is a resampling technique (with replacement) that draws samples from the original data with the same size. If you have groups within your data, you can specify grouping variables as arguments in order to maintain, within each resample, the same proportion of observations than in the original dataset.
This function returns a table with as many rows as the resampling size (n) containing the results for each resample.
boot_qp <- boot_quadratic_plateau(corr_df, STV, RY, n = 500) # only 500 for sake of speed
boot_qp %>% head(n = 5)
#> # A tibble: 5 × 14
#> boot_id intercept slope quadratic plateau CSTV target STVt AIC AICc BIC
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 30.8 4.09 -0.0645 95.6 31.7 95.6 31.7 1063. 1064. 1075.
#> 2 2 32.3 3.97 -0.0614 96.5 32.3 96.5 32.3 1024. 1024. 1035.
#> 3 3 56.1 1.85 -0.0211 96.7 43.9 96.7 43.9 992. 992. 1003.
#> 4 4 44.6 2.86 -0.0397 96.3 36.1 96.3 36.1 1004. 1004. 1015.
#> 5 5 28.5 4.48 -0.0760 94.5 29.5 94.5 29.5 1091. 1091. 1103.
#> # ℹ 3 more variables: R2 <dbl>, RMSE <dbl>, pvalue <dbl>
# CSTV Confidence Interval
quantile(boot_qp$CSTV, probs = c(0.025, 0.5, 0.975))
#> 2.5% 50% 97.5%
#> 28.54063 37.30023 50.58490
# Plot
boot_qp %>%
ggplot2::ggplot(aes(x = CSTV))+
geom_histogram(color = "grey25", fill = "#9de0bf", bins = 10)
We can generate a ggplot with the same
quadratic_plateau()
function.
We just need to specify the argument plot = TRUE
.
As ggplot object, plots can be adjusted in several ways, such as modifying titles and axis scales.
Set the argument resid = TRUE
.
#> # A tibble: 1 × 17
#> intercept slope quadratic equation plateau CSTV lowerCL upperCL CI_type
#> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 12.9 1.91 -0.0111 12.9 + 1.91x … 95.3 86.2 45.5 127. Wald C…
#> # ℹ 8 more variables: target <dbl>, STVt <dbl>, AIC <dbl>, AICc <dbl>,
#> # BIC <dbl>, R2 <dbl>, RMSE <dbl>, pvalue <dbl>
Bullock, D.G. and Bullock, D.S. (1994), Quadratic and Quadratic-Plus-Plateau Models for Predicting Optimal Nitrogen Rate of Corn: A Comparison. Agron. J., 86: 191-195. 10.2134/agronj1994.00021962008600010033x