The standardized precipitation index (SPI) (Mckee and Kleist (1993)) is a probability-based drought indicator that categorizes dry and wet events as a function of their probability of occurrence or expected frequencies. This widely used drought index was initially developed as a stationary method in which the parameters of the parametric distribution used to estimate the cumulative probability of its unique input variable (rainfall), do not vary over time (Blain et al. (2022)). However, there have been observed changes in drought frequency and intensity worldwide (Strzepek et al. (2010), Dai (2013), Spinoni et al. (2019), Stagge and Sung (2022), and Blain et al. (2022)), which violate the assumption of stationarity (Coles (2001), Zhang, Zwiers, and Li (2004), Cheng et al. (2014)).
In the context of the current climate changes, studies such as (Russo et al. (2013), Li et al. (2015), and Rashid and Beecham (2019)) proposed the nonstationary standardized precipitation index (NSPI), which incorporates several indices (e.g. time) as external covariates. Therefore, the NSPI was designed to capture nonstationary characteristics of rainfall and, thereby, to capture drought signals under transient rainfall distribution patterns (Blain et al. (2022)). However, the use of the NSPI is not a straightforward task (Blain et al. (2022)). Because time-varying parametric distributions account for temporal changes in rainfall distributions, they remove the effect of rainfall trends on the NSPI estimates (Shiau (2020)). For instance, by applying the NSPI to rainfall series presenting significant increasing trends, (Shiau (2020)) noted that the NSPI indicated an increase in the frequency and severity of drought events. Similar results were observed by (Park et al. (2018)) that found that, in the presence of decreasing rainfall trends, the use of time-varying distribution leads to an increase in the cumulative probability of a particular rainfall event over time, also increasing the corresponding NSPI estimate. Although this contradictory behavior (Shiau (2020)) may be algebraically explained by the NSPI calculation algorithm (Blain et al. (2022)), it potentially prevents widespread acceptance of nonstationary distributions for calculating standardized drought indices such as the SPI. This statement particularly holds for the operational use of these indices.
In spite of the above-mentioned limitation, the nonstationary approach adopted by the NSPI algorithm is capable of modelling how the probabilities of rainfall events have changed over a time series. In addition, the use of nonstationary distributions enables isolate the effect of trends on the central tendency and on the dispersion of the rainfall frequency distributions. In this context, the {SPIChanges} package demonstrates that the challenge of interpreting SPI estimates under changing climatic conditions can be overcome by using information generated by the NSPI calculation algorithm. This package uses nonstationary distributions to detect trends in rainfall patterns and quantify their effect on the occurrence of any SPI estimate. In other words, the {SPIChanges} package leverages the nonstationary approach of the NSPI algorithm to detect trends in rainfall series and enhance the understanding of how changes in rainfall patterns affect the expected frequency of drought occurrence. See Theoretical Background for further information on the {SPIChanges} package.
Load the library in your R session.
As described in the introduction, the SPI requires only rainfall data
as its input variable. The {SPIChanges} package includes daily rainfall
data (CampinasRain
) to provide use case for its functions.
The daily rainfall amounts were measured in millimetres, and recorded in
Campinas state of Sao Paulo, Brazil (1980-2023). This data series were
obtained from the CPC Global Unified Gauge-Based Analysis of Daily
Precipitation, provided by the NOAA/OAR/ESRL PSL, Boulder, CO https://psl.noaa.gov/. The
data is included in the package as a dataframe with 2 columns and 16071
rows, where the first and second columns are the Date and rainfall
amounts, respectively.
TSaggreg()
Aggregates or accumulates daily rainfall totals at quasiWeek time scales.
As a multi-scalar index, the SPI accumulates rainfall totals (often
recorded at the daily scale) at any time scale selected by the users.
The {SPIChanges} package employs a basic time scale (quasiWeek) that
splits each month into four sub-periods. Therefore, a quasiWeek time
scale (TS) equal to 4 corresponds to a moving window with a 1-month
length. TS equal to 48 corresponds to a moving window with a 1-year
length. TS equals to 1 corresponds 1-quasiWeek time scale.
At this point, it is worth emphasizing that the use of calendar weeks as
a basic time scales may pose challenges since the SPI is a relative
metric that require homogeneous periods (Vicente-Serrano et al. (2022)). The first day of
each year can fall on different weekdays, causing inconsistency
throughout the year. Leap years may also complicate this comparison in
routine drought monitoring (Vicente-Serrano et
al. (2022)). Additionally, the 4-quasiWeek time scale aligns with
the widely used 1-month time scale.
## Usage
Rainfall amounts aggregated at the time scale selected by the user.
Aggregating daily rainfall values at 4-quasiWeek time scale, which corresponds to monthly data.
library(SPIChanges)
daily.rain <- CampinasRain[, 2]
RainTS4 <- TSaggreg(daily.rain = daily.rain, start.date = "1980-01-01", TS = 4)
#> Done. Just ensure the last quasi-week is complete.
#> The last day of your series is 31 and TS is 4
head(RainTS4)
#> Year Month quasiWeek rain.at.TS4
#> [1,] 1980 1 4 223.1143
#> [2,] 1980 2 1 217.4197
#> [3,] 1980 2 2 207.0196
#> [4,] 1980 2 3 203.8757
#> [5,] 1980 2 4 183.2537
#> [6,] 1980 3 1 177.9945
SPIChanges()
Detect trends and quantify their effect on the probability of SPI values occurring.
The SPIChanges()
is the central function of the package
is based on the following steps, which adapted from the study of (Blain et al. (2022)):
Given a rainfall series with data accumulated at a particular quasiWeek time scale, the function uses the stationary version of the two-parameter gamma distribution to calculate SPI estimates and the cumulative probability of each rainfall amount occurring (i.e., the probability of each SPI estimate occurring).
Using the second-order Akaike Information Criterion (AICc), the package selects the best-fitting model among distinct nonstationary two-parameter gamma distributions (see details).
The package uses the selected nonstationary model to calculate the cumulative probability of each rainfall amount occurring under changing climate conditions.
The package compares the probabilities estimated in steps 1 and 3 to indicate whether the frequency of drought or wet events, quantified by the SPI, has increased or decreased over time.
The gamma distributions are fitted to rainfall data using Generalized Additive Models (GAMLSS) with time as a covariate. This fitting process is based on the maximum likelihood method (McCullagh (2018)) and considers the following increasingly complex functions (candidate models). Further information on GAMLSS can be found in (Rigby and Stasinopoulos (2005)).
Model 1 (stationary): The mean (µ) and dispersion (δ) of the distribution are constant over time.
Model 2 (homoscedastic): Only µ is allowed to vary over time linearly.
Model 3: Only δ is allowed to vary over time linearly.
Model 4: Both µ and δ are allowed to vary over time linearly.
Model 5 (homoscedastic): Only µ is allowed to vary over time non-linearly with a quadratic polynomial function.
Model 6: Only δ is allowed to vary over time non-linearly with a quadratic polynomial function.
Model 7: µ is allowed to vary over time non-linearly with a quadratic polynomial function; δ is allowed to vary over time linearly.
Model 8: µ is allowed to vary over time linearly; δ is allowed to vary over time non-linearly with a quadratic polynomial function.
Model 9: Both µ and δ are allowed to vary over time non-linearly with a quadratic polynomial function.
Model 10: Only µ is allowed to vary over time non-linearly with a cubic polynomial function.
Model 11: Only δ is allowed to vary over time non-linearly with a cubic polynomial function.
Model 12: µ is allowed to vary over time non-linearly with a cubic polynomial function; δ is allowed to vary over time linearly.
Model 13: µ is allowed to vary over time linearly; δ is allowed to vary over time non-linearly with a cubic polynomial function.
Model 14: µ is allowed to vary over time non-linearly with a cubic polynomial function; δ is allowed to vary over time non-linearly with a quadratic polynomial function.
Model 15: µ is allowed to vary over time non-linearly with a quadratic polynomial function; δ is allowed to vary over time non-linearly with a cubic polynomial function.
Model 16: µ is allowed to vary over time non-linearly with a cubic polynomial function; δ is allowed to vary over time non-linearly with a cubic polynomial function.
The gamma distribution has two parameters: the shape and scale. Their relationships with the mean (mu) and dispersion (sigma) are given by equations 1 and 2.
\[ mu = shape*scale \tag{1} \] \[ sigma = \frac{\sqrt{shape*scale^2}}{shape*scale} \tag{2} \]
A list object with:
data.week: The precipitation amounts, SPI, cumulative probability of the SPI values (stationary approach), cumulative probability of the precipitation values calculated from the nonstationary algorithm of the NSPI, and the changes in the frequency of the SPI values caused by the changes in precipitation patterns.
model.selection: The generalized additive model that best fits the precipitation series.
Changes.Freq.Drought: changes in the frequency of moderate, severe and extreme drought events, as defined by the SPI classification system (Table 1), caused by the changes in precipitation patterns.
Statistics: Year to year changes in the expected frequency of moderate, severe and extreme drought events.
Using only linear nonstationary parametric models to assess the
probability of rainfall amounts. The models are based on the
two-parameter gamma distribution with parameters estimated by the
maximum likelihood method. The packages gamlss
and
gamlss.dist
are used for such estimations.
library(SPIChanges)
daily.rain <- CampinasRain[, 2]
rainTS4 <- TSaggreg(daily.rain = daily.rain, start.date = "1980-01-01", TS = 4)
#> Done. Just ensure the last quasi-week is complete.
#> The last day of your series is 31 and TS is 4
Changes_SPI <- SPIChanges(rain.at.TS = rainTS4, only.linear = "Yes")
#> Warning in SPIChanges(rain.at.TS = rainTS4, only.linear = "Yes"): rainfall
#> series Month 9 Week 1 has more than 6.7% of zeros. In this situation the SPI
#> cannot assume values lower than -1.5
head(Changes_SPI$data.week)
#> Year Month quasiWeek rain.at.TS SPI Exp.Acum.Prob Actual.Acum.Prob
#> 1 1980 1 4 223.1143 -0.203 0.420 0.420
#> 2 1980 2 1 217.4197 -0.035 0.486 0.486
#> 3 1980 2 2 207.0196 0.031 0.513 0.513
#> 4 1980 2 3 203.8757 0.122 0.549 0.549
#> 5 1980 2 4 183.2537 0.317 0.624 0.624
#> 6 1980 3 1 177.9945 0.413 0.660 0.660
#> ChangeDryFreq
#> 1 0
#> 2 0
#> 3 NoDrought
#> 4 NoDrought
#> 5 NoDrought
#> 6 NoDrought
head(Changes_SPI$Statistics)
#> Month quasiWeek ProbZero mu sigma
#> 1 1 1 0 218.182 0.326
#> 2 1 1 0 218.182 0.326
#> 3 1 1 0 218.182 0.326
#> 4 1 1 0 218.182 0.326
#> 5 1 1 0 218.182 0.326
#> 6 1 1 0 218.182 0.326
head(Changes_SPI$model.selection)
#> Month quasiWeek model
#> [1,] 1 1 1
#> [2,] 1 2 1
#> [3,] 1 3 1
#> [4,] 1 4 1
#> [5,] 2 1 1
#> [6,] 2 2 1
head(Changes_SPI$Changes.Freq.Drought)
#> Month quasiWeek StatProbZero NonStatProbZero StatNormalRain
#> [1,] 1 1 0 0 210.51
#> [2,] 1 2 0 0 228.64
#> [3,] 1 3 0 0 231.14
#> [4,] 1 4 0 0 237.84
#> [5,] 2 1 0 0 220.15
#> [6,] 2 2 0 0 204.63
#> NonStatNormalRain ChangeMod ChangeSev ChangeExt
#> [1,] 210.51 0 0 0
#> [2,] 228.64 0 0 0
#> [3,] 231.14 0 0 0
#> [4,] 237.84 0 0 0
#> [5,] 220.15 0 0 0
#> [6,] 204.63 0 0 0
Using linear and non-linear nonstationary parametric models to assess
the probability of rainfall amounts. The models are based on the
two-parameter gamma distribution with parameters estimated by the
maximum likelihood method. The packages gamlss
and
gamlss.dist
are used for such estimations.
library(SPIChanges)
daily.rain <- CampinasRain[, 2]
rainTS4 <- TSaggreg(daily.rain = daily.rain, start.date = "1980-01-01", TS = 4)
#> Done. Just ensure the last quasi-week is complete.
#> The last day of your series is 31 and TS is 4
Changes_SPI <- SPIChanges(rain.at.TS = rainTS4, only.linear = "No")
#> Warning in SPIChanges(rain.at.TS = rainTS4, only.linear = "No"): rainfall
#> series Month 9 Week 1 has more than 6.7% of zeros. In this situation the SPI
#> cannot assume values lower than -1.5
head(Changes_SPI$data.week)
#> Year Month quasiWeek rain.at.TS SPI Exp.Acum.Prob Actual.Acum.Prob
#> 1 1980 1 4 223.1143 -0.203 0.420 0.420
#> 2 1980 2 1 217.4197 -0.035 0.486 0.486
#> 3 1980 2 2 207.0196 0.031 0.513 0.513
#> 4 1980 2 3 203.8757 0.122 0.549 0.549
#> 5 1980 2 4 183.2537 0.317 0.624 0.624
#> 6 1980 3 1 177.9945 0.413 0.660 0.660
#> ChangeDryFreq
#> 1 0
#> 2 0
#> 3 NoDrought
#> 4 NoDrought
#> 5 NoDrought
#> 6 NoDrought
head(Changes_SPI$Statistics)
#> Month quasiWeek ProbZero mu sigma
#> 1 1 1 0 218.182 0.326
#> 2 1 1 0 218.182 0.326
#> 3 1 1 0 218.182 0.326
#> 4 1 1 0 218.182 0.326
#> 5 1 1 0 218.182 0.326
#> 6 1 1 0 218.182 0.326
head(Changes_SPI$model.selection)
#> Month quasiWeek model
#> [1,] 1 1 1
#> [2,] 1 2 1
#> [3,] 1 3 1
#> [4,] 1 4 1
#> [5,] 2 1 1
#> [6,] 2 2 1
head(Changes_SPI$Changes.Freq.Drought)
#> Month quasiWeek StatProbZero NonStatProbZero StatNormalRain
#> [1,] 1 1 0 0 210.51
#> [2,] 1 2 0 0 228.64
#> [3,] 1 3 0 0 231.14
#> [4,] 1 4 0 0 237.84
#> [5,] 2 1 0 0 220.15
#> [6,] 2 2 0 0 204.63
#> NonStatNormalRain ChangeMod ChangeSev ChangeExt
#> [1,] 210.51 0 0 0
#> [2,] 228.64 0 0 0
#> [3,] 231.14 0 0 0
#> [4,] 237.84 0 0 0
#> [5,] 220.15 0 0 0
#> [6,] 204.63 0 0 0