In order to get up to speed with the `portvine`

package
you will find in this vignette:

**Exemplary workflows**for the unconditional and conditional risk measure estimation**Opinionated visualization functions**not included in the package- Some notes on
**backtesting**the estimated risk measures

What you do not find here is a theoretical description of the
algorithms. For this purpose there is the **Under the hood
article** on the package website and actually it is probably most
often the better idea to start off with the theoretical view. For a very
detailed theoretical view with lots of illustrations and two case
studies have a look at my thesis - Sommer (2022).

```
#################### required ####################
library(portvine)
#################### optional ####################
## the following packages are only necessary for the visualizations
# for visualization:
library(ggplot2)
library(patchwork)
theme_set(
theme_minimal() +
theme(plot.title = ggtext::element_markdown(size = 11),
plot.subtitle = ggtext::element_markdown(size = 9))
)
# some data wrangling for visualizations
library(dplyr)
library(tidyr)
```

The sampling within the estimation algorithms is reproducible with a fixed seed (this also holds for parallel processing).

With the package comes a small test data set containing 1000 daily log returns of 3 stocks (Apple ~ AAPL, Alphabet ~ GOOG and Amazon ~ AMZN).

```
data("sample_returns_small")
head(sample_returns_small)
#> AAPL GOOG AMZN
#> 1: 0.005221515 -0.0063910288 -0.016940943
#> 2: 0.019702688 0.0232542613 0.016639145
#> 3: 0.019878308 -0.0006788564 -0.004209719
#> 4: -0.005595481 0.0065947671 -0.000176859
#> 5: -0.024806737 -0.0049333142 0.009580031
#> 6: 0.015416878 0.0113819042 0.018446965
dim(sample_returns_small)
#> [1] 1000 3
```

So as one financial year is comprised of roughly 250 days one has the data of roughly 4 years. The aim is now to estimate portfolio level risk measures for the last 40 days i.e. the interval \([960,1000]\). For a realistic scenario this would of course be a very short window. First one specifies the marginal assetwise time series models.

In practice one would for example like to refit the marginal models every two months which is in trading days roughly 50 days, but here we will refit every 20 days. The training data size is already specified by the interval of interest i.e. here one uses the whole available past of the first marginal window which has the size 960.

Here in the unconditional case one will not deviate from the default
thus for all marginal models an ARMA(1,1)-GARCH(1,1) model is fitted.
The default model can be assessed via the function
`default_garch_spec()`

.

```
default_garch_spec()
#>
#> *---------------------------------*
#> * GARCH Model Spec *
#> *---------------------------------*
#>
#> Conditional Variance Dynamics
#> ------------------------------------
#> GARCH Model : sGARCH(1,1)
#> Variance Targeting : FALSE
#>
#> Conditional Mean Dynamics
#> ------------------------------------
#> Mean Model : ARFIMA(1,0,1)
#> Include Mean : TRUE
#> GARCH-in-Mean : FALSE
#>
#> Conditional Distribution
#> ------------------------------------
#> Distribution : sstd
#> Includes Skew : TRUE
#> Includes Shape : TRUE
#> Includes Lambda : FALSE
```

The marginal model specifications are then collected in a
`marginal_settings`

object as shown below.

For the conditional case one will use a more individual specification. One changes the default to an ARMA(1,2)-GARCH(1,1) model while sticking with the default for the Amazon stock.

```
cond_marg_settings <- marginal_settings(
train_size = 960,
refit_size = 20,
individual_spec = list("AMZN" = default_garch_spec()),
default_spec = default_garch_spec(ma = 2)
)
```

The next step is to specify the settings for the vine fitting.

Here one would like to refit the vine copula models every 10 or 20 days which is allowed by the package as 20 is dividable by 10. One would like to capture the dependence on a training set of 100 trading days for each rolling window.

For the unconditional case one will use the default
`vine_type`

which is the R-vine. For the bivariate building
blocks of the vine copula one will restrict oneself to the single
parametric pair copulas. For other possible choices have a look at the
help page of the function `rvinecopulib::bicop()`

.

In the conditional case up to now only D-vine copulas are allowed. As copula families one will pick some arbitrary ones for this example. Note that “parametric” would be the most nonrestrictive valid option here.

```
cond_vine_settings <- vine_settings(
train_size = 100,
refit_size = 20,
family_set = c("gumbel", "joe", "t"),
vine_type = "dvine"
)
```

Now one is ready to perform the actual risk measure estimation.

The risk measures should be estimated at both popular confidence levels 1% and 2.5%. In terms of risk measures one would like to estimate the Value at Risk (VaR) and the expected shortfall (ES) with mean estimation based on here a very small and not realistic sample size of 50. A realistic one would be 100000.

Here an equally weighted portfolio of the three stocks is assumed
thus one can leave the default `NULL`

value for the
`weights`

argument. The trace argument set to
`TRUE`

will trigger the printing of some basic information
during training. This workhorse function of the package is very well
suited for parallel and even nested parallel processing. For details
have a look at the help page and consult the article covering the
parallel processing on the package website for some performance
measurements. Down below one can facilitate basic parallel processing by
just using the commented code.

```
## specify parallel strategy
# future::plan("multisession", workers = 4)
uncond_risk_roll <- estimate_risk_roll(
data = sample_returns_small,
weights = NULL,
marginal_settings = uncond_marg_settings,
vine_settings = uncond_vine_settings,
alpha = c(0.01, 0.025),
risk_measures = c("VaR", "ES_mean"),
n_samples = 50,
trace = TRUE
)
#>
#> Fit marginal models:
#> AAPL GOOG AMZN
#>
#> Fit vine copula models and estimate risk.
#> Vine windows:
#> (1/4) (2/4) (3/4) (4/4)
## return to default sequential settings (cut off any background processes)
# future::plan("sequential")
```

Having estimated the risk measures one can get a short and more detailed summary of the process by printing the object and calling the summary method.

```
uncond_risk_roll
#> An object of class <portvine_roll>
#> Number of ARMA-GARCH/ marginal windows: 2
#> Number of vine windows: 4
#> Risk measures estimated: VaR ES_mean
#> Alpha levels used: 0.01 0.025
#>
#> Time taken: 0.0675 minutes
```

```
summary(uncond_risk_roll)
#> An object of class <portvine_roll>
#>
#> --- Marginal models ---
#> Number of ARMA-GARCH/ marginal windows: 2
#> Train size: 960
#> Refit size: 20
#>
#> --- Vine copula models ---
#> Number of vine windows: 4
#> Train size: 100
#> Refit size: 10
#> Vine copula type: rvine
#> Vine family set: onepar
#>
#> --- Risk estimation ---
#> Risk measures estimated: VaR ES_mean
#> Alpha levels used: 0.01 0.025
#> Number of estimated risk measures: 160
#> Number of samples for each risk estimation: 50
#>
#> Time taken: 0.0675 minutes.
```

The package currently supports up to two conditioning assets. Here
the single conditional case will be covered but the double conditional
case is almost identical. One will choose the Amazon stock to be the
conditioning asset (via the `cond_vars`

argument).
Furthermore one will give the Alphabet stock a greater weight and the
weight of the conditioning asset must of course be zero. Note that the
weights for example according to market capitalization can be even
changing on the vine window granularity level. The `cond_u`

argument then specifies the quantile level of the conditioning value. So
a small quantile level will emulate the behavior of a very weak
performance of the conditioning Amazon stock. Details on all arguments
are given on the help page.

```
cond_risk_roll <- estimate_risk_roll(
data = sample_returns_small,
weights = c("AAPL" = 1, "GOOG" = 2, "AMZN" = 0),
marginal_settings = cond_marg_settings,
vine_settings = cond_vine_settings,
alpha = c(0.01, 0.025),
risk_measures = c("VaR", "ES_mean"),
n_samples = 50,
cond_vars = "AMZN",
cond_u = 0.1
)
```

Having estimated the conditional risk measures one can again get a short and more detailed summary of the process by printing the object and calling the summary method.

```
cond_risk_roll
#> An object of class <cond_portvine_roll>
#> Conditional variable(s): AMZN
#> Number of ARMA-GARCH/ marginal windows: 2
#> Number of vine windows: 2
#> Risk measures estimated: VaR ES_mean
#> Alpha levels used: 0.01 0.025
#>
#> Time taken: 0.073 minutes
```

```
summary(cond_risk_roll)
#> An object of class <cond_portvine_roll>
#>
#> --- Conditional settings ---
#> Conditional variable(s): AMZN
#> Number of conditional estimated risk measures: 320
#> Conditioning quantiles: 0.1
#>
#> --- Marginal models ---
#> Number of ARMA-GARCH/ marginal windows: 2
#> Train size: 960
#> Refit size: 20
#>
#> --- Vine copula models ---
#> Number of vine windows: 2
#> Train size: 100
#> Refit size: 20
#> Vine copula type: dvine
#> Vine family set: gumbel joe t
#>
#> --- Risk estimation ---
#> Risk measures estimated: VaR ES_mean
#> Alpha levels used: 0.01 0.025
#> Number of estimated risk measures: 160
#> Number of samples for each risk estimation: 50
#>
#> Time taken: 0.073 minutes.
```

In order to extract the marginal models one can use the function
`fitted_marginals()`

. The result is a named list with a
`rugarch::ugarchroll`

object for each asset. These objects
contain all fitted marginal time series models.

For the extraction of the fitted residuals for a residual analysis
the `portvine`

package further provides the utility function
`roll_residuals()`

. In the following an opinionated
visualization function useful for a visual residual analysis is given
and used. It allows to visualize the raw standardized residuals, the
empirical autocorrelation function of them and the results (p-values) of
Ljung-Box tests for serial correlation at different lags. This is useful
to assess the appropriateness of the mean equation of the ARMA-GARCH
model. In order to assess the volatility equation one can have a look at
the same visualization for the squared standardized residuals. As the
function is highly opinionated it was not integrated in the
`portvine`

package.

```
# An opinionated function for residual analysis. Note this function has no
# input checks and is not tested in any ways.
# Output: named list with a composition of residual plots for each asset
marg_resid_viz_list <- function(
roll, # portvine_roll or cond_portvine_roll object
asset_names = NULL, # filter for certain assets
marg_window = 1, # specify a marginal window
squared = FALSE # if set to true the results for the squared stand. residuals
# are displayed
) {
fitted_marginals <- fitted_marginals(roll)
if (is.null(asset_names)) asset_names <- fitted_vines(roll)[[1]]$names
sapply(
asset_names,
function(asset_name) {
# use again a utility function from the portvine package to extract the
# fitted standardized residuals
model_resid <- roll_residuals(
fitted_marginals[[asset_name]], marg_window
)
if (squared) model_resid <- model_resid^2
simple_exploratory <- data.frame(resid = model_resid) %>%
mutate(id = seq(length(model_resid))) %>%
ggplot(aes(x = id, y = resid)) +
geom_line(size = 0.2) +
labs(x = "t", y = ifelse(squared, expression(z[t]^2),expression(z[t])),
title = asset_name) +
theme(axis.text.x = element_blank(),
axis.ticks.x = element_blank())
acf_plot <- data.frame(
acf = as.numeric(acf(model_resid, type = "cor", lag.max = 20,
plot = FALSE)$acf),
lag = 0:20
) %>%
filter(lag != 0 & lag <= 10) %>%
ggplot() +
geom_hline(yintercept = 0, col = "black", size = 0.3) +
geom_hline(yintercept = qnorm(c(0.025, 0.975)) /
sqrt(length(model_resid)),
linetype = "longdash", col = "#92B8DE", size = 0.5) +
geom_segment(aes(x = lag, xend = lag, y = 0, yend = acf)) +
geom_point(aes(x = lag, y = acf)) +
scale_x_continuous(breaks = seq(1, 10, 1)) +
ylim(-1, 1) +
labs(x = "h", y = "ACF(h)")
ljungplot <- data.frame(
pval = sapply(
1:10,
function(i) Box.test(model_resid, lag = i,
type = "Lju")$p.value),
lag = 1:10) %>%
ggplot() +
geom_hline(yintercept = 0, col = "black", size = 0.3) +
geom_hline(yintercept = 0.05,
linetype = "longdash", col = "#92B8DE", size = 0.5) +
geom_line(aes(x = lag, y = pval)) +
geom_point(aes(x = lag, y = pval)) +
scale_x_continuous(breaks = seq(1, 10, 1)) +
labs(x = "h", y = "p-value of Ljung-Box test at lag h")
(simple_exploratory / (ljungplot + acf_plot)) +
plot_layout(nrow = 2)
}, USE.NAMES = TRUE, simplify = FALSE)
}
```

Apply the function to the result of the unconditional estimation and display the residual plots for Apple and Alphabet.

```
uncond_residual_viz <- marg_resid_viz_list(uncond_risk_roll)
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
uncond_residual_viz_squared <- marg_resid_viz_list(
uncond_risk_roll, squared = TRUE
)
```

For a better overview of all assets, especially in higher dimensions, a heatmap with the p-values of the Ljung-Box tests can be useful. For this I will now showcase another opinionated utility function.

```
# The function creates a heatmap of Ljung Box test p-values for a
# (cond-)portvine_roll object. Note this function again has no
# input checks and is not tested in any ways.
# Output: The heatmap (ggplot2 object)
ljung_heatmap <- function(roll, roll_num = 1, squared = FALSE) {
asset_names <- fitted_vines(roll)[[1]]$names
roll_marginals <- fitted_marginals(roll)
ljung_data <- sapply(asset_names, function(asset_name) {
model_resid <- roll_residuals(
roll_marginals[[asset_name]], roll_num = roll_num
)
if (squared) model_resid <- model_resid^2
sapply(
1:10,
function(i) Box.test(model_resid, lag = i, type = "Lju")$p.value
)
}, USE.NAMES = TRUE, simplify = TRUE)
ljung_data <- ljung_data %>%
as.data.frame() %>%
mutate(lag = seq(nrow(ljung_data))) %>%
pivot_longer(-lag, names_to = "asset", values_to = "pval")
if (all(ljung_data$pval >= 0.05)) {
legend_scale <- scale_fill_gradient(
low = "#92B8DE", high = "#2a82db"
)
} else {
legend_scale <- scale_fill_gradientn(
colours = c("#db4f59","#C37285" ,
"#92B8DE", "#2a82db"),
values = scales::rescale(c(0, 0.05 - 0.01, 0.05, 1)),
breaks = c(0.05),
labels = c(0.05),
guide = guide_colourbar(nbin = 1000))
}
ljung_data %>%
ggplot(aes(x = lag, y = asset, fill = pval)) +
geom_tile() +
scale_x_continuous(breaks = 1:10) +
labs(y = "", x = "h", fill = "p-value",
title = "Results of the Ljung-Box tests",
caption = paste("Rolling window:", roll_num)) +
legend_scale +
theme(legend.position = "right",
panel.grid.minor.x = element_blank())
}
```

This time one uses the result of the conditional risk measure estimation for a display of the function for the residual analysis. \(h\) again denotes the lag.

P-values lower than 0.05 would be highlighted in red on the heatmap for easy spotting.

Next up one can have a look at the fitted vine copulas. For this
again a utility function is available in the `portvine`

package. The function `fitted_vines()`

extracts all fitted
vines in a list, one vine copula object for each vine window.

```
uncond_fitted_vines <- fitted_vines(uncond_risk_roll)
uncond_fitted_vines[[3]]
#> 3-dimensional vine copula fit ('vinecop')
#> nobs = 100 logLik = 56 npars = 3 AIC = -106.01 BIC = -98.19
```

As the list elements are `rvinecopulib::vinecop`

objects
one can further analyze them with tools from the great `rvinecopulib`

package e.g. plot the objects to exploit the graphical representation,
which aids interpretability.

The last utility function that is presented in this vignette is
`risk_estimates()`

and as the name suggests it can be used to
easily extract the risk measure estimates from the estimation result
object. One can not only filter by the risk measure, confidence levels
but also choose between an output as `data.frame`

or
`data.table`

. Moreover optionally it calculates a logical
column indicating whether the risk measure was exceeded.

```
## a glimpse at the tidy data.frame with all estimated risk measures
head(risk_estimates(uncond_risk_roll), 10)
#> risk_measure risk_est alpha row_num vine_window realized
#> 1 VaR -0.06515662 0.010 961 1 -0.001246766
#> 2 VaR -0.04451489 0.025 961 1 -0.001246766
#> 3 ES_mean -0.08349987 0.010 961 1 -0.001246766
#> 4 ES_mean -0.06478227 0.025 961 1 -0.001246766
#> 5 VaR -0.08514404 0.010 962 1 0.048242745
#> 6 VaR -0.04493658 0.025 962 1 0.048242745
#> 7 ES_mean -0.12128020 0.010 962 1 0.048242745
#> 8 ES_mean -0.08440657 0.025 962 1 0.048242745
#> 9 VaR -0.06361489 0.010 963 1 0.011817887
#> 10 VaR -0.05623254 0.025 963 1 0.011817887
## a glimpse at the estimated VaR at confidence level 1% with exceeded column
head(
risk_estimates(
uncond_risk_roll,
risk_measures = "VaR",
alpha = 0.01,
exceeded = TRUE
),
6
)
#> risk_measure risk_est alpha row_num vine_window realized exceeded
#> 1 VaR -0.06515662 0.01 961 1 -0.001246766 FALSE
#> 2 VaR -0.08514404 0.01 962 1 0.048242745 FALSE
#> 3 VaR -0.06361489 0.01 963 1 0.011817887 FALSE
#> 4 VaR -0.06957757 0.01 964 1 0.012662108 FALSE
#> 5 VaR -0.09127966 0.01 965 1 0.023044956 FALSE
#> 6 VaR -0.15228774 0.01 966 1 -0.013655480 FALSE
## of course also applies for the conditional case
head(
risk_estimates(
cond_risk_roll,
risk_measures = "ES_mean",
alpha = 0.01,
cond_u = c("resid", 0.1)
),
6
)
#> risk_measure risk_est alpha row_num AMZN cond_u vine_window
#> 1 ES_mean -0.10060582 0.01 961 -0.021734368 0.1 1
#> 2 ES_mean -0.04506207 0.01 961 -0.008608346 resid 1
#> 3 ES_mean -0.15585150 0.01 962 -0.019588962 0.1 1
#> 4 ES_mean -0.03063990 0.01 962 0.015758643 resid 1
#> 5 ES_mean -0.11187111 0.01 963 -0.019490136 0.1 1
#> 6 ES_mean -0.02746338 0.01 963 0.008117431 resid 1
#> realized
#> 1 0.007439544
#> 2 0.007439544
#> 3 0.039189494
#> 4 0.039189494
#> 5 -0.002692899
#> 6 -0.002692899
```

This tidy representation can be easily used for visualizations. For example comparisons of risk measures over time.

```
risk_estimates(uncond_risk_roll, alpha = 0.025) %>%
ggplot() +
geom_line(aes(x = row_num, y = realized), col = "grey") +
geom_line(aes(x = row_num, y = risk_est,
col = factor(risk_measure))) +
scale_color_manual(values = c("#92B8DE", "#477042")) +
labs(x = "trading day",
y = "portfolio log returns",
col = "Risk measure",
title = "Comparison of unconditional risk measure estimates at
confidence level 2.5%",
subtitle = paste0("Realized log returns in ",
"<span style='color:",
"grey",
"'>**grey**</span>",
"."))
```

Or one can explore exceedances.

```
risk_estimates(
uncond_risk_roll,
risk_measures = c("VaR"),
alpha = 0.01,
exceeded = TRUE) %>%
ggplot() +
geom_line(aes(x = row_num, y = realized), col = "lightgrey") +
geom_line(aes(x = row_num, y = risk_est), col = "#92B8DE") +
geom_point(aes(x = row_num, y = realized), col = "#db4f59",
inherit.aes = FALSE, data = . %>% filter(exceeded)) +
labs(x = "trading days",
y = "portfolio log returns",
col = "Exceeded",
subtitle = paste0("Exceedances are highlighted in ",
"<span style='color:",
"#db4f59",
"'>**red**</span>",
"."),
title = "Risk measure: VaR with confidence level 1%"
) +
theme(legend.position = "none")
```

In the conditional case one can first have a look at how the different conditional strategies behave. On the one side a conditioning value based on the quantile level 0.1 was forecasted (below shown in light blue) and on the other side the residual strategy is by default set to the one that uses the realized residuals (below shown in green, it should almost overlay the grey realized line). Those are only for the comparison as they conceptually are an oracle as the actual realized conditional asset returns would not be known at sampling time.

```
risk_estimates(
cond_risk_roll,
risk_measures = "ES_mean",
alpha = 0.01) %>%
ggplot() +
geom_line(
data = sample_returns_small[961:1000, "AMZN"],
aes(x = 961:1000,
y = AMZN), col = "grey", size = .3) +
geom_line(aes(x = row_num, y = AMZN, col = factor(cond_u)),
size = 0.5) +
scale_color_manual(values = c("#92B8DE", "#477042"),
name = "Conditional strategy") +
labs(title = "Conditional variable: AMZN",
x = "trading days", y = "log returns of the asset AMZN",
subtitle = paste0("Realized log returns in ",
"<span style='color:",
"grey",
"'>**grey**</span>",
"."))
```

Then one can compare the resulting conditional risk measure estimates.

```
risk_estimates(cond_risk_roll, alpha = 0.025, risk_measures = "ES_mean") %>%
ggplot() +
geom_line(aes(x = row_num, y = realized), col = "grey") +
geom_line(aes(x = row_num, y = risk_est,
col = factor(cond_u))) +
scale_color_manual(values = c("#477042", "#92B8DE")) +
labs(x = "trading day",
y = "portfolio log returns",
col = "Conditional strategy",
title = "Comparison of conditional risk measure estimates at
confidence level 2.5%",
subtitle = paste0("Realized log returns in ",
"<span style='color:",
"grey",
"'>**grey**</span>",
"."))
```

This is of course just the tip of the iceberg and countless other visualizations are possible at that point. Especially for multiple conditional quantile levels in a stress testing scenario. An especially promising comparison is the one comparing multiple levels of quantile based risk measure estimates with the respective unconditional one. This allows to assess how well the unconditional approach accounts for the market risk on the portfolio associated with fluctuations of the conditional asset(s).

The interpretation is left out for these exemplary workflows. Detailed case studies can again be found in Sommer (2022).

The risk measure estimation performed is in many cases performed with
the goal of backtesting in mind. This can be easily achieved given the
objects returned by the `estimate_risk_roll()`

function. As
there are already excellent resources for performing these here are some
suggestions for implementations that can be applied in this portfolio
level setting:

**Value at Risk backtesting**: The package`rugarch`

contains the straightforward to use function`rugarch::VaRTest()`

. It features classical backtests based on violation sequences.**Expected shortfall backtesting**: The package`esback`

contains a suite of easy to use and theoretically backed backtests for the expected shortfall.