MultiDoE
packageThe MultiDoE
package can be used to construct
multi-stratum experimental designs (for any number of strata) that
optimize up to six statistical criteria simultaneously. To solve such
optimization problems, the innovative MS-Opt and MS-TPLS algorithms are
implemented. The former relies on a local search procedure to find a
locally optimal experimental design. More precisely, it is an extension
of the Coordinate-Exchange (CE) algorithm that allows both the search of
experimental designs for any type of nested multi-stratum experiment and
the optimization of multiple criteria simultaneously. The latter, by
embedding MS-Opt in a Two-Phase Local Search framework, is able to
generate a good Pareto front approximation for the optimization problem
under study. The package provides different ways to choose the final
optimal experimental design among those belonging to the Pareto
front.
In what follows, we report a practical example.
In this experiment, a split-plot design is used. The final aim is to investigate the effect of five factors on protein extraction. More precisely, a mixture containing two valuable proteins, among other components, is considered after fermentation and purification processes.
The experiment is intended to separate the two proteins from the mixture, and the responses were the yields and purities of the two proteins. The factors are: \(x_1\), the feed position for the inflow of a mixture, which is hard to set; \(x_2\) the feed flow rate; \(x_3\) the gas flow rate; \(x_4\) the concentration of the first protein; \(x_5\), the concentration of the second protein. Three levels are used for each factor. The split-plot design was set up as follows: one whole-plot factor, four subplot factors, twenty-one whole plots of size two, and 42 runs.
Let use the package for simultaneously optmize more than one criterion.
First step is to upload the package.
It is time to initialize the main arguments for defining the experimental design.
backup_options <- options()
set.seed(13)
options(digits = 15)
facts <- list(1, 2:5)
units <- list(21, 2)
levels <- 3
etas <- list(1)
criteria <- c('Id', 'Ds')
model <- "quadratic"
facts
is a list of vectors representing the distribution
of factors across strata. Each item in the list represents a stratum and
the first item is the highest stratum of the multi-stratum structure of
the experiment. Within the vectors, experimental factors are indicated
by progressive integer from 1 (the first factor of the highest stratum)
to the total number of experimental factors (the last factor of the
lowest stratum). Blocking factors are differently denoted by empty
vectors.
units
is a list whose \(i\)-th element, \(n_i\), is the number of experimental units
within each unit at the previous stratum (\(i-1\)).
levels
is the number of available levels for each
experimental factor.
etas
is used to specify the ratios of error variance
between subsequent strata, starting from the highest strata.
criteria
is a list specifying the criteria to be
optimized among I-, Id-, D-, A-, Ds, and As-optimality.
model
is a string which indicates the type of model,
among main effects only (“main”), interaction (“interaction”) and full
quadratic (“quadratic”).
Now, we need to declare the main arguments for the MS-TPLS algorithms in order to get the final Pareto Front.
model
is an integer indicating the number of iterations
of the MS-TPLS algorithm. In this case, we set the iteration as 5 times
the number of criteria.
restarts
defines the number of times the MS-Opt
procedure is altogether called within each iteration of the MS-TPLS
algorithm.
restInit
determines how many of the iterations of MS-Opt
should be used for each criterion in the first step of the MS-TPLS
algorithm.
restarts
and restInit
could be tuned.
We now simultaneously minimize Id- and Ds-criteria.
tpls <- runTPLS(facts,units, criteria, model, iters,
"Restarts", restarts,
"RestInit", restInit)
#> [1] 1
#> [1] 2
#> [1] 3
#> [1] 4
#> [1] 5
#> [1] 6
#> [1] 7
#> [1] 8
#> [1] 9
#> [1] 10
Given the tpls
object, you have now different options.
First of all, you can inspect your Pareto front by using the
plotPareto
function. As input of the
plotPareto
function, you should use the megaAr
output of the runTPLS
function. megaAR
contains different information, among them it includes a matrix on which
every row contains the criteria values for each Pareto front design.
Now, you would probably like to find the best compromise between Id-
and Ds-criteria. For this purpose, the optMultiCrit
is the
function you are searching for. The optMultiCrit
function
provides an objective criterion for the selection of the best
experimental design among all Pareto front solutions. The selection is
based on minimizing the euclidean distance in the criteria space between
all the Pareto front points and an approximate utopian point. By
default, the coordinates of the utopian point correspond to the minimum
value reached by each criterion during the runTPLS
optimization procedure. Alternatively, the utopian point can be chosen
by the user (please see the documentation).
optMultiCrit(tpls$megaAR)
#> $solution
#> $solution[[1]]
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 1 1 1 -1
#> [2,] 1 -1 -1 1 0
#> [3,] 0 0 -1 0 0
#> [4,] 0 -1 0 1 1
#> [5,] 1 0 0 0 0
#> [6,] 1 1 -1 -1 1
#> [7,] 0 1 0 -1 0
#> [8,] 0 0 1 0 1
#> [9,] 1 -1 -1 -1 1
#> [10,] 1 1 1 -1 1
#> [11,] -1 -1 1 1 1
#> [12,] -1 -1 -1 -1 0
#> [13,] 0 0 1 1 -1
#> [14,] 0 1 0 0 0
#> [15,] -1 1 -1 -1 -1
#> [16,] -1 -1 0 1 0
#> [17,] 0 1 0 0 1
#> [18,] 0 0 1 -1 0
#> [19,] 1 -1 1 1 0
#> [20,] 1 1 -1 1 -1
#> [21,] 1 -1 -1 -1 -1
#> [22,] 1 0 -1 1 1
#> [23,] -1 1 -1 1 1
#> [24,] -1 0 0 0 -1
#> [25,] 0 1 -1 0 0
#> [26,] 0 0 0 -1 -1
#> [27,] 1 1 1 -1 -1
#> [28,] 1 -1 0 1 -1
#> [29,] -1 1 1 0 -1
#> [30,] -1 -1 1 -1 1
#> [31,] 0 0 0 0 0
#> [32,] 0 -1 -1 1 -1
#> [33,] -1 1 1 1 0
#> [34,] -1 -1 -1 0 1
#> [35,] 1 0 -1 -1 -1
#> [36,] 1 -1 1 -1 1
#> [37,] -1 -1 1 -1 -1
#> [38,] -1 0 0 -1 1
#> [39,] -1 1 1 -1 1
#> [40,] -1 0 -1 1 -1
#> [41,] 1 -1 1 0 -1
#> [42,] 1 1 1 1 1
#>
#>
#> $scores
#> Id Ds
#> 0.3327840041018626 0.0784106692294139
Another option is the use of the topsisOpt
function.
This approach is based on the principle that the best solutions must be
near to a positive ideal solution \((I+)\) and far from a negative ideal
solution \((I-)\) in the criteria
space. Please see the main reference for more details.
M. Méndez, M. Frutos, F. Miguel and R. Aguasca-Colomo. TOPSIS Decision on Approximate Pareto Fronts by Using Evolutionary Algorithms: Application to an Engineering Design Problem. Mathematics, 2020.
You can access the scores of the best compromised design found by the TOPSIS procedure.
topsis_solutions <- topsisOpt(tpls)
topsis_solutions$bestScore
#> Id Ds
#> 0.3357919136841818 0.0778816088473012
Then, you can also investigate the best design matrix.
topsis_solutions$bestSol
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 1 0 0 0
#> [2,] 1 -1 -1 1 -1
#> [3,] 0 0 0 0 0
#> [4,] 0 -1 1 -1 -1
#> [5,] 1 1 1 1 1
#> [6,] 1 -1 -1 0 1
#> [7,] -1 -1 1 -1 1
#> [8,] -1 0 -1 0 -1
#> [9,] 1 1 1 -1 1
#> [10,] 1 -1 0 -1 -1
#> [11,] -1 -1 -1 -1 1
#> [12,] -1 1 1 1 1
#> [13,] -1 1 1 1 -1
#> [14,] -1 -1 -1 -1 -1
#> [15,] -1 0 1 -1 0
#> [16,] -1 -1 -1 1 1
#> [17,] -1 1 0 -1 0
#> [18,] -1 -1 1 1 -1
#> [19,] 0 1 1 0 0
#> [20,] 0 0 0 1 1
#> [21,] 1 0 -1 -1 1
#> [22,] 1 1 -1 0 0
#> [23,] 1 0 1 -1 -1
#> [24,] 1 1 -1 1 1
#> [25,] 1 -1 1 -1 1
#> [26,] 1 1 1 1 -1
#> [27,] -1 1 -1 1 -1
#> [28,] -1 0 1 0 1
#> [29,] 0 1 0 -1 1
#> [30,] 0 -1 1 1 1
#> [31,] -1 1 -1 -1 1
#> [32,] -1 1 1 -1 -1
#> [33,] 1 1 -1 -1 -1
#> [34,] 1 -1 1 1 0
#> [35,] 0 -1 -1 -1 0
#> [36,] 0 0 0 0 -1
#> [37,] 0 1 -1 -1 -1
#> [38,] 0 0 0 0 0
#> [39,] 0 0 -1 1 0
#> [40,] 0 -1 0 0 -1
#> [41,] -1 -1 0 1 0
#> [42,] -1 1 -1 0 1
At this point, another interesting analysis is related to the
identification of the best design matrices for each criterion in the
Pareto front. optSingleCrit
function selects he best design
from the Pareto Front for each criterion.
optSingleCrit(tpls$megaAR)
#> $Id
#> $Id$scores
#> Id Ds
#> 0.3327840041018626 0.0784106692294139
#>
#> $Id$solution
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 1 1 1 -1
#> [2,] 1 -1 -1 1 0
#> [3,] 0 0 -1 0 0
#> [4,] 0 -1 0 1 1
#> [5,] 1 0 0 0 0
#> [6,] 1 1 -1 -1 1
#> [7,] 0 1 0 -1 0
#> [8,] 0 0 1 0 1
#> [9,] 1 -1 -1 -1 1
#> [10,] 1 1 1 -1 1
#> [11,] -1 -1 1 1 1
#> [12,] -1 -1 -1 -1 0
#> [13,] 0 0 1 1 -1
#> [14,] 0 1 0 0 0
#> [15,] -1 1 -1 -1 -1
#> [16,] -1 -1 0 1 0
#> [17,] 0 1 0 0 1
#> [18,] 0 0 1 -1 0
#> [19,] 1 -1 1 1 0
#> [20,] 1 1 -1 1 -1
#> [21,] 1 -1 -1 -1 -1
#> [22,] 1 0 -1 1 1
#> [23,] -1 1 -1 1 1
#> [24,] -1 0 0 0 -1
#> [25,] 0 1 -1 0 0
#> [26,] 0 0 0 -1 -1
#> [27,] 1 1 1 -1 -1
#> [28,] 1 -1 0 1 -1
#> [29,] -1 1 1 0 -1
#> [30,] -1 -1 1 -1 1
#> [31,] 0 0 0 0 0
#> [32,] 0 -1 -1 1 -1
#> [33,] -1 1 1 1 0
#> [34,] -1 -1 -1 0 1
#> [35,] 1 0 -1 -1 -1
#> [36,] 1 -1 1 -1 1
#> [37,] -1 -1 1 -1 -1
#> [38,] -1 0 0 -1 1
#> [39,] -1 1 1 -1 1
#> [40,] -1 0 -1 1 -1
#> [41,] 1 -1 1 0 -1
#> [42,] 1 1 1 1 1
#>
#>
#> $Ds
#> $Ds$scores
#> Id Ds
#> 0.4112907857518298 0.0741784584427386
#>
#> $Ds$solution
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 1 -1 -1 0
#> [2,] 1 -1 -1 1 -1
#> [3,] 1 0 -1 -1 1
#> [4,] 1 -1 1 -1 -1
#> [5,] 1 1 1 1 1
#> [6,] 1 -1 -1 1 1
#> [7,] -1 -1 1 -1 1
#> [8,] -1 0 -1 0 -1
#> [9,] 1 1 1 -1 1
#> [10,] 1 -1 -1 -1 -1
#> [11,] -1 -1 -1 -1 1
#> [12,] -1 1 1 1 1
#> [13,] -1 1 1 1 -1
#> [14,] -1 -1 -1 -1 -1
#> [15,] -1 -1 1 -1 -1
#> [16,] -1 -1 -1 1 1
#> [17,] -1 1 0 -1 0
#> [18,] -1 -1 1 1 -1
#> [19,] 0 1 1 0 0
#> [20,] 0 0 0 1 1
#> [21,] 1 -1 0 -1 1
#> [22,] 1 1 -1 1 -1
#> [23,] 1 0 1 -1 -1
#> [24,] 1 1 -1 1 1
#> [25,] 1 -1 1 0 1
#> [26,] 1 1 1 1 -1
#> [27,] -1 1 -1 1 -1
#> [28,] -1 0 1 0 1
#> [29,] 0 1 1 -1 1
#> [30,] 0 -1 1 1 1
#> [31,] -1 1 -1 -1 1
#> [32,] -1 1 1 -1 -1
#> [33,] 1 1 -1 -1 -1
#> [34,] 1 -1 1 1 0
#> [35,] 0 -1 -1 -1 0
#> [36,] 0 -1 0 1 -1
#> [37,] -1 1 -1 -1 -1
#> [38,] -1 -1 0 0 0
#> [39,] 0 0 -1 1 0
#> [40,] 0 1 0 0 -1
#> [41,] -1 -1 0 1 0
#> [42,] -1 1 -1 0 1