NEWS | R Documentation |
add_reftable() allows a more extended interface for the Simulate
function, which may now handle a data frame of parameter values.
New default refine(., nbCluster) value, based on a more definite guess of a good value and aiming at faster analyses.
New convenient wrapper init_reftable() around the init_grid() function imported from the 'blackbox' package. Infusion now depends on versions >= 1.1.41 of that package, which include a faster init_grid() function.
plot2Dprof() can use parallelisation.
plot2Dprof(., pars) syntax extended for more flexible specification of sets of profiles.
refine.default() gets new argument 'CIs' for more transparent control of CI computations.
refine.default() can now request parallel execution of ranger(), according to refine()'s 'nb_cores' argument by default, but with refine()'s 'cluster_args' argument further allowing independent control of ranger()'s 'num.threads' independently from 'nb_cores'.
project.default() gets new arguments 'use_oob' and 'is_trainset' to bypass a costly step in specific (but commonplace) cases; and also argument 'methodArgs'.
nbCluster="max" syntax allowed in some contexts for fast gaussian mixture modelling by fixing the tried number of clusters to a single value automatically generated.
The 'xLLiM' package (new 'Suggested' dependency) can be used as an alternative to 'Rmixmod' for joint density modelling, by calling infer_SLik_joint(., using="xLLiM").
New convenience function write_workflow().
The structure of 'SLik_j' objects has been modified, meaning that the result of running infer_SLik_joint() has to be re-generated to be compatible with functions of which this object is an argument.
add_reftable() and add_simulation() gain argument 'parsTable' with is an alias (and now the preferred name) for pre-existing argument 'par.grid'.
The documentations for add_reftable() and add_simulation() have been revised, to better emphasize the inference workflow based on add_reftable() over the primitive workflow based on add_simulation(), and split in different files for clarity. The definitions of the functions have also been revised, mostly in a backward-compatible way, although subtle changes may result in the primitive workflow.
'matrixStats' is an imported package.
New function SLRT() for "summary-likelihood ratio tests", including bootstrap-corrected ones. For the latter correction, another function get_LRboot() provides a fast approximation to bootstrap distribution of likelihood ratio statistic.
Better choice of initial value of optimization operations in likelihood profile computations, giving more correct 1D profile plots and likelihood ratio tests (and possibly improving all subsequent operations in iterative workflows).
More thorough implementation of parallelization in add_simulation() (and add_reftable()), allowing forking, and control of random number generator.
Finer control of parallelisation in refine(), through new features of the 'cluster_args' argument.
project() accepts (barely documented) "fastai" and "keras" methods, interfacing the packages of the same name (themselves interfacing python libraries).
New default values of various controls in calls to ranger::ranger.
infer_SLik_joint() and goftest() will check for linear dependencies among the summary statistics (which would cause bugs in goftest()), thanks to the new convenience function check_raw_stats().
MSL() now computes the hessian of summary likelihood at its maximum, to check parameter identifiability (with limited success). This hessian is included in the return value of MSL() and then in the 'MSL' environment stored in e.g.) objects of class 'SLik_j'.
New function focal_refine() to refine the likelihood surface for given parameter values.
The previously private function predict.SLik_j() that evaluates the likelihood for specific parameter points is now part of the API.
New argument 'plot.slices' for plot.SLik_j() and plot.SLik().
New argument 'decorations' for plot1Dprof() and plot2Dprof().
Improved control of number of clusters in Gaussian mixture modelling to avoid fitting more parameters than data.
projections using 'ranger' now calls it with argument importance="permutation" which may be used to select raw statistics for the goodness of fit test.
Check by add_reftable() of its result, to detect and more clearly warn about otherwise obscure problems that may occur at later steps.
profile.SLik() gets an 'init' argument, though it's more for programming purposes than intended as a user-level feature.
New 'summLik' extractor. summLik() uses a fit object to evaluate the summary-likelihood function for distinct parameter values and even for new data.
plot_proj(), a hastily written convenience function to plot a diagnostic plot for a projection from an SLik_j object.
Several code fixes for more efficient handling of large reference tables (notably, resolving some memory issues when using "ranger" projection results).
New functions get_nbCluster_range() (and seq_nbCluster(), less directly useful) to help in controlling the number of clusters used in Gaussian mixture modelling.
get_from() gets new argument 'force' to force computation of elements that may be missing from the fit object.
Formals of add_reftable() modified (backward incompatibility if the former first argument 'simulations' was named in the function call).
Infusion.getOption("nb_cores") now controls ranger::predict and ranger::ranger, with the expected effect according to the general effect of 'nb_cores' in Infusion (a single CPU is used if 'nb_cores' is NULL, while ranger functions use multithreading by default, i.e., when their 'num.threads' argument is NULL).
Backward-incompatible changes in project() arguments controlling training sizes.
Backward-incompatible changes in global options controlling the number of clusters used in Gaussian mixture modelling.
Incorrect reference to randomForest in documentation for project().
'caret' is back in suggested packages, but for its findLinearCombos() function rather than for any modelling method.
'ranger' is now an imported package.
New goftest() function, which may return a test of goodness of fit.
Improved sampling of parameter space in refine().
Several functions have a new argument 'cluster_args', passed to parallel::makeCluster(), for better control of the parallel computations. These functions' 'nb_cores' argument now acts as a shortcut for cluster_args$spec, in a backward-compatible way.
add_reftable() results can be more easily subsetted thanks to a new '[' method.
For better control of 1D profiles, plot1Dprof() gets new options for argument 'type', and new argument 'control'.
Refined use of function optimizers, which generally performs better than the previously used base optimizer (optim()).
plot.SLik_j(), plot1Dprof() and plot2Dprof() now re-run MSL(., eval_RMSEs=FALSE, CIs=FALSE) when they detect a new likelihood maximum, and discard pre-existing CI/RMSE information as these depend on the maximum. plot1Dprof() and plot2Dprof() now have a return value indicating whether MSL() was re-run.
New extractor get_from() to extract elements from summary-likelihood objects in a backward-compatible way.
Better control of verbosity in refine().
Tenfold larger default 'knotNbr' for ranger() projection method than previously.
New 'logLik' extractor.
'Rmixmod', which had disappeared from the dependencies in recent versions, is back in Suggested packages.
'nloptr' and 'minqa' now used for optimization.
'crayon' added in Suggests, with no visible effects yet in standard use.
Default model for Gaussian mixture modelling changed.
add_simulation() argument 'Simulate' can now be a function (rather than a name) and the return value's attr(.,"Simulate") is always the function rather than its name.
New Infusion.options "mixturing" (which replaces some previously undocumented options).
Slightly modified parameter sampling in refine(), so that all ensuing results will be slightly modified.
add_simulation() could fail to pass the definition of 'Simulate' to child processes in parallel computations.
project.default() could fail on one-dimensional statistics (a toy case).
Out-of-bag projections used whenever relevant in projections using random-forests methods (which definitely improves the results over "in-bag" projections).
New 'eval_RMSEs' argument of function refine.default().
New argument 'update_projectors' in refine.default(), effective only for the reference table method.
'ranger' package included in Suggests; random forest by 'ranger' is the new default projection 'method' in project.character().
infer_SLik_joint() can now (automatically) provide correct results when there is *one* probability mass in the distribution of the summary statistics.
infer_SLik_joint() can now handle 'using="mclust"' (but this will still provide incorrect results when there is a probability mass in the distribution of the summary statistics).
Objects of class 'SLik_j' using projection now retain a cumulative table of unprojected simulations, from which new projections can be generated.
Revised documentation including examples for the inference method based on reference tables.
Internal methods in infer_SLik_joint() are modified by default so that numerical results are slightly modified. See its new argument 'marginalize' to control this.
Defaults Infusion options 'projTrainingSize' and 'projKnotNbr' for projection by REML have been increased, because the called procedures are more efficiently implemented in versions of spaMM >= 2.7.0.
RMSE computation on SLik_j objects could fail following Rmixmod::mixmodCluster()'s failure to fit a bootstrap replicate.
Potential problems with figure rendering when Rstudio's graphic device was *not* used.
New 'nb_cores', 'packages' and 'env' arguments for more widely allowing parallel simulation in add_simulation() and refine().
New '...' (or 'control.Simulate') argument in add_simulation() for controlling the 'Simulate' arguments.
Dependencies changed for parallel computations; ”doSNOW' usage has been reinstated optionally (though only 'foreach' appears in the package dependencies). See help("Infusion.options") for details.
plot.SLik() failed for more than two parameters.
project() did not work on reference tables.
DESCRIPTION now Suggests 'Rmixmod' rather than Imports it. This means that, even if 'Rmixmod' were to be archived from CRAN (which has nearly occurred in March 2018), 'Infusion' could still be installed and run without 'Rmixmod', using 'mclust' as an alternative for model-based clustering. 'Rmixmod' is still the preferred method.
'mclust', 'caret' and 'ks' have been removed from DESCRIPTION (but can still be used without changes in user's code).
Imports package 'pbapply' to draw progress bars, including in parallel computations; no longer Suggests the package 'foreach' (and optionally 'doSNOW') for the same effect.
infer_SLik_joint() can now use procedures from the 'mclust' package (controlled by new argument 'using').
The calls to 'mclust' procedures in other Infusion functions have been revised to be comparable to calls to 'Rmixmod' procedures. This means in particular that the 'G' and 'modelNames' arguments of 'mclust' procedures are set to values equivalent to the corresponding arguments of calls to 'Rmixmod' procedures, and that Infusion uses AIC to select the number of gaussian clusters in both cases (although selection by AIC is implemented in neither of the clustering packages).
infer_logLs() can now analyze the input distributions in parallel. This is controlled by its new 'nb_cores' argument, and may make use of another new argument, 'packages'.
Infusion.options(<new values>) did not return old values.
New function infer_SLik_joint() to infer likelihood surfaces from a simulation table where each simulated data set is drawn for a distinct (vector-valued) parameter, as is usual for reference tables in Approximate Bayesian Computation.
'Infusion' now depends on package 'blackbox' for several functions moved there from package 'spaMM'.