Function to evaluate methods defined in a BenchDesign
on a supplied
data set to generate a SummarizedBenchmark
of
benchmarking results. In addition to the results of applying each method on the data, the returned
SummarizedBenchmark
also
includes metadata for the methods
in the colData
of the returned object, metadata for the
data in the rowData
, and session information in the
metadata
.
buildBench(bd, data = NULL, truthCols = NULL, ftCols = NULL, sortIDs = FALSE, keepData = FALSE, catchErrors = TRUE, parallel = FALSE, BPPARAM = bpparam())
bd |
|
---|---|
data | Data set to be used for benchmarking, will take priority over
data set specified to |
truthCols | Character vector of column names in data set corresponding to
ground truth values for each assay. If specified, column will be added to
the |
ftCols | Vector of character names of columns in data set that should be
included as feature data (row data) in the returned |
sortIDs | Whether the output of each method should be merged and sorted using IDs.
See Details for more information. (default = |
keepData | Whether to store the data as part of the |
catchErrors | logical whether errors produced by methods during evaluation
should be caught and printed as a message without stopping the entire
build process. (default = |
parallel | Whether to use parallelization for evaluating each method.
Parallel execution is performed using BiocParallel. Parameters for
parallelization should be specified with |
BPPARAM | Optional |
SummarizedBenchmark
object.
Parallelization is performed across methods. Therefore, there is currently no benefit to
specifying more cores than the total number of methods in the
BenchDesign
object.
By default, errors thrown by individual methods in the BenchDesign
are caught
during evaluation and handled in a way that allows buildBench
to continue
running with the other methods. The error is printed as a message, and the corresponding
column in the returned SummarizedBenchmark
object is set to NA
. Since
many benchmarking experiments can be time and computationally intensive, having to rerun
the entire analysis due to a single failed method can be frustrating. Default error catching
was included to alleviate these frustrations. However, if this behavior is not desired,
setting catchErrors = FALSE
will turn off error handling.
If sortIDs = TRUE
, each method must return a named vector or list. The names will be
used to align the output of each method in the returned SummarizedBenchmark
.
Missing values
from each method will be set to NA. This can be useful if the different methods return
overlapping, but not identical, results. If truthCols
is also specified, and sorting
by IDs is necessary, rather than specifying sortIDs = TRUE
, specify the string name of a column in
the data
to use to sort the method output to match the order of truthCols
.
When a method specified in the BenchDesign
does not have a postprocessing function specified
to post =
, the trivial base::identity
function is used as the default postprocessing
function.
## with toy data.frame df <- data.frame(pval = rnorm(100)) bench <- BenchDesign(data = df) ## add methods bench <- addMethod(bench, label = "bonf", func = p.adjust, params = rlang::quos(p = pval, method = "bonferroni")) bench <- addMethod(bench, label = "BH", func = p.adjust, params = rlang::quos(p = pval, method = "BH")) ## evaluate benchmark experiment sb <- buildBench(bench) ## evaluate benchmark experiment w/ data sepecified sb <- buildBench(bench, data = df)