The boilerplate
package provides tools for managing and
generating standardised text for methods and results sections of
scientific reports. It handles template variable substitution and
supports hierarchical organisation of text through dot-separated
paths.
You can install the development version of boilerplate from GitHub with:
# install the devtools package if you don't have it already
install.packages("devtools")
::install_github("go-bayes/boilerplate") devtools
statistical.longitudinal.lmtp
){{variable}}
placeholders with actual valuesThe boilerplate package includes several safety features to prevent accidental data loss:
boilerplate_save()
function requires explicit specification
of categories when saving individual databasesboilerplate_unified.rds
for unified databases,
{category}_db.rds
for individual categories)confirm=TRUE
)
before overwriting existing filestimestamp=TRUE
) to prevent
overwritescreate_backup=TRUE
in interactive
sessions)create_dirs=TRUE
)# install from github if not already installed
if (!require(boilerplate, quietly = TRUE)) {
# install devtools if necessary
if (!require(devtools, quietly = TRUE)) {
install.packages("devtools")
}::install_github("go-bayes/boilerplate")
devtools
}
# create a directory for this example (in practice, use your project directory)
<- file.path(tempdir(), "boilerplate_example")
example_dir dir.create(example_dir, showWarnings = FALSE)
# initialise unified database with example content
boilerplate_init(
data_path = example_dir,
create_dirs = TRUE,
create_empty = FALSE, # FALSE loads default example content
confirm = FALSE,
quiet = TRUE
)
# import the unified database
<- boilerplate_import(data_path = example_dir, quiet = TRUE)
unified_db
# add a new method entry directly to the unified database
$methods$sample_selection <- "Participants were selected from {{population}} during {{timeframe}}."
unified_db
# save all changes at once (JSON by default)
boilerplate_save(unified_db, data_path = example_dir, confirm = FALSE, quiet = TRUE)
# generate text with variable substitution
<- boilerplate_generate_text(
methods_text category = "methods",
sections = c("sample.default", "sample_selection"),
global_vars = list(
population = "university students",
timeframe = "2020-2021"
),db = unified_db,
add_headings = TRUE
)
cat(methods_text)
The boilerplate package can manage bibliography files for your projects, ensuring consistent citations across all your boilerplate text:
# Make sure you have the unified_db loaded from previous example
# If not, load it:
# unified_db <- boilerplate_import(data_path = example_dir, quiet = TRUE)
# Add bibliography information to your database
# Using the example bibliography included with the package
<- system.file("extdata", "example_references.bib", package = "boilerplate")
example_bib <- boilerplate_add_bibliography(
unified_db
unified_db,url = paste0("file://", example_bib),
local_path = "references.bib"
)
# Save the updated database
boilerplate_save(unified_db, data_path = example_dir, confirm = FALSE, quiet = TRUE)
# Generate text and automatically copy bibliography
<- boilerplate_generate_text(
methods_text category = "methods",
sections = "statistical.default", # Use full path to the default text
db = unified_db,
copy_bibliography = TRUE,
bibliography_path = "manuscript/"
)
# Validate all citations exist in bibliography
<- boilerplate_validate_references(unified_db)
validation if (!validation$valid) {
warning("Missing references: ", paste(validation$missing, collapse = ", "))
}
The boilerplate package supports JSON format for all database operations. JSON provides several advantages over the traditional RDS format:
For detailed JSON workflows, see
vignette("boilerplate-json-workflow")
.
# First ensure you have a database to import
# Initialise if needed:
# boilerplate_init(data_path = "path/to", create_dirs = TRUE)
# First ensure you have a database to import
# Initialise if needed:
boilerplate_init(data_path = "my_project/data", create_dirs = TRUE, confirm = FALSE, quiet = TRUE)
# import database (automatically detects JSON or RDS format)
<- boilerplate_import(data_path = "my_project/data", quiet = TRUE)
unified_db
# save as JSON (this is the default format)
boilerplate_save(unified_db, data_path = "my_project/data", format = "json", confirm = FALSE, quiet = TRUE)
# If you have old RDS files from a previous version, you can migrate them:
# results <- boilerplate_migrate_to_json(
# source_path = "old_project/data", # Path containing .rds files
# output_path = "new_project/data", # Where to save JSON files
# format = "unified", # Create single unified file
# backup = TRUE # Backup RDS files first
# )
# Example: Using a specific project directory for JSON data
<- file.path("my_analysis", "boilerplate_data")
my_json_path
# Initialise if needed
# boilerplate_init(data_path = my_json_path, create_dirs = TRUE, confirm = FALSE, quiet = TRUE)
# import database (auto-detects JSON format)
# db <- boilerplate_import(data_path = my_json_path, quiet = TRUE)
# make changes
# db$methods$new_method <- "This is a new method using {{technique}}."
# save back as JSON (default format)
# boilerplate_save(db, data_path = my_json_path, confirm = FALSE, quiet = TRUE)
# Example: Validate JSON database structure
# Note: This requires the JSON schema files to be installed
# validation_errors <- validate_json_database(
# file.path("my_project/data", "boilerplate_unified.json"),
# type = "unified"
# )
#
# if (length(validation_errors) == 0) {
# message("JSON structure is valid!")
# } else {
# message("Validation errors found:")
# print(validation_errors)
# }
By default, boilerplate stores database files using
tools::R_user_dir("boilerplate", "data")
for CRAN
compliance. However, there are many situations where you might need to
use a different location:
All key functions in the package (boilerplate_init()
,
boilerplate_import()
, boilerplate_save()
, and
boilerplate_export()
) accept a data_path
parameter to specify a custom location. When working with custom paths,
be sure to use the same path consistently across all functions.
# define your custom path
<- file.path("my_research_project", "data")
my_project_path
# Initialise databases in your custom location
boilerplate_init(
categories = c("measures", "methods", "results", "discussion", "appendix", "template"),
data_path = my_project_path, # Specify custom path here
create_dirs = TRUE,
confirm = FALSE,
quiet = TRUE
)
# import all databases from your custom location
<- boilerplate_import(
unified_db data_path = my_project_path # Specify the same custom path
)
# make some changes
$measures$new_measure <- list(
unified_dbname = "new measure scale",
description = "a newly added measure",
reference = "author2023",
waves = "1-2",
keywords = c("new", "test"),
items = list("test item 1", "test item 2")
)
# save changes back to your custom location
boilerplate_save(
db = unified_db,
data_path = my_project_path, # Specify the same custom path
confirm = TRUE
)
# to save just a specific category:
boilerplate_save(
db = unified_db$measures,
category = "measures",
data_path = my_project_path,
confirm = TRUE
)
The boilerplate package now supports projects - isolated namespaces that keep different boilerplate collections separate. This is ideal for:
All core functions now accept a project
parameter:
# Create a new project for shared lab content
boilerplate_init(
project = "lab_shared",
categories = c("methods", "measures"),
create_dirs = TRUE,
confirm = FALSE
)
# Import from a specific project
<- boilerplate_import(project = "lab_shared")
lab_db
# Add content to the lab project
$methods$ethics <- "This study was approved by {{institution}} ethics committee (ref: {{ethics_ref}})."
lab_db
# Save to the specific project
boilerplate_save(lab_db, project = "lab_shared")
# List all available projects
<- boilerplate_list_projects()
projects print(projects)
# Create personal and shared projects
boilerplate_init(project = "my_analysis", create_dirs = TRUE, confirm = FALSE, quiet = TRUE)
boilerplate_init(project = "team_templates", create_dirs = TRUE, confirm = FALSE, quiet = TRUE)
# Each project maintains its own isolated namespace
<- boilerplate_import(project = "my_analysis", quiet = TRUE)
my_db <- boilerplate_import(project = "team_templates", quiet = TRUE) team_db
Copy content between projects with conflict handling:
# Copy specific content from team templates to your project
boilerplate_copy_from_project(
from_project = "team_templates",
to_project = "my_analysis",
paths = c("methods.statistical", "measures.demographics"),
merge_strategy = "skip", # skip, overwrite, or rename
confirm = FALSE
)
# Example: Copy with a prefix to avoid naming conflicts
# First create the colleague's project
# boilerplate_init(project = "colleague_jane", create_dirs = TRUE, confirm = FALSE, quiet = TRUE)
# Then copy their content:
# boilerplate_copy_from_project(
# from_project = "colleague_jane",
# to_project = "my_analysis",
# paths = "measures.anxiety",
# prefix = "jane_", # results in "jane_anxiety"
# confirm = FALSE
# )
Both relative and absolute paths are supported:
# Example: relative path (relative to working directory)
# boilerplate_import(data_path = "my_project/data", quiet = TRUE)
# Example: absolute path
# boilerplate_import(data_path = "/Users/researcher/projects/study_2023/data", quiet = TRUE)
For portable code, consider using relative paths or the
file.path()
function to construct paths.
A common workflow in research labs involves maintaining a central boilerplate database on GitHub that team members copy for project-specific use:
# 1. Clone the central database from GitHub
# git clone https://github.com/yourlab/boilerplate-database.git
# 2. Copy the database files to your project
# cp -r boilerplate-database/.boilerplate-data my-project/.boilerplate-data
# 3. Import and use in your project (auto-detects format)
# db <- boilerplate_import(data_path = ".boilerplate-data")
# 4. Make project-specific changes
# db$methods$sample_size <- "We recruited {{n}} participants for {{study_name}}."
# 5. Save locally for your project
# boilerplate_save(db, data_path = ".boilerplate-data")
# For JSON format (now the default):
# boilerplate_save(db, data_path = ".boilerplate-data", format = "json")
# 6. If you make changes that should be shared:
# - Copy back to the central repository
# - Submit a pull request with your improvements
The boilerplate package now supports version management for your databases. When you save databases with timestamps or when backups are created, you can easily manage and restore these versions.
Use boilerplate_list_files()
to see all available
database files:
# List all database files in your data directory
# First ensure you have initialised a database:
# boilerplate_init(data_path = "my_project/data", create_dirs = TRUE, confirm = FALSE, quiet = TRUE)
# Then list files:
# files <- boilerplate_list_files(data_path = "my_project/data")
# print(files)
# List only methods database files
# files <- boilerplate_list_files(data_path = "my_project/data", category = "methods")
# List files from a specific period
# files <- boilerplate_list_files(data_path = "my_project/data", pattern = "202401") # January 2024 files
The function organises files into: - Standard files:
Current working versions (e.g., methods_db.rds
) -
Timestamped versions: Saved with timestamps (e.g.,
methods_db_20240115_143022.rds
) - Backup
files: Automatic backups (e.g.,
methods_db_backup_20240115_140000.rds
)
The enhanced boilerplate_import()
function can now
import any database file directly:
# Import database examples
# Note: These examples show the pattern - replace paths with your actual files
# Import the current standard version
# db <- boilerplate_import("methods")
# Import a specific timestamped version
# db <- boilerplate_import(data_path = "path/to/methods_db_20240115_143022.rds")
# Import a backup file
# db <- boilerplate_import(data_path = "path/to/methods_db_backup_20240115_140000.rds")
Use boilerplate_restore_backup()
for convenient backup
restoration:
# Backup restoration examples
# Note: These require existing backup files in your data directory
# View the latest backup without restoring
# backup_db <- boilerplate_restore_backup("methods")
# Restore the latest backup as the current version
# db <- boilerplate_restore_backup(
# category = "methods",
# restore = TRUE,
# confirm = TRUE # Will ask for confirmation
# )
# Restore a specific backup by timestamp
# db <- boilerplate_restore_backup(
# category = "methods",
# backup_version = "20240110_120000",
# restore = TRUE
# )
Here’s a typical workflow for managing versions:
# 1. Check what versions are available
# files <- boilerplate_list_files(data_path = "my_project/data", category = "methods")
# 2. Save current work with timestamp
# boilerplate_save(
# db = unified_db,
# data_path = "my_project/data",
# timestamp = TRUE, # Creates timestamped backup
# confirm = FALSE,
# quiet = TRUE
# )
# 3. If you need to revert changes, restore from backup
# boilerplate_restore_backup(
# data_path = "my_project/data",
# category = "methods",
# restore = TRUE,
# confirm = FALSE
# )
# 4. Work with specific versions
# List available backups first:
# backups <- boilerplate_list_files(data_path = "my_project/data", pattern = "backup")
# Then load a specific version if needed
Rather than creating separate .qmd
files, you can embed
boilerplate directly in your analysis code chunks:
# At the beginning of your analysis script or Quarto document
library(boilerplate)
# Define global variables
<- list(
study_params n_participants = 250,
study_name = "Study 1",
recruitment_method = "online panels",
analysis_software = "R version 4.3.0"
)
# Example 1: Using default location (recommended for persistent storage)
# The default location uses tools::R_user_dir() and includes project structure
# db <- boilerplate_import() # Uses default project
# Example 2: Using a temporary directory (for this example)
<- file.path(tempdir(), "analysis_example")
temp_analysis boilerplate_init(
data_path = temp_analysis,
create_dirs = TRUE,
create_empty = FALSE, # Load default content
confirm = FALSE,
quiet = TRUE
)
# Import database
<- boilerplate_import(data_path = temp_analysis, quiet = TRUE)
db
# Generate methods text when needed
<- boilerplate_generate_text(
methods_sample category = "methods",
sections = "sample.default", # Use full path to the default text
global_vars = study_params,
db = db
)
# Use the text directly in your document
cat("## Methods\n\n", methods_sample)
# Clean up
unlink(temp_analysis, recursive = TRUE)
# Example 3: For a real project with existing .boilerplate-data directory:
# If you have an existing directory structure, you may need to specify:
# db <- boilerplate_import(data_path = ".boilerplate-data/projects/default/data")
# Or initialise it first:
# boilerplate_init(data_path = ".boilerplate-data", create_dirs = TRUE)
You can still work with individual databases if preferred:
# Working with individual databases example
<- file.path(tempdir(), "individual_db_example")
temp_dir boilerplate_init(data_path = temp_dir, create_dirs = TRUE, confirm = FALSE, quiet = TRUE)
# Import just the methods database
<- boilerplate_import("methods", data_path = temp_dir, quiet = TRUE)
methods_db
# Add a new method entry
$sample_selection <- "Participants were selected from {{population}} during {{timeframe}}."
methods_db
# Save just the methods database
boilerplate_save(methods_db, "methods", data_path = temp_dir, confirm = FALSE, quiet = TRUE)
# generate text with variable substitution
<- boilerplate_generate_text(
methods_text category = "methods",
sections = c("sample.default", "sample_selection"),
global_vars = list(
population = "university students",
timeframe = "2020-2021"
),db = methods_db,
add_headings = TRUE
)
cat(methods_text)
# Clean up
unlink(temp_dir, recursive = TRUE)
The package supports initialising empty database structures by default, providing a clean slate for your project without sample content.
# Creating empty databases example
<- file.path(tempdir(), "empty_db_example")
temp_empty
# Initialise empty databases (default behavior)
boilerplate_init(
categories = c("methods", "results"),
data_path = temp_empty,
create_dirs = TRUE,
confirm = FALSE,
quiet = TRUE
)
# Check that databases are empty
<- boilerplate_import(data_path = temp_empty, quiet = TRUE)
db_empty print(length(db_empty$methods)) # Should be 0
# Clean up
unlink(temp_empty, recursive = TRUE)
# Initialise with default content when needed
<- file.path(tempdir(), "content_db_example")
temp_content boilerplate_init(
categories = c("methods", "results"),
data_path = temp_content,
create_dirs = TRUE,
create_empty = FALSE, # This loads default content
confirm = FALSE,
quiet = TRUE
)
# Check that databases have content
<- boilerplate_import(data_path = temp_content, quiet = TRUE)
db_content print(length(db_content$methods)) # Should be > 0
# Clean up
unlink(temp_content, recursive = TRUE)
Empty databases provide just the top-level structure without example content, making it easier to start with a clean slate.
The package now supports exporting databases for versioning or sharing specific elements:
# Export database example
<- file.path(tempdir(), "export_example")
temp_export boilerplate_init(data_path = temp_export, create_dirs = TRUE, confirm = FALSE, quiet = TRUE)
# Import database
<- boilerplate_import(data_path = temp_export, quiet = TRUE)
unified_db
# Export entire database for versioning
boilerplate_export(
db = unified_db,
output_file = "boilerplate_v1.0.json",
data_path = temp_export,
confirm = FALSE,
quiet = TRUE
)
# Export selected elements (specific methods and results)
boilerplate_export(
db = unified_db,
output_file = "causal_methods_subset.json",
select_elements = c("methods.statistical.*", "results.main_effect"),
data_path = temp_export,
confirm = FALSE,
quiet = TRUE
)
# Check exported files exist
list.files(temp_export, pattern = "\\.(json|rds)$")
# Clean up
unlink(temp_export, recursive = TRUE)
The export function supports: - Full database export (ideal for versioning) - Selective export using dot notation (e.g., “methods.statistical.longitudinal”) - Wildcard selections using “” (e.g., ”methods.” selects all methods) - Category-prefixed paths for unified databases
Export is distinct from save: use boilerplate_save()
for
normal database updates and boilerplate_export()
for
creating standalone exports.
The package provides a simplified way to manage measures and generate formatted text about them. Measures are stored as top-level entries in the measures database, with each measure containing standardised properties like name, description, reference, etc.
# Measures example with temporary directory
<- file.path(tempdir(), "measures_example")
temp_measures boilerplate_init(data_path = temp_measures, create_empty = FALSE, create_dirs = TRUE, confirm = FALSE, quiet = TRUE)
# Import the unified database
<- boilerplate_import(data_path = temp_measures, quiet = TRUE)
unified_db
# Add a measure directly to the unified database
# Note: Measures should be at the top level of the measures database
$measures$anxiety_gad7 <- list(
unified_dbname = "generalised anxiety disorder scale (GAD-7)",
description = "anxiety was measured using the GAD-7 scale.",
reference = "spitzer2006",
waves = "1-3",
keywords = c("anxiety", "mental health", "gad"),
items = list(
"feeling nervous, anxious, or on edge",
"not being able to stop or control worrying",
"worrying too much about different things",
"trouble relaxing"
)
)
# Save the entire unified database
boilerplate_save(unified_db, data_path = temp_measures, confirm = FALSE, quiet = TRUE)
# Alternatively, save just the measures portion
boilerplate_save(unified_db$measures, "measures", data_path = temp_measures, confirm = FALSE, quiet = TRUE)
# then generate text referencing the measure by its top-level name
<- boilerplate_generate_measures(
exposure_text variable_heading = "Exposure Variable",
variables = "anxiety_gad7", # match the name you used above
db = unified_db, # can pass the unified database
heading_level = 3,
subheading_level = 4,
print_waves = TRUE
)cat(exposure_text)
# you can also use the helper function to extract just the measures
<- boilerplate_measures(unified_db)
measures_db
# generate text for outcome variables using just the measures database
<- boilerplate_generate_measures(
psych_text variable_heading = "Psychological Outcomes",
variables = c("anxiety_gad7", "depression_phq9"),
db = measures_db, # or use the extracted measures database
heading_level = 3,
subheading_level = 4,
print_waves = TRUE
)cat(psych_text)
# generate statistical methods text
<- boilerplate_generate_text(
stats_text category = "methods",
sections = c("statistical.longitudinal.lmtp"),
global_vars = list(software = "R version 4.2.0"),
add_headings = TRUE,
custom_headings = list("statistical.longitudinal.lmtp" = "LMTP"),
heading_level = "###",
db = unified_db # pass the unified database
)
# initialise a sample text (assuming this was defined earlier)
<- boilerplate_generate_text(
sample_text category = "methods",
sections = "sample.default",
global_vars = list(population = "university students", timeframe = "2023-2024"),
db = unified_db
)
# combine all sections into a complete methods section
<- paste(
methods_section "## Methods\n\n",
"\n\n",
sample_text, "### Variables\n\n",
"\n",
exposure_text, "### Outcome Variables\n\n",
"\n\n",
psych_text,
stats_text,sep = ""
)cat(methods_section)
# Save the methods section to a file that can be included in a quarto document
# writeLines(methods_section, "methods_section.qmd")
# Clean up
unlink(temp_measures, recursive = TRUE)
When adding measures to the database:
boilerplate_generate_measures()
, use the top-level
nameIncorrect structure (avoid this):
# don't organise measures under categories at the top level
$measures$psychological$anxiety <- list(...) # WRONG unified_db
Correct structure:
# Add measures directly at the top level
$measures$anxiety_gad7 <- list(...) # CORRECT
unified_db$measures$depression_phq9 <- list(...) # CORRECT unified_db
The package includes powerful tools for standardising measure entries and reporting on database quality. This is particularly useful when working with legacy databases or when multiple contributors have added measures with inconsistent formatting.
The boilerplate_standardise_measures()
function
automatically cleans and standardises your measures:
# Standardisation example
<- file.path(tempdir(), "standardise_example")
temp_standard boilerplate_init(data_path = temp_standard, create_empty = FALSE, create_dirs = TRUE, confirm = FALSE, quiet = TRUE)
# Import your database
<- boilerplate_import(data_path = temp_standard, quiet = TRUE)
unified_db
# Check quality before standardisation
boilerplate_measures_report(unified_db$measures)
# Standardise all measures
$measures <- boilerplate_standardise_measures(
unified_db$measures,
unified_dbextract_scale = TRUE, # Extract scale info from descriptions
identify_reversed = TRUE, # Identify reversed items
clean_descriptions = TRUE, # Clean up description text
verbose = TRUE # Show what's being done
)
# Save the standardised database
boilerplate_save(unified_db, data_path = temp_standard, confirm = FALSE, quiet = TRUE)
# Clean up
unlink(temp_standard, recursive = TRUE)
Extracts Scale Information: Identifies and extracts scale details from descriptions
# Before:
= "Ordinal response: (1 = Strongly Disagree, 7 = Strongly Agree)"
description
# After:
= NULL # Removed if only contains scale info
description = "1 = Strongly Disagree, 7 = Strongly Agree"
scale_info = c("1 = Strongly Disagree", "7 = Strongly Agree") scale_anchors
Identifies Reversed Items: Detects items marked with (r), (reversed), etc.
# Items with (r) markers are identified
= list(
items "I have frequent mood swings.",
"I am relaxed most of the time. (r)",
"I get upset easily."
)# Creates: reversed_items = c(2)
Cleans Descriptions: Removes extra whitespace, fixes punctuation
Standardises References: Ensures consistent reference formatting
Ensures Complete Structure: All measures have standard fields
Use boilerplate_measures_report()
to assess your
measures database:
# get a quality overview
boilerplate_measures_report(unified_db$measures)
# Output:
# === Measures Database Quality Report ===
# Total measures: 180
# Complete descriptions: 165 (91.7%)
# With references: 172 (95.6%)
# With items: 180 (100.0%)
# With wave info: 178 (98.9%)
# Already standardised: 180 (100.0%)
# get detailed report as data frame
<- boilerplate_measures_report(
quality_report $measures,
unified_dbreturn_report = TRUE
)
# find measures missing information
<- quality_report[!quality_report$has_reference, ]
missing_refs <- quality_report[!quality_report$has_description, ]
missing_desc
# View specific measure details
View(quality_report)
You can also standardise individual measures or a subset:
# standardise only specific measures
$measures <- boilerplate_standardise_measures(
unified_db$measures,
unified_dbmeasure_names = c("anxiety_gad7", "depression_phq9", "self_esteem")
)
# or standardise a single measure
$measures$anxiety_gad7 <- boilerplate_standardise_measures(
unified_db$measures$anxiety_gad7
unified_db )
After standardisation, the
boilerplate_generate_measures()
function can better format
your measures:
# Generate formatted output with enhanced features
<- boilerplate_generate_measures(
measures_text variable_heading = "Psychological Measures",
variables = c("self_control", "neuroticism"),
db = unified_db,
table_format = TRUE, # Use table format
sample_items = 3, # Show only 3 items per measure
check_completeness = TRUE, # Note any missing information
quiet = TRUE # Suppress progress messages
)
cat(measures_text)
Example output:
### Psychological Measures
#### Self Control
| Field | Information |
|-------|-------------|
| Description | Self-control was measured using two items [@tangney_high_2004]. |
| Response Scale | 1 = Strongly Disagree, 7 = Strongly Agree |
| Waves | 5-current |
**Items:**
1. In general, I have a lot of self-control
2. I wish I had more self-discipline (r)
*(r) denotes reverse-scored item*
#### Neuroticism
| Field | Information |
|-------|-------------|
| Description | Mini-IPIP6 Neuroticism dimension [@sibley2011]. |
| Response Scale | 1 = Strongly Disagree, 7 = Strongly Agree |
| Waves | 1-current |
**Items:**
1. I have frequent mood swings.
2. I am relaxed most of the time. (r)
3. I get upset easily.
*(1 additional items not shown)*
*(r) denotes reverse-scored item*
boilerplate_export()
to create
a backup before standardisingThe package includes powerful functions for batch editing and cleaning your databases. These are particularly useful when you need to update multiple entries at once or clean up inconsistent formatting.
Use boilerplate_batch_edit()
to update specific fields
across multiple entries:
# First, ensure you have a database to work with
# Example using a temporary directory:
<- file.path(tempdir(), "batch_example")
temp_batch boilerplate_init(
data_path = temp_batch,
create_dirs = TRUE,
create_empty = FALSE, # FALSE loads example content with actual measures
confirm = FALSE,
quiet = TRUE
)
# Load your database
<- boilerplate_import(data_path = temp_batch, quiet = TRUE)
unified_db
# Example 1: Update specific references
<- boilerplate_batch_edit(
unified_db db = unified_db,
field = "reference",
new_value = "sibley2021",
target_entries = c("anxiety", "depression", "life_satisfaction"),
category = "measures"
)
# Example 2: Update all references containing "_reference"
<- boilerplate_batch_edit(
unified_db db = unified_db,
field = "reference",
new_value = "sibley2023",
match_pattern = "_reference",
category = "measures"
)
# Example 3: Use wildcards to target groups of entries
<- boilerplate_batch_edit(
unified_db db = unified_db,
field = "waves",
new_value = "1-15",
target_entries = "alcohol*", # All entries starting with "alcohol"
category = "measures"
)
# Example 4: Update entries with specific values
<- boilerplate_batch_edit(
unified_db db = unified_db,
field = "reference",
new_value = "sibley2024",
match_values = c("anxiety_reference", "depression_reference"),
category = "measures"
)
Always preview changes before applying them:
# preview what would change
boilerplate_batch_edit(
db = unified_db,
field = "reference",
new_value = "sibley2021",
target_entries = c("ban_hate_speech", "born_nz"),
category = "measures",
preview = TRUE
)
# output shows what would change:
# Preview of changes:
# ℹ ban_hate_speech: "dore2022boundaries" -> "sibley2021"
# ℹ born_nz: "sibley2011" -> "sibley2021"
# ✓ Would update 2 entries
Edit multiple fields in one operation:
# update both reference and waves for specific entries
<- boilerplate_batch_edit_multi(
unified_db db = unified_db,
edits = list(
list(
field = "reference",
new_value = "sibley2021",
target_entries = c("ban_hate_speech", "born_nz")
),list(
field = "waves",
new_value = "1-15",
target_entries = c("ban_hate_speech", "born_nz")
)
),category = "measures"
)
Clean up formatting issues across your database:
# Continue with the unified_db from previous examples
# Example 1: Remove unwanted characters from references
<- boilerplate_batch_clean(
unified_db db = unified_db,
field = "reference",
remove_chars = c("@", "[", "]"),
category = "measures"
)
# Example 2: Clean all entries EXCEPT specific ones
<- boilerplate_batch_clean(
unified_db db = unified_db,
field = "reference",
remove_chars = c("_", "[", "]"),
exclude_entries = c("anxiety", "depression"),
category = "measures"
)
# Example 3: Clean with pattern matching and exclusions
<- boilerplate_batch_clean(
unified_db db = unified_db,
field = "description",
remove_chars = c("(", ")"),
target_entries = "life_*", # All entries starting with "life_"
exclude_entries = "life_events", # Except this one (if it existed)
category = "measures"
)
# Example 4: Multiple cleaning operations
<- boilerplate_batch_clean(
unified_db db = unified_db,
field = "description",
remove_chars = c("(", ")"),
replace_pairs = list(" " = " "), # Replace double spaces with single
trim_whitespace = TRUE,
collapse_spaces = TRUE,
category = "measures"
)
# Save all the changes made through batch operations
boilerplate_save(unified_db, data_path = temp_batch, confirm = FALSE, quiet = TRUE)
# Clean up
unlink(temp_batch, recursive = TRUE)
Before cleaning, identify which entries contain specific characters:
# Using the same unified_db from previous examples
# Find all entries with problematic characters
<- boilerplate_find_chars(
entries_to_clean db = unified_db,
field = "reference",
chars = c("@", "[", "]"),
category = "measures"
)
# View the results
print(entries_to_clean)
# Find entries but exclude some from results
<- boilerplate_find_chars(
entries_to_clean db = unified_db,
field = "reference",
chars = c("@", "[", "]"),
exclude_entries = c("forgiveness", "special_*"),
category = "measures"
)
Here’s a complete workflow for cleaning up reference formatting:
# 1. First, see what needs cleaning
<- boilerplate_find_chars(
problem_refs db = unified_db,
field = "reference",
chars = c("@", "[", "]", " "),
category = "measures"
)
cat("Found", length(problem_refs), "references that need cleaning\n")
# 2. Preview the cleaning operation
boilerplate_batch_clean(
db = unified_db,
field = "reference",
remove_chars = c("@", "[", "]"),
replace_pairs = list(" " = "_"), # Replace spaces with underscores
trim_whitespace = TRUE,
category = "measures",
preview = TRUE
)
# 3. Apply the cleaning
<- boilerplate_batch_clean(
unified_db db = unified_db,
field = "reference",
remove_chars = c("@", "[", "]"),
replace_pairs = list(" " = "_"),
trim_whitespace = TRUE,
category = "measures",
confirm = TRUE # Will ask for confirmation
)
# 4. Save the cleaned database
boilerplate_save(unified_db)
Always preview first: Use
preview = TRUE
to see what will change
Make backups: Export your database before major changes
boilerplate_export(unified_db, output_file = "backup_before_cleaning.rds")
Use exclusions carefully: Some entries might have special formatting requirements
Test on subsets: Try operations on a few entries before applying to all
Document changes: Keep notes about what was changed and why
Standardising References
# Convert various reference formats to consistent style
<- boilerplate_batch_clean(
unified_db db = unified_db,
field = "reference",
remove_chars = c("@", "[", "]", "(", ")"),
replace_pairs = list(
" " = "", # Remove spaces
"," = "_", # Replace commas
"&" = "and" # Replace ampersands
),category = "measures"
)
Updating Wave Information
# Update all measures from a specific wave range
<- boilerplate_batch_edit(
unified_db db = unified_db,
field = "waves",
new_value = "1-16",
match_values = c("1-15", "1-current"),
category = "measures"
)
Fixing Description Formatting
# Clean up description formatting issues
<- boilerplate_batch_clean(
unified_db db = unified_db,
field = "description",
replace_pairs = list(
".." = ".", # Fix double periods
" ." = ".", # Fix space before period
" " = " " # Fix double spaces
),trim_whitespace = TRUE,
category = "measures"
)
These batch operations make it easy to maintain consistency across your entire database, especially when dealing with legacy data or contributions from multiple sources.
The package supports appendix content that can be managed within the unified database:
# import the unified database
<- boilerplate_import()
unified_db
# add detailed measures documentation to appendix
$appendix$detailed_measures <- "# Detailed Measures Documentation\n\n## Overview\n\nThis appendix provides comprehensive documentation for all measures used in this study, including full item text, response options, and psychometric properties.\n\n## {{exposure_var}} Measure\n\n{{exposure_details}}\n\n## Outcome Measures\n\n{{outcome_details}}"
unified_db
# save the changes to the unified database
boilerplate_save(unified_db)
# generate appendix text with variable substitution
<- boilerplate_generate_text(
appendix_text category = "appendix",
sections = c("detailed_measures"),
global_vars = list(
exposure_var = "Perfectionism",
exposure_details = "The perfectionism measure consists of 3 items...",
outcome_details = "Anxiety was measured using the GAD-7 scale..."
),db = unified_db # pass the unified database
)
cat(appendix_text)
You can create complete workflows that integrate methods, results, and templates using the unified database:
# import the unified database
<- boilerplate_import()
unified_db
# function to generate a complete document from a template
<- function(template_name, study_params, section_contents, db) {
generate_document # extract the template using the boilerplate_template helper
<- boilerplate_template(db, template_name)
template_text
# apply template variables (combining study params and section contents)
<- c(study_params, section_contents)
all_vars
# replace placeholders in template
for (var_name in names(all_vars)) {
<- paste0("{{", var_name, "}}")
placeholder <- gsub(placeholder, all_vars[[var_name]], template_text, fixed = TRUE)
template_text
}
return(template_text)
}
# define study parameters
<- list(
study_params title = "Political Orientation and Social Wellbeing in New Zealand",
authors = "Jane Smith, John Doe, and Robert Johnson",
date = format(Sys.Date(), "%B %d, %Y")
)
# define section contents
<- list(
section_contents abstract = "This study investigates the causal effects of political orientation on social wellbeing using data from the New Zealand Attitudes and Values Study.",
introduction = "Understanding the causal effects of political orientation on wellbeing has important implications for social policy and public health...",
methods_sample = "Participants were recruited from university students during 2020-2021.",
methods_measures = "Political orientation was measured using a 7-point scale...",
methods_statistical = "We used the LMTP estimator to address confounding...",
results = "Our analysis revealed reliable causal effects of political conservatism on social wellbeing...",
discussion = "These findings suggest that political orientation may causally influence wellbeing along the following dimensions..."
)
# generate the document
<- generate_document(
journal_article template_name = "journal_article",
study_params = study_params,
section_contents = section_contents,
db = unified_db
)
cat(substr(journal_article, 1, 2500), "...")
You can create tailored reports for different audiences from the same underlying data:
# import the unified database
<- boilerplate_import()
unified_db
# add audience-specific LMTP descriptions
$methods$statistical_estimator$lmtp$technical_audience <- "We estimate causal effects using the Longitudinal Modified Treatment Policy (LMTP) estimator within a Targeted Minimum Loss-based Estimation (TMLE) framework. This semi-parametric estimator leverages the efficient influence function (EIF) to achieve double robustness and asymptotic efficiency."
unified_db
$methods$statistical_estimator$lmtp$applied_audience <- "We estimate causal effects using the LMTP estimator. This approach combines machine learning with causal inference methods to estimate treatment effects while avoiding strict parametric assumptions."
unified_db
$methods$statistical_estimator$lmtp$general_audience <- "We used advanced statistical methods that account for multiple factors that might influence both {{exposure_var}} and {{outcome_var}}. This method helps us distinguish between mere association and actual causal effects."
unified_db
# save the updated unified database
boilerplate_save(unified_db)
# function to generate methods text for different audiences
<- function(audience = c("technical", "applied", "general"), db) {
generate_methods_by_audience <- match.arg(audience)
audience
# select appropriate paths based on audience
<- paste0("statistical_estimator.lmtp.", audience, "_audience")
lmtp_path
# generate text
boilerplate_generate_text(
category = "methods",
sections = c("sample.default", lmtp_path),
global_vars = list(
exposure_var = "political_conservative",
outcome_var = "social_wellbeing"
),db = db
)
}
# generate reports for different audiences
<- generate_methods_by_audience("technical", unified_db)
technical_report <- generate_methods_by_audience("applied", unified_db)
applied_report <- generate_methods_by_audience("general", unified_db)
general_report
cat("General audience report:\n\n", general_report)
The unified database approach includes several helper functions to extract specific categories:
# import the unified database
<- boilerplate_import()
unified_db
# extract specific categories using helper functions
<- boilerplate_methods(unified_db)
methods_db <- boilerplate_measures(unified_db)
measures_db <- boilerplate_results(unified_db)
results_db <- boilerplate_discussion(unified_db)
discussion_db <- boilerplate_appendix(unified_db)
appendix_db <- boilerplate_template(unified_db)
template_db
# extract specific items using dot notation
<- boilerplate_methods(unified_db, "statistical.longitudinal.lmtp")
lmtp_method <- boilerplate_measures(unified_db, "anxiety_gad7")
anxiety_measure <- boilerplate_results(unified_db, "main_effect")
main_result
# you can also directly access via the list structure
<- unified_db$methods$causal_assumptions$identification causal_assumptions
The package supports document templates that can be used to create complete documents with placeholders for dynamic content:
# import unified database
<- boilerplate_import()
unified_db
# add a custom conference abstract template
$template$conference_abstract <- "# {{title}}\n\n**Authors**: {{authors}}\n\n## Background\n{{background}}\n\n## Methods\n{{methods}}\n\n## Results\n{{results}}"
unified_db
# save the updated unified database
boilerplate_save(unified_db)
# generate a document from template with variables
<- boilerplate_generate_text(
abstract_text category = "template",
sections = "conference_abstract",
global_vars = list(
title = "Effect of Political Orientation on Well-being",
authors = "Smith, J., Jones, A.",
background = "Previous research has shown mixed findings...",
methods = "We used data from a longitudinal study (N=47,000)...",
results = "We found significant positive effects..."
),db = unified_db
)
cat(abstract_text)
This example demonstrates combining multiple components to create a complete methods section using the unified database approach:
# initialise all databases and import them
boilerplate_init(create_dirs = TRUE, confirm = TRUE)
<- boilerplate_import()
unified_db
# add perfectionism measure to the unified database
$measures$perfectionism <- list(
unified_dbname = "perfectionism",
description = "Perfectionism was measured using a 3-item scale assessing maladaptive perfectionism tendencies.",
reference = "rice_short_2014",
waves = "10-current",
keywords = c("personality", "mental health"),
items = list(
"Doing my best never seems to be enough.",
"My performance rarely measures up to my standards.",
"I am hardly ever satisfied with my performance."
)
)
# save the updated unified database
boilerplate_save(unified_db)
# define parameters
<- list(
study_params exposure_var = "perfectionism",
population = "New Zealand Residents Enroled in Electoral Roll in 2021",
timeframe = "2021-2025",
sampling_method = "convenience"
)
# generate methods text for participant selection
<- boilerplate_generate_text(
sample_text category = "methods",
sections = c("sample_selection"),
global_vars = study_params,
add_headings = TRUE,
heading_level = "###",
db = unified_db
)cat(sample_text)
# generate measures text for exposure variable
<- boilerplate_generate_measures(
exposure_text variable_heading = "Exposure Variable",
variables = "perfectionism",
heading_level = 3,
subheading_level = 4,
print_waves = TRUE,
db = unified_db
)
cat(exposure_text)
To cite the boilerplate package in publications, please use:
Bulbulia, J. (2025). boilerplate: Tools for Managing and Generating Standardised Text for Scientific Reports. R package version 1.2.0 https://doi.org/10.5281/zenodo.13370825
A BibTeX entry for LaTeX users:
@software{bulbulia_boilerplate_2025,
author = {Bulbulia, Joseph},
title = {{boilerplate: Tools for Managing and Generating
Standardised Text for Scientific Reports}},
year = 2025,
publisher = {Zenodo},
version = {1.3.0},
doi = {10.5281/zenodo.13370825},
url = {https://github.com/go-bayes/boilerplate}
}
MIT © Joseph Bulbulia
For specific workflows: - JSON support: See
vignette("boilerplate-json-workflow")
- Quarto integration:
See vignette("boilerplate-quarto-workflow")
- Getting
started: See vignette("boilerplate-intro")
d ### Example
Files
The package includes example files in the inst/
directory: - Quarto example:
system.file("examples", "minimal-quarto-example.qmd", package = "boilerplate")
- JSON workflows: See files in
system.file("examples/json-examples", package = "boilerplate")
- Example data: CSV and JSON examples in
system.file("extdata", package = "boilerplate")
The boilerplate
package is under active development.
Here’s our planned roadmap for upcoming features:
Enhanced Documentation and Examples (v1.3.1) - Comprehensive example testing framework - Enhanced vignette coverage for all workflows - Improved error messages with helpful suggestions - Video tutorials for common workflows
Enhanced Type Safety (v1.4.x) - Implementation of S3 classes for all database objects - Custom print methods for cleaner output - Validation methods for database integrity - Better IDE support with autocompletion
Modern R Infrastructure (v2.0) - Migration to S7 object system (once stable) - Performance optimizations - Extended validation framework - Advanced project management features
Our development follows these principles: - Backward compatibility: No breaking changes without major version bump - User-first design: Features driven by real research needs - Type safety: Progressive enhancement of type checking - Modern R practices: Adoption of new standards as they mature
We welcome feedback and contributions! Please see our contribution guidelines for more information.