The goal of {charcuterie} is to finally have strings as iterable character vectors.
You can install the published version from CRAN with
install.packages("charcuterie")
You can install the development version with
# install.packages("remotes")
::install_github("jonocarroll/charcuterie") remotes
See this blog post: https://jcarroll.com.au/2024/08/03/charcuterie-what-if-strings-were-iterable-in-r/
Most programming languages seem to treat a string as an array of
characters, but not R, where a “string” is an object of type “character”
which has a length of 1. The number of ‘characters’ in a string is
obtained via nchar(x)
but otherwise, the individual
‘characters’ comprising the string are rarely exposed.
The most common route around this limitation is to split the string into smaller strings, each containing a single character, i.e.
strsplit("string", split = "")
#> [[1]]
#> [1] "s" "t" "r" "i" "n" "g"
which produces a list of strings, each a single character. This is cumbersome to type out, so this package offers a cleaner approach (which does the above all the same)
library(charcuterie)
#>
#> Attaching package: 'charcuterie'
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, union
<- chars("string")
s
s#> [1] "string"
This looks like it did nothing, but that’s the point - it still looks like a “string”. It’s actually a vector, though
unclass(s)
#> [1] "s" "t" "r" "i" "n" "g"
This means you can finally do vector things with it, like reverse it
rev(s)
#> [1] "gnirts"
or sort it
sort(s)
#> [1] "ginrst"
or index into it
3]
s[#> [1] "r"
or count elements
count(chars("strawberry"), "r")
#> [1] 3
{charcuterie} defines S3 methods of functions for a wide range of operations to be performed on a string built from a vector of characters
[
c
format
and print
head
and tail
rev
sort
setdiff
, union
,
intersect
, and a new except
unique
,
toupper
, and tolower
For more detailed usage examples, see the vignettes.