This function conditionally splits a code into multiple codes. Note that you
may want to use recode_addChildCodes()
instead to not lose the
original coding.
recode_split(
input,
codes,
splitToCodes,
filter = TRUE,
output = NULL,
filenameRegex = ".*",
outputPrefix = "",
outputSuffix = "_recoded",
decisionLabel = NULL,
justification = NULL,
justificationFile = NULL,
preventOverwriting = rock::opts$get("preventOverwriting"),
encoding = rock::opts$get("encoding"),
silent = rock::opts$get("silent")
)
One of 1) a character string specifying the path to a file
with a source; 2) an object with a loaded source as produced by a call
to load_source()
; 3) a character string specifying the path to a directory
containing one or more sources; 4) or an object with a list of loaded
sources as produced by a call to load_sources()
.
A single character value with the code to split.
A named list with specifying when to split to which
new code. Each element of this list is a filtering criterion that will be
passed on to get_source_filter()
to create the actual filter that
will be applied. The name of each element is the code that will be applied
to utterances matching that filter. When calling recode_split()
for a
single source, instead of passing the filtering criterion, it is also
possible to pass a filter (i.e. the result of the call to
get_source_filter()
), which allows more finegrained control. Note
that these split filters and the corresponding codes are processed
sequentially in the order specified in splitToCodes
. This means that once
an utterance that was coded with codes
has been matched to one of these
'split filters' (and so, recoded with the corresponding 'split code', i.e.,
with the name of that split filter in splitToCodes
), it will not be
recoded again even if it also matches with other split filters down the
line. Any utterances coded with the code to split up (i.e. specified in
codes
) that do not match with any of the split filters specified as the
splitToCodes
elements will not be recoded and so remain coded with
codes
. To create a catch-all ('else') category, pass ".*"
or TRUE
as
a filter (see the example).
Optionally, a filter to apply to specify a subset of the
source(s) to process (see get_source_filter()
).
If specified, the recoded source(s) will be written here.
Only process files matching this regular expression.
The prefix and suffix to add to the filenames when writing the processed files to disk, in case multiple sources are passed as input.
A description of the (recoding) decision that was taken.
The justification for this action.
If specified, the justification is appended to
this file. If not, it is saved to the justifier::workspace()
. This can
then be saved or displayed at the end of the R Markdown file or R script
using justifier::save_workspace()
.
Whether to prevent overwriting existing files
when writing the files to output
.
The encoding to use.
Whether to be chatty or quiet.
Invisibly, the changed source(s) or source(s) object.
### Get path to example source
examplePath <-
system.file("extdata", package="rock");
### Get a path to one example file
exampleFile <-
file.path(examplePath, "example-1.rock");
### Load example source
loadedExample <- rock::load_source(exampleFile);
### Split a code into two codes, showing progress
recoded_source <-
rock::recode_split(
loadedExample,
codes="childCode1",
splitToCodes = list(
and_REPLACED = " and ",
book_REPLACED = "book",
else_REPLACED = TRUE
),
silent=FALSE
);
#> Creating 3 source filters.
#> Splitting filtered/matching occurrences of code 'childCode1' into 'and_REPLACED', 'book_REPLACED' & 'else_REPLACED'.
#> Using regular expression '(\[\[|>)childCode1(\]\]|>)'.
#>
#> Out of the 132 utterances in the provided source, 8 match both the general filter and the split filter for 'and_REPLACED' and have not yet been matched by a previous split filter. Of these, 2 have been coded with code 'childCode1' and will now be coded with code 'and_REPLACED'.
#> --------PRE: Lorem Ipsum is simply dummy text of the printing and typesetting industry. [[parentCode1>childCode1]]
#> POST: Lorem Ipsum is simply dummy text of the printing and typesetting industry. [[parentCode1>and_REPLACED]]
#> --------PRE: by accident, sometimes on purpose (injected humour and the like). [[parentCode1>childCode1>grandchildCode3]]
#> POST: by accident, sometimes on purpose (injected humour and the like). [[parentCode1>and_REPLACED>grandchildCode3]]
#>
#> Out of the 132 utterances in the provided source, 2 match both the general filter and the split filter for 'book_REPLACED' and have not yet been matched by a previous split filter. Of these, 1 have been coded with code 'childCode1' and will now be coded with code 'book_REPLACED'.
#> --------PRE: ~specimen book. [[parentCode1>childCode2]] [[childCode1]] [[intensity||2]]
#> POST: ~specimen book. [[parentCode1>childCode2]] [[book_REPLACED]] [[intensity||2]]
#>
#> Out of the 132 utterances in the provided source, 132 match both the general filter and the split filter for 'else_REPLACED' and have not yet been matched by a previous split filter. Of these, 3 have been coded with code 'childCode1' and will now be coded with code 'else_REPLACED'.
#> --------PRE: using 'Content here, content here', making it look like readable English. [[parentCode1>childCode1>grandchildCode1]]
#> POST: using 'Content here, content here', making it look like readable English. [[parentCode1>else_REPLACED>grandchildCode1]]
#> --------PRE: ~still in their infancy. [[parentCode1>childCode1>grandchildCode2]]
#> POST: ~still in their infancy. [[parentCode1>else_REPLACED>grandchildCode2]]
#> --------PRE: accompanied by English versions from the 1914 translation by H. Rackham. [[childCode1>grandchildCode2]]
#> POST: accompanied by English versions from the 1914 translation by H. Rackham. [[else_REPLACED>grandchildCode2]]
#>
#> Split 6 instances of code 'childCode1' into 'and_REPLACED', 'book_REPLACED' & 'else_REPLACED'.
#>