This function conditionally splits a code into multiple codes. Note that you
may want to use recode_addChildCodes()
instead to not lose the
original coding.
Usage
recode_split(
input,
codes,
splitToCodes,
filter = TRUE,
output = NULL,
filenameRegex = ".*",
outputPrefix = "",
outputSuffix = "_recoded",
decisionLabel = NULL,
justification = NULL,
justificationFile = NULL,
preventOverwriting = rock::opts$get("preventOverwriting"),
encoding = rock::opts$get("encoding"),
silent = rock::opts$get("silent")
)
Arguments
- input
One of 1) a character string specifying the path to a file with a source; 2) an object with a loaded source as produced by a call to
load_source()
; 3) a character string specifying the path to a directory containing one or more sources; 4) or an object with a list of loaded sources as produced by a call toload_sources()
.- codes
A single character value with the code to split.
- splitToCodes
A named list with specifying when to split to which new code. Each element of this list is a filtering criterion that will be passed on to
get_source_filter()
to create the actual filter that will be applied. The name of each element is the code that will be applied to utterances matching that filter. When callingrecode_split()
for a single source, instead of passing the filtering criterion, it is also possible to pass a filter (i.e. the result of the call toget_source_filter()
), which allows more finegrained control. Note that these split filters and the corresponding codes are processed sequentially in the order specified insplitToCodes
. This means that once an utterance that was coded withcodes
has been matched to one of these 'split filters' (and so, recoded with the corresponding 'split code', i.e., with the name of that split filter insplitToCodes
), it will not be recoded again even if it also matches with other split filters down the line. Any utterances coded with the code to split up (i.e. specified incodes
) that do not match with any of the split filters specified as thesplitToCodes
elements will not be recoded and so remain coded withcodes
. To create a catch-all ('else') category, pass".*"
orTRUE
as a filter (see the example).- filter
Optionally, a filter to apply to specify a subset of the source(s) to process (see
get_source_filter()
).- output
If specified, the recoded source(s) will be written here.
- filenameRegex
Only process files matching this regular expression.
- outputPrefix, outputSuffix
The prefix and suffix to add to the filenames when writing the processed files to disk, in case multiple sources are passed as input.
- decisionLabel
A description of the (recoding) decision that was taken.
- justification
The justification for this action.
- justificationFile
If specified, the justification is appended to this file. If not, it is saved to the
justifier::workspace()
. This can then be saved or displayed at the end of the R Markdown file or R script usingjustifier::save_workspace()
.- preventOverwriting
Whether to prevent overwriting existing files when writing the files to
output
.- encoding
The encoding to use.
- silent
Whether to be chatty or quiet.
Examples
### Get path to example source
examplePath <-
system.file("extdata", package="rock");
### Get a path to one example file
exampleFile <-
file.path(examplePath, "example-1.rock");
### Load example source
loadedExample <- rock::load_source(exampleFile);
### Split a code into two codes, showing progress
recoded_source <-
rock::recode_split(
loadedExample,
codes="childCode1",
splitToCodes = list(
and_REPLACED = " and ",
book_REPLACED = "book",
else_REPLACED = TRUE
),
silent=FALSE
);
#> Creating 3 source filters.
#> Splitting filtered/matching occurrences of code 'childCode1' into 'and_REPLACED', 'book_REPLACED' & 'else_REPLACED'.
#> Using regular expression '(\[\[|>)childCode1(\]\]|>)'.
#>
#> Out of the 132 utterances in the provided source, 8 match both the general filter and the split filter for 'and_REPLACED' and have not yet been matched by a previous split filter. Of these, 2 have been coded with code 'childCode1' and will now be coded with code 'and_REPLACED'.
#> --------PRE: Lorem Ipsum is simply dummy text of the printing and typesetting industry. [[parentCode1>childCode1]]
#> POST: Lorem Ipsum is simply dummy text of the printing and typesetting industry. [[parentCode1>and_REPLACED]]
#> --------PRE: by accident, sometimes on purpose (injected humour and the like). [[parentCode1>childCode1>grandchildCode3]]
#> POST: by accident, sometimes on purpose (injected humour and the like). [[parentCode1>and_REPLACED>grandchildCode3]]
#>
#> Out of the 132 utterances in the provided source, 2 match both the general filter and the split filter for 'book_REPLACED' and have not yet been matched by a previous split filter. Of these, 1 have been coded with code 'childCode1' and will now be coded with code 'book_REPLACED'.
#> --------PRE: ~specimen book. [[parentCode1>childCode2]] [[childCode1]] [[intensity||2]]
#> POST: ~specimen book. [[parentCode1>childCode2]] [[book_REPLACED]] [[intensity||2]]
#>
#> Out of the 132 utterances in the provided source, 132 match both the general filter and the split filter for 'else_REPLACED' and have not yet been matched by a previous split filter. Of these, 3 have been coded with code 'childCode1' and will now be coded with code 'else_REPLACED'.
#> --------PRE: using 'Content here, content here', making it look like readable English. [[parentCode1>childCode1>grandchildCode1]]
#> POST: using 'Content here, content here', making it look like readable English. [[parentCode1>else_REPLACED>grandchildCode1]]
#> --------PRE: ~still in their infancy. [[parentCode1>childCode1>grandchildCode2]]
#> POST: ~still in their infancy. [[parentCode1>else_REPLACED>grandchildCode2]]
#> --------PRE: accompanied by English versions from the 1914 translation by H. Rackham. [[childCode1>grandchildCode2]]
#> POST: accompanied by English versions from the 1914 translation by H. Rackham. [[else_REPLACED>grandchildCode2]]
#>
#> Split 6 instances of code 'childCode1' into 'and_REPLACED', 'book_REPLACED' & 'else_REPLACED'.
#>