R/collapse_occurrences.R
collapse_occurrences.Rd
This function collapses all occurrences into groups
sharing the same identifier, by default the stanzaId
identifier ([[sid=..]]
).
collapse_occurrences(
parsedSource,
collapseBy = "stanzaId",
columns = NULL,
logical = FALSE
)
The parsed sources as provided by parse_source()
.
The column in the sourceDf
(in the parsedSource
object)
to collapse by (i.e. the column specifying the groups to collapse).
The columns to collapse; if unspecified (i.e. NULL
), all
codes stored in the code
object in the codings
object in the
parsedSource
object are taken (i.e. all used codes in the parsedSource
object).
Whether to return the counts of the occurrences (FALSE
) or
simply whether any code occurreded in the group at all (TRUE
).
A dataframe with one row for each value of of collapseBy
and columns
for collapseBy
and each of the columns
, with in the cells the counts (if
logical
is FALSE
) or TRUE
or FALSE
(if logical
is TRUE
).
### Get path to example source
exampleFile <-
system.file("extdata", "example-1.rock", package="rock");
### Parse example source
parsedExample <-
rock::parse_source(exampleFile);
### Collapse logically, using a code (either occurring or not):
collapsedExample <-
rock::collapse_occurrences(parsedExample,
collapseBy = 'childCode1');
### Show result: only two rows left after collapsing,
### because 'childCode1' is either 0 or 1:
collapsedExample;
#> childCode1 childCode1 childCode2 childCode3 childCode4 childCode5
#> 1 0 0 0 1 1 1
#> 2 1 2 1 0 0 0
#> grandchildCode1 grandchildCode2 grandchildCode3 grandchildCode4
#> 1 1 3 1 1
#> 2 0 0 0 0
#> grandchildCode5 grandchildCode6 grandchildCode7 someOtherCode
#> 1 1 1 1 1
#> 2 0 0 0 0
### Collapse using weights (i.e. count codes in each segment):
collapsedExample <-
rock::collapse_occurrences(parsedExample,
collapseBy = 'childCode1',
logical=FALSE);