This function maps the codes from multiple streams onto a primary stream.
Usage
sync_streams(
x,
primaryStream,
columns = NULL,
anchorsCol = rock::opts$get("anchorsCol"),
sourceId = rock::opts$get("sourceId"),
streamId = rock::opts$get("streamId"),
prependStreamIdToColName = FALSE,
appendStreamIdToColName = TRUE,
sep = " ",
fill = TRUE,
paddingValue = NA,
neverFill = grep("_raw$", names(x$qdt), value = TRUE),
compressFun = NULL,
compressFunPart = NULL,
expandFun = NULL,
carryOverAnchors = FALSE,
colNameGlue = rock::opts$get("colNameGlue"),
silent = rock::opts$get("silent")
)
Arguments
- x
The object with the parsed sources.
- primaryStream
The identifier of the primary stream.
- columns
The names of the column(s) to synchronize.
- anchorsCol
The column containing the anchors.
- sourceId
The column containing the source identifiers.
- streamId
The column containing the stream identifiers.
- prependStreamIdToColName, appendStreamIdToColName
Whether to append or prepend the stream identifier before merging the dataframes together.
- sep
When not specifying
compressFun
andcompressFunPart
, thepaste
function is used to combine elements, and in that case,sep
is passed topaste
as separator.- fill
When expanding streams, whether to duplicate elements to fill the resulting vector. Ignored if
fillFun
is specified.- paddingValue
The value to insert for rows when not filling (by default, filling carries over the value from the last preceding row that had a value specified).
- neverFill
Columns to never fill regardless of whether fill is
TRUE
. Set toNULL
to always respect the setting offill
. By default, the raw versions of the class instance identification columns are never duplicated (found with regular expression"_raw$"
), since those are used for state transition computations.- compressFun
If specified, when compressing streams, instead of pasting elements together using separator
sep
, the vectors are passed to functioncompressFun
, which must accept a vector (to compress) and a single integer (with the desired resulting length of the vector).- compressFunPart
A function to apply to the segments that are automatically created; this can be passed instead of
compressFun
.- expandFun
If specified, when expanding streams, instead of potentially filling the new larger vector with elements (if
fill
isTRUE
), the vectors are passed to functionexpandFun
, which must accept a vector (to compress) and a single integer (with the desired resulting length of the vector).- carryOverAnchors
Whether to carry over anchors for each source
- colNameGlue
When appending or prepending stream identifiers, the character(s) to use as "glue" or separator.
- silent
Whether to be silent (
TRUE
) or chatty (FALSE
).
Value
The object with parsd sources, x
, with the synchronization results
added in the $syncResults
subobject.
Examples
### Get a directory with example sources
examplePath <-
file.path(
system.file(package="rock"),
'extdata',
'streams'
);
### Parse the sources
parsedSources <- rock::parse_sources(
examplePath
);
### Add a dataframe, syncing all streams to primary stream !
parsedSources <- rock::sync_streams(
parsedSources,
primaryStream = "streamA",
columns = c("Code1", "Code2", "Code3"),
prependStreamIdToColName = TRUE
);
### Look at two examples
parsedSources$syncResults$qdt[
,
c("streamB_Code3_streamB", "streamC_Code1_streamC")
];
#> streamB_Code3_streamB streamC_Code1_streamC
#> 1 0 0
#> 2 0 0
#> 3 1 0
#> 4 1 0
#> 5 0 0
#> 6 0 1
#> 7 1 1 0 0
#> 8 0 0 0
#> 9 0 0 1 0
#> 10 0 0
#> 11 0 0 1
#> 12 <NA> <NA>
#> 13 0 0
#> 14 0 0 1
#> 15 0 0 0 0
#> 16 0 0
#> 17 0 0
#> 18 0 1
#> 19 0 0 1
#> 20 0 0
#> 21 1 0 1
#> 22 0 0
#> 23 1 0
#> 24 <NA> <NA>