
Expand categorical attribute variables to a series of dichotomous variables
Source:R/expand_attributes.R
expand_attributes.Rd
Expand categorical attribute variables to a series of dichotomous variables
Usage
expand_attributes(
data,
attributes,
valueLabels = NULL,
prefix = "",
glue = "__",
suffix = "",
falseValue = 0,
trueValue = 1,
valueFirst = TRUE,
append = TRUE
)
Arguments
- data
The data frame, normally the
$qdt
data frame that exists in the object returned by a call toparse_sources()
.- attributes
The name of the attribute(s) to expand.
- valueLabels
It's possible to use different names for the created variables than the values of the attributes. This can be set with the
valueLabels
argument. If only one attribute is specified, pass a named vector forvalueLabels
, and if multiple attributes are specified, pass a named list of named vectors, where the name of each vector corresponds to an attribute passed inattributes
. The names of the vector elements must correspond to the values of the attributes (see the example).- prefix, suffix
The prefix and suffix to add to the variables names that are returned.
- glue
The glue to paste the first part ad the second part of the composite variable name together.
- falseValue, trueValue
The values to set for rows that, respectively, do not match and do match an attribute value.
- valueFirst
Whether to insert the attribute value first, or the attribute name, in the composite variable names.
- append
Whether to append the columns to the supplied data frame or not.
Examples
### Get path to example source
examplePath <-
system.file("extdata", package="rock");
### Get a path to one example file
exampleFile <-
file.path(examplePath, "example-1.rock");
### Parse single example source
parsedExample <- rock::parse_source(exampleFile);
### Create a categorical attribute column
parsedExample$qdt$age_group <-
c(rep(c("<18", "18-30", "31-60", ">60"),
each=19),
rep(c("<18", ">60"),
time = c(3, 4)));
### Expand to four logical columns
parsedExample$qdt <-
rock::expand_attributes(
parsedExample$qdt,
"age_group",
valueLabels =
c(
"<18" = "youngest",
"18-30" = "youngish",
"31-60" = "oldish",
">60" = "oldest"
),
valueFirst = FALSE
);
### Show some of the result
table(parsedExample$qdt$age_group,
parsedExample$qdt$age_group__youngest);
#>
#> 0 1
#> 18-30 19 0
#> 31-60 19 0
#> <18 0 22
#> >60 23 0
table(parsedExample$qdt$age_group,
parsedExample$qdt$age_group__oldish);
#>
#> 0 1
#> 18-30 19 0
#> 31-60 0 19
#> <18 22 0
#> >60 23 0