The factors are the variable in R, which takes the categorical variable and stores data in levels. Factors can be ordered or unordered and are an important class for statistical analysis and for plotting.
The function factor
is used to encode a vector as a factor. Therefore, If the argument ordered
is TRUE
, the factor levels are assumed to be ordered.
is.factor
, is.ordered
, as.factor
and as.ordered
are the membership and coercion functions for these classes.
Factors are created using the factor ()
function by taking a vector as input.
# let's create a vector
> logical <- c("TRUE", "FALSE", "TRUE", "TRUE", "FALSE")
> logical.factor <- factor(logical)
>logical.factor
[1] TRUE FALSE TRUE TRUE FALSE
Levels: FALSE TRUE
> levels(factor(logical))
[1] "FALSE" "TRUE"
Therefore, ‘Levels’ help us to sort alphabetically.
Structure of factor function
So, we can see the structure of factor with str()
function.
> str(logical.factor)
Factor w/ 2 levels "FALSE","TRUE": 2 1 2 2 1
Changing the Order of Levels
Moreover, we can change the order of levels with applying factor function with the new order of levels.
# Create a vector
> data <- c("East","South","East","North","North","West","West","East","North")
# Create the factors
> data.factor <- factor(data)
> print(factor_data)
[1] East South East North North West West East North
Levels: East North South West
# Apply the factor function with required order of the level.
> data.order <- factor(data.factor, levels = c("North", "South", "East", "West"))
> print(new_order_data)
[1] East South East North North West West East North
Levels: North South East West
Generating Factor Levels
Furthermore, we can generate factor levels by using the gl()
function. In additon it takes two integers as input which indicates how many levels and how many times each level.
Syntax
gl(n, k, labels)
Argument
- n is an integer giving the number of levels.
- k is an integer giving the number of replications.
- labels is a vector of labels for the resulting factor levels.
> factor_lvl <- gl(2, 3, labels = c("TRUE", "FALSE"))
> factor_lvl
[1] TRUE TRUE TRUE FALSE FALSE FALSE
Levels: TRUE FALSE
Renaming a Factor levels
Similarly, we can change the name of the vector values in the input by specifying the regular use of ‘levels’ as the first argument with values “TRUE” and “FALSE” and can change the vector values using ‘labels’ as the second argument with “T” and “F” respectively.
> logical <- c("TRUE", "FALSE", "TRUE", "TRUE", "FALSE")
> logical.factor <- factor(logical)
> new_label <- factor(logical.factor, levels=c("TRUE","FALSE"),labels=c("M","F"))
Conclusion
In conclusion we studeid what is factor how to use factor, what is the structure of factor, changing the oder of levels genreating factor levels along with the renaming of factor levels, as shown above.
This brings the end of this Blog. We really appreciate your time.
Hope you liked it.
Do visit our page www.zigya.com/blog for more informative blogs on Data Science
Keep Reading! Cheers!
Zigya Academy
BEING RELEVANT