Press "Enter" to skip to content

What is a Factor in R?

Zigya Acadmey 0

The factors are the variable in R, which takes the categorical variable and stores data in levels. Factors can be ordered or unordered and are an important class for statistical analysis and for plotting.

The function factor is used to encode a vector as a factor. Therefore, If the argument ordered is TRUE, the factor levels are assumed to be ordered.

is.factoris.orderedas.factor and as.ordered are the membership and coercion functions for these classes.

Factors are created using the factor () function by taking a vector as input.

# let's create a vector
> logical <- c("TRUE", "FALSE", "TRUE", "TRUE", "FALSE")
> logical.factor <- factor(logical)
>logical.factor
[1] TRUE  FALSE TRUE  TRUE  FALSE
Levels: FALSE TRUE

> levels(factor(logical))
[1] "FALSE" "TRUE"

Therefore, ‘Levels’ help us to sort alphabetically.

Structure of factor function

So, we can see the structure of factor with str() function.

> str(logical.factor)
Factor w/ 2 levels "FALSE","TRUE": 2 1 2 2 1

Changing the Order of Levels

Moreover, we can change the order of levels with applying factor function with the new order of levels.

# Create a vector
> data <- c("East","South","East","North","North","West","West","East","North")

# Create the factors
> data.factor <- factor(data)
> print(factor_data)

[1] East  South East  North North West  West  East  North
Levels: East North South West

# Apply the factor function with required order of the level.
> data.order <- factor(data.factor, levels = c("North", "South", "East", "West"))
> print(new_order_data)
[1] East  South East  North North West  West  East  North
Levels: North South East West

Generating Factor Levels

Furthermore, we can generate factor levels by using the gl() function. In additon it takes two integers as input which indicates how many levels and how many times each level.

Syntax

gl(n, k, labels)

Argument

  • n is an integer giving the number of levels.
  • k is an integer giving the number of replications.
  • labels is a vector of labels for the resulting factor levels.
> factor_lvl <- gl(2, 3, labels = c("TRUE", "FALSE"))
> factor_lvl
[1] TRUE TRUE TRUE FALSE FALSE FALSE
Levels: TRUE FALSE

Renaming a Factor levels

Similarly, we can change the name of the vector values in the input by specifying the regular use of ‘levels’ as the first argument with values “TRUE” and “FALSE” and can change the vector values using ‘labels’ as the second argument with “T” and “F” respectively.

> logical <- c("TRUE", "FALSE", "TRUE", "TRUE", "FALSE")
> logical.factor <- factor(logical)

> new_label <- factor(logical.factor, levels=c("TRUE","FALSE"),labels=c("M","F"))

Conclusion

In conclusion we studeid what is factor how to use factor, what is the structure of factor, changing the oder of levels genreating factor levels along with the renaming of factor levels, as shown above.

This brings the end of this Blog. We really appreciate your time.

Hope you liked it.

Do visit our page www.zigya.com/blog for more informative blogs on Data Science

Keep Reading! Cheers!

Zigya Academy
BEING RELEVANT

Leave a Reply

Your email address will not be published. Required fields are marked *