Press "Enter" to skip to content

What is Coercion in R?

Zigya Acadmey 0

R’s coercion behavior may seem inconvenient, but it is not arbitrary. R always follows the same rules when it coerces data types. Once you are familiar with these rules, you can use R’s coercion behavior to do surprisingly useful things.

So how does R coerce data types? If a character string is present in an atomic vector, R will convert everything else in the vector to character strings. If a vector only contains logical and numbers, R will convert the logical to numbers; every TRUE becomes a 1, and every FALSE becomes a 0, as shown below.

R always uses the same rules to coerce data to a single type. If character strings are present, everything will be coerced to a character string. Otherwise, locals are coerced to numerics.

This arrangement preserves information. It is easy to look at a character string and tell what information it used to contain. For example, you can easily spot the origins of "TRUE" and "5". You can also easily back-transform a vector of 1s and 0s to TRUEs and FALSEs.

R uses the same coercion rules when you try to do math with logical values. So the following code:

sum(c(TRUE, TRUE, FALSE, FALSE))

will become:

sum(c(1, 1, 0, 0))
## 2

This means that sum will count the number of TRUEs in a logical vector (and mean will calculate the proportion of TRUEs). Neat, huh?

You can explicitly ask R to convert data from one type to another with the as functions. R will convert the data whenever there is a sensible way to do so:

as.character(1)
## "1"

as.logical(1)
## TRUE

as.numeric(FALSE)
## 0

You now know how R coerces data types, but this won’t help you save a playing card. To do that, you will need to avoid coercion altogether. You can do this by using a new type of object, a list.

Before we look at lists, let’s address a question that might be on your mind.

Many data sets contain multiple types of information. The inability of vectors, matrices, and arrays to store multiple data types seems like a major limitation. So why bother with them?

In some cases, using only a single type of data is a huge advantage. Vectors, matrices, and arrays make it very easy to do math on large sets of numbers because R knows that it can manipulate each value the same way. Operations with vectors, matrices, and arrays also tend to be fast because the objects are so simple to store in memory.

In other cases, allowing only a single type of data is not a disadvantage. Vectors are the most common data structure in R because they store variables very well. Each value in a variable measures the same property, so there’s no need to use different types of data.

Conclusion

Hence we saw what is coercion in R with the examples.

This brings the end of this Blog. We really appreciate your time.

Hope you liked it.

Do visit our page www.zigya.com/blog for more informative blogs on Data Science

Keep Reading! Cheers!

Zigya Academy
BEING RELEVANT

Leave a Reply

Your email address will not be published. Required fields are marked *