is.numeric(3)
[1] TRUE
is.numeric("3")
[1] FALSE
is.na(NA)
[1] TRUE
If you have a bit of understanding of R’s data types and other foundational concepts, let’s look at some functions that will help you do something with them.
The following functions test to see if a specific piece of data (x
in this case) is of a specific type. You can probably guess what each does.
is.numeric(x)
is.character(x)
is.integer(x)
is.logical(x)
is.list(x)
is.vector(x)
is.na(x)
Here are a few examples:
is.numeric(3)
[1] TRUE
is.numeric("3")
[1] FALSE
is.na(NA)
[1] TRUE
Approaching this problem from the other direction, the following two functions tell the user what type
or class
a certain piece of data x
is:
typeof(x)
class(x)
The following table shows the results for both functions when applied to a variety of values.
val |
typeof(val) |
class(val) |
---|---|---|
3.8 | double | numeric |
3 | double | numeric |
3L | integer | integer |
“hi” | character | character |
TRUE | logical | logical |
NA |
logical | logical |
NaN |
double | numeric |
That’s enough for the moment. What do these indicate?
typeof()
: How the data is stored in memoryclass()
: Tells the abstract typeYou can see, at least from these examples, just how closely these two functions are related. We generally use them interchangeably.
The words “coerce” and “coercion” are technical computer science words that have to do with changing the data type. It matters for us not in a theoretical sense but because it can be a source of some sneaky errors and R not acting the way that we think it “should”.
Let’s dive in to some details.
The following functions and examples are about R’s abilities to coerce a value x
into a number.
as.integer(x)
as.double(x)
First, let’s take a look at the functions that can help us convert into numbers, as.integer(x)
and as.double(x)
.
Here’s the most basic step. We’re going to ask R to coerce the integer 3
into an integer. (Yes, you’re right. This is too straight-forward, but we have to start somewhere.)
typeof(3)
[1] "double"
as.integer(3)
[1] 3
It’s surprising to us that this result shows [1] 3
and not [1] 3L
(since 3L
is an integer in R)…but that’s what it does.
Just checking:
typeof(as.integer(3))
[1] "integer"
Sure enough, the type of as.integer(3)
is “integer” (even though it printed 3
in the R Console).
This isn’t as straight-forward as we would hope, but it is consistent with everything that we have seen so far:
typeof(3)
: As we showed in the table above, the number 3
is represented as a double in R.as.integer(3)
: Coercing 3
to an integer results in the value of 3
.typeof(as.integer(3))
: The type of the result of as.integer(3)
is an integer, not a double.Okay, what about coercing the double 3.84
into an integer:
as.integer(3.84)
[1] 3
Coercing a double into an integer drops the decimal portion of the number (.84
) so that it becomes just 3
.
Now, we’ll get to a bit more complicated coercion—from a string to an integer. But not just any string, a string that contains an integer! What happens?
as.integer("3")
[1] 3
Pleasantly enough, coercing the string "3"
into an integer is successful…and turns it into the integer 3
! (What? You’re not super excited and thinking in exclamation points? Come on, jump on the geekmobile with us!)
But is it an integer? Or a double?
typeof(as.integer("3"))
[1] "integer"
Yes! It is an integer!
Now, moving onto a less-successful, but understandable, case is conversion of some random strings:
as.integer("hi")
[1] NA
as.integer("three")
[1] NA
Both of these return the special result NA
indicating that it is not returning a value.
Now, possibly surprisingly, look at what the following returns:
as.integer("3.84")
[1] 3
Would you have thought that this would return what it did? Why did this work? Describe how you think it happened.
Even though there are really only two values here (TRUE
and FALSE
), it is still possible to coerce into a logical value.
There’s only one function here: as.logical(x)
.
As you would hope, as.logical()
does not change the values of either of the logical values:
as.logical(TRUE)
[1] TRUE
as.logical(FALSE)
[1] FALSE
The above also highlight the difference between is.logical()
and as.logical()
. The first would have returned TRUE
for both because they are both logical values. The second returns the truth-value of the term, so as.logical(TRUE)
has the value TRUE
and as.logical(FALSE)
has the value FALSE
.
Now, as straight-foward as all of that might (or might not!) be, it’s now going to get trickier.
The following table tells how R handles coercing a numeric value into a logical value: zero is FALSE
and all other values are TRUE
. This is basically standard in all areas of computer science.
Value of as.logical(x) |
Numeric value of x |
---|---|
0 |
FALSE |
Any non-zero value | TRUE |
It turns out that R has a special way of handling character (string) values when coercing to logical values. Look at this table.
Value of as.logical(x) |
Character value of x |
---|---|
"TRUE" , "True" , "true" , "T" |
TRUE |
"FALSE" , "False" , "false" , "F" |
FALSE |
Any other string | NA |
Here, you can see that R tries as hard as it can to use common sense when coercing strings to logical values. If it sees a standard representation of the words true or false, then it coerces to the appropriate value; otherwise, it indicates that it can’t be done (with an NA
).
The following is what you might have considered to be a special case for this:
as.logical("3")
[1] NA
You might think, “Hey, "3"
can be coerced to 3
(a double), and 3
is not zero, so the value should be TRUE
.” But no! R has no reason to coerce to a double because it is trying to coerce to a string. Since "3"
is not one of the eight special values above, it coerces to NA
.