Basic R Structures

If you have a bit of understanding of R’s data types and other foundational concepts, let’s look at some functions that will help you do something with them.

1 Testing functions

The following functions test to see if a specific piece of data (x in this case) is of a specific type. You can probably guess what each does.

  • is.numeric(x)
  • is.character(x)
  • is.integer(x)
  • is.logical(x)
  • is.list(x)
  • is.vector(x)
  • is.na(x)

Here are a few examples:

is.numeric(3)
[1] TRUE
is.numeric("3")
[1] FALSE
is.na(NA)
[1] TRUE

2 Type and class functions

Approaching this problem from the other direction, the following two functions tell the user what type or class a certain piece of data x is:

  • typeof(x)
  • class(x)

The following table shows the results for both functions when applied to a variety of values.

val typeof(val) class(val)
3.8 double numeric
3 double numeric
3L integer integer
“hi” character character
TRUE logical logical
NA logical logical
NaN double numeric

That’s enough for the moment. What do these indicate?

  • typeof(): How the data is stored in memory
  • class(): Tells the abstract type

You can see, at least from these examples, just how closely these two functions are related. We generally use them interchangeably.

3 “Coercion” functions

The words “coerce” and “coercion” are technical computer science words that have to do with changing the data type. It matters for us not in a theoretical sense but because it can be a source of some sneaky errors and R not acting the way that we think it “should”.

Let’s dive in to some details.

3.1 Into a number

The following functions and examples are about R’s abilities to coerce a value x into a number.

3.1.1 The functions

  • as.integer(x)
  • as.double(x)

3.1.2 Examples

First, let’s take a look at the functions that can help us convert into numbers, as.integer(x) and as.double(x).

Here’s the most basic step. We’re going to ask R to coerce the integer 3 into an integer. (Yes, you’re right. This is too straight-forward, but we have to start somewhere.)

typeof(3)
[1] "double"
as.integer(3)
[1] 3

It’s surprising to us that this result shows [1] 3 and not [1] 3L (since 3L is an integer in R)…but that’s what it does.

Just checking:

typeof(as.integer(3))
[1] "integer"

Sure enough, the type of as.integer(3) is “integer” (even though it printed 3 in the R Console).

This isn’t as straight-forward as we would hope, but it is consistent with everything that we have seen so far:

  • typeof(3): As we showed in the table above, the number 3 is represented as a double in R.
  • as.integer(3): Coercing 3 to an integer results in the value of 3.
  • typeof(as.integer(3)): The type of the result of as.integer(3) is an integer, not a double.

Okay, what about coercing the double 3.84 into an integer:

as.integer(3.84)
[1] 3

Coercing a double into an integer drops the decimal portion of the number (.84) so that it becomes just 3.

Now, we’ll get to a bit more complicated coercion—from a string to an integer. But not just any string, a string that contains an integer! What happens?

as.integer("3")
[1] 3

Pleasantly enough, coercing the string "3" into an integer is successful…and turns it into the integer 3! (What? You’re not super excited and thinking in exclamation points? Come on, jump on the geekmobile with us!)

But is it an integer? Or a double?

typeof(as.integer("3"))
[1] "integer"

Yes! It is an integer!

Now, moving onto a less-successful, but understandable, case is conversion of some random strings:

as.integer("hi")
[1] NA
as.integer("three")
[1] NA

Both of these return the special result NA indicating that it is not returning a value.

Now, possibly surprisingly, look at what the following returns:

as.integer("3.84")
[1] 3

Would you have thought that this would return what it did? Why did this work? Describe how you think it happened.

3.2 Into a logical value

Even though there are really only two values here (TRUE and FALSE), it is still possible to coerce into a logical value.

There’s only one function here: as.logical(x).

As you would hope, as.logical() does not change the values of either of the logical values:

as.logical(TRUE)
[1] TRUE
as.logical(FALSE)
[1] FALSE

The above also highlight the difference between is.logical() and as.logical(). The first would have returned TRUE for both because they are both logical values. The second returns the truth-value of the term, so as.logical(TRUE) has the value TRUE and as.logical(FALSE) has the value FALSE.

Now, as straight-foward as all of that might (or might not!) be, it’s now going to get trickier.

The following table tells how R handles coercing a numeric value into a logical value: zero is FALSE and all other values are TRUE. This is basically standard in all areas of computer science.

Value of as.logical(x) Numeric value of x
0 FALSE
Any non-zero value TRUE

It turns out that R has a special way of handling character (string) values when coercing to logical values. Look at this table.

Value of as.logical(x) Character value of x
"TRUE", "True", "true", "T" TRUE
"FALSE", "False", "false", "F" FALSE
Any other string NA

Here, you can see that R tries as hard as it can to use common sense when coercing strings to logical values. If it sees a standard representation of the words true or false, then it coerces to the appropriate value; otherwise, it indicates that it can’t be done (with an NA).

The following is what you might have considered to be a special case for this:

as.logical("3")
[1] NA

You might think, “Hey, "3" can be coerced to 3 (a double), and 3 is not zero, so the value should be TRUE.” But no! R has no reason to coerce to a double because it is trying to coerce to a string. Since "3" is not one of the eight special values above, it coerces to NA.