Missing data

Standards exist for representing missing data, so we’ll start there.

NA: this means that there is no data (not available). Note that this is not the same as "NA", which is a string.
NaN: this value represents “not a number”. It can only exist within numeric vectors when there is an undefined result, such as 0/0.

1 The functions testing for `NA` and `NaN`

The two most important functions you need to understand are is.na(x) (“does x have a value?”) and is.nan(x) (“is x an undefined number?”).

2 Examples

Consider the following examples. We’ll start with what we hope are obvious cases:

is.na(3)

[1] FALSE

is.na(NA)

[1] TRUE

We would certainly hope that R recognizes that 3 has a value and that NA does not. Whew.

Okay, now a trickier pair of examples

is.na(TRUE)

[1] FALSE

is.na(FALSE)

[1] FALSE

Okay, even though FALSE is the absence of truth, it is not the absence of value! So is.na(FALSE) must be true.

What does is.na() think about undefined numbers?

is.na(0/0)

[1] TRUE

is.na(NaN)

[1] TRUE

Well, R’s position is that something that has an undefined numeric value does not have a value. That seems right…you just have to think carefully about it.

Now let’s consider a few cases with is.nan() (not a number) that should be straight forward to interpret:

is.nan("a")

[1] FALSE

is.nan(3.4)

[1] FALSE

is.nan(0/0)

[1] TRUE

is.nan(NaN)

[1] TRUE

Nothing too surprising here. When testing to see if the argument is an undefined number, both "a" and 3.4 fail (because they are not undefined numbers) but both 0/0 and NaN pass (because they are undefined numbers). Which is as it should be.

Let’s look at what R’s position is on the question “is NA an undefined number?”:

is.nan(NA)

[1] FALSE

So, R’s position is that it is not the case that NA (a non-existent value) is an undefined number.

Now, sometimes you want to test for the opposite of these two functions. You might want to know the answers to the questions “is data available?” and “is the data anything other than an undefined number?” R does not have to define new operators to accomplish this; you can simply use the not operator (!):

!is.na(x): “does x have a value?”
!is.nan(x): “is x anything other than an undefined number?”

1 The functions testing for NA and NaN

2 Examples

1 The functions testing for `NA` and `NaN`