[1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE
[1] "a" "3" "TRUE" "FALSE" NA "NaN" "NaN"
Data isn’t clean, perfect, and ready-to-use. You always have to clean it before you can use it. This is always a data-specific and context-specific problem and solution.
When doing your analyses, you will want to be clear about the following:
NA
and NaN
values are, andNA
and NaN
values in your analysis—do you want to include them or exclude them from your calculations?Sometimes you want to examine a list to see if there are missing values. Let’s quickly define a list and test it with these functions:
[1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE
[1] "a" "3" "TRUE" "FALSE" NA "NaN" "NaN"
a b
[1,] FALSE FALSE
[2,] FALSE TRUE
[3,] FALSE TRUE
a b
[1,] 1 5
[2,] 2 NA
[3,] 3 NaN
Name | df |
Number of rows | 3 |
Number of columns | 2 |
_______________________ | |
Column type frequency: | |
numeric | 2 |
________________________ | |
Group variables | None |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
a | 0 | 1.00 | 2 | 1 | 1 | 1.5 | 2 | 2.5 | 3 | ▇▁▇▁▇ |
b | 2 | 0.33 | 5 | NA | 5 | 5.0 | 5 | 5.0 | 5 | ▁▁▇▁▁ |
This is how you retrieve a specific observation (row) from a data frame:
Year ID NPS Field ClassLevel Status Gender BirthYear FinPL
6 2012 mdoqvaalcscx 8 Undecl Sr Part-time Female 1988 Yes
FinSch FinGov FinSelf FinPar FinOther TooDifficult NotRelevant
6 Yes No Yes No No
PoorTeaching UnsuppFac Grades Sched ClassTooBig BadAdvising
6 Strongly Disagree Neutral Agree Strongly Agree <NA> Disagree
FinAid OverallValue
6 <NA> Strongly Agree
The anyNA(x)
function determines if there are any NA values in the vector:
Here we use it on the 6th row of survey
:
The following call returns which items in the vector have the value NA
:
The following counts how many items in the vector have the value NA
: