Access data items

One of the most basic needs of a programmer is to be able to access specific columns, rows, or data items in a data frame. We look at how to accomplish those on this page.

For the purposes of our work, we’re going to create this tiny data frame.

grades = data.frame(
  instructor_id      = c(1, 2, 3, 4, 5, 6, 7, 8, 9) * 2,
  instr_first        = c("Alice", "Bob", "Charlie", "David",
                         "Eve", "Stanislav", "Yolanda",
                         "Zoe", "Xavier"),
  instr_last         = c("Smith", "Jones", "Kline", "White",
                         "Zettle", "Bernard-Zza", "Zhang",
                         "Xu", "Zimmerman"),
  subject            = c("Math", "Math", "Math", "English",
                         "English", "English", "History",
                         "History", "History"),
  grade              = c("A", "B", "A", "C", "B", "A", "A",
                         "B", "A")
)

1 Whole data frame

We can access the whole data frame by simply typing its name:

grades
  instructor_id instr_first  instr_last subject grade
1             2       Alice       Smith    Math     A
2             4         Bob       Jones    Math     B
3             6     Charlie       Kline    Math     A
4             8       David       White English     C
5            10         Eve      Zettle English     B
6            12   Stanislav Bernard-Zza English     A
7            14     Yolanda       Zhang History     A
8            16         Zoe          Xu History     B
9            18      Xavier   Zimmerman History     A

2 Specific column

We can access a whole column at once either by name or by number. Let’s take a look.

2.1 By column name

We have several different ways of accessing a specific column by name. In all cases, one must know the name of the data frame and the name of the column. (Not surprising.)

grades$subject 
[1] "Math"    "Math"    "Math"    "English" "English" "English" "History"
[8] "History" "History"
grades[, "subject"]
[1] "Math"    "Math"    "Math"    "English" "English" "English" "History"
[8] "History" "History"
grades[["subject"]]
[1] "Math"    "Math"    "Math"    "English" "English" "English" "History"
[8] "History" "History"

2.2 By column number

We can also access a specific column by column number:

grades[, 4]
[1] "Math"    "Math"    "Math"    "English" "English" "English" "History"
[8] "History" "History"

This form returns a vector of values, as you can see.

is.vector(grades[, 4])
[1] TRUE

We can use an alternate form that returns a list, also using a column number:

grades[4]
  subject
1    Math
2    Math
3    Math
4 English
5 English
6 English
7 History
8 History
9 History

And, yes, it is a list:

is.list(grades[4])
[1] TRUE

3 Parts of a specific column

Both of the following return the first through fourth values of the subject column.

grades[1:4, "subject"]
[1] "Math"    "Math"    "Math"    "English"
grades[1:4, 4]
[1] "Math"    "Math"    "Math"    "English"

4 Parts of a specific row

This returns the third through fifth values of row 7.

grades[7, 3:5]
  instr_last subject grade
7      Zhang History     A

5 A specific item in a specific column

R has multiple ways — that’s no surprise by now, eh? — of accessing specific items in a specific column. Each of these access the item in row 7 of the subject column:

grades$subject[7]
[1] "History"
grades[7,4]
[1] "History"
grades[7, "subject"]
[1] "History"
grades[["subject"]][7]
[1] "History"