Day 3 - Objects in R

Spring 2023

Smith College

Overview

Timeline

  • Objects in R
  • Meta-Data for Objects
  • Lists

Goal

To understand how R sees objects under the hood so we can better use those structures when we get programming.

Objects in R

How We Understand Objects

You may be familiar with the class() function.


It tells us the class of an object in our environment.


These are important to understand when doing data analyses.

```{r}
# logical
class(c(TRUE, FALSE))
```
[1] "logical"


```{r}
# numeric
class(c(1, 3, 3, 7))
```
[1] "numeric"


```{r}
# character
class(c("Hello", "World!"))
```
[1] "character"

Coercion

R will default to the most inclusive (most general) class.

```{r}
#| output-location: fragment

c(1, TRUE)
```
[1] 1 1
```{r}
#| output-location: fragment

c(1, "TRUE")
```
[1] "1"    "TRUE"
```{r}
#| output-location: fragment

c(1L, FALSE)
```
[1] 1 0

Character > Numeric > Logical

We can coerce data into these type with the as.XXXX() family of functions.

```{r}
# make example vector
type_vec = c(1, 0, 1, 1, 0)

# coerce to logical
as.logical(type_vec)
```
[1]  TRUE FALSE  TRUE  TRUE FALSE
```{r}
# "coerce" to numeric
as.numeric(type_vec)
```
[1] 1 0 1 1 0
```{r}
# coerce to characters
as.character(type_vec)
```
[1] "1" "0" "1" "1" "0"

R Sees Objects in Many Ways

R actually sees objects from a number of angles, class being only one. Type is another.

```{r}
# logical
typeof(c(TRUE, FALSE))
```
[1] "logical"
```{r}
# numeric
typeof(c(1L, 3L, 3L, 7L))
```
[1] "integer"
```{r}
# numeric
typeof(c(1.3, 3.7))
```
[1] "double"
```{r}
# character
typeof(c("Hello", "World!"))
```
[1] "character"

Not-Values

NA & is.na()
NA is used in place of unknown or missing values (though they aren’t the same in reality!). With few exceptions, whenever you try to do something to a NA the result will be NA.


NULL & is.null()
NULL is used then there is no value–it does not exist. NULL does not take the place of another value, it has a length of 0. Contrast this to when there is a value, but it is unknown or missing.


NaN & is.nan()
Rarely seen, but used when some computation was done, but the result can’t exist, like 0 / 0.

Meta-Data for Objects

Object Attributes

R can also keep meta-data on objects.


The easiest of these to see is names.


You may have used named vectors before; this is how they work!


Attributes are used all over for lots of common uses.

# normal vector
plain_vec = c("cat", "dog", "fish")
plain_vec
[1] "cat"  "dog"  "fish"
# named vector
named_vec = c("robert" = "cat",
              "yuki" = "dog",
              "K'nar'st" = "fish")
named_vec
  robert     yuki K'nar'st 
   "cat"    "dog"   "fish" 
str(named_vec)
 Named chr [1:3] "cat" "dog" "fish"
 - attr(*, "names")= chr [1:3] "robert" "yuki" "K'nar'st"
attributes(named_vec)
$names
[1] "robert"   "yuki"     "K'nar'st"

AKA Object Meta-Data

You can add your own custom attributes too!


In fact, many of the packages in R rely on attributes.


We’ll get into that more later this semester.

named_vec
  robert     yuki K'nar'st 
   "cat"    "dog"   "fish" 
attr(named_vec, "age") = c(4, 6, NaN)
attributes(named_vec)
$names
[1] "robert"   "yuki"     "K'nar'st"

$age
[1]   4   6 NaN
attr(named_vec, "favorite_color") =
  c("orange", "white", "void")
attributes(named_vec)
$names
[1] "robert"   "yuki"     "K'nar'st"

$age
[1]   4   6 NaN

$favorite_color
[1] "orange" "white"  "void"  

Other Common Uses

  • Factors
  • Matrices
  • Dates
  • Date-time

Understanding Attributes Saves Grief

Factors

survey_colors = factor(survey$fav_color)
head(survey_colors)
[1] Lavender     Pink         orange       Light purple Blue        
[6] Green       
16 Levels: black Black blue Blue Brown Crimson Green grey ... wine red
as.numeric(survey_colors)
 [1]  9 14 11 10  4  7  4  4  8 13 16  5  6  2 14  4  7  7  4  3  1  3 15 12

Dates

render_time = Sys.time()
render_time
[1] "2023-01-31 22:32:15 CST"
as.numeric(render_time)
[1] 1675222336

Lists

What is a List

Lists are kinda like super-vectors (JSON-like).


They can contain anything in their elements. You could have:

  • A list with one number in each element
  • A list with a vector of temperatures per day
  • A list of dataframes
  • A list of lists
```{r}
list(1, 2, 3, 4)
```
[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 3

[[4]]
[1] 4
```{r}
list("last_week" = c(50, 42, 50, 54, 57, 60, 59),
     "this_week" = c(58, 58, 65, 67, 60))
```
$last_week
[1] 50 42 50 54 57 60 59

$this_week
[1] 58 58 65 67 60
```{r}
list(data.frame("id" = 1:2, "let" = c("a", "b")),
     data.frame("id" = 3:4, "let" = c("c", "d")))
```
[[1]]
  id let
1  1   a
2  2   b

[[2]]
  id let
1  3   c
2  4   d

Accessing Vectors

You can ask for data in a specific position in a vector by giving it the number of that position.


For example, if vector <- c(1, 3, 5, 7, 9).


  • vector[1]: c(1)
  • vector[2]: c(3)
  • vector[c(1, 2)]: c(1, 3)
  • vector[c(1, 2, 5)]: c(1, 3, 9)
  • vector[vector > 5]: c(7, 9)
  • vector[vector <= 5]: c(1, 3, 5)
  • vector[vector == 5]: c(5)
  • vector[vector != 5]: c(1, 3, 7, 9)

Accessing Lists

Getting the content of lists requires special syntax!


Each list element is accessed using double square brackets [[ ]]

```{r}
test_list = list("num_vec" = c(1, 2, 3, 4, 5),
                 "let_vec" = c("a", "b", "c", "c"),
                 "df" = head(mtcars))

test_list
```
$num_vec
[1] 1 2 3 4 5

$let_vec
[1] "a" "b" "c" "c"

$df
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
```{r}
test_list[[1]]
```
[1] 1 2 3 4 5
```{r}
test_list[["num_vec"]]
```
[1] 1 2 3 4 5
```{r}
test_list[["let_vec"]][1]
```
[1] "a"
```{r}
test_list[["df"]]$mpg
```
[1] 21.0 21.0 22.8 21.4 18.7 18.1
```{r}
cars_df = test_list[[3]]

cars_df$cyl
```
[1] 6 6 4 6 8 6

In Other Words

IF

example_vec_1 = c(1, 2, 3)

example_vec_2 = c(“a”, “b”, “c”)

AND

example_list = list(example_vec_1, example_vec_2)

THEN

example_list[[1]] == example_vec_1 == c(1, 2, 3)

example_list[[2]][3] == example_vec_2[3] == “c”

Pepper All the Way Down

pepper_shaker = list("packet_1" = c("pepper", "pepper", "pepper"),
                     "packet_2" = c("pepper", "pepper", "pepper"))

Why Use This Mess?

You already do!

head(mtcars, n = 4)
                mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4      21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag  21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710     22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
attributes(mtcars)
$names
 [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
[11] "carb"

$row.names
 [1] "Mazda RX4"           "Mazda RX4 Wag"       "Datsun 710"         
 [4] "Hornet 4 Drive"      "Hornet Sportabout"   "Valiant"            
 [7] "Duster 360"          "Merc 240D"           "Merc 230"           
[10] "Merc 280"            "Merc 280C"           "Merc 450SE"         
[13] "Merc 450SL"          "Merc 450SLC"         "Cadillac Fleetwood" 
[16] "Lincoln Continental" "Chrysler Imperial"   "Fiat 128"           
[19] "Honda Civic"         "Toyota Corolla"      "Toyota Corona"      
[22] "Dodge Challenger"    "AMC Javelin"         "Camaro Z28"         
[25] "Pontiac Firebird"    "Fiat X1-9"           "Porsche 914-2"      
[28] "Lotus Europa"        "Ford Pantera L"      "Ferrari Dino"       
[31] "Maserati Bora"       "Volvo 142E"         

$class
[1] "data.frame"

Not Always Necessary

Sometimes, things don’t really need to be a list.


You may run into this situation when creating functions later on.


It’s possible to “flatten” lists into a vector if they are simple enough.

# see list
pepper_shaker
$packet_1
[1] "pepper" "pepper" "pepper"

$packet_2
[1] "pepper" "pepper" "pepper"


# pour the packets out
unlist(pepper_shaker)
packet_11 packet_12 packet_13 packet_21 packet_22 packet_23 
 "pepper"  "pepper"  "pepper"  "pepper"  "pepper"  "pepper" 

Really Though

Lists let you store any arbitrary structure of data.


This makes them very powerful.


We will learn to harness this power in time.

Code-Along

For Next Time

Topic

Functions

To-Do

  • Finish worksheet
  • Make sure you finished all install guides (especially the GitHub one)