Day 26 - Advanced Functions/S3

Spring 2023

Smith College

Overview

Timeline

  • A Refresher on Classes and Attributes
  • Making Use of the Meta
  • Mechanisms of Methods

Goal

To understand how classes and methods interact.

A Refresher on Classes and Attributes

Most Common Classes in R

You’re almost certainly familiar with the three most common classes in R at this point.


Almost everything in R is eventually categorized as a logical, numeric, or character vector.


These vectors can only contain one kind of information within them, and will be coerced into a more general kind if needed.

Logical Vector

```{r}
log_vec = c(TRUE, FALSE, TRUE, TRUE, FALSE,
            FALSE, TRUE, TRUE)
class(log_vec)
```
[1] "logical"

Integer (Numeric) Vector

```{r}
num_vec = 1:10
class(num_vec)
```
[1] "integer"

Character Vector

```{r}
char_vec = letters[1:5]
class(char_vec)
```
[1] "character"

Attributes in R

We have also bee introduced to attributes in R. Attributes at as another way to identify properties of objects.


One of the more common ways we see attributes used is for names.


Here, we can see a named vector being created in three ways, all identical.

Create a Named Vector

named_vec1 = c("cat" = 5, "fish" = 1,
               "dog" = 3, "rock" = 9)
named_vec1
 cat fish  dog rock 
   5    1    3    9 

Add Names using names()

named_vec2 = c(5, 1, 3, 9)
names(named_vec2) = c("cat", "fish",
                      "dog", "rock")
named_vec2
 cat fish  dog rock 
   5    1    3    9 

Add Names as Attribute

named_vec3 = c(5, 1, 3, 9)
attr(named_vec3, "names") = c("cat", "fish",
                             "dog", "rock")
named_vec3
 cat fish  dog rock 
   5    1    3    9 

Making Use of the Meta

Smiple Functions Have One Job

Many functions we use in R are fairly simple.


Take length() as an example. No matter what you give it, it will always do the same thing: tell you how many elements are in an object.


This simplicity can become limiting with more complex function, however.

length(log_vec)
[1] 8


length(num_vec)
[1] 10


length(char_vec)
[1] 5


length(mtcars)
[1] 11


length(list(log_vec, num_vec, char_vec))
[1] 3

Other Functions are More Complex

Some functions change how they behave given the input.


Consider summary(). How does it know to change what it does based on the input?


To the right we can see some of the different ways summary() will behave.

summary(log_vec)
   Mode   FALSE    TRUE 
logical       3       5 


summary(num_vec)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00    3.25    5.50    5.50    7.75   10.00 


summary(char_vec)
   Length     Class      Mode 
        5 character character 


summary(list(log_vec, num_vec, char_vec))
     Length Class  Mode     
[1,]  8     -none- logical  
[2,] 10     -none- numeric  
[3,]  5     -none- character

Getting Function to React

One way to accomplish this reactivity is with conditions such as if() else code.


We can imagine summary working like the code on the right.


However, that would not be quite correct. Instead it just says UseMethod("summary").

It could be…

if(object is logical){
  
  Do logical stuff
  
} else if(object is numeric){
  
  Do numeric stuff
  
} else if(object is character){
  
  Do character stuff
  
}

However, it really is …

summary
function (object, ...) 
UseMethod("summary")
<bytecode: 0x0000019e9f596358>
<environment: namespace:base>

What is that?

The Many Faces of Summary()

If we did need to write an if() for everything summary() did, the function would be massive!


It would also mean that every time someone wanted to create a new summary for their package or custom object, they would need to edit or overwrite the base summary().


Given we routinely have many packages loaded at once, that isn’t really sustainable.

summary(mtcars[, 1:3])
      mpg             cyl             disp      
 Min.   :10.40   Min.   :4.000   Min.   : 71.1  
 1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8  
 Median :19.20   Median :6.000   Median :196.3  
 Mean   :20.09   Mean   :6.188   Mean   :230.7  
 3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0  
 Max.   :33.90   Max.   :8.000   Max.   :472.0  
summary(lm(mpg ~ disp + wt, data = mtcars))

Call:
lm(formula = mpg ~ disp + wt, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.4087 -2.3243 -0.7683  1.7721  6.3484 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 34.96055    2.16454  16.151 4.91e-16 ***
disp        -0.01773    0.00919  -1.929  0.06362 .  
wt          -3.35082    1.16413  -2.878  0.00743 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.917 on 29 degrees of freedom
Multiple R-squared:  0.7809,    Adjusted R-squared:  0.7658 
F-statistic: 51.69 on 2 and 29 DF,  p-value: 2.744e-10

Introducing Methods

Rather than editing one big function, we can instead create a method for each class of object we want our function to work with.


This lets a single generic function coordinate a common task with several different methods of accomplishing that task in different contexts.


For example, sumamry() is really a generic function calling all of the methods we see here!

methods(summary)
 [1] summary.aov                         summary.aovlist*                   
 [3] summary.aspell*                     summary.check_packages_in_dir*     
 [5] summary.connection                  summary.data.frame                 
 [7] summary.Date                        summary.default                    
 [9] summary.ecdf*                       summary.factor                     
[11] summary.glm                         summary.infl*                      
[13] summary.lm                          summary.loess*                     
[15] summary.manova                      summary.matrix                     
[17] summary.mlm*                        summary.nls*                       
[19] summary.packageStatus*              summary.POSIXct                    
[21] summary.POSIXlt                     summary.ppr*                       
[23] summary.prcomp*                     summary.princomp*                  
[25] summary.proc_time                   summary.rlang:::list_of_conditions*
[27] summary.rlang_error*                summary.rlang_message*             
[29] summary.rlang_trace*                summary.rlang_warning*             
[31] summary.srcfile                     summary.srcref                     
[33] summary.stepfun                     summary.stl*                       
[35] summary.table                       summary.tukeysmooth*               
[37] summary.warnings                   
see '?methods' for accessing help and source code

Mechanisms of Methods

Creating a New Method

The most simple way to create methods in R is the S3 style, named for the S language R grew from.


The syntax goes:

generic_function.class_name =
   function(x){
      When generic_function is called
      on an object of class
      class_name, do this to x.
}

Keep in mind this only works for generic functions!

Make object of class SuperCool

test_vec = 1:5
class(test_vec) = "SuperCool"

Define a method to the generic summary() function for class SuperCool

summary.SuperCool = function(x){
  cat("There are", length(x),
      "cool things in this vector.\n")
  cat("What more do you need to know?")
}

Then run summary()

summary(test_vec)
There are 5 cool things in this vector.
What more do you need to know?

Using Multiple Methods

Say we still wanted to get the normal information we would get for numeric vectors like test_vec.


We can do that by letting our SuperCool method pass the object to another method.


In this way, we can re-use some code if objects have multiple classes.

Modify our SuperCool summary method to use the next method, in this case the summary for the numeric class

summary.SuperCool = function(x){
  cat("There are", length(x),
      "cool things in this vector.\n")
  
  NextMethod()
}

Rerun our SuperCool summary

summary(test_vec)
There are 5 cool things in this vector.
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      1       2       3       3       4       5 

A Visual

Code-Along

For Next Time

Topic

Finals Work Time 1

To-Do

  • Finish Worksheet