Day 6 - Debugging & Flow

Spring 2023

Smith College

Overview

Timeline

  • Errors
  • Errors in Functions
  • Control Flow

Goal

To learn the primary debugging tools in R.

Errors

Common Error Messages

object X not found / could not find X
R couldn’t find something you tried to use


subscript out of bounds
You tried to sub-set something, but you selected for something that doesn’t exist


non-numeric argument to binary operator
You tried to do math on something you can’t do math on


replacement has...
You tried to put values somewhere and there are too many/not enough values for that space

Errors in Functions

Error Messages in Functions are Difficult

#| echo: fenced

example = c(1, 2, 3, 3, 4, 5, 7)

error_example = function(numeric_vector) {
  
  # Sum it up!
  vec_sum = sum(numeric_vector)
  
  # divide by sum!
  percents = numeric_vector / vec_sum
  
  # Get a table!
  vec_table = table(numeric_vector)
  
  # Sum the possibilities!
  sum_names = sum(names(vec_table))
  
  # return the table!
  return(vec_table)
}

error_example(example)

Error Messages in Functions are Difficult

#| echo: fenced

example = c(1, 2, 3, 3, 4, 5, 7)

error_example = function(numeric_vector) {
  
  # Sum it up!
  vec_sum = sum(numeric_vector)
  
  # divide by sum!
  percents = numeric_vector / vec_sum
  
  # Get a table!
  vec_table = table(numeric_vector)
  
  # Sum the possibilities!
  sum_names = sum(names(vec_table))
  
  # return the table!
  return(vec_table)
}

error_example(example)

Error in sum(names(vec_table)): invalid 'type' (character) of argument

Errors WITHOUT Messages are Worse

# Make example
example1 = c(1, 2, 3, 3, 4, 5, 7)
example2 = c(5, 5, 5, 5, 5, 5, 5)

# re-create my failed mode from Monday
get_mode = function(...){
  
  # unlist all input to find mode
  flat = unlist(...)
  
  # find the most common input
  mode_table = table(flat)
  
  # get hte name of the most common element
  our_mode = names(which.max(mode_table))
  
  # return output
  return(our_mode)
}

# Bad result!
get_mode(example1, example2)
[1] "3"

Figure out the Why (Debugging Tools)

traceback()
Show me all the code run when the error happened.
debug() / undebug() / debugonce()
Let me explore the mini-R universe inside the function
browser()
Let me explore inside the function right here!


More on these in the Code-Along

Conditions

ifelse

```{r, eval=FALSE}
ifelse(TEST, IF TEST IS TRUE, IF TEST IS FALSE)
```
```{r}
#| output-location: column-fragment

ifelse(10 > 5,
       "Yes, it is greater.",
       "No, it is not greater.")
```
[1] "Yes, it is greater."

if then else

```{r, eval=FALSE}
if(TEST) {
  IF TEST IS TRUE
  DO THIS
  AND TIHS
  ETC
} else {
  ELSE, DO TIHS
  AND THIS
  ETC
}
```
```{r}
#| output-location: column-fragment

if(10 > 5) {
  
  print("Yes, it is greater.")
  
} else {
  
  print("No, it is not greater.")
  
}
```
[1] "Yes, it is greater."

switch

switch() is used to test a number of possible outputs at once.


It helps prevent long chains of if else if else if else ...


Oddly, switch() is not vectorized, meaning it can only accept a single value (very weird for R)

```{r}
# set a value (like the input in an argument)
example1 = "cat"

# whatever that matches, return that.
switch(example1,
       "spider" = "<unintelligible chittering>",
       "cow" = "moo",
       "cat" = "meow",
       "dog" = "woof",
       "chicken" = "cluck",
       "human" = "why are you recording me",
       "rock" = "",
       stop("I don't know!"))
```
[1] "meow"

dplyr::case_when()

dplyr::case_when() does something similar with different notation.


Up side is you can pass it whole vectors, which switch() won’t take.


Down side is it can be a bit harder to set up.

```{r}
example2 = 1:10

dplyr::case_when(
  example2 == 2 ~ "PRIME!",
  example2 == 3 ~ "PRIME!",
  example2 == 5 ~ "PRIME!",
  example2 == 7 ~ "PRIME!",
  TRUE ~ "not prime ..."
)
```
 [1] "not prime ..." "PRIME!"        "PRIME!"        "not prime ..."
 [5] "PRIME!"        "not prime ..." "PRIME!"        "not prime ..."
 [9] "not prime ..." "not prime ..."

All Rely on Tests

Comparison operators help you ask for things when a condition is TRUE.

  • == - Equal to
  • != - Not equal to
  • > - Greater than
  • >= - Greater than or equal to
  • < - Less than
  • <= - Less than or equal to

For example:

vector <- c(1, 3, 5, 7, 9, 11)

  • vector[vector > 5]: c(7, 9, 11)
  • vector[vector <= 5]: c(1, 3, 5)
  • vector[vector == 5]: c(5)
  • vector[vector != 5]: c(1, 3, 7, 9, 11)

Errors - Stop Execution

We can use conditions to create our own error messages using stop()

```{r}
#| error: true

add_me <- function(x, y) {
  
  if(!is.numeric(x)) {stop("X isn't numeric!")}
  if(!is.numeric(y)) {stop("Y isn't numeric!")}
  
  # Step 1
  result = x + y
  
  # Return Results
  return(result)
}
```
```{r}
#| error: true
#| output-location: fragment

"8" + 1
```
Error in "8" + 1: non-numeric argument to binary operator


```{r}
#| error: true
#| output-location: fragment

add_me("8", 1)
```
Error in add_me("8", 1): X isn't numeric!

Warnings - Something Needs Attention

We can create our own warning messages using warning()

```{r}
add_me <- function(x, y) {
  
  if(!is.numeric(x)) {
    warning("Your X wasn't numeric!
            I tried to fix it.")
    x = as.numeric(x)
  }
  
  if(!is.numeric(y)) {
    warning("Your Y wasn't numeric!
            I tried to fix it.")
    y = as.numeric(y)
  }
  
  # Step 1
  result = x + y
  
  # Return Results
  return(result)
}
```
```{r}
#| error: true
#| output-location: fragment

"8" + 1
```
Error in "8" + 1: non-numeric argument to binary operator


```{r}
#| error: true
#| warning: true
#| output-location: fragment

add_me("8", 1)
```
Warning in add_me("8", 1): Your X wasn't numeric!
            I tried to fix it.
[1] 9

Messages - Helpful to Know

We can simply report progress using message().

```{r}
add_me <- function(x, y, verbose = TRUE) {
  
  if(verbose){message("Testing inputs ...")}
  
  if(is.numeric(x) & is.numeric(y)){
    if(verbose){message("All good inputs! Adding ...")}
  }
  
  if(!is.numeric(x)) {
    warning("Your X wasn't numeric!
            I tried to fix it.")
    x = as.numeric(x)
  }
  
  if(!is.numeric(y)) {
    warning("Your Y wasn't numeric!
            I tried to fix it.")
    y = as.numeric(y)
  }
  
  # Step 1
  result = x + y
  
  # Return Results
  return(result)
}
```
```{r}
#| error: true
#| warning: true
#| output-location: fragment

add_me(8, 1)
```
Testing inputs ...
All good inputs! Adding ...
[1] 9


```{r}
#| error: true
#| warning: true
#| output-location: fragment

add_me(8, 1, verbose = FALSE)
```
[1] 9

A Useful Example

```{r}
pet_years = function(pet_age, type) {
  
  # set default human_age if type not found
  human_age = NA
  
  # if pet is a dog
  if(type == "dog"){
    human_age = (16 * log(pet_age)) + 31
  }
  
  # if pet is a cat
  if(type == "cat"){
    human_age = -14.4 + 21.58275*pet_age - 2.98951*pet_age^2 + 0.1550117*pet_age^3
  }
  
  # return
  return(human_age)
}
```
```{r}
print("Your dog is __ years old.")
pet_years(6, "dog")
```
[1] "Your dog is __ years old."
[1] 59.66815


```{r}
print("Your cat is __ years old.")
pet_years(4, "cat")
```
[1] "Your cat is __ years old."
[1] 34.01959

Keep in Mind the Flow

While it can seem functions are instant, always keep in mind they are a sequence of steps.

The Dangers of Thinking Machines

We are starting to create code that will work without our direct involvement.


While what we create now is simple, that will not always be the case.


The same errors we encounter now plague much more serious endeavors.

Code-Along

Example 1

```{r}
#| output-location: column-fragment
#| error: true

example = c(1, 2, 3, 3, 4, 5, 7)

error_example = function(numeric_vector) {
  
  # Sum it up!
  vec_sum = sum(example)
  
  # divide by sum!
  percents = example / vec_sum
  
  # Get a table!
  vec_table = table(example)
  
  # Sum the possibilities!
  sum_names = sum(names(vec_table))
  
  # return the table!
  return(vec_table)
}

error_example(example)
```
Error in sum(names(vec_table)): invalid 'type' (character) of argument

Example 2

# Make example
example1 = c(1, 2, 3, 3, 4, 5, 7)
example2 = c(5, 5, 5, 5, 5, 5, 5)

# re-create my failed mode from Monday
get_mode = function(...){
  
  # unlist all input to find mode
  flat = unlist(...)
  
  # find the most common input
  mode_table = table(flat)
  
  # get hte name of the most common element
  our_mode = names(which.max(mode_table))
  
  # return output
  return(our_mode)
}

# Bad result!
get_mode(example1, example2)
[1] "3"

Why is it giving us “3”?

For Next Time

Topic

Lab 2

To-Do

  • Finish Worksheet