Spring 2023
Smith College
To learn the primary debugging tools in R.
object X not found / could not find X
subscript out of bounds
non-numeric argument to binary operator
replacement has...
#| echo: fenced
example = c(1, 2, 3, 3, 4, 5, 7)
error_example = function(numeric_vector) {
# Sum it up!
vec_sum = sum(numeric_vector)
# divide by sum!
percents = numeric_vector / vec_sum
# Get a table!
vec_table = table(numeric_vector)
# Sum the possibilities!
sum_names = sum(names(vec_table))
# return the table!
return(vec_table)
}
error_example(example)
#| echo: fenced
example = c(1, 2, 3, 3, 4, 5, 7)
error_example = function(numeric_vector) {
# Sum it up!
vec_sum = sum(numeric_vector)
# divide by sum!
percents = numeric_vector / vec_sum
# Get a table!
vec_table = table(numeric_vector)
# Sum the possibilities!
sum_names = sum(names(vec_table))
# return the table!
return(vec_table)
}
error_example(example)
Error in sum(names(vec_table)): invalid 'type' (character) of argument
# Make example
example1 = c(1, 2, 3, 3, 4, 5, 7)
example2 = c(5, 5, 5, 5, 5, 5, 5)
# re-create my failed mode from Monday
get_mode = function(...){
# unlist all input to find mode
flat = unlist(...)
# find the most common input
mode_table = table(flat)
# get hte name of the most common element
our_mode = names(which.max(mode_table))
# return output
return(our_mode)
}
# Bad result!
get_mode(example1, example2)
[1] "3"
traceback()
debug() / undebug() / debugonce()
browser()
More on these in the Code-Along
switch()
is used to test a number of possible outputs at once.
It helps prevent long chains of if else if else if else ...
Oddly, switch()
is not vectorized, meaning it can only accept a single value (very weird for R)
```{r}
# set a value (like the input in an argument)
example1 = "cat"
# whatever that matches, return that.
switch(example1,
"spider" = "<unintelligible chittering>",
"cow" = "moo",
"cat" = "meow",
"dog" = "woof",
"chicken" = "cluck",
"human" = "why are you recording me",
"rock" = "",
stop("I don't know!"))
```
[1] "meow"
dplyr::case_when()
does something similar with different notation.
Up side is you can pass it whole vectors, which switch()
won’t take.
Down side is it can be a bit harder to set up.
```{r}
example2 = 1:10
dplyr::case_when(
example2 == 2 ~ "PRIME!",
example2 == 3 ~ "PRIME!",
example2 == 5 ~ "PRIME!",
example2 == 7 ~ "PRIME!",
TRUE ~ "not prime ..."
)
```
[1] "not prime ..." "PRIME!" "PRIME!" "not prime ..."
[5] "PRIME!" "not prime ..." "PRIME!" "not prime ..."
[9] "not prime ..." "not prime ..."
Comparison operators help you ask for things when a condition is TRUE
.
==
- Equal to!=
- Not equal to>
- Greater than>=
- Greater than or equal to<
- Less than<=
- Less than or equal toFor example:
vector <- c(1, 3, 5, 7, 9, 11)
We can use conditions to create our own error messages using stop()
We can create our own warning messages using warning()
We can simply report progress using message()
.
```{r}
add_me <- function(x, y, verbose = TRUE) {
if(verbose){message("Testing inputs ...")}
if(is.numeric(x) & is.numeric(y)){
if(verbose){message("All good inputs! Adding ...")}
}
if(!is.numeric(x)) {
warning("Your X wasn't numeric!
I tried to fix it.")
x = as.numeric(x)
}
if(!is.numeric(y)) {
warning("Your Y wasn't numeric!
I tried to fix it.")
y = as.numeric(y)
}
# Step 1
result = x + y
# Return Results
return(result)
}
```
```{r}
pet_years = function(pet_age, type) {
# set default human_age if type not found
human_age = NA
# if pet is a dog
if(type == "dog"){
human_age = (16 * log(pet_age)) + 31
}
# if pet is a cat
if(type == "cat"){
human_age = -14.4 + 21.58275*pet_age - 2.98951*pet_age^2 + 0.1550117*pet_age^3
}
# return
return(human_age)
}
```
While it can seem functions are instant, always keep in mind they are a sequence of steps.
We are starting to create code that will work without our direct involvement.
While what we create now is simple, that will not always be the case.
The same errors we encounter now plague much more serious endeavors.
```{r}
#| output-location: column-fragment
#| error: true
example = c(1, 2, 3, 3, 4, 5, 7)
error_example = function(numeric_vector) {
# Sum it up!
vec_sum = sum(example)
# divide by sum!
percents = example / vec_sum
# Get a table!
vec_table = table(example)
# Sum the possibilities!
sum_names = sum(names(vec_table))
# return the table!
return(vec_table)
}
error_example(example)
```
Error in sum(names(vec_table)): invalid 'type' (character) of argument
# Make example
example1 = c(1, 2, 3, 3, 4, 5, 7)
example2 = c(5, 5, 5, 5, 5, 5, 5)
# re-create my failed mode from Monday
get_mode = function(...){
# unlist all input to find mode
flat = unlist(...)
# find the most common input
mode_table = table(flat)
# get hte name of the most common element
our_mode = names(which.max(mode_table))
# return output
return(our_mode)
}
# Bad result!
get_mode(example1, example2)
[1] "3"
Why is it giving us “3”?
Lab 2
SDS 270: Advanced Programming for Data Science