Daniel Jacobs. Presented at Advanced R Book Club
9/7/2019
OO YEAH!
S3=>
Today we will cover!
Expect to understand:
Raise your hand if you are sorta familiar with…
Only one method you need
library(R6)
Accumulator <- R6Class("Accumulator", list(
sum = 0,
add = function(x = 1) {
self$sum <- self$sum + x
invisible(self)
})
)
acc = Accumulator$new()
Access properties with $
acc$sum
[1] 0
acc$add(1)
acc$sum
[1] 1
Method Chaining (without the %>%!)
acc$add(10)$add(10)$sum
[1] 21
Defining $initialize will modify the behavior of “new”
Person <- R6Class("Person", list(
name = NULL,
age = NA,
initialize = function(name, age = NA) {
stopifnot(is.character(name), length(name) == 1)
stopifnot(is.numeric(age), length(age) == 1)
self$name <- name
self$age <- age
}
))
Defining $print will modify the default printing of the object
Looks like this
AccumulatorChatty <- R6Class("AccumulatorChatty",
inherit = Accumulator,
public = list(
add = function(x = 1) {
cat("Adding ", x, "\n", sep = "")
super$add(x = x) #super is alternative to 'NextMethod' in S3
}
)
)
names(acc)
[1] ".__enclos_env__" "sum" "clone" "add"
getters and setters -> yes it's a thing
public/private elements -> also a thing.
a1 = c(0)
a2 = a1
a2 = a2 + 10
y1 <- Accumulator$new()
y1$sum
[1] 0
y2 <- y1
y1$add(10)
c(y1 = y1$sum, y2 = y2$sum)
a1 = c(0)
a2 = a1
a2 =a2 + 10
c( a1 = a1, a2 = a2 )
a1 a2
0 10
y1 <- Accumulator$new()
y2 <- y1
y1$add(10)
c(y1 = y1$sum, y2 = y2$sum)
y1 y2
10 10
How would you implement a stack in R?
stack = new_stack( c(1,2,3))
stack = push(stack, 4) #this is ok
result = pop(stack)
result # 4
result = pop(stack)
result # 3
result = pop(stack)
result # 2
new_stack <- function(items = list()) {
structure(list(items = items), class = "stack")
}
push <- function(x, y) {
x$items <- c(x$items, list(y))
x
}
pop <- function(x) {
n <- length(x$items)
item <- x$items[[n]]
x$items <- x$items[-n]
list(item = item, x = x) #UGLY!
}
There is a package called 'progress' for rendering a progress bar. Why does it use R6?
pb <- progress_bar$new(
format = "(:spin) [:bar] :percent",
total = 30, clear = FALSE, width = 60)
for (i in 1:30) {
pb$tick()
Sys.sleep(40 / 100)
}
RC is a 'reference class' system built into R. Hadley prefers R6 because it's simpler and the docs are better: https://r6.r-lib.org
Learn R6 @hadley says
Discussion: What commonalities do these packages have?
lubridate::period() returns an S4 class. What slots does it have? What class is each slot? What accessors does it provide?
library(lubridate)
p = lubridate::period(12)
p@.Data # this is how you access a 'slot'
[1] 12
str(p)
Formal class 'Period' [package "lubridate"] with 6 slots
..@ .Data : num 12
..@ year : num 0
..@ month : num 0
..@ day : num 0
..@ hour : num 0
..@ minute: num 0
class?Period
class?h2o::H2OModel
Here are some packages that use S4
Notice any patterns? It's not obvious!
What do these have in common?
They are mappings of things from more formal Object Oriented languages.
Here are a bunch of examples of how S4 works.
setClass("Person",
slots = c(
name = "character",
age = "numeric"
))
Construct an object with “new”
john <- new("Person", name = "John Smith", age = NA_real_)
S4 classes have slots ( Similar to attributes )
Access a “slot” with the '@' symbol
john@name
[1] "John Smith"
hadley recommends creating getters and setters and only using '@' in internal methods.
First create generics
setGeneric("age", function(x) standardGeneric("age"))
[1] "age"
setGeneric("age<-", function(x, value) standardGeneric("age<-"))
[1] "age<-"
Then we can create getters and setters
setMethod("age", "Person", function(x) x@age)
setMethod("age<-", "Person", function(x, value) {
x@age <- value
x
})
age(john) <- 50
age(john)
[1] 50
To print an S4 model:
show(john)
An object of class "Person"
Slot "name":
[1] "John Smith"
Slot "age":
[1] 50
To look up documentation:
Look up S4 class docs like this
You can set a default value:
setClass("Person",
slots = c(
name = "character",
age = "numeric"
),
prototype = list(
name = NA_character_,
age = NA_real_
)
)
me <- new("Person", name = "Hadley")
str(me)
Formal class 'Person' [package ".GlobalEnv"] with 2 slots
..@ name: chr "Hadley"
..@ age : num NA
setClass("Employee",
contains = "Person", #This is how it knows to inherit!
slots = c(
boss = "Person"
),
prototype = list(
boss = new("Person")
)
)
If you insantiate an object, and then redefine a class, it doesn't go well!
hadley & Martin Morgan suggest using 'new' internally and exposing a separate constructor with nice warning messages:
Person <- function(name, age = NA) {
age <- as.double(age)
new("Person", name = name, age = age)
}
This is called automatically when you call 'new', and will throw an error if not TRUE.
setValidity("Person", function(object) {
if (length(object@name) != length(object@age)) {
"@name and @age must be same length"
} else {
TRUE
}
})
try({
new("Person", name = "Danimal", age = c(10, 23))
})
Error in validObject(.Object) :
invalid class "Person" object: @name and @age must be same length
You can call validObject at any time to check. It's only called automatically in the constructor
try({
alex <- Person("Alex", age = 30)
alex@age <- 1:10
validObject(alex)
})
Error in validObject(alex) :
invalid class "Person" object: @name and @age must be same length
Create a new generic like this
setGeneric("myGeneric", function(x) standardGeneric("myGeneric"))
[1] "myGeneric"
Instead of this:
myGeneric = function(x){ UseMethod("myGeneric")}
Arguments are : generic; 'signature' and function
setMethod("myGeneric", "Person", function(x) {
# method implementation
})
For reference
methods("generic") #To find all methods for generic
methods(class ='class') #To find methods for class
selectMethod("generic", "class") # to see an implementation
How to lookup an existing generic:
args(getGeneric("show"))
function (object)
NULL
How to implement an existing generic:
setMethod("show", "Person", function(object) {
cat(is(object)[[1]], "\n",
" Name: ", object@name, "\n",
" Age: ", object@age, "\n",
sep = ""
)
})
Here are some packages that use S4
Package | explained |
---|---|
DBI | OK |
RCpp | OK |
H20 | OK |
bioconductor | ? |
lubridate | ? |
colorspace | ? |
Matrix | ? |
The lubridate package has periods, intervals, and durations.
Durations: the way physicists think about time ( it always goes up, one second at a time )
Periods: They way humans think about time. Times are measured in years, month and days. However! The number of seconds in a period might vary, based on when the period started ( thanks to leap years )
Interval: a number of seconds that starts at a specific date
Numeric: sometimes you just want to represent a number as a number of seconds
Why does lubridate use S4? (Or rather, what might be difficult in S3?)
Answer: lubridate wants to combine these classes in all kinds of ways. That's clunky in S3.
There is no easy way in S3 to change your method definition based on the types of two arguments!
`%within%`.Period= function( val ){
if ( class(val) == 'numeric' ){ .... }
if ( class(val) == 'Period' ){ .... }
if ( class(val) == 'POSIXct' ){ .... }
}
For multiple dispatch ( which is annoying in S3)
#' @export
setMethod("+", signature(e1 = "Period", e2 = "Period"),
function(e1, e2) add_period_to_period(e2, e1))
#' @export
setMethod("+", signature(e1 = "Period", e2 = "Date"),
function(e1, e2) add_period_to_date(e1, e2))
#' @export
setMethod("+", signature(e1 = "Period", e2 = "numeric"),
function(e1, e2) add_number_to_period(e2, e1))
#' @export
setMethod("+", signature(e1 = "Period", e2 = "POSIXct"),
function(e1, e2) add_period_to_date(e1, e2))
#' @export
setMethod("+", signature(e1 = "Period", e2 = "POSIXlt"),
function(e1, e2) add_period_to_date(e1, e2))
#' @export
setMethod("+", signature(e1 = "Date", e2 = "Duration"),
function(e1, e2) add_duration_to_date(e2, e1))
#' @export
setMethod("+", signature(e1 = "Date", e2 = "Period"),
function(e1, e2) add_period_to_date(e2, e1))
#' @export
setMethod("+", signature(e1 = "difftime", e2 = "Duration")
Just because something is possible doesn't mean it is a good idea.
Either
Now you can use factor in S4 classes.
Package( classes )
The more times you remember a thing, the less likely you are to forget it.