18.0.1 Example Data

Import the example data. This data represents benthic macroinvertebrate data collected in the littoral zone of Onondaga, Otisco, and Cazenovia lakes.

taxa.df <- file.path("data",
          "zms_thesis-macro_2017-06-18.csv") %>% 
  read.csv(stringsAsFactors = FALSE)

Preprocess taxa.df to only include unique instances of station IDs and sample dates. For more details about this process see the select and distinct sections.

dates.df <- taxa.df %>% 
  select(station_id, date) %>% 

DT::datatable(dates.df, options = list(columnDefs = list(list(className = 'dt-center', targets = 0:2))))

18.0.2 mdy, ymd, dmy, ymd_hms, …

In dates.df, the date column is imported as a character class and follows a “mm/dd/yyyy” format. The function mdy() can be used convert the character strings in the date column to a date class.

mdy.df <- dates.df %>% 
  mutate(date = mdy(date))

DT::datatable(mdy.df, options = list(columnDefs = list(list(className = 'dt-center', targets = 0:2))))

In the example above, it is obvious the the format of the date has changed but it is not obvious that the R-class has changed. First look at the classes represented in the dates.df.

sapply(dates.df, class)
##  station_id        date 
## "character" "character"

Then looking at the column classes in myd.df, we can see date has changed to class “Date”.

sapply(mdy.df, class)
##  station_id        date 
## "character"      "Date"

18.0.3 year, month, mday, yday, hour, minute, and second

Once a column is a date or datetime class, then lubridate functions make it easy to extract parts of the date, such as year, month, day, hour, minutes, seconds, etc. In the mutate() call below, I applied many but not all of the helpful functions for extracting datetime related information. The majority of these are straight forward; however, we can change label and abbr to alter the output of functions like month() and wday().

  1. label
    • label = FALSE returns a numeric value
    • label = TRUE returns a character value
  2. abbr
    • If label = FALSE, then abbr has no effect
    • label = TRUE and abbr = TRUE returns an abbreviated character string
      • week: Sun, Mon, Tue, Wed, Thu, Fri, Sat
      • month: Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec
    • label = TRUE and abbr = FALSE returns an full character string
      • week: Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday
      • month: January, February, March, April, May, June, July, August, September, October, November, December
extract.df <- mdy.df %>% 
  mutate(year = year(date),
         month_int = month(date),
         month_abv = month(date, label = TRUE),
         month_full = month(date, label = TRUE, abbr = FALSE),
         week = week(date),
         day = day(date),
         wday_int = wday(date),
         wday_abv = wday(date, label = TRUE),
         wday_full = wday(date, label = TRUE, abbr = FALSE),
         mday = mday(date),
         qday = qday(date),
         yday = yday(date),
         hour = hour(date),
         minute = minute(date),
         second = second(date))

DT::datatable(extract.df, options = list(scrollX = TRUE))

18.0.4 round_date, floor_date, and ceiling_date

round_date() will round the date or datetime by the specified unit of time, such as “15 minutes”, “week”, “month”, or “year”. I find it really convient that you can specify to the nearest “15 minutes”. floor_date() and ceiling_date() provide similar functionality but always round down or up, respectively.

round.df <- mdy.df %>% 
  mutate(round_week = round_date(date, "week"),
         round_month = round_date(date, "month"),
         round_year = round_date(date, "year"),
         round_year5 = round_date(date, "5 years"),
         round_century = round_date(date, "100 years"),
         floor_month = floor_date(date, "month"),
         floor_year = floor_date(date, "year"),
         ceiling_month = ceiling_date(date, "month"),
         ceiling_year = ceiling_date(date, "year"))

DT::datatable(round.df, options = list(scrollX = TRUE,
                                       autoWidth = TRUE,
                                       columnDefs = list(list(width = '70px', targets = c(2)))))