metric-calculation_vignette.Rmd
This document demonstrates how to calculate common biological community metrics, such as richness, relative abundance, and diversity indices using the mmir R-package.
The package, devtools, must be installed to use the devtools::install_github()
function. You will also need to install the dplyr package to aid in data manipulation.
install.packages("devtools", "dplyr")
Install the Multi-Metric Index (MMI) package, mmir, that I am developing from GitHub.
devtools::install_github("zsmith27/mmir")
Once mmir is installed, load the packages with library()
.
library(mmir)
One of the major goals of the mmir package was to simplify the metric calculation procedure. Five base taxa functions (i.e., taxa_rich()
, taxa_pct_rich()
, taxa_div()
, taxa_pct()
, and taxa_abund()
) are capable of calculating the majority of metrics that are used in Multi-Metric Indices of biotic integrity.
As an example data set, mmir contains benthic macroinvertebrates data collected in the Northern Appalachians as part of the 2008-2009 National Rivers and Streams Assessment conducted by the United States Environmental Protection Agency.
data("nrsa_nap_0809", package = "mmir")
Use taxa_fill
to replace NA
s in the taxonomic hierarchy with the nearest identified taxonomic rank. Note that the order the columns are presented is used to create the hierarchy that identifies the previous taxonomic rank. For example, you want your columns to be ordered as: Phylum, Class, Order, Family, Genus.
nrsa_nap_fill <- nrsa_nap_0809 %>% taxa_fill(.final_id = target_taxon, .prefix = "unidentified", phylum:genus)
Create a data frame that will store the calculated metrics. This should include a unique site ID and possibly a few other important columns, such as stream gradient classification column (e.g., Reference and Degraded). I recommend not carrying all of the site information through the metric calculation process because it can easily be appended at the end of the calculation process or at another later step. This will reduce clutter, allowing you to focus on the metric values, and will help you manage memory allocation. Additionally, do NOT alter the order of the unique identifier. The output from each metric function is a vector in the order of the supplied data frame of taxonomic counts (in this case onondaga
). If the unique identifier is rearranged, no error will appear but the metric values will be associated with incorrect unique identifier.
nest.df <- nrsa_nap_fill %>% dplyr::group_nest(uid, rt_nrsa_cat, .key = "data")
Standard input variable definitions for functions in the mmir package.
.dataframe
= A data frame in a long data format, where each row represents a unique sampling event and taxon.key.col
= A single column that represents a unique ID or key (i.e., site ID) that can be used to group the data. This variable should be input using NSE syntax (i.e., not quoted)..group_col
= A single column that represents a lower resolution taxonomic rank than the taxonomic rank specified for taxa.col
. This variable should be input using NSE syntax (i.e., not quoted).taxa.col
= A single column that represents a higher resolution taxonomic rank than the taxonomic rank specified for .group_col
. This variable should be input using NSE syntax (i.e., not quoted).taxon
= A object or vector, generally a character string or a character vector, that is used subsets the data frame to only include rows that match the string(s) in taxa.col
..counts_col
= A single numeric column that represents taxonomic counts. This variable should be input using NSE syntax (i.e., not quoted).Richness refers to the number of unique taxa. Typically, richness is used to refer to the number of unique species found in a sample but richness can be calculated for any taxonomic rank (e.g., order, family, genus).
taxa_rich()
is used to calculate richness. A long format data frame containing taxonomic counts is specified as the .dataframe
variable (onondaga
). The name of a unique ID column is specified for the .key_col
variable (uid
). Note that this is done with NSE syntax (i.e., no quotes). The .group_col
and .filter_col
refer to low resolution and high resolution taxonomic rank columns of interest. .filter_col
is required but .group_col
is only necessary when calculating subgroup richness. Furthermore, the taxon
variable is only necessary for specifying which taxon or taxa will be used to calculate subgroup richness; therefore, when calculating community richness, taxon
should be set to NULL
. In this example family-level and genus-level community richness are calculated and appended to metrics.key
. DT::datatable()
is only used to present interactive tables in this document.
rich.df <- nest.df %>% dplyr::mutate( rich_family = taxa_rich(.dataframe = ., .key_col = uid, .group_col = family, .unnest_col = data), rich_genus = taxa_rich(.dataframe = ., .key_col = uid, .group_col = genus, .unnest_col = data) ) rich.df %>% dplyr::mutate(data = "nested dataframe") %>% head() %>% knitr::kable()
uid | rt_nrsa_cat | data | rich_family | rich_genus |
---|---|---|---|---|
11134 | least disturbed | nested dataframe | 25 | 47 |
11135 | least disturbed | nested dataframe | 26 | 54 |
11136 | least disturbed | nested dataframe | 28 | 48 |
11137 | least disturbed | nested dataframe | 23 | 36 |
11138 | least disturbed | nested dataframe | 26 | 37 |
11564 | most disturbed | nested dataframe | 19 | 36 |
taxa_rich()
can also be used to calculate subgroup richness, which refers to calculating the richness of only select set of taxa. In the example below Ephemeroptera (Mayfly) richness is calculated at the family- and genus-levels. More than one taxonomic group can be specified, as shown for rich_ept_gen
(taxon = c("ephemeroptera", "plecoptera", "trichoptera")
). The taxon
variable is used to filter the data frame based on character string matches found in specified .group_col
. In the example, the onondaga
data frame is subset to only include rows where the order
column specifies “ephemeroptera”. The .filter_col
(e.g., family
or genus
) is then used to find the number of unique ephemeroptera taxa at this specified taxonomic level. If taxon
was set to “baetidae” (a family of ephmeroptera) in the example below, then taxa_rich()
would return all zeros because the character string “baetidae” would never be found in the order-level column.
sub_rich.df <- nest.df %>% dplyr::mutate( rich_ephemeroptera_fam = taxa_rich(.dataframe = ., .key_col = uid, .group_col = family, .filter = order %in% "ephemeroptera", .unnest_col = data), rich_ephemeroptera_gen = taxa_rich(.dataframe = ., .key_col = uid, .group_col = genus, .filter = order %in% "ephemeroptera", .unnest_col = data), rich_ept_gen = taxa_rich(.dataframe = ., .key_col = uid, .group_col = genus, .filter = order %in% c("ephemeroptera", "plecoptera", "trichoptera"), .unnest_col = data) ) sub_rich.df %>% dplyr::mutate(data = "nested dataframe") %>% head() %>% knitr::kable()
uid | rt_nrsa_cat | data | rich_ephemeroptera_fam | rich_ephemeroptera_gen | rich_ept_gen |
---|---|---|---|---|---|
11134 | least disturbed | nested dataframe | 2 | 7 | 21 |
11135 | least disturbed | nested dataframe | 4 | 9 | 25 |
11136 | least disturbed | nested dataframe | 4 | 8 | 22 |
11137 | least disturbed | nested dataframe | 5 | 8 | 21 |
11138 | least disturbed | nested dataframe | 4 | 4 | 16 |
11564 | most disturbed | nested dataframe | 1 | 1 | 3 |
Diversity metric in general combine the measure of taxonomic evenness and taxonomic richness. ADD EVENNESS DESCRIPTION. A number of diversity metrics can be calculated using taxa_div()
: Shannon-Wiener Diversity (“shannon”), Simpson’s Diversity (“simpson”), Margalef’s Diversity (“margalef”), Menhinick’s Diversity (“menhinick”), and Pielou Evenness (“pielou”).
div.df <- nest.df %>% dplyr::mutate( shannon_genus = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = genus, .job = "shannon", .base_log = 2, .unnest_col = data), simpson_genus = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = genus, .job = "simpson", .unnest_col = data), margalef_genus = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = genus, .job = "margalef", .unnest_col = data), menhinick_genus = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = genus, .job = "menhinick", .unnest_col = data), pielou_genus = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = genus, .job = "pielou", .unnest_col = data) ) div.df %>% dplyr::mutate(data = "nested dataframe") %>% head() %>% knitr::kable()
uid | rt_nrsa_cat | data | shannon_genus | simpson_genus | margalef_genus | menhinick_genus | pielou_genus |
---|---|---|---|---|---|---|---|
11134 | least disturbed | nested dataframe | 4.074962 | 0.1018417 | 17.43422 | 2.253477 | 1.672098 |
11135 | least disturbed | nested dataframe | 4.584539 | 0.0694405 | 20.37699 | 2.703381 | 1.732394 |
11136 | least disturbed | nested dataframe | 4.202079 | 0.1019651 | 18.66180 | 2.642313 | 1.681241 |
11137 | least disturbed | nested dataframe | 3.723167 | 0.1819546 | 15.93880 | 2.873113 | 1.556302 |
11138 | least disturbed | nested dataframe | 2.869174 | 0.3291442 | 14.06301 | 1.941996 | 1.568202 |
11564 | most disturbed | nested dataframe | 2.678898 | 0.2865829 | 12.91861 | 1.590990 | 1.556302 |
sub_div.df <- nest.df %>% dplyr::mutate( gini_simpson_ept = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = order, .filter = genus %in% c("ephemeroptera", "plecoptera", "trichoptera"), .job = "gini_simpson", .unnest_col = data), simpson_ept = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = order, .filter = genus %in% c("ephemeroptera", "plecoptera", "trichoptera"), .job = "simpson", .unnest_col = data), shannon_ept = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = order, .filter = genus %in% c("ephemeroptera", "plecoptera", "trichoptera"), .job = "shannon", .base_log = 2, .unnest_col = data) ) sub_div.df %>% dplyr::mutate(data = "nested dataframe") %>% head() %>% knitr::kable()
dom.df <- nest.df %>% dplyr::mutate( dom_1_target_taxon = taxa_dom(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = target_taxon, .dom_level = 1, .unnest_col = data), dom_5_target_taxon = taxa_dom(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = target_taxon, .dom_level = 5, .unnest_col = data) ) dom.df %>% dplyr::mutate(data = "nested dataframe") %>% head() %>% knitr::kable()
uid | rt_nrsa_cat | data | dom_1_target_taxon | dom_5_target_taxon |
---|---|---|---|---|
11134 | least disturbed | nested dataframe | 21.37931 | 63.67816 |
11135 | least disturbed | nested dataframe | 18.04511 | 48.37093 |
11136 | least disturbed | nested dataframe | 23.33333 | 58.48485 |
11137 | least disturbed | nested dataframe | 40.12739 | 63.05732 |
11138 | least disturbed | nested dataframe | 55.64738 | 76.30854 |
11564 | most disturbed | nested dataframe | 42.77344 | 83.78906 |
taxa_pct()
calculates the percentage of the sample represented by a taxon or taxa. The example below shows how to calculate the percentage of a single taxon, Ephemeroptera (pct_ephemeroptera
), and multiple taxa, EPT (pct_ept
). Remember that .filter_col
refers to the name of the column that contains the taxon or taxa of interest. The .filter_vec
variable is then used to specify these taxon or taxa.
pct.df <- nest.df %>% dplyr::mutate( pct_ephemeroptera = taxa_pct(.dataframe = ., .key_col = uid, .counts_col = total, .filter = order %in% "ephemeroptera", .unnest_col = data), pct_ept = taxa_pct(.dataframe = ., .key_col = uid, .counts_col = total, .filter = order %in% c("ephemeroptera", "plecoptera", "trichoptera"), .unnest_col = data) ) pct.df %>% dplyr::mutate(data = "nested dataframe") %>% head() %>% knitr::kable()
uid | rt_nrsa_cat | data | pct_ephemeroptera | pct_ept |
---|---|---|---|---|
11134 | least disturbed | nested dataframe | 5.287356 | 17.93103 |
11135 | least disturbed | nested dataframe | 26.817043 | 60.40100 |
11136 | least disturbed | nested dataframe | 12.424242 | 41.51515 |
11137 | least disturbed | nested dataframe | 15.923567 | 77.07006 |
11138 | least disturbed | nested dataframe | 3.581267 | 26.44628 |
11564 | most disturbed | nested dataframe | 1.953125 | 2.34375 |
Abundance refers to reported taxonomic count but may represent counts aggregated at a lower resolution taxonomic rank. For example, taxa may be reported at the genus-level but abundance could be calculated at the order-level. Generally, I have not found abundance metrics used in macorinvertebrate indices but they are common in fish indices. taxa_abund()
has the same input as taxa_pct()
(See Percentages).
abund.df <- nest.df %>% dplyr::mutate( abund_ephemeroptera = taxa_abund(., .key_col = uid, .counts_col = total, .filter = order %in% "ephemeroptera", .unnest_col = data), abund_ept = taxa_abund(., .key_col = uid, .counts_col = total, .filter = order %in% c("ephemeroptera", "plecoptera", "trichoptera"), .unnest_col = data) ) abund.df %>% dplyr::mutate(data = "nested dataframe") %>% head() %>% knitr::kable()
uid | rt_nrsa_cat | data | abund_ephemeroptera | abund_ept |
---|---|---|---|---|
11134 | least disturbed | nested dataframe | 23 | 78 |
11135 | least disturbed | nested dataframe | 107 | 241 |
11136 | least disturbed | nested dataframe | 41 | 137 |
11137 | least disturbed | nested dataframe | 25 | 121 |
11138 | least disturbed | nested dataframe | 13 | 96 |
11564 | most disturbed | nested dataframe | 10 | 12 |
When developing a new index, the developer usually tries to evaluate as many metrics as possible to obtain the most sensitive metrics. Writing individual lines of code for each taxon using taxa_rich()
, taxa_pct_rich()
, taxa_pct()
, or taxa_abund()
would be very time consuming and prone to typos. taxa_seq()
was developed to make these calculations simple and safe.
This is a wrapper function around the previously discussed taxa metrics (taxa_rich()
, taxa_pct_rich()
, taxa_pct()
, or taxa_abund()
), which loops through each taxon in the specified taxonomic rank or taxonomic attribute column(s) calculating taxon richness, taxon percent richness, taxon percentages, or taxon abundance. The input variable, job
, is used to specify which type of metrics to calculate (i.e., “rich”, “pct_rich”, “pct”, and “abund”). The character string used to specify the job
will be used as a prefix for the column names (e.g., “rich_ephemeroptera” or “pct_plecoptera”). The remaining inputs are the same as the base taxa functions (taxa_rich()
, taxa_pct_rich()
, taxa_pct()
, or taxa_abund()
) that taxa_seq()
wraps around. This also means that there will frequently be many input variables that you do not need to use. For example, if job = "pct"
, then hich.taxa.col
, base.log
, and q
do not need to be specified because these variables are only used for richness and/or diversity related calculations. It is often helpful to refer back to the base taxa function to identify the necessary inputs for the specified job. Additionally, each metric only represents a single taxon. The output will therefore not include metrics, such as the percentage of EPT taxa or the richness of Gathers’ and Filter Feeder taxa. These metrics will need to be calculated individually using the base taxa functions.
The example below calculates taxa richness, percent taxa richness, taxa percentage, and taxa abundance for all of the taxa under the order and family columns. The taxa.cols
input variable could be reduced to just one column (taxa.cols = "family"
) or could be expanded to include as many taxonomic rank and/or taxonomic attribute columns as you want (taxa.cols = c("class", order", "suborder", "family", "tolerance_values", "functional_feeding_groups", "habits")
). I suggest specifying all of your taxonomic rank and taxonomic attribute columns. This will create a lot of output, could potentially take awhile to calculate, and the majority of the metrics will have very poor metric sensitivity; however, the sensitivity()
function (See [Calculate Metric Sensitivity]) will allow you to quickly filter out poorly performing metrics. Calculating a large amount of metrics will give you more opptertunities to identify patterns that reflect your defined disturbance gradient. I used this function during the development of the Chessie BIBI and I found several metrics that are not commonly used to have high metric sensitivity values. For example, the percent of Systellognatha (a suborder of Plecoptera) was more sensitive than the commonly used percent of Plecoptera metric. This extensive exploration of your data also comes with the danger of overfitting your index. To prevent this issue please use index validation procedure (e.g., Hold-Out validation or Cross Validation) during your index development.
seq.df <- nest.df %>% dplyr::bind_cols( taxa_seq(.dataframe = ., .key_col = uid, .counts_col = total, .filter_cols_vec = c("class", "order"), .group_col = genus, .job = "rich", .unnest_col = data) ) #> New names: #> * rich_unidentified_nemata -> rich_unidentified_nemata...9 #> * rich_unidentified_cnidaria -> rich_unidentified_cnidaria...11 #> * rich_unidentified_nemata -> rich_unidentified_nemata...36 #> * rich_unidentified_cnidaria -> rich_unidentified_cnidaria...41 #> New names: #> * rich_unidentified_nemata...9 -> rich_unidentified_nemata...12 #> * rich_unidentified_cnidaria...11 -> rich_unidentified_cnidaria...14 #> * rich_unidentified_nemata...36 -> rich_unidentified_nemata...39 #> * rich_unidentified_cnidaria...41 -> rich_unidentified_cnidaria...44 dplyr::tibble( "List of Column Names" = names(seq.df) ) %>% knitr::kable()
List of Column Names |
---|
uid |
rt_nrsa_cat |
data |
rich_insecta |
rich_arachnida |
rich_gastropoda |
rich_oligochaeta |
rich_bivalvia |
rich_malacostraca |
rich_turbellaria |
rich_enopla |
rich_unidentified_nemata…12 |
rich_hirudinea |
rich_unidentified_cnidaria…14 |
rich_polychaeta |
rich_ostracoda |
rich_trichoptera |
rich_diptera |
rich_trombidiformes |
rich_coleoptera |
rich_ephemeroptera |
rich_basommatophora |
rich_plecoptera |
rich_lumbriculida |
rich_megaloptera |
rich_odonata |
rich_haplotaxida |
rich_veneroida |
rich_isopoda |
rich_amphipoda |
rich_unidentified_turbellaria |
rich_enchytraeida |
rich_neotaenioglossa |
rich_unidentified_arachnida |
rich_lepidoptera |
rich_decapoda |
rich_hemiptera |
rich_hoplonemertea |
rich_unidentified_nemata…39 |
rich_architaenioglossa |
rich_unidentified_gastropoda |
rich_sarcoptiformes |
rich_rhynchobdellida |
rich_unidentified_cnidaria…44 |
rich_arhynchobdellida |
rich_sabellida |
rich_cumacea |
rich_unidentified_hirudinea |
rich_unionoida |
rich_heterostropha |
rich_neuroptera |
rich_branchiobdellida |
rich_mysida |
rich_unidentified_ostracoda |
rich_unidentified_bivalvia |
rich_unidentified_oligochaeta |
# seq.df %>% # dplyr::mutate(data = "nested dataframe") %>% # head() %>% # dplyr::select(uid:rich_genus_bivalvia) %>% # knitr::kable()
The following code-chunk will produce more than 1,000 biological metrics.
metrics.df <- nest.df %>% dplyr::mutate( rich_family = taxa_rich(.dataframe = ., .key_col = uid, .group_col = family, .unnest_col = data), rich_genus = taxa_rich(.dataframe = ., .key_col = uid, .group_col = genus, .unnest_col = data), rich_target_taxon = taxa_rich(.dataframe = ., .key_col = uid, .group_col = target_taxon, .unnest_col = data), gini_simpson_ept = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = target_taxon, .filter = order %in% c("ephemeroptera", "plecoptera", "trichoptera"), .job = "gini_simpson", .unnest_col = data), simpson_ept = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = target_taxon, .filter = order %in% c("ephemeroptera", "plecoptera", "trichoptera"), .job = "simpson", .unnest_col = data), shannon_ept = taxa_div(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = target_taxon, .filter = order %in% c("ephemeroptera", "plecoptera", "trichoptera"), .job = "shannon", .base_log = 2, .unnest_col = data), pct_ept = taxa_pct(.dataframe = ., .key_col = uid, .counts_col = total, .filter = order %in% c("ephemeroptera", "plecoptera", "trichoptera"), .unnest_col = data), pct_cote = taxa_pct(.dataframe = ., .key_col = uid, .counts_col = total, .filter = order %in% c("coleoptera", "odonata", "trichoptera", "ephemeroptera"), .unnest_col = data), dom_1_target_taxon = taxa_dom(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = target_taxon, .dom_level = 1, .unnest_col = data), dom_5_target_taxon = taxa_dom(.dataframe = ., .key_col = uid, .counts_col = total, .group_col = target_taxon, .dom_level = 5, .unnest_col = data), tol_index = taxa_tol_index(.dataframe = ., .key_col = uid, .counts_col = total, .tol_col = ptv, .unnest_col = data) ) %>% dplyr::bind_cols( taxa_seq(.dataframe = ., .key_col = uid, .counts_col = total, .filter_cols_vec = c("class", "order", "family"), .group_col = target_taxon, .job = "rich", .unnest_col = data), taxa_seq(.dataframe = ., .key_col = uid, .counts_col = total, .filter_cols_vec = c("class", "order", "family"), .group_col = target_taxon, .job = "pct_rich", .unnest_col = data), taxa_seq(.dataframe = ., .key_col = uid, .counts_col = total, .filter_cols_vec = c("class", "order", "family", "genus"), .job = "pct", .unnest_col = data), taxa_seq(.dataframe = ., .key_col = uid, .counts_col = total, .filter_cols_vec = c("class", "order", "family"), .group_col = target_taxon, .job = "simpson", .unnest_col = data), ) #> New names: #> * rich_unidentified_nemata -> rich_unidentified_nemata...9 #> * rich_unidentified_cnidaria -> rich_unidentified_cnidaria...11 #> * rich_unidentified_turbellaria -> rich_unidentified_turbellaria...28 #> * rich_unidentified_arachnida -> rich_unidentified_arachnida...31 #> * rich_unidentified_nemata -> rich_unidentified_nemata...36 #> * ... #> New names: #> * pct_rich_unidentified_nemata -> pct_rich_unidentified_nemata...9 #> * pct_rich_unidentified_cnidaria -> pct_rich_unidentified_cnidaria...11 #> * pct_rich_unidentified_turbellaria -> pct_rich_unidentified_turbellaria...28 #> * pct_rich_unidentified_arachnida -> pct_rich_unidentified_arachnida...31 #> * pct_rich_unidentified_nemata -> pct_rich_unidentified_nemata...36 #> * ... #> New names: #> * pct_NULL_unidentified_nemata -> pct_NULL_unidentified_nemata...9 #> * pct_NULL_unidentified_cnidaria -> pct_NULL_unidentified_cnidaria...11 #> * pct_NULL_unidentified_turbellaria -> pct_NULL_unidentified_turbellaria...28 #> * pct_NULL_unidentified_arachnida -> pct_NULL_unidentified_arachnida...31 #> * pct_NULL_unidentified_nemata -> pct_NULL_unidentified_nemata...36 #> * ... #> New names: #> * simpson_unidentified_nemata -> simpson_unidentified_nemata...9 #> * simpson_unidentified_cnidaria -> simpson_unidentified_cnidaria...11 #> * simpson_unidentified_turbellaria -> simpson_unidentified_turbellaria...28 #> * simpson_unidentified_arachnida -> simpson_unidentified_arachnida...31 #> * simpson_unidentified_nemata -> simpson_unidentified_nemata...36 #> * ... #> New names: #> * rich_unidentified_nemata...9 -> rich_unidentified_nemata...23 #> * rich_unidentified_cnidaria...11 -> rich_unidentified_cnidaria...25 #> * rich_unidentified_turbellaria...28 -> rich_unidentified_turbellaria...42 #> * rich_unidentified_arachnida...31 -> rich_unidentified_arachnida...45 #> * rich_unidentified_nemata...36 -> rich_unidentified_nemata...50 #> * ...
nrsa_nap_metrics.df <- metrics.df %>% dplyr::select(-data) usethis::use_data(nrsa_nap_metrics.df, overwrite = TRUE) #> √ Setting active project to 'C:/Users/zmsmith.000/OneDrive - New York State Office of Information Technology Services/projects/mmir' #> √ Saving 'nrsa_nap_metrics.df' to 'data/nrsa_nap_metrics.df.rda' #> * Document your data (see 'https://r-pkgs.org/data.html')