Overview

This document demonstrates how to calculate common biological community metrics, such as richness, relative abundance, and diversity indices using the mmir R-package.

Preparation

The package, devtools, must be installed to use the devtools::install_github() function. You will also need to install the dplyr package to aid in data manipulation.

install.packages("devtools",
                 "dplyr")

Install the Multi-Metric Index (MMI) package, mmir, that I am developing from GitHub.

devtools::install_github("zsmith27/mmir")

Once mmir is installed, load the packages with library().

library(mmir)

Calculate Metrics

One of the major goals of the mmir package was to simplify the metric calculation procedure. Five base taxa functions (i.e., taxa_rich(), taxa_pct_rich(), taxa_div(), taxa_pct(), and taxa_abund()) are capable of calculating the majority of metrics that are used in Multi-Metric Indices of biotic integrity.

Preparation

As an example data set, mmir contains benthic macroinvertebrates data collected in the Northern Appalachians as part of the 2008-2009 National Rivers and Streams Assessment conducted by the United States Environmental Protection Agency.

data("nrsa_nap_0809", package = "mmir")

Use taxa_fill to replace NAs in the taxonomic hierarchy with the nearest identified taxonomic rank. Note that the order the columns are presented is used to create the hierarchy that identifies the previous taxonomic rank. For example, you want your columns to be ordered as: Phylum, Class, Order, Family, Genus.

nrsa_nap_fill <- nrsa_nap_0809 %>% 
  taxa_fill(.final_id = target_taxon,
            .prefix = "unidentified",
            phylum:genus)

Create a data frame that will store the calculated metrics. This should include a unique site ID and possibly a few other important columns, such as stream gradient classification column (e.g., Reference and Degraded). I recommend not carrying all of the site information through the metric calculation process because it can easily be appended at the end of the calculation process or at another later step. This will reduce clutter, allowing you to focus on the metric values, and will help you manage memory allocation. Additionally, do NOT alter the order of the unique identifier. The output from each metric function is a vector in the order of the supplied data frame of taxonomic counts (in this case onondaga). If the unique identifier is rearranged, no error will appear but the metric values will be associated with incorrect unique identifier.

nest.df <- nrsa_nap_fill %>% 
  dplyr::group_nest(uid,  rt_nrsa_cat, .key = "data")

Input Variable Definitions

Standard input variable definitions for functions in the mmir package.

.dataframe = A data frame in a long data format, where each row represents a unique sampling event and taxon.
key.col = A single column that represents a unique ID or key (i.e., site ID) that can be used to group the data. This variable should be input using NSE syntax (i.e., not quoted).
.group_col = A single column that represents a lower resolution taxonomic rank than the taxonomic rank specified for taxa.col. This variable should be input using NSE syntax (i.e., not quoted).
taxa.col = A single column that represents a higher resolution taxonomic rank than the taxonomic rank specified for .group_col. This variable should be input using NSE syntax (i.e., not quoted).
taxon = A object or vector, generally a character string or a character vector, that is used subsets the data frame to only include rows that match the string(s) in taxa.col.
.counts_col = A single numeric column that represents taxonomic counts. This variable should be input using NSE syntax (i.e., not quoted).

Richness Metrics

Richness refers to the number of unique taxa. Typically, richness is used to refer to the number of unique species found in a sample but richness can be calculated for any taxonomic rank (e.g., order, family, genus).

Community Richness

taxa_rich() is used to calculate richness. A long format data frame containing taxonomic counts is specified as the .dataframe variable (onondaga). The name of a unique ID column is specified for the .key_col variable (uid). Note that this is done with NSE syntax (i.e., no quotes). The .group_col and .filter_col refer to low resolution and high resolution taxonomic rank columns of interest. .filter_col is required but .group_col is only necessary when calculating subgroup richness. Furthermore, the taxon variable is only necessary for specifying which taxon or taxa will be used to calculate subgroup richness; therefore, when calculating community richness, taxon should be set to NULL. In this example family-level and genus-level community richness are calculated and appended to metrics.key. DT::datatable() is only used to present interactive tables in this document.

rich.df <- nest.df %>% 
  dplyr::mutate(
    rich_family = taxa_rich(.dataframe = .,
                            .key_col = uid,
                            .group_col = family,
                            .unnest_col = data),
    rich_genus = taxa_rich(.dataframe = .,
                            .key_col = uid,
                            .group_col = genus,
                            .unnest_col = data)
  )


rich.df %>% 
  dplyr::mutate(data = "nested dataframe") %>% 
  head() %>% 
  knitr::kable()

uid	rt_nrsa_cat	data	rich_family	rich_genus
11134	least disturbed	nested dataframe	25	47
11135	least disturbed	nested dataframe	26	54
11136	least disturbed	nested dataframe	28	48
11137	least disturbed	nested dataframe	23	36
11138	least disturbed	nested dataframe	26	37
11564	most disturbed	nested dataframe	19	36

Subgroup Richness

taxa_rich() can also be used to calculate subgroup richness, which refers to calculating the richness of only select set of taxa. In the example below Ephemeroptera (Mayfly) richness is calculated at the family- and genus-levels. More than one taxonomic group can be specified, as shown for rich_ept_gen (taxon = c("ephemeroptera", "plecoptera", "trichoptera")). The taxon variable is used to filter the data frame based on character string matches found in specified .group_col. In the example, the onondaga data frame is subset to only include rows where the order column specifies “ephemeroptera”. The .filter_col (e.g., family or genus) is then used to find the number of unique ephemeroptera taxa at this specified taxonomic level. If taxon was set to “baetidae” (a family of ephmeroptera) in the example below, then taxa_rich() would return all zeros because the character string “baetidae” would never be found in the order-level column.

sub_rich.df <- nest.df %>% 
  dplyr::mutate(
    rich_ephemeroptera_fam = taxa_rich(.dataframe = .,
                            .key_col = uid,
                            .group_col = family,
                            .filter = order %in% "ephemeroptera",
                            .unnest_col = data),
    rich_ephemeroptera_gen = taxa_rich(.dataframe = .,
                            .key_col = uid,
                            .group_col = genus,
                            .filter = order %in% "ephemeroptera",
                            .unnest_col = data),
    rich_ept_gen = taxa_rich(.dataframe = .,
                            .key_col = uid,
                            .group_col = genus,
                            .filter = order %in% c("ephemeroptera",
                                                       "plecoptera",
                                                       "trichoptera"),
                            .unnest_col = data)
  )

sub_rich.df %>% 
  dplyr::mutate(data = "nested dataframe") %>% 
  head() %>% 
  knitr::kable()

uid	rt_nrsa_cat	data	rich_ephemeroptera_fam	rich_ephemeroptera_gen	rich_ept_gen
11134	least disturbed	nested dataframe	2	7	21
11135	least disturbed	nested dataframe	4	9	25
11136	least disturbed	nested dataframe	4	8	22
11137	least disturbed	nested dataframe	5	8	21
11138	least disturbed	nested dataframe	4	4	16
11564	most disturbed	nested dataframe	1	1	3

Diversity Metrics

Diversity metric in general combine the measure of taxonomic evenness and taxonomic richness. ADD EVENNESS DESCRIPTION. A number of diversity metrics can be calculated using taxa_div(): Shannon-Wiener Diversity (“shannon”), Simpson’s Diversity (“simpson”), Margalef’s Diversity (“margalef”), Menhinick’s Diversity (“menhinick”), and Pielou Evenness (“pielou”).

Community Diversity

div.df <- nest.df %>% 
  dplyr::mutate(
    shannon_genus = taxa_div(.dataframe = .,
                            .key_col = uid,
                            .counts_col = total,
                            .group_col = genus,
                            .job = "shannon",
                             .base_log = 2,
                            .unnest_col = data),
    simpson_genus = taxa_div(.dataframe = .,
                            .key_col = uid,
                            .counts_col = total,
                            .group_col = genus,
                            .job = "simpson",
                            .unnest_col = data),
    margalef_genus = taxa_div(.dataframe = .,
                            .key_col = uid,
                            .counts_col = total,
                            .group_col = genus,
                            .job = "margalef",
                            .unnest_col = data),
    menhinick_genus = taxa_div(.dataframe = .,
                            .key_col = uid,
                            .counts_col = total,
                            .group_col = genus,
                            .job = "menhinick",
                            .unnest_col = data),
    pielou_genus = taxa_div(.dataframe = .,
                            .key_col = uid,
                            .counts_col = total,
                            .group_col = genus,
                            .job = "pielou",
                            .unnest_col = data)
  )

div.df %>% 
  dplyr::mutate(data = "nested dataframe") %>% 
  head() %>% 
  knitr::kable()

uid	rt_nrsa_cat	data	shannon_genus	simpson_genus	margalef_genus	menhinick_genus	pielou_genus
11134	least disturbed	nested dataframe	4.074962	0.1018417	17.43422	2.253477	1.672098
11135	least disturbed	nested dataframe	4.584539	0.0694405	20.37699	2.703381	1.732394
11136	least disturbed	nested dataframe	4.202079	0.1019651	18.66180	2.642313	1.681241
11137	least disturbed	nested dataframe	3.723167	0.1819546	15.93880	2.873113	1.556302
11138	least disturbed	nested dataframe	2.869174	0.3291442	14.06301	1.941996	1.568202
11564	most disturbed	nested dataframe	2.678898	0.2865829	12.91861	1.590990	1.556302

Subgroup Diversity

sub_div.df <- nest.df %>% 
  dplyr::mutate(
    gini_simpson_ept = taxa_div(.dataframe = .,
                            .key_col = uid,
                            .counts_col = total,
                            .group_col = order,
                            .filter = genus %in% c("ephemeroptera",
                                                       "plecoptera",
                                                       "trichoptera"),
                            .job = "gini_simpson",
                            .unnest_col = data),
    simpson_ept = taxa_div(.dataframe = .,
                            .key_col = uid,
                            .counts_col = total,
                            .group_col = order,
                            .filter = genus %in% c("ephemeroptera",
                                                       "plecoptera",
                                                       "trichoptera"),
                            .job = "simpson",
                            .unnest_col = data),
    shannon_ept = taxa_div(.dataframe = .,
                            .key_col = uid,
                            .counts_col = total,
                            .group_col = order,
                            .filter = genus %in% c("ephemeroptera",
                                                       "plecoptera",
                                                       "trichoptera"),
                            .job = "shannon",
                            .base_log = 2,
                            .unnest_col = data)
  )

sub_div.df %>% 
  dplyr::mutate(data = "nested dataframe") %>% 
  head() %>% 
  knitr::kable()

Dominance

dom.df <- nest.df %>% 
  dplyr::mutate(
    dom_1_target_taxon = taxa_dom(.dataframe = .,
                                  .key_col = uid,
                                  .counts_col = total,
                                  .group_col = target_taxon,
                                  .dom_level = 1,
                                  .unnest_col = data),
    dom_5_target_taxon = taxa_dom(.dataframe = .,
                                  .key_col = uid,
                                  .counts_col = total,
                                  .group_col = target_taxon,
                                  .dom_level = 5,
                                  .unnest_col = data)
  )

dom.df %>% 
  dplyr::mutate(data = "nested dataframe") %>% 
  head() %>% 
  knitr::kable()

uid	rt_nrsa_cat	data	dom_1_target_taxon	dom_5_target_taxon
11134	least disturbed	nested dataframe	21.37931	63.67816
11135	least disturbed	nested dataframe	18.04511	48.37093
11136	least disturbed	nested dataframe	23.33333	58.48485
11137	least disturbed	nested dataframe	40.12739	63.05732
11138	least disturbed	nested dataframe	55.64738	76.30854
11564	most disturbed	nested dataframe	42.77344	83.78906

Community Metrics

Percentages

taxa_pct() calculates the percentage of the sample represented by a taxon or taxa. The example below shows how to calculate the percentage of a single taxon, Ephemeroptera (pct_ephemeroptera), and multiple taxa, EPT (pct_ept). Remember that .filter_col refers to the name of the column that contains the taxon or taxa of interest. The .filter_vec variable is then used to specify these taxon or taxa.

pct.df <- nest.df %>% 
  dplyr::mutate(
    pct_ephemeroptera = taxa_pct(.dataframe = .,
                                  .key_col = uid,
                                  .counts_col = total,
                                  .filter = order %in% "ephemeroptera",
                                 .unnest_col = data),
    pct_ept = taxa_pct(.dataframe = .,
                        .key_col = uid,
                        .counts_col = total,
                        .filter = order %in% c("ephemeroptera",
                                                   "plecoptera",
                                                   "trichoptera"),
                       .unnest_col = data)
  )

pct.df %>% 
  dplyr::mutate(data = "nested dataframe") %>% 
  head() %>% 
  knitr::kable()

uid	rt_nrsa_cat	data	pct_ephemeroptera	pct_ept
11134	least disturbed	nested dataframe	5.287356	17.93103
11135	least disturbed	nested dataframe	26.817043	60.40100
11136	least disturbed	nested dataframe	12.424242	41.51515
11137	least disturbed	nested dataframe	15.923567	77.07006
11138	least disturbed	nested dataframe	3.581267	26.44628
11564	most disturbed	nested dataframe	1.953125	2.34375

Abundances

Abundance refers to reported taxonomic count but may represent counts aggregated at a lower resolution taxonomic rank. For example, taxa may be reported at the genus-level but abundance could be calculated at the order-level. Generally, I have not found abundance metrics used in macorinvertebrate indices but they are common in fish indices. taxa_abund() has the same input as taxa_pct() (See Percentages).

abund.df <- nest.df %>% 
  dplyr::mutate(
    abund_ephemeroptera = taxa_abund(.,
                                  .key_col = uid,
                                  .counts_col = total,
                                  .filter = order %in% "ephemeroptera",
                                  .unnest_col = data),
    abund_ept = taxa_abund(.,
                        .key_col = uid,
                        .counts_col = total,
                        .filter = order %in% c("ephemeroptera",
                                                   "plecoptera",
                                                   "trichoptera"),
                        .unnest_col = data)
  )

abund.df %>% 
  dplyr::mutate(data = "nested dataframe") %>% 
  head() %>% 
  knitr::kable()

uid	rt_nrsa_cat	data	abund_ephemeroptera	abund_ept
11134	least disturbed	nested dataframe	23	78
11135	least disturbed	nested dataframe	107	241
11136	least disturbed	nested dataframe	41	137
11137	least disturbed	nested dataframe	25	121
11138	least disturbed	nested dataframe	13	96
11564	most disturbed	nested dataframe	10	12

Sequencing Metric Calculations

When developing a new index, the developer usually tries to evaluate as many metrics as possible to obtain the most sensitive metrics. Writing individual lines of code for each taxon using taxa_rich(), taxa_pct_rich(), taxa_pct(), or taxa_abund() would be very time consuming and prone to typos. taxa_seq() was developed to make these calculations simple and safe.

This is a wrapper function around the previously discussed taxa metrics (taxa_rich(), taxa_pct_rich(), taxa_pct(), or taxa_abund()), which loops through each taxon in the specified taxonomic rank or taxonomic attribute column(s) calculating taxon richness, taxon percent richness, taxon percentages, or taxon abundance. The input variable, job, is used to specify which type of metrics to calculate (i.e., “rich”, “pct_rich”, “pct”, and “abund”). The character string used to specify the job will be used as a prefix for the column names (e.g., “rich_ephemeroptera” or “pct_plecoptera”). The remaining inputs are the same as the base taxa functions (taxa_rich(), taxa_pct_rich(), taxa_pct(), or taxa_abund()) that taxa_seq() wraps around. This also means that there will frequently be many input variables that you do not need to use. For example, if job = "pct", then hich.taxa.col, base.log, and q do not need to be specified because these variables are only used for richness and/or diversity related calculations. It is often helpful to refer back to the base taxa function to identify the necessary inputs for the specified job. Additionally, each metric only represents a single taxon. The output will therefore not include metrics, such as the percentage of EPT taxa or the richness of Gathers’ and Filter Feeder taxa. These metrics will need to be calculated individually using the base taxa functions.

The example below calculates taxa richness, percent taxa richness, taxa percentage, and taxa abundance for all of the taxa under the order and family columns. The taxa.cols input variable could be reduced to just one column (taxa.cols = "family") or could be expanded to include as many taxonomic rank and/or taxonomic attribute columns as you want (taxa.cols = c("class", order", "suborder", "family", "tolerance_values", "functional_feeding_groups", "habits")). I suggest specifying all of your taxonomic rank and taxonomic attribute columns. This will create a lot of output, could potentially take awhile to calculate, and the majority of the metrics will have very poor metric sensitivity; however, the sensitivity() function (See [Calculate Metric Sensitivity]) will allow you to quickly filter out poorly performing metrics. Calculating a large amount of metrics will give you more opptertunities to identify patterns that reflect your defined disturbance gradient. I used this function during the development of the Chessie BIBI and I found several metrics that are not commonly used to have high metric sensitivity values. For example, the percent of Systellognatha (a suborder of Plecoptera) was more sensitive than the commonly used percent of Plecoptera metric. This extensive exploration of your data also comes with the danger of overfitting your index. To prevent this issue please use index validation procedure (e.g., Hold-Out validation or Cross Validation) during your index development.

seq.df <- nest.df %>% 
  dplyr::bind_cols(
    taxa_seq(.dataframe = .,
             .key_col = uid,
             .counts_col = total,
             .filter_cols_vec = c("class", "order"),
             .group_col = genus,
             .job = "rich",
             .unnest_col = data)
    )
#> New names:
#> * rich_unidentified_nemata -> rich_unidentified_nemata...9
#> * rich_unidentified_cnidaria -> rich_unidentified_cnidaria...11
#> * rich_unidentified_nemata -> rich_unidentified_nemata...36
#> * rich_unidentified_cnidaria -> rich_unidentified_cnidaria...41
#> New names:
#> * rich_unidentified_nemata...9 -> rich_unidentified_nemata...12
#> * rich_unidentified_cnidaria...11 -> rich_unidentified_cnidaria...14
#> * rich_unidentified_nemata...36 -> rich_unidentified_nemata...39
#> * rich_unidentified_cnidaria...41 -> rich_unidentified_cnidaria...44

dplyr::tibble(
  "List of Column Names" = names(seq.df)
) %>% 
knitr::kable()

List of Column Names
uid
rt_nrsa_cat
data
rich_insecta
rich_arachnida
rich_gastropoda
rich_oligochaeta
rich_bivalvia
rich_malacostraca
rich_turbellaria
rich_enopla
rich_unidentified_nemata…12
rich_hirudinea
rich_unidentified_cnidaria…14
rich_polychaeta
rich_ostracoda
rich_trichoptera
rich_diptera
rich_trombidiformes
rich_coleoptera
rich_ephemeroptera
rich_basommatophora
rich_plecoptera
rich_lumbriculida
rich_megaloptera
rich_odonata
rich_haplotaxida
rich_veneroida
rich_isopoda
rich_amphipoda
rich_unidentified_turbellaria
rich_enchytraeida
rich_neotaenioglossa
rich_unidentified_arachnida
rich_lepidoptera
rich_decapoda
rich_hemiptera
rich_hoplonemertea
rich_unidentified_nemata…39
rich_architaenioglossa
rich_unidentified_gastropoda
rich_sarcoptiformes
rich_rhynchobdellida
rich_unidentified_cnidaria…44
rich_arhynchobdellida
rich_sabellida
rich_cumacea
rich_unidentified_hirudinea
rich_unionoida
rich_heterostropha
rich_neuroptera
rich_branchiobdellida
rich_mysida
rich_unidentified_ostracoda
rich_unidentified_bivalvia
rich_unidentified_oligochaeta


# seq.df %>% 
#   dplyr::mutate(data = "nested dataframe") %>% 
#   head() %>% 
#   dplyr::select(uid:rich_genus_bivalvia) %>% 
#   knitr::kable()

Putting it All Together

The following code-chunk will produce more than 1,000 biological metrics.

metrics.df <- nest.df %>% 
  dplyr::mutate(
    rich_family = taxa_rich(.dataframe = .,
                            .key_col = uid,
                            .group_col = family,
                            .unnest_col = data),
    rich_genus = taxa_rich(.dataframe = .,
                           .key_col = uid,
                           .group_col = genus,
                           .unnest_col = data),
    rich_target_taxon = taxa_rich(.dataframe = .,
                                  .key_col = uid,
                                  .group_col = target_taxon,
                                  .unnest_col = data),
    gini_simpson_ept = taxa_div(.dataframe = .,
                                .key_col = uid,
                                .counts_col = total,
                                .group_col = target_taxon,
                                .filter = order %in% c("ephemeroptera",
                                                       "plecoptera",
                                                       "trichoptera"),
                                .job = "gini_simpson",
                                .unnest_col = data),
    simpson_ept = taxa_div(.dataframe = .,
                           .key_col = uid,
                           .counts_col = total,
                           .group_col = target_taxon,
                           .filter = order %in% c("ephemeroptera",
                                                  "plecoptera",
                                                  "trichoptera"),
                           .job = "simpson",
                           .unnest_col = data),
    shannon_ept = taxa_div(.dataframe = .,
                           .key_col = uid,
                           .counts_col = total,
                           .group_col = target_taxon,
                           .filter = order %in% c("ephemeroptera",
                                                  "plecoptera",
                                                  "trichoptera"),
                           .job = "shannon",
                           .base_log = 2,
                           .unnest_col = data),
    pct_ept = taxa_pct(.dataframe = .,
                       .key_col = uid,
                       .counts_col = total,
                       .filter = order %in% c("ephemeroptera",
                                              "plecoptera",
                                              "trichoptera"),
                       .unnest_col = data),
    pct_cote = taxa_pct(.dataframe = .,
                        .key_col = uid,
                        .counts_col = total,
                        .filter = order %in% c("coleoptera",
                                               "odonata",
                                               "trichoptera",
                                               "ephemeroptera"),
                        .unnest_col = data),
    dom_1_target_taxon = taxa_dom(.dataframe = .,
                                  .key_col = uid,
                                  .counts_col = total,
                                  .group_col = target_taxon,
                                  .dom_level = 1,
                                  .unnest_col = data),
    dom_5_target_taxon = taxa_dom(.dataframe = .,
                                  .key_col = uid,
                                  .counts_col = total,
                                  .group_col = target_taxon,
                                  .dom_level = 5,
                                  .unnest_col = data),
    tol_index = taxa_tol_index(.dataframe = .,
                               .key_col = uid,
                               .counts_col = total,
                               .tol_col = ptv,
                               .unnest_col = data)
  ) %>% 
  dplyr::bind_cols(
    taxa_seq(.dataframe = .,
             .key_col = uid,
             .counts_col = total,
             .filter_cols_vec = c("class", "order", "family"),
             .group_col = target_taxon,
             .job = "rich",
             .unnest_col = data),
    taxa_seq(.dataframe = .,
             .key_col = uid,
             .counts_col = total,
             .filter_cols_vec = c("class", "order", "family"),
             .group_col = target_taxon,
             .job = "pct_rich",
             .unnest_col = data),
    taxa_seq(.dataframe = .,
             .key_col = uid,
             .counts_col = total,
             .filter_cols_vec = c("class", "order", "family", "genus"),
             .job = "pct",
             .unnest_col = data),
    taxa_seq(.dataframe = .,
             .key_col = uid,
             .counts_col = total,
             .filter_cols_vec = c("class", "order", "family"),
             .group_col = target_taxon,
             .job = "simpson",
             .unnest_col = data),
  )
#> New names:
#> * rich_unidentified_nemata -> rich_unidentified_nemata...9
#> * rich_unidentified_cnidaria -> rich_unidentified_cnidaria...11
#> * rich_unidentified_turbellaria -> rich_unidentified_turbellaria...28
#> * rich_unidentified_arachnida -> rich_unidentified_arachnida...31
#> * rich_unidentified_nemata -> rich_unidentified_nemata...36
#> * ...
#> New names:
#> * pct_rich_unidentified_nemata -> pct_rich_unidentified_nemata...9
#> * pct_rich_unidentified_cnidaria -> pct_rich_unidentified_cnidaria...11
#> * pct_rich_unidentified_turbellaria -> pct_rich_unidentified_turbellaria...28
#> * pct_rich_unidentified_arachnida -> pct_rich_unidentified_arachnida...31
#> * pct_rich_unidentified_nemata -> pct_rich_unidentified_nemata...36
#> * ...
#> New names:
#> * pct_NULL_unidentified_nemata -> pct_NULL_unidentified_nemata...9
#> * pct_NULL_unidentified_cnidaria -> pct_NULL_unidentified_cnidaria...11
#> * pct_NULL_unidentified_turbellaria -> pct_NULL_unidentified_turbellaria...28
#> * pct_NULL_unidentified_arachnida -> pct_NULL_unidentified_arachnida...31
#> * pct_NULL_unidentified_nemata -> pct_NULL_unidentified_nemata...36
#> * ...
#> New names:
#> * simpson_unidentified_nemata -> simpson_unidentified_nemata...9
#> * simpson_unidentified_cnidaria -> simpson_unidentified_cnidaria...11
#> * simpson_unidentified_turbellaria -> simpson_unidentified_turbellaria...28
#> * simpson_unidentified_arachnida -> simpson_unidentified_arachnida...31
#> * simpson_unidentified_nemata -> simpson_unidentified_nemata...36
#> * ...
#> New names:
#> * rich_unidentified_nemata...9 -> rich_unidentified_nemata...23
#> * rich_unidentified_cnidaria...11 -> rich_unidentified_cnidaria...25
#> * rich_unidentified_turbellaria...28 -> rich_unidentified_turbellaria...42
#> * rich_unidentified_arachnida...31 -> rich_unidentified_arachnida...45
#> * rich_unidentified_nemata...36 -> rich_unidentified_nemata...50
#> * ...

nrsa_nap_metrics.df <- metrics.df %>% 
  dplyr::select(-data)

usethis::use_data(nrsa_nap_metrics.df,
                  overwrite = TRUE)
#> √ Setting active project to 'C:/Users/zmsmith.000/OneDrive - New York State Office of Information Technology Services/projects/mmir'
#> √ Saving 'nrsa_nap_metrics.df' to 'data/nrsa_nap_metrics.df.rda'
#> * Document your data (see 'https://r-pkgs.org/data.html')

Metric Calculation

Zachary M. Smith

2020-12-11

Overview

Preparation

Calculate Metrics

Preparation

Input Variable Definitions

Richness Metrics

Community Richness

Subgroup Richness

Diversity Metrics

Community Diversity

Subgroup Diversity

Dominance

Community Metrics

Percentages

Abundances

Sequencing Metric Calculations

Putting it All Together