vignettes/austraits_overview.Rmd
austraits_overview.Rmd
AusTraits is an open-source, harmonised database of Australian plant trait data. Traits vary in scope from physiological measures of performance (e.g. photosynthetic gas exchange, water-use efficiency) to morphological attributes (e.g. leaf area, seed mass, plant height).
This vignette provides an overview of our workflow, to demonstrate our commitment to creating a reliable, reproducible resource for anyone interested in plant traits.
The data in AusTraits are derived from nearly 300 distinct sources, each contributed by an individual researcher, government entity (e.g. herbaria), or NGO. Each reflects the research agenda of the individual/organisation who contributed the data - the species selected, traits measured, manipulative treatments performed, and locations sampled encompass the diversity of research interests present in Australia across the past many decades. The AusTraits data curators have simply sought out researcher upon researcher to share their data, reaching out to as many people as time permitted, but not explicitly soliciting datasets with specific traits and the spotty data coverage by trait or location simply represents what has been merged into AusTraits at this time.
These datasets use different variable names, data structures, units and sometimes methods.
To create a single database for distribution to the research community, we developed a reproducible and transparent workflow in R for merging each dataset into AusTraits. The pipeline ensures the following information is standardised across all datasets in AusTraits. A metadata
file for each study documents how the data tables
submitted by an individual contributor are translated into the standardised terms used in the AusTraits database.
definitions
file and only data for traits included in this file can be merged into AusTraits. The trait names used in the incoming dataset are mapped onto the appropriate AusTraits trait name.definitions
file includes units
and the allowable range
of values. All incoming data are converted to the appropriate units.definitions
file includes a list of allowable values
, allowed terms for the trait. Each categorical trait value is defined in the definitions
file. List of substitutions translate the exact syntax and terms in the spreadsheet submitted into the values allowed by AusTraits. This ensures that for a certain trait the same value
has an identical meaning throughout the AusTraits database.The metadata
file also includes all metadata associated with the study: