Explanation of some important terms used in dealing with Restricted Access Species Data

There are several terms that need to be explained in this framework. A full list is provided at the back of the Framework under Definitions. Some particularly important terms are:

Dataset and Attribute Field

The terms dataset and attribute field are used widely in this framework. A dataset is a set of data held by a data custodian that includes information about species observations including the identity of the species, latitude, and longitude. Other fields that might be included are date, a location description, collector details etc.

An attribute field refers to one of the fields in the dataset used to describe the observations such as location, date, observer etc.

Transformation

Transformation is a term used widely in this document to describe the range of actions that a data custodian applies to a dataset to modify it from its raw state. Actions include:

  1. Withholding attribute fields (removing attribute fields from a dataset to, for example, remove personal identifiable information about observers)
  2. Obfuscating locations (modifying the latitude and longitude to make it difficult to identify the original location of the record)

Sensitive Species or Restricted Access Species Lists (RASLs)

RASLs, also known as sensitive species lists, are lists of species that are regarded as requiring protection from human interference for a variety of reasons. They can include both listed threatened species as well as unlisted species that have some aspect that makes them sensitive, for example, poaching from nest sites. Typically, the decision to classify a species as sensitive is balanced against the potential cost and disadvantages of hiding locality information and potentially impairing conservation efforts. Some RASLs only affect species records at certain times of the year or certain life stages, for example Birdlife Australia treats some RASL species records as sensitive only during breeding season.

Maintaining up-to-date, public, discoverable RASLs supports better and more efficient decision- making, and more informed use and management of data.

RASLs are maintained by all state and territory environmental agencies in Australia. Most jurisdictional RASLs are publicly available lists that delineate which species should have their geographic locations blurred (see obfuscation) or withheld entirely to prevent disturbance or exploitation of the species or the site.

Some custodians of large species-specific datasets such as BirdLife Australia, FrogID and Butterflies Australia also maintain RASLs derived from expert opinion.

Lastly, large data aggregators also use these RASLs, for example the Atlas of Living Australia applies both jurisdictional and third-party RASLs to data.

Obfuscation

Obfuscation or “fuzzing” of data is the practice of transforming or obscuring the original geo-localities of an observation of flora or fauna to make it difficult to discern the original geo-locality by randomisation or generalisation. Randomisation has benefits for map display but is not a robust data management approach in a data ecosystem as it creates artificial points. In the following example Figure 1 (figure one taken from Chapman 2020), two alternative means of taking individual observations and generalising them to a grid are shown.

Two generalisation methods: On the left, a geographic grid, where all records are referenced to the bottom left-hand (SE) corner; right, a metric grid where all records are referenced to the centroid
Figure 1: (Taken from Chapman 2020 – please note that this is not figure one in this framework)

Either of the above generalisation techniques, rounding (A) or the provision of records as grid square polygons (B) are preferred, if data are to be passed between systems that may then apply their own additional obfuscation. For the purposes of this framework, references to obfuscation imply generalisation.

Metadata

Metadata are data about data, helping a user to interpret data (or observations). For the purposes of this framework, metadata are either:

  1. Dataset metadata – a concise description of a dataset which enables a user, not necessarily able to access a dataset, to gauge the relevance of the data for their purposes; or
  2. Record (or row)-level metadata – these are documentation in an attribute field at the level of a record in a dataset. For the purposes of this framework, it refers to documentation of the sensitivity status of the record (or the species of which it is a part) along with access constraints pertaining to the record and details of any generalisation of the data.