Data models, data standards and data types are at the core of proper data organisations, re-use and interoperability. With the term data standards we refer to all the different file formats that are being used in science expecting of course that these are well specified and where possible adhere to widely accepted standards and best practices to enable interpretation . The term data type has been used in computer science for many years and ranges from simple types (integers, etc.) to composite/complex types, MIME types and abstract data types . In the context of RDA the term data types was used to establish a relation between for example a file type and a tool that allows its interpretation or between a concept that can be found in a structured file and its interpretation. Also the term data model has been used in computer science for many years with various flavours . In this context we interpret the term as specifying the way data, collections, different types of metadata (descriptive, provenance, rights, etc.) and persistent identifiers are being organised and how they refer to each other to allow machines to easily find all required information about a digital object.
Differences between and non-explicitness of data standards, types and model are one of the major sources of inefficiencies when working with data in science and industry. The FAIR principles define principles which are also addressing this topic area.
It is widely agreed that in the core of this topic area are Digital Objects and their characteristics: they have bit sequences stored in repositories, they are assigned a persistent identifier and associated with metadata. Digital Objects can be of a wide range of sorts: data, software, system configurations, digital fingerprints of physical objects, etc. Data Objects can originate for example from files or from executing queries on databases.
This cluster page refers to all RDA WGs and IGs that deal in some way with the aspects mentioned above.
Array Database Assessment WG The Array Database Assessment WG is working with a completely different data model. They expect that all data and metadata belonging to a certain study is entered into a big array so that one can then work efficiently with all this data and its metadata, define various views, do calculations etc. being assisted by a query language. All data of this type are accessible by exposing Open Geospatial Consortium (OGC) services on top of them, such as the WCS, WMS and WCPS ones
BioSharing Registry WG The aim of this working group is to produce a searchable registry of linked and reliable resources (funder policies, databases, content standards, journals) for a variety of stakeholders working in the life sciences. These stakeholders – such as researchers, funders, and journals – will be able to select and recommend community endorsed standards, while repository developers will be able to confirm the requirements of their products for discoverability and endorsement.
Data Fabric IG The Data Fabric IG is focusing on the data creation and consumption circle as it happens daily in the scientific and industrial labs and on the identification of ways to make this work more efficiently and thus more cost-effective. The group’s goal is to identify so-called Common Components and define their characteristics and services that can be used across boundaries in such a way that they can be combined to solve a variety of data scenarios.
Data Foundation & Terminology WG The Data Foundation and Terminology WG task is to describe a basic, abstract data organization model which can be used to derive a reference data terminology that can be used across communities and stakeholders to better synchronize conceptualization, to enable better understanding within and between communities and finally to stimulate tool building, such as for data services, supportive of the basic model’s use. This abstract data organization model will focus on common building blocks and their characteristics, along with relevant protocols.
Data Type Registries WG The Data Type Registry WG concept is compliant with the Data Foundation and Terminology data model and allows users to define data types which can be a variable found in a Digital Object or the structure of a Digital Object and link them with functions.
Metadata IG The Metadata IG is discussing a new package based approach to model metadata. The intentions are compliant with the DFT model. The metadata IG kicked off the Metadata Standards Directory WG which created the Metadata Standards Directory as output where everyone should register newly created metadata schemas so that interested experts can make use of what has been already done. The Metadata IG aims on facilitating and coordinating the efforts of all the WGs dealing with metadata. Its activity mostly focuses on data management policies and standards.
PID Information Types WG The PID Information Types WG recognises that in complex data domains, unique and persistent identifiers (PIDs) associated with specific information are the core of proper data management and access. They can be used to give every data object (including collection objects) an identity that enables referring to the data resources and metadata and, additionally, to prove integrity, authenticity and other attributes. But this requires a PID to be uniquely associated with specific types of information, and those types and their association with PIDs must be well managed. Therefore it is useful to specify a framework for information types, to start agreeing on some essential types, and to define a process by which other types can be integrated.
Practical Policies WG The Practical Policies WG is widely agnostic to concrete data models, since it collects a wide variety of typical data management and analytics workflows that are being executed on collections. It can be used so that it supports the DFT model.
Research Data Collections WG The Research Data Collections WG is working on specificities of data collections and their description. This group did not produce results.
Rice Data Interoperability WG The objective of the Rice Research Data Interoperability Working Group is to provide a framework for community accepted standards to aid data integration and analysis, and bridge the gap in free data sharing in rice research data. The framework will help identify, describe, and link rice data using open standards. The group will also address issues such as the development of a minimal metadata set and selection of appropriate vocabularies. The group will encourage adoption of the outputted framework even within private (for-profit) institutions.
Wheat Data Interoperability WG The Wheat Data Interoperability Working Group seeks to devise a common framework to promote and sustain wheat data sharing, reusability and operability. The framework will use open standards for the identification, description, mapping and publication of wheat data. It will also examine the requirements for a minimal metadata set to describe wheat data, and seek to develop recommendations on topical vocabularies and ontologies. The group aims to produce a 'cookbook' on how to produce easily shareable, reusable and interoperable wheat data.
No outputs yet
No outputs yet
A document on “Metadata Principles” has been made available and endorsed by all the related metadata groups.
This WG has not yet produced results.