• 2017-05-24
  • Article

The term Data Stewardship refers to policies, procedures and roles in managing data throughout the whole life cycle of data objects. Therefore, the term stresses the need for active data management after the end of the project that created data, metadata, collections, databases, etc. 

During project lifetimes, most administrational roles such as owners, copyright holders, data managers, primary users, metadata creators etc. have been defined through the project setup or have been implicitly assumed. At  the end of a project most, if not all of these roles  - including rights and licensing issues – are no longer well defined.

The end of a project is the moment where data will enter a new phase, because

  • contractual specifications are not valid anymore
  • responsibilities for taking care of the data and metadata at the different layers (storage, updates, access rights, etc.) have not been not clarified in general
  • data and metadata need to be curated to make them usable to users outside of the project. This step has often not been seen as part of the project's tasks
  • data could become obsolete after some years and it needs to be clear who can decide to delete data

This topic page refers to all RDA WGs and IGs that deal in some way with these and other related questions. It is closely related to the Data Management topic and refers to a largely overlapping set of WGs and IGs.

Topic Graph

Relevant RDA groups

Active Data Management Plans IG Researchers are being required to provide Data Management Plans (DMP) for project proposals to indicate that data management and stewardship are taken seriously and thus data will be accessible and reusable. However during the lifetime of a project, change can occur to the project data plan for various reasons. DMPs often remain static and do not reflect these changes, and as a result are of limited value.the lifetime of projects data plans, however, change for various reasons and the current DMPs are static and do not reflect these changes, thus they only have a limited value. Based on an analysis of current practices the Active Data Management Plan (ADMP) IG is addressing this gap and is working on major topics:

  • identifying the requirements for ADMP covering lifecycles of data and changes within projects,
  • specifying practical tools and services to create active data management plans and making them actionable,
  • specify interfaces and exchange formats for ADMP supporting tools

Data Citation WG The RDA Working Group on Data Citation (WG-DC) aims to bring together agroup of experts to discuss the issues, requirements, advantages and shortcomings of existing approaches for efficiently citing subsets of data. The WG-DC focuses on a narrow field where we can contribute significantly and provide prototypes and reference implementations.

Data Fabric IG The Data Fabric IG is focusing on the data creation and consumption circle as it happens daily in the scientific and industrial labs and on the identification of ways to make this work more efficiently and thus more cost-effective. The group’s goal is to identify so-called Common Components and define their characteristics and services that can be used across boundaries in such a way that they can be combined to solve a variety of data scenarios.

Data Foundation & Terminology WG The Data Foundation and Terminology WG task is to describe a basic, abstract data organization model which can be used to derive a reference data terminology that can be used across communities and stakeholders to better synchronize conceptualization, to enable better understanding within and between communities and finally to stimulate tool building, such as for data services, supportive of the basic model’s use. This abstract data organization model will focus on common building blocks and their characteristics, along with relevant protocols.

Libraries for Research Data IG

Libraries have expanded on their traditional roles and developed new services in the digital environment, not just facilitating but becoming active participants in the research process. These services include providing access and preservation of research data, as well as advising and supporting researchers in the management of research data.

Libraries have a successful history in collaboration and interoperable solutions, something that is increasingly vital in an environment of evolving software and data management products, mobile researchers, and volatile repositories. Maintaining continued long term access to scholarly assets is essential, and RDA offers a venue for librarians to share their skill sets and expertise in this regard with members of other groups such as Domain Repositories Interest Group, the Metadata Working Group, and the Data Publishing Interest Group. Librarians in turn can receive best practice developed in other fields and bring this back to the library community. It also offers the opportunity to share the principles, and practices of librarians experienced in the stewardship of data, with domain specific groups seeking to develop local solutions to often universal problems within data management.

The objectives of the Libraries for Research Data Interest Group include development of strategies to embed data management services at academic and research institutions, identification of sustainable organisational business models for libraries in support of RDM, and the promotion of best practice and interoperability of library infrastructures with domain repositories and other RDM initiatives. Working groups will be formed with reference to specific, short term activities identified by the Interest Group.

Long tail of research data IG This group draws a principal difference between big data, the massive data sets produced in large scientific projects, and the huge amount of data that is being produced daily by researchers and often being stored on local servers and even notebooks. This group is focusing on the latter, the long tail of research data, and wants to develop a set of good practices for managing such data. Based on use cases the group wants to categorise the types of repositories relevant for long tail data, define the scope of data sets and repositories that are relevant and analyse federation approaches allowing doing discovery across the many repositories of long tail data. Finally, the group wants to identify and publish good practices and skills needed to manage such long tail data.

Metadata IG The Metadata IG is discussing a new package based approach to model metadata. The intentions are compliant with the DFT model. The metadata IG kicked off the Metadata Standards Directory WG which created the Metadata Standards Directory as output where everyone should register newly created metadata schemas so that interested experts can make use of what has been already done. The Metadata IG aims on facilitating and coordinating the efforts of all the WGs dealing with metadata. Its activity mostly focuses on data management policies and standards.

National Data Services IG analysing a possible special role of NDS for data stewardship

Practical Policies WG The Practical Policies WG is widely agnostic to concrete data models, since it collects a wide variety of typical data management and analytics workflows that are being executed on collections. It can be used so that it supports the DFT model.

Preservation Tools, Techniques, and Policies IG https://www.rd-alliance.org/ig-preservation-tools-techniques-and-policies-rda-9th-plenary-meeting This newly formed group The Preservation Tools, Techniques, and Policies (PTTP) IG provides a forum to bring together domain researchers, data and informatics experts, and policy specialists to discuss questions such as:

  • what needs to be preserved to enable re-use and reproducibility,
  • are there tools that facilitate preservation and not hamper research,
  • what preservation policies exist and what are their characteristics
  • do we need improved preservation policies?
In their first meeting the group focussed on starting a survey and discussing ways of cataloguing preservation tools.

Repository Audit and Certification DSA–WDS Partnership WG Repositories are and will be key pillars for accessibility and re-usability of digital objects in the emerging global data domain. Therefore, it is important for all involved stakeholders (data creators, users, funders, etc.) to know which repositories are trustworthy to rely on their proper data management capabilities. Two initiatives, Data Seal of Approval and World Data System, worked in parallel on comparable sets of criteria that allows assessing the quality of the policies and procedures followed by a repository. Under the umbrella of RDA both initiatives joined forces with the objective of developing a common framework for the certification of trustworthy repositories to harmonise the approaches and give clear signals to the stakeholder communities worldwide about the need to assess the quality of repositories and use a joint approach globally recognised.

Repository Platforms for Research Data IG specifying requirements for proper repository software supporting data stewardship

Research Data Collections WG The Research Data Collections WG is working on specificities of data collections and their description. This group did not produce results.

Research Data Provenance IG analysing the requirements for provenance metadata relevant for later data re-use

Outputs of the Repository Audit and Certification WG

The working group finished its work by producing three documents:

  • a set of harmonized Common Procedures for the certification of repositories to support the implementation of a catalogue of common requirements
  • a catalogue of common requirements merging the DSA and WDS approaches to have one joint basis for certification
  • In addition, the group released a report on the testbed they created to evaluate the procedures and requirements.

DSA–WDS Partnership – Procedures for Core Certification V1.2.pdf



Outputs of the Practical Policies WG

  • Identification of eleven generic policy areas for operation with data collections stored in repositories and a template-based collection of policy specifications in these areas being collected in a cookbook.
  • Development of code snippets to support policy specifications and making it easy for people to turn to executable procedures.

Outputs of the Active Data Management Plans IG

This WG has not yet produced results

Outputs of the Long Tail of Research Data IG

The group produced a first document with 7 concrete recommendations to support long tail data.


Outputs of the Preservation Tools, Technique and Policies IG

The group does not yet have results