RDM Platform Coscine - FAIR play integrated right from the start

This is a Preprint and has not been peer reviewed. A published version of this Preprint is available on ing.grid . This is version 7 of this Preprint.

Authors

Ilona Lang  , Marcel Nellesen, Marius Politze 

Abstract

Nowadays, researchers often need to distribute their research data among a multitude of service providers with varying (if any) levels of maturity in terms of FAIR research data management (RDM). To provide researchers with a single point of access to their project data and to add a FAIR layer to already established services, the RDM platform Coscine was developed. Within Coscine different services (so-called resources) can be added to a project, allowing access to the associated data for all project participants. A persistent identifier (PID) is assigned for each resource and metadata management is integrated with flexibly definable schemas based on RDF, OWL and SHACL. Thereby, Coscine bundles for each project the research data, metadata, interfaces and PID into a linked record according to the FAIR digital object (FDO) model.

Comments

Comment #105 Gretchen Greene @ 2024-03-29 15:11

As the topical editor for this article, I am pleased to give my final approval and recommendation of the RDM Platform Coscine article for publication in ing.grid. Thank you to the reviewers for their valuable feedback and also to the authors for their improvements to the article. The Coscine platform delivers a rich set of capabilities for research data management, integrating best practices and standards, and has many clear benefits for the research community.

Invited Review Comment #104 Daniel Allan @ 2024-03-26 17:41

Following up on my initial review, I think Section 3.1 makes it clear now the layers available to users at various levels.

Invited Review Comment #103 Chandler Becker @ 2024-03-22 17:38

The authors have sufficiently addressed previous comments for the manuscript to be recommended for publication, and the paper will be a useful addition to the literature in this area.

Comment #95 Marius Politze @ 2024-02-14 20:17

Thanks to the reviewers for their time and valuable and detailed input concerning our contribution. The following aspects were adressed in the revised version (#4) of the article.

Generally we did a substantial rework of Section 5 which was renamed to "Discussion" and now includes "Use Cases", "Comparison with other RDM Platforms", and "Limitations". We found this to be the best way to address the concerns of the reviewers, rather than distributing pieces of information throughout several other sections.

Concerning some of the individual remarks:

> - I would be interested to have a clearer sense of how high the "bar" is and whether a graded approach is available.

We found this to be hard to answer within the scope of the current article. Section 3.1 was extended by a paragraph that tries to give some insights on gradual adaptation within a researchers' or a data managers' workflow. We hope that this addresses the concern.

> - However, there was minimal discussion of the use of Coscine in research practice; some use cases or examples would be very valuable here.

Again we found this hard to answer as our (as the operators of the service offering) overview of who is using Coscine for what is very incomplete. We tried to give an opinionated selection but also expressed our concern in this direction.

> - There should also be more discussion about how this approach compares with other similar efforts.

Section 5 was extended with a discussion on comparable platforms and differences in terms of goals and/or functionality. However, an in detail comparison would greatly extend the scope of the paper and is likely better to be answered in a separate article.

> - Besides external collaborators being added on RWTH Aachen projects or access ‘at a low-threshold level via ORCID’, how can non-affiliated groups interact or deploy the software?

Section 2.2 and 2.3 were extended to also cover use cases, with a reduced feature set, for users that are not directly affiliated with any of the organizations providing storage space.

> - More discussion would also be helpful regarding installing the software at other institutions.
> - Is it expected that other groups would download the code and deploy it with minimal developer interactions required?

A remark on this was added in Section 5.2 describing dependencies for a local setup and a future vision on this topic.

> - How can contributions be made outside of the development team? Are there processes for external groups to collaborate?

A remark was added in Section 5.2 briefly describing ways to make contributions from outside the development team. Going into a full description of the processes set up to allow participation is slightly out of scope in this article but will likely be addressed in a future essay.

> - More information about lessons learned would also be helpful for others who are deploying these types of systems.

A paragraph on the topic was added in Section 6. As this article is more about the "FAIR" standards used and less about the introduction of the more technical operating model, we consider a more in depth discussion out of scope of the current article.

> - The project is open source with an API (though the code repository location is not in the paper)

References to the API documentation and source code were added in the respective sections.

> - Spelling in Fig 1 - ‘automaticaly’

Fixed the typo.

> - Can’t easily read text in Figure 2 due to small size and white text on a dark background.

Text size and contrast were increased for this figure. We hope that we were able to find a good compromise of the figure's density and legibility.

> - Figure 4 is in German. Since the rest of the manuscript is in English (a requirement for the journal), it would be good to be consistent if possible. If translation in the application is not feasible, a translation in the manuscript would be helpful to indicate what the various fields represent.

The screenshot was recreated with the English language version of the application and an increased font size.

> - Line 50-51: “For each project, Coscine allows inviting all project participants, integrate the project-related data from different resources, and add the related metadata.” Does Coscine allow inviting …, integrating …, and adding…? Or was something else intended?

Yes. This was intended and corrected in the text accordingly.

> - Line 61: “Coscine provides researchers of participating universities or access to storage space on the RDS.” Does “or access” refer to ORCID-based users?

Fixed a typo. This was meant to read "Coscine provides [...] access to storage".

> - References 23 and 24 appear to be duplicates.

Duplicate was removed.

We hope that with the updated preprint we were able to sufficiently address the concerns of the reviewers and look forward to further comments and/or their recommendation for publication.

Comment #37 Gretchen Greene @ 2023-07-24 14:14

As the responsible topical editor, I would like to thank the two reviewers for their detailed and constructive feedback. After consideration of the comments, I advise the authors to revise the paper according to the suggestions provided in the reviews. After completion, the new version of the paper should be uploaded again and will be given again to the reviewers. Thank you.

Invited Review Comment #32 Daniel Allan @ 2023-07-12 12:16

The authors effectively justify the choice to integrate with existing well-sourced cloud systems. The low threshold to participate (e.g. ORCID support) is another strong point. Section 2.4 prompted a question about how high the bar is for "metadata", how much time investment would be required of a newcomer to learn, if necessary define, and use the approved metadata standards. The paper makes clear why some minimum standard of metadata quality and completeness is important but, even as the ideas are developed through Section 3, I would be interested to have a clearer sense of how high the "bar" is and whether a graded approach is available. The range of transport technologies available (REST API, web UI, S3) covers a good range of accessibility and performance. Overall, the paper delinates a clear and pracitcie scope for Coscine and contextualizes it within the ecosystem of RDM solutions.

Invited Review Comment #29 Anonymous @ 2023-06-27 16:31

“RDM Platform Coscine - FAIR play integrated right from the start” by Ilona Lang, Marcel Nellesen, and Marius Politze, describes development of a promising approach to capturing metadata associated with research data and keeping that metadata associated with the data for use.  Specifically, Coscine creates groupings of ‘research data, metadata, interfaces and PIDs into a linked record’ based on a FAIR Digital Object model.


The manuscript included a discussion of the various technologies involved, as well as issues around which metadata is available and appropriate for research users of the systems.  One particular strength of the effort appears to be the Coscine Technical Adaptation Group with its focus on working with users.  It is also good that Coscine considers existing data and not just new projects.  However, there was minimal discussion of the use of Coscine in research practice; some use cases or examples would be very valuable here.  There should also be more discussion about how this approach compares with other similar efforts.


More discussion would also be helpful regarding installing the software at other institutions.  Besides external collaborators being added on RWTH Aachen projects or access ‘at a low-threshold level via ORCID’, how can non-affiliated groups interact or deploy the software?  Is it expected that other groups would download the code and deploy it with minimal developer interactions required?  How can contributions be made outside of the development team?  Are there processes for external groups to collaborate?  More information about lessons learned would also be helpful for others who are deploying these types of systems.


Other comments:

- It is a strength that the system is being used in practice at RWTH Aachen and that there is participation in external consortia such as NFDI4Ing and NFDI-MatWerk

- The project is open source with an API (though the code repository location is not in the paper)

- While the manuscript was generally clear, there were some issues.  For example:

- Spelling in Fig 1 - ‘automaticaly’

- Can’t easily read text in Figure 2 due to small size and white text on a dark background.

- Figure 4 is in German.  Since the rest of the manuscript is in English (a requirement for the journal), it would be good to be consistent if possible.  If translation in the application is not feasible, a translation in the manuscript would be helpful to indicate what the various fields represent.

- Line 50-51: “For each project, Coscine allows inviting all project participants, integrate the project-related data from different resources, and add the related metadata.”  Does Coscine allow inviting …, integrating …, and adding…?  Or was something else intended?

- Line 61: “Coscine provides researchers of participating universities or access to storage space on the RDS.”  Does “or access” refer to ORCID-based users?

- References 23 and 24 appear to be duplicates.


Overall, the manuscript is well structured and describes work that helps group data and metadata aligned with a FAIR Digital Objects philosophy.  It provides a good description of overall approach, including both technical and conceptual implementation, but case studies or examples of its use for research projects would be very helpful in demonstrating usability and impact.


Therefore, while the Coscine effort seems to contribute to FAIR approaches for research data, the manuscript needs revision before it can be recommended for publication.

Downloads

Download Preprint

Metadata
  • Published: 2023-05-05
  • Last Updated: 2024-04-11
  • License: Creative Commons Attribution 4.0
  • Subjects: Data Infrastructure, Data Management Software
  • Keywords: Coscine, Research Data Management Platform, FAIR Guiding Principles, FAIR Digital Object Framework, metadata, Data Management Software, FAIR, FAIR Digital Object Framework, metadata
Versions
All Preprints