Skip to main content


Plot Serializer - A Tool for Creating FAIR Data for Scientific Figures

This is a Preprint and has not been peer reviewed. This is version 3 of this Preprint.

Authors

Michaela Leštáková  , Ning Xia, Julius Florstedt

Abstract

To fight the reproducibility crisis in science, more and more researchers are adopting the practice of sharing their research data. However, making research data comprehensible and reusable for others often takes significant amount of time and effort. This software descriptor introduces Plot Serializer, a Python package for supporting researchers in creating FAIR datasets corresponding to the figures of their manuscript. Fitting into existing workflows, Plot Serializer enables effortless export of data plotted in scientific figures into interoperable datasets with customizable metadata for improved reusability and thus facilitates research data management practices. Besides a clear description of Plot Serializer's scope and functionality, a minimal example of its usage and output is given. Finally, its limitations and future plans are outlined.

Comments

Comment #219 Michaela Leštáková @ 2025-09-23 15:21

Dear editor, dear reviewers,
Thank you for considering our software descriptor for the publication in ing.grid and providing thorough feedback.
We found your comments very helpful and are convinced that addressing them has contributed greatly to the quality of our paper.
The most significant changes include:
• adding a new paragraph in Section 2: Scope to discuss the limitation of Plot Serializer in more detail (78-83), addressing the review comment #214 by Rene Caspart
• adding a new paragraph in Section 2: Scope discussing the human and machine readability of JSON, addressing the review comment #212 by Dorothea Iglezakis
• adding PlotID to Related Work and describing commonalities, differences and synergies between the two tools (103-111), addressing the review comment #212 by Dorothea Iglezakis
• adding a whole new section 4 Embedding Plot Serializer in Research Workflows incl. a diagram (Fig. 1) to describe how the tool can be used within research workflows along with research code, research data and publications (122-148), addressing the review comment #212 by Dorothea Iglezakis
All changes have been marked in the PDF file.
The new section addresses multiple important issues mentioned by Dorothea Iglezakis (including where the JSON file can be stored, which tools/repositories can be used for storing research data and code, linking between figure and the JSOn file, role of RO crates). We believe that adding this section greatly improves understanding the usage of Plot Serializer in research data management context.
Moreover, we have done further changes to address the thoughtful feedback from Dorothea Iglezakis, namely:
• adding GitLab badges to link to the relevant Zenodo publications
• reformulating the introduction, omitting the term “research objects” which carries meaning different from the intended one
• providing details on the RO crate implementation (179-183)
Regarding the last comment from Dorothea Iglezakis, we fully agree that adding a JSON-LD context file would improve the balance between human and machine readability. We discuss the issue of standardizing our JSON specification possibly by extending it to JSON-LD, building upon existing ontologies, in the outlook – it is something we find important and would like to address in the near future. However, the current version of Plot Serializer does not support this functionality yet.

Comment #218 Christian Stemmer @ 2025-09-22 17:54

Thanks to the reviewers who made valuable remarks.
The authors have been asked to review their submission and we are expecting a revised version.

Christian Stemmer

Invited Review Comment #214 Rene Caspart @ 2025-07-31 11:09

Dear authors,

thanks a lot for submitting your work, I found the paper interesting to read. It addresses a very important, yet often overlooked, aspect of everyday scientific work and publication.

The paper is well written and clearly illustrates the design, usability and aspects of plot serializer. Plot serializer itself provides a very important contribution to enabling FAIR research data and output, especially with respect to the "R" aspect.
In itself, the software plot serializer adheres well to the FAIR principles and also general good practices for research software.

I have only one minor comment, which I would find a good addition to your submission:
In your work you mention that plot serializer, understandably, is not able to reproduce all features of a plot, but mostly captures the nature of a plot and focuses on being able to reproduce the necessary features for interpreting the reproduced plot. To further illustrate this, I would appreciate adding e.g. an example to illustrate (some) features which can not be captured.

Invited Review Comment #212 Dorothea Iglezakis @ 2025-05-31 13:53

The paper introduces the python package Plot Serializer, that functions as a wrapper for matplotlib to generate not only the figure itself, but also a serialized version of the data and information included in the figure in form of a human and machine readable JSON file. The tool also allows to recreate the figure by deserialization of the JSON file. The tool itself is published on the python package index pypi.org and documented via readthedocs.io. The code is also published on Zenodo and archived on Softwareheritage.org, but the doi of the publication is interestingly neither mentioned in the gitlab repository, nor in the CITATION.cff file of the software package. (One way to link back from the GitLab-Repo to the Zenodo publication, could be to add a badge just below the header of the README.md file by adding the following line:
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.15082363.svg)](https://doi.org/10.5281/zenodo.15082363) )

The software descriptor is well written and describes the package mainly from an implementation perspective (section 4). The minimal example in section 5 is very helpful to understand the functionality and the scope of the package. Also the data model is described clearly. What is missing a bit, is a description of how the tool can be embedded within research processes to enable FAIRer research. To really take on the challenge to make research data and code FAIRer, much more than the (de-)serialization of figures has to be done. The functionality of the package is helpful and important. But it would be good to describe at least from a high level perspective, how the plot serializer could be embedded into a workflow and with which other tools and services (PlotID, data repositories,...) the package could be combined to really help researchers to find and use data and code underlying a publication.

  • How can the metadata embedded in the JSON file be used by machines or humans?
  • Where could the JSON file be put to be findable and how could it be connected to the figures in the paper on the one side and the underlying data (and code) on the other side?
  • What role do the mentioned RO-Crates play in the process?

Detailed comments:

  • The introduction starts with research objects that are defined as code or data. This very broad definition encompasses much more than the objects addressed by the Plot Serializer, namely the underlying data points of a visualization: raw data, reused data, post processed data, analysed data, research software, software used in research, scripts, .... 
  • As a JSON file is by itself not a FAIR dataset (missing a PID, searchable metadata and a license), even if human and machine readable, the claim in line 9 is a bit far-fetched.
  • The interconnection between different research objects as described in the lines 10-12 are not really adressed by the plot serializer. Especially the linking between the figure and the data is not really described in the article, but should (f.e. by using PlotID?) . Perhaps you could also mention in the outlook, how additional interconnections could be achieved with the plot serializer (f.e. by adding additional metadata linking to the PIDs of other research objects or artefacts)
  • Perhaps you could already reference in line 58 to the example JSON file in section 5 
  • It does not get clear, how the resulting JSON file is added into a RO-Crate as mentioned in line 59 and lines 126f. Just as an additional file object? Or somehow connected to the metadata of the RO-Crate? An example could help here.
  • The related work described in the lines 93-96 would be the ideal spot to describe the role of the Plot Serializer in a FAIR research workflow. But it does not get clear, in what kind of way the work mentioned here relates to the Plot Serializer. Is it building upon the functionality of Plot Serializer, somehow compatible or a completely different way to handle the problem? And why is PlotID (https://doi.org/10.48694/inggrid.3632) not mentioned as related work as it seems to be the ideal tool to link the generated JSON data file to the figure itself?
  • To make the metadata within the JSON file more easily usable by other tools, the standardization of the metadata and its values outlined in section 7 would be very valuable. Adding a JSON-LD context file could help keep the balance between human and machine readability. It would be good, not to reinvent a metadata schema, but build upon existing standards (f.e. DCAT, DublinCore, Croissant)
     

Small typos:

  • double "the" in line 64
  • "Happinness level" -> "Happiness level" in section 5 (Minimal example)

 

Downloads

Download Preprint

Metadata

  • Published: 2025-04-14
  • Last Updated: 2025-09-23
  • License: Creative Commons Attribution 4.0
  • Subjects: Data Management Software
  • Keywords: research data management, figure, plot, FAIR data, metadata

Versions

All Preprints