plotID - a toolkit for connecting research data and visualization

This is a Preprint and has not been peer reviewed. This is version 3 of this Preprint.

Downloads

Download Preprint

Supplementary Files
Authors

Martin Hock , Hannes Mayr, Manuela Richter, Jan Lemmer, Peter Pelz 

Abstract

The highest amount of published information on paper is contained in visualizations such as 2D and or 3D plots. Supporting a generic research workflow, plotID provides tools that can a) create and anchor a reference (ID code, URL,...) for and b) package figures, data, code and parameters used to create the figure.
The code is provided as tools with small impact, that need to be used consciously by the researcher and does not aim to relieve the researcher of his duty to keep his digital working environment organized. The exported packages help immensely to make results reusable and repeatable.
The initial implementation was created in Matlab and used internally before rewriting the tool in the Python programming language, for easier distribution and adaption to diverse environments.

Subjects

Data Management Software

Keywords

research data management, visualization, figure, plot, mapping, referencing, ID, visualization, figure, plot, mapping, reference, ID, organisation

Dates

Published: 2022-09-05 10:00

Last Updated: 2022-11-18 18:05

Older Versions
License

Creative Commons Attribution 4.0

Add a Comment

You must log in to post a comment.


Comments

Comment #9 Martin Hock @ 2022-11-15 16:54

The authors wish to thank both the reviewers for their detailed reviews, advice and points for improvement.
The comments resulted in many changes to both the descriptor and the repository. A new version of the submission has been uploaded. A few points have been added as issues to the repository, to be implemented in the near future.
The changes include:
- Rewrite of the abstract to be more concise in detailing the posed problem and the steps for a solution.
- Added to the statement of need, to better explain the motivation.
- Comparison with similar software: Added general comparisons about the main functions and goals of plotID after an investigation into a wide variety of software used for research data management.
- We made sure to mention the current limitation to Python and matplotlib or picture files earlier and plainer. Breaking free of these limitations is a future aim of the project.
- The implementation section was supported with a system architecture diagram.
- We decided against listing all Python dependencies, we will add this to the official documentation.
- The example code was provided with a figure showing the resulting folder and picture including the ID.
- We discussed separate installation instructions, but since only the first step optional step diverges, we emphasized this difference instead. Should more steps diverge we will create separate instructions.
- Added a 'CONTRIBUTING' file, containing instructions to the repository and added a short section with a reference to the readme.
- Added a section in the readme on how to install optional dependencies, which doubles as instructions for a development environment.
- Added a feature request about not printing the ID to the repository. This should be implemented within the next weeks.
- Added a feature request to add the ID as metadata to picture files and as an additional text file.
- Started investigations on how to add metadata in a generalized way to python objects and export them.
- Added more details about the plans to export the python environment as requirements file.
- Adjusted many small errors in wording and formatting.

Comment #7 Kevin Logan @ 2022-10-14 10:17

As the responsible topical editor, I would like to thank both reviewers for their detailed and constructive feedback. After consideration of the comments, I advise the authors to revise both the descriptor and the repository according to the suggestions provided in the reviews. After completion, the new version of the descriptor including the revisions may be submitted by the authors for further consideration.

Invited Review Comment #3 Bernd Flemisch @ 2022-10-04 20:53

Overall assessment:

The authors present a Python tool for equipping a plot or more general an image with a unique ID. While the idea is certainly interesting, the manuscript would profit from a clearer motivation and a more careful presentation. I would appreciate if my comments below are considered before publication.

Detailed comments:

- While I can imagine that IDs for plots can be useful, I would welcome a more elaborated motivation, for example, by means of specific use cases. I would also be very interested in the envisioned interplay of several PIDs for, e.g., paper, data, code and figures.

- Section 3: I would rather have expected a description of the code architecture and its dependencies. I think that there is no need for justifying the usage of Python. Moreover, the license information belongs rather to Section 6.

- Line 101: "everything necessary to recreate the visualization from scratch". I doubt that, usually there will be more necessary than just the script. At least the result data, possibly much more. Please provide a more extensive discussion of what should be done in a more general case or point to the (improved) description in Section 5.2.

- Section 5.1.3: It seems to be mandatory that the ID is displayed together with the plot. I could imagine that this is not desired in some cases. Would it be also ok to add the ID to the figure's metadata? Please elaborate on this, maybe include the option to add it to the metadata.

- Section 5.2: It's currently hard to put the individual "pieces" 5.2.1-3 together to get the full picture. Please organize the section in a better way.

- Section 5.2.1: I wonder if parsing the import statements is really enough to achieve reproducibility. For this, the employed versions should also be captured. Please elaborate on this.

- It would be good to add a specific example with a figure that is put in the paper. Could be taken from the repo, but maybe even Figure 1 qualifies?

Small issues:

- Abstract, line 2: "and or", please choose one.

- Abstract, l.3f: Something seems to be wrong or at least not easy to understand with a) and b). I think that you mean this: "a) create and anchor a reference (ID code, URL,...) for THE FIGURE and b) package figures, data, code and parameters used to create IT." It still sounds odd to "package figures ... used to create a figure". Maybe one can simply omit the "figures," here.

- Abstract, line 4ff: "as tools" and "does" don't fit together.

- L.13f: A closing parenthesis is missing and the next sentence starts without a dot before. I suggest rather "repositories, see ... [6]. Labelling ...".

- L.28: Delete the "on".

- L.36: "even BE shipped".

- Section 5: If a normal word is the first word of a section heading, it should be capitalized.

- Section 5.1.3: "Type of string" should rather be "Of type string".

Invited Review Comment #2 Jane Wyngaard @ 2022-09-21 16:54

PlotID appears to be a useful tool

Some minor modifications to the the descriptor submission and public repository would increase its reuse and impact.

Descriptor:
Abstract should be re-written, it's poorly structured making it difficult for a reader to quickly understand what is being reported on.  Further, highlights of PlotID specific features should be included.

The statement of need currently also serves to report on related work.  This is limited to naming one other software package.  This package's functionality and limitations along with other similiar efforts to fill the same need should be elaborated on in more detail so as to offer a fuller comparison with PlotID.

The fact that this tool is Python specific should be pointed to from the start rather than being left ambiguous through the use of the general term 'scripts'.  This is a tool that specifically allows for integrated tagging and version control of Python generated plots for a range of data types such as can be visualised with matplotlib.  These facts and other system requirements and features should be highlighted earlier in the description.

A reader will gain more from reviewing the provided example code in section4 if it was placed after the contents of section 5

Section 5 would be more readable if a system architecture diagram was provided.


Repository:
* Separation of unix and windows installation instructions into separate subsections would be more efficient for a reader
* Contributing: Are pull requests and issues welcome?  How would a contributor setup their development environment?