Towards categorizing ethical questions in data literacy

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.


Download Preprint


Samira Khodaei , Anas Abdelrazeq , Ingrid Isenhardt


Data Literacy is crucial for a sustainable engineering education [11]. In aiming to
find solutions to solve future challenges, mechanical engineering has started to integrate data
literacy into the higher education curriculum [13]. However, ethics are rarely considered in
current frameworks. Ethics are seen as a side topic or are equated to data privacy issues [2].
Since literacy aims to empower people to make informed decisions based on their or other
data[12], the development of critical reflection and discussion on ethics is central for data
literacy. In our contribution, we will first summarize current existing data literacy frameworks
and their ethics concept. Then, through a focus group study among data literacy experts’
ethical questions in data literacy are collected and categorized. The study was conducted
with 15 experts at the NFDI4Ing Conference 2022. This approach expands ethical issues in
data literacy beyond data privacy towards applied, current and pressing ethical topics.


Data Ethics, Data Literacy


ethical literacy, data literacy, ethics, interdisciplinary research, focus group study Data availability:, data literacy, ethics, focus group study, interdisciplinary research


Published: 2023-05-12 10:00


Creative Commons Attribution 4.0

Add a Comment

You must log in to post a comment.


Invited Review Comment #42 Björn Schembera @ 2023-08-07 12:37

The paper problematizes that ethics does not appear in any or only a few curricula in the field of data literacy. However, this is important in order to enable critical reflection on data.

The paper makes an important contribution by presenting the results of a workshop that served to collect and categorize ethical issues related to data science. It is overall important that the ethical questions get more attention, especially in the data engineering / literacy domain.

The presentation of the state of the art in section 2 of the paper is convincing, as is the presentation of the methodology in 3.

Areas for improvement of the paper would be as follows:
- The study collected 20 ethical questions. These are mentioned only sporadically in the body text. It would be interesting to see all questions in a table with their classification, or in an appendix (there should be space for this).
- The 6 categories "fall from the sky": Where does it come from? This should be explained.
- In the paper it is never - and this is a weakness - really defined and rolled out what is meant by ethics. However, this would be important for categorizing the questions.
- Following on from this, there is also the question of why a distinction is made at all between "process-centered" and "human centered" categories. Actually, ethical judgments or actions in general can only be carried out by humans (if necessary, of course, mediated by a machine or technology in general), but ultimately it is the human being who decides how to act (which can then be ethically evaluated).

Invited Review Comment #40 Anonymous @ 2023-07-27 13:44

(1) In general, the article (apart from the problems mentioned under (2), (3) and (4)) is not clear enough regarding the objective: is it only about data literacy for mechanical engineering or about data literacy in general?

(2) In my opinion, the keyword "ethics", which is used sweepingly in this article, needs to be specified, otherwise the topic will be missed. In the context of data literacy (for mechanical engineering), it is not about ethics in general, but can only be about "research ethics". This in turn is closely related to issues of "good scientific practice" (Gute wissenschaftliche Praxis, vgl. DFG) and quality assurance in data management and data science. Avoiding biases in data sets, for example, is primarily a question of quality assurance, but not of research ethics. The degree of transparency of data should in turn be set out in a policy of the data-holding institution, etc. (research ethics would require: know the policy, follow the policy or formulate a policy). In this respect, the ethics definition from footnote 8 is simply wrong or unusable. The net publication cited there is not a scientific (and not a suitable) as well as poor quality source on the topic. A topic like corporate power, on the other hand, is a political, not an ethical problem - whereas "digital sovereignty" has meanwhile become a quality criterion of data services and software solutions. E.g., an open source preference may apply to data infrastructures (or an obligation to be GAIA-X compliant). But then it is a defined, "hard" requirement and not ethics.

(3) The methodological goal (and purpose) of using the focus group was not clear to me. Apparently, practitioners were asked about "ethical" concerns/problems at the meeting. As it seems, this was done, however, without defining "ethical" beforehand (or even narrowing the topic to research ethics). The collection of problems that came about in this way thus, unsurprisingly, merely reflects the - apparently amateurish and highly disparate - "general" understanding of ethics of the people involved. What has been collected in this way? At any rate, not any components of a possible research ethics curriculum. Also not really a "need". But actually only a (mis)interpretation of the word (for more an already trained group with an understandig of already established ethics requirements would have been necessary).

(4) The summary of the results of the experiment is correspondingly vague in the text. I do not think that the "collected ethical questions" can point the way to a reasonable "ethics"-extension of data literacy. The method does not fit the goal. And, more importantly, there is not really a need for "categorizing" bottom up. You need to know practice very well to postulate (or to teach) research ethics criteria, but your cannot derive research ethics criteria from empiricism or by collecting opinions. A better approach would be to systematically evaluate the already existing research ethical codes of other data-driven disciplinary cultures and apply them to the research reality of Mechanical Engineering (by experts). Central should also be the (research) ethics specifications of the major engineering associations (in Germany: VDI). Initial AI research ethics guidelines are also already available.