March 16, 2025
Evaluating the understanding of the ethical and moral challenges of Big Data and AI among Jordanian medical students, physicians in training, and senior practitioners: a cross-sectional study | BMC Medical Ethics

Summary of findings

Our examination of a subset of Jordanian healthcare workers, including medical students, demonstrated a fair, yet limited, awareness of the ethical and moral dilemmas associated with Big Data and AI; at least on the superficial level across many themes, namely, privacy and confidentiality, informed consent, ownership, biases and divides, epistemology, and accountability. Participants were appreciative of the following: Big Data and AI are associated with privacy risks across all stages of implementation, obtaining informed consent is an ethical limitation of the technology and cannot be trusted due to the complex inner working of algorithms, ethical inappropriateness of “broad consent”, inability to own data, Big Data and AI are associated with inherent biases precipitated by their development process, and vulnerability of such technology to the same errors of conventional research and its lack of an appropriate clinical context for integration. Furthermore, while the attitudes of participants were generally positive, they believed that Jordan does not have legal capacity to regulate Big Data and AI.

Content within the literature

The utilization of Big Data and AI in healthcare has offered numerous possibilities that hold the potential to improve patient care and healthcare systems’ efficiency. However, this integration also brings a range of significant privacy risks that demand careful consideration. One of the most concerning privacy risks is the potential for data breaches and unauthorized access to patient-sensitive information. Despite the implementation of security measures, a critical privacy risk comes from the process of de-identification, which, paradoxically, introduces its own set of vulnerabilities.

One of the primary concerns is the potential for re-identification, where, even after extensive de-identification efforts, health data remains vulnerable to re-identification breaches. In their study, Patsakis et al. demonstrate how Large Language Models (LLM) were found to have a devastating impact on document deanonymization. Even when not explicitly trained for this purpose, LLMs can use minor knowledge hints to achieve complete deanonymization of data [25]. Moreover, organizations are increasingly likely to use LLMs to gain visibility about their customers’ data and, more concerning, trends. This sheds light on the threat LLM poses in the era of Big Data and AI. On another note, El Emam et al. demonstrated that even after immense de-identification efforts, re-identification is a plausible risk [26]. Fredrikson et al. show that through inference attacks, the process by which AI algorithms uncover sensitive information from what was assumed to be non-sensitive data, re-identification could be achieved [27]. Thus, researchers and scientists should be proficient in de-identifying data per the most appropriate guidelines. Also, it is essential to establish complementary legal safeguards and governance standards such as data-sharing agreements for the aims of prohibiting re-identification attempts and delineating accountability.

Informed consent has always been the cornerstone of medical ethics, emphasizing on the importance of providing patients with information about their proposed treatments to enable autonomous decision-making [28]. However, with the integration of Big Data or AI systems that are able to make predictions or find trends within data, secondary uses of data become apparent and the concept of ‘consent’ is challenged [7]. In the context of Big Data, informed consent’s limitations become apparent, and broad consent emerges as a potential solution to navigate the complexities of healthcare data sharing and research, while respecting patient privacy concerns [11]. However, this solution, although can provide legal coverage for Big Data or AI applications, is not ethical, to say the least. Within broad consent, neither the researcher nor patient knows what data or even usage objectives of data are to be conducted, since these objectives are often determined at a future timeframe when the data is mature.

In addition to the ethical considerations, the legal aspect of data ownership is a subject of ongoing debate and ambiguity. The implementation of recent Big Data and AI tools has significantly increased not only the importance of owning this data, but also increased its value to both public and private health sectors [29]. Individual ownership of data, including healthcare data, is contrary to well-established legal precedents in the United States, the United Kingdom, and many other jurisdictions [30]. This perspective is due to the long-established legal model that does not recognize property interests in facts or information. In contrast, European data protection regulations, typically frame data-related rights as an extension of fundamental human rights, which gives individuals a certain degree of control over their data and also implies that they are unsuitable for commodification or commercialization. Beyond European and American perspectives, various national legal frameworks differ considerably in their stance on data ownership. Contract law plays a significant role in defining the rights of data originators and processors; however, it does not address the foundational question of who owns the data [5].

Big Data and AI offer promising solutions for global healthcare challenges, addressing resource shortages and improving healthcare infrastructure [31, 32]. However, this transformative potential also carries the substantial risks of exacerbating the already existing health and economic inequalities. If not carefully and thoughtfully adopted, AI may unintentionally reinforce existing disparities among various demographic groups. Algorithms that are not rigorously tested across diverse demographics can yield inaccurate results, leading to diagnostic tests that perform better for some groups at the expense of others [33]. An example of this bias emerged when an AI-powered dermatology application, trained predominantly on Caucasian skin types, showed reduced diagnostic accuracy for black patients [34]. This implies that such an AI-powered tool, despite its diagnostic accuracy, may be of limited use in areas such as Asia, Sub-Saharan Africa, Latin America, or even the Middle East due to the different epidemiology of skin diseases and the fact an ML model cannot explain any phenomenon beyond its trained dataset. The latter is of utmost importance as it can augment already existing disparities. For example, ML algorithms inherently exhibit bias against underprivileged and minority populations as those have lesser access to healthcare services; thus, fewer data points [35]. This was exemplified by Jacoba et al., who demonstrated that AI-powered diagnostic tools for retinal diseases may show reduced accuracy in underrepresented populations due to the lack of accessible representative images for 45% of the global population [36].

Big Data and AI applications introduce a variety of epistemological challenges, mainly that of data collection and data analysis. In terms of data collection, researchers must take extremely cautious steps when acquiring and preprocessing data. Such is the case due to the fact that most generated datasets utilized in Big Data are not the output of valid and reliable tools [37]. On the analytic front, analysis within Big Data and AI is entirely data-driven; an approach which is seen to produce irreproducible studies, unreliable data, and utilizes inappropriate statistics by anti-data fundamentalists [32, 38]. While such an approach is more precise than traditional theory-based science, its processes of extracting and deriving meaning from even hidden trends are semantically blind [37]. Nonetheless, recent epistemological literature considers data and theory-driven approaches as complementary approaches that are potentially convergent rather than radically divergent [1].

Within the Jordanian landscape, Big Data and AI applications are gaining traction. In our study, we demonstrated the knowledge, attitudes, and practices of Jordanian healthcare workers, including medical students, toward Big Data and AI. In terms of risk, participants were moderately aware of the impact of Big Data and AI on privacy, consent, and extending inequality. Nonetheless, some items may mirror ideological and cultural differences compared to Western standards. For example, 59% of participants justify an in-house data breach under certain scenarios. The Apple-FBI dispute in 2016 clearly showcased that the aforementioned notion could never be accepted in the West, particularly the United States [39]. Another example is the dominant stance for institutions to have a quasi-control of patients’ data. This may show that Jordanian institutions or their employed healthcare workers are willing to use their patients’ data for commodification purposes if the opportunity allows. Nonetheless, due to the significant lack of experience with respect to Big Data and AI, participants’ poor awareness of epistemological or methodological biases associated with such technology might be justified. In fact, healthcare workers are not confident that even the Jordanian healthcare landscape could adopt or regulate such technology.

Limitations

Our study represents a preliminary investigation into the understanding of ethical risks associated with Big Data and AI. However, it is bound by a number of limitations which include the use of a face-validated questionnaire, vulnerability to biases introduced by cross-sectional designs, the close-ended nature of the questionnaire, and a convenient sampling technique. The latter is particularly important as it may fail to produce a sample representative of the Jordanian healthcare workforce. Moreover, the full spectrum of psychometric properties of the developed scale was not calculated. However, the questionnaire was not designed to produce scores for its sub-components nor are there similar tools to test its validity against.

link

Leave a Reply

Your email address will not be published. Required fields are marked *