Ethics And Governance Of Artificial Intelligence In Health

Editor’s Note

With the rapid development of artificial intelligence technology, the health sector is continuously trying to improve the quality of medical care and enhance work efficiency by introducing AI. Due to the capability of large model technology to process large-scale data and perform complex tasks, it significantly enhances the generalization, versatility, and practicality of AI. Therefore, this technology has broad application prospects and potential in disease prediction, diagnosis, treatment, and drug development, but it also brings many ethical challenges and risks that urgently require strengthened governance.

On January 18, 2024, the World Health Organization (WHO) released the English version of “Ethics And Governance Of Artificial Intelligence In Health: Guidance On Large Multi-modal Models,” aimed at assisting countries in planning the benefits and challenges associated with multi-modal large models in the health sector and providing policy and practical guidance for the appropriate development, provision, and use of multi-modal large models.

Given the significant strategic importance of multi-modal large model AI for our country in gaining new advantages in future strategic competition and promoting the health of the general public, this journal has organized experts engaged in related research to translate the guidelines into Chinese for researchers’ reference, aiming to promote the research and guidance of ethical governance of medical large models in our country, achieving a virtuous interaction between high-quality innovative development and high-level safety.

This article was first published on CNKI, reference format as follows:

Wang Yue, Song Yaxin, Wang Yifei, et al. Ethics And Governance Of Artificial Intelligence In Health: Guidance On Large Multi-modal Models [J/OL]. Chinese Medical Ethics: 1-58 [2024-03-14]. http://kns.cnki.net/kcms/detail/61.1203.R.20240304.1833.002.html.

Ethics And Governance Of Artificial Intelligence In Health: Guidance On Large Multi-modal Models

Ethics and governance of artificial intelligence for health. Guidance on large multi-modal models. Geneva: World Health Organization; 2024. Licence: CC BY-NC-SA 3.0 IGO.

This translation is not derived from the World Health Organization (WHO), and the WHO is not responsible for the content or accuracy of the translation. The original English version should be considered the authoritative text.

Original Version Number: ISBN 978-92-4-008475-9 (Electronic Version); ISBN 978-92-4-008476-6 (Print Version)

Ethics And Governance Of Artificial Intelligence In Health

Translators

Wang Yue1, Song Yaxin1, Wang Yifei1, Translation; Yu Lian2, Wang Jing3, Review

(1 Xi’an Jiaotong University School of Law, Xi’an, Shaanxi 710049; 2 Xi’an Jiaotong University School of Public Health, Xi’an, Shaanxi 710061; 3 Beijing Traditional Chinese Medicine Hospital Affiliated to Capital Medical University, Beijing 100010)

Abstract

Artificial Intelligence (AI) refers to the ability of algorithms integrated into systems and tools to learn from data so that they can perform automated tasks without explicit programming for each step. Generative artificial intelligence is an AI technology where machine learning models are trained on datasets available for generating new content (such as text, images, or videos). This guide focuses on one type of generative artificial intelligence, namely, Large Multi-modal Models (LMM). These models can accept one or more types of data inputs and produce multiple outputs that are not limited to the types of data input algorithms. It is predicted that multi-modal large models will be widely applied in healthcare, scientific research, public health, and drug development. Multi-modal large models are also referred to as “General-purpose Foundation Models,” although it has yet to be confirmed whether multi-modal large models can accomplish various tasks and purposes.

The speed of adoption of multi-modal large models surpasses that of any consumer application in history. They are notable for facilitating human-computer interaction, mimicking human communication, and responding to queries or data inputs in a human-like and seemingly authoritative manner. With rapid consumer adoption and acceptance, and considering their potential to disrupt core social services and economic sectors, many large tech companies, startups, and governments are investing in and competing to steer the development of generative artificial intelligence.

In 2021, the World Health Organization (WHO) released a comprehensive guide on “Ethics And Governance Of Artificial Intelligence In Health.” The WHO consulted 20 leading experts in the field of artificial intelligence who identified the potential benefits and risks of using AI in health and published six principles reached through consensus for governments, developers, and providers using AI to consider in policy and practice development. These principles should guide a wide range of stakeholders, including governments, public institutions, researchers, businesses, and implementers in the development and deployment of AI in health. The six principles are: (1) Protect human autonomy; (2) Promote human well-being, safety, and the public interest; (3) Ensure transparency, explainability, and understandability; (4) Foster accountability and responsibility; (5) Ensure inclusivity and fairness; and (6) Promote responsive and sustainable AI (Figure 1).

Figure 1: Consensus on Ethical Principles for Artificial Intelligence in Health by WHO

The purpose of the WHO in publishing this guide is to assist member states in planning the benefits and challenges associated with multi-modal large models in health and to provide policy and practical guidance for the appropriate development, provision, and use of multi-modal large models. This guide offers governance recommendations for corporate, governmental, and international cooperation consistent with the guiding principles. The foundation of this guide is based on the guiding principles and governance recommendations considering the unique ways humans use generative AI in health.

Applications, Challenges, and Risks of Multi-modal Large Models

The potential applications of multi-modal large models in health are similar to those of other forms of AI; however, the access and usage of multi-modal large models are new, presenting both new benefits and risks that social systems, health systems, and end users are not yet prepared to address. Table 1 summarizes the main applications of multi-modal large models and their potential benefits and risks.

Ethics And Governance Of Artificial Intelligence In Health

Systemic risks associated with the use of multi-modal large models include risks that may affect healthcare systems (Table 2).

Ethics And Governance Of Artificial Intelligence In Health

The use of multi-modal large models may also bring broader regulatory and systemic risks. One concern being studied by some data protection agencies is whether multi-modal large models comply with existing legal or regulatory frameworks, including international human rights obligations and national data protection laws. Due to the manner in which training data for multi-modal large models is collected, the management and processing of collected data (or data input by end users), the transparency and accountability of multi-modal large model developers, and the potential for multi-modal large models to exhibit “hallucinations,” algorithms may not be applicable under current laws. Multi-modal large models may also violate consumer protection laws.

As the use of multi-modal large models continues to grow, developing them requires vast amounts of computational power, data, human resources, and financial resources. A broader social risk associated with the use of such algorithms in health is the fact that multi-modal large models are predominantly developed and deployed by large tech companies. This may strengthen the dominance of these tech giants over the development and use of AI compared to smaller businesses and governments, including directing the focus of AI research in the public and private sectors. Other concerns regarding the potential dominance of large tech companies also include inadequate corporate commitments to ethics and transparency. New voluntary commitments between companies and between companies and governments can mitigate some risks in the short term but cannot replace the government oversight that may ultimately be implemented.

Another social risk is the carbon and water footprint of multi-modal large models. Like other forms of AI, multi-modal large models require significant energy and produce an increasing water footprint. While multi-modal large models and other forms of AI can bring significant social benefits, the growing carbon emissions may become a major factor in climate change, and the increasing water consumption may further negatively impact water-scarce communities. Another social risk associated with the emergence of multi-modal large models is that despite providing seemingly plausible responses, they may gradually be regarded as sources of knowledge, ultimately undermining the authority of human knowledge, including in healthcare, scientific, and medical research fields.

Ethics And Management Of Multi-modal Large Models In Healthcare And Pharmaceuticals

Multi-modal large models can be seen as the product of a series (or chain) of decisions made by one or more actors regarding programming and product development (Figure 2). Decisions made at each stage of the AI value chain can have direct or indirect impacts on downstream entities involved in the development, deployment, and use of multi-modal large models. Governments can influence and regulate these decisions by enacting and enforcing laws and policies at national, regional, and global levels.

Ethics And Governance Of Artificial Intelligence In Health

Figure 2: Value Chain for the Development, Provision, and Deployment of Multi-modal Large Models

The AI value chain typically begins with a large tech company referred to as the “developer” in this guide. Developers can also be universities, smaller tech companies, national health systems, public-private partnerships, or other entities with the resources and capabilities to utilize several inputs. These inputs comprise the “AI infrastructure” (a term used by governments in legislation and regulation to describe multi-modal large models), such as data, computational power, and AI expertise used to develop general-purpose foundation models. These models can be directly used to perform various tasks, often unexpected, including those related to healthcare. Several general-purpose foundation models have been specifically trained for use in healthcare and pharmaceuticals.

Third parties (“providers”) can use general-purpose foundation models for specific purposes or applications through active programming interfaces. This includes: (i) fine-tuning new multi-modal large models, which may require additional training on the foundation model; (ii) integrating multi-modal large models into applications or larger software systems to provide services to users; or (iii) integrating components known as “plugins” to guide, filter, and organize multi-modal large models in a formal or standardized format to generate “digestible” results.

Subsequently, providers can sell products or services based on multi-modal large models to clients (or “deployers”), such as health departments, healthcare systems, hospitals, pharmaceutical companies, or even individuals, such as healthcare service providers. Clients who purchase or obtain licenses to use products or applications can use them directly for patients, other entities in healthcare systems, non-professionals, or their own businesses. The value chain can be “vertically integrated,” allowing companies (or other entities, such as national health systems) that collect data and train general-purpose foundation models to modify multi-modal large models for specific purposes and directly provide applications to users.

Governance is a means to embody ethical principles and human rights obligations through existing laws and policies, newly enacted or revised laws, guidelines, internal codes of conduct, developer programs, and international agreements and frameworks.

One approach to building a governance framework for multi-modal large models is to incorporate it into three stages of the AI value chain: (i) designing and developing general-purpose foundation models or multi-modal large models; (ii) providing services, applications, or products based on general-purpose foundation models; and (iii) deploying healthcare services or applications. In this guide, each stage is reviewed from three aspects:

1. What risks should be addressed at each stage of the value chain (as described above)? Which actors are best positioned to address these risks?

2. What can relevant actors do to address risks? What ethical principles must be adhered to?

3. What is the role of the government, including relevant laws, policies, and regulations?

Some risks can be addressed at various stages of the AI value chain, while certain actors may play a more significant role in mitigating various risks and upholding ethical values. Although there may be disagreements and tensions regarding the allocation of responsibilities among developers, providers, and deployers, in some clear areas, actors are each in the most favorable position to respond or are the only entities capable of addressing potential or actual risks.

Design And Development Of General-purpose Foundation Models (Multi-modal Large Models)

During the design and development of general-purpose foundation models, the responsibility lies with the developers. Governments have the responsibility to establish laws and standards that require certain practices to be taken or prohibited. Chapter 4 of this guide provides some recommendations to help address risks and maximize benefits during the development of multi-modal large models.

Provision Of General-purpose Foundation Models (Multi-modal Large Models)

In the process of providing services or applications, governments have the responsibility to define requirements and obligations for developers and providers to address specific risks associated with the use of AI-based systems in healthcare settings. Chapter 5 of this guide provides some recommendations to address risks and maximize benefits when using multi-modal large models for healthcare services and applications.

Deployment Of General-purpose Foundation Models (Multi-modal Large Models)

Even when relevant laws, policies, and ethical practices are applied during the development and provision of multi-modal large models, risks may arise during their use, partly due to the unpredictability of multi-modal large models and the responses they provide. Users may apply general-purpose foundation models in ways that developers and providers did not anticipate, and the outputs of multi-modal large models may change over time. Chapter 6 of this guide offers recommendations on risks and challenges that should be addressed during the use of multi-modal large models and applications.

Accountability Of General-purpose Foundation Models (Multi-modal Large Models)

With the widespread use of multi-modal large models in healthcare and pharmaceuticals, errors, misuse, and ultimately harm to individuals are inevitable. Therefore, accountability can ensure that users harmed by multi-modal large models receive adequate compensation or other forms of redress, alleviating the burden of proof on harmed users and ensuring they receive full and fair compensation.

Governments can achieve this by introducing a presumption of causality. They may also consider introducing strict liability standards to address harm caused by the deployment of multi-modal large models. While strict accountability can ensure compensation for harmed individuals, it may also hinder the use of increasingly complex multi-modal large models. Governments may also consider establishing no-fault compensation funds.

International Governance Of General-purpose Foundation Models (Multi-modal Large Models)

Governments must work together to establish new institutional structures and rules to ensure that international governance keeps pace with the globalization of technology. Governments should also ensure that cooperation and collaboration within the UN system are strengthened to address the opportunities and challenges of broader deployment of AI applications in health and social and economic fields.

To ensure that governments are accountable for their investments and participation in the development and deployment of AI-based systems, and to ensure that appropriate regulations are enacted to uphold ethical principles, human rights, and international law, international governance is essential. International governance can also ensure that multi-modal large models developed and deployed by companies meet appropriate international safety and efficiency standards and comply with ethical principles and human rights obligations. Governments should also avoid enacting regulations that provide competitive advantages or disadvantages to companies or governments themselves.

To give meaning to international governance, these rules must be jointly formulated by all countries, not just by high-income countries (and tech companies collaborating with high-income country governments). As the UN Secretary-General proposed in 2019, international governance of AI may require all stakeholders to cooperate through networked multilateralism, enabling the UN family, international financial institutions, regional organizations, trade groups, and other aspects, including civil society, cities, businesses, local authorities, and youth, to collaborate more closely, effectively, and inclusively.

Introduction

This guide addresses the emerging uses of multi-modal large models in relevant applications in the health sector. It includes the potential benefits and risks of using multi-modal large models in healthcare and pharmaceuticals, as well as governance methods that best ensure compliance with ethical, human rights, and safety standards and obligations. This guide is based on the WHO’s June 2021 guidelines on “Ethics And Governance Of Artificial Intelligence In Health.” The “Ethics And Governance Of Artificial Intelligence In Health” explores the ethical challenges and risks of AI in health, identifies six principles to ensure that AI benefits all countries utilizing it in the health sector, and proposes recommendations to strengthen governance of AI in health to maximize the potential of this technology.

Artificial intelligence refers to the ability of algorithms integrated into systems and tools to learn from data to perform automated tasks without explicit programming for each step. Generative artificial intelligence is an AI technology in which machine learning models are used to train algorithms on datasets to generate new outputs, such as text, images, videos, and music. Generative AI models learn patterns and structures during the training data process, allowing them to predict and generate new data based on learned patterns. Generative AI can be applied in various fields, including design, content generation, simulation, and scientific discovery.

Large language models are a specific type of generative artificial intelligence that receives text-type inputs and provides responses of the same type, thus attracting significant attention. Large language models are the epitome of large single-modal models and form the basis for early versions of chatbots that integrate these models. While large language models participate in conversations, the models themselves do not know what they are generating. They merely predict the next word based on preceding words, learned patterns, or combinations of words.

This guide explores the increasingly widespread uses of multi-modal large models (including large language models), which are trained on highly diverse datasets in healthcare and pharmaceuticals, including not only text but also biosensor, genomic, epigenomic, proteomic, imaging, clinical, social, and environmental data. Thus, multi-modal large models can accept various types of inputs and produce outputs that are not limited to the types of data input. Multi-modal large models can be widely applied in healthcare and drug development.

Multi-modal large models differ from previous AI and machine learning models. While AI has been widely integrated into many consumer applications, most algorithms’ outputs neither require nor invite customer or user participation, except for the primary forms of AI integrated into social media platforms, which attract attention through user-generated content. Another distinction of multi-modal large models from other types of AI is their versatility. Previous and existing AI models, including those for medical purposes, were designed for specific tasks, thus lacking flexibility. They can only perform tasks defined in their training sets and labels and cannot adapt or perform other functions without retraining with different datasets. Consequently, although the US Food and Drug Administration has approved over 500 AI models for clinical medicine, most models are only approved for one or two narrow tasks. In contrast, multi-modal large models are trained on different datasets and can be used for various tasks, including some not explicitly trained for.

Multi-modal large models typically have an interface and format that facilitate human-computer interaction, mimicking communication between humans, thereby guiding users to inject human-like qualities into the algorithm. As a result, unlike other forms of AI, the usage of multi-modal large models and the content of the responses they generate appear “human-like,” which is one reason for their unprecedented adoption by the public. Furthermore, because the responses they provide seem authoritative, many users still uncritically regard them as correct, even though multi-modal large models cannot guarantee the correctness of their responses and cannot incorporate ethical norms or moral reasoning into their generated responses. Multi-modal large models have been used in numerous fields, including education, finance, communication, and computer science, and this guide illustrates the different ways multi-modal large models are used (or envisioned to be used) in healthcare and pharmaceuticals.

Multi-modal large models can be seen as the product of a series (or chain) of decisions made by one or more actors regarding programming and product development. Decisions made at each stage of the AI value chain can have direct or indirect impacts on downstream entities involved in the development, deployment, and use of multi-modal large models. These decisions may be influenced and regulated by governments enacting and enforcing laws and policies at national, regional, and global levels.

The AI value chain typically begins with a large tech company developing the model. Developers can also be universities, smaller tech companies, national health systems, public-private partnerships, or other entities with the resources and capabilities to utilize several inputs. These inputs comprise the “AI infrastructure,” such as data, computational power, and AI expertise used to develop general-purpose foundation models. These models can be directly used to perform various tasks, often unexpected, including those related to healthcare. Several general-purpose foundation models have been specifically trained for use in healthcare and pharmaceuticals.

The WHO recognizes that AI can bring significant benefits to healthcare systems, including improving public health and achieving universal health coverage. However, as noted in the WHO’s guide on “Ethics And Governance Of Artificial Intelligence In Health,” AI poses significant risks that may harm public health and jeopardize individual dignity, privacy, and human rights. Although multi-modal large models are relatively new, their acceptance and dissemination have prompted the WHO to provide this guide to ensure their potential for successful and sustainable use globally. The WHO recognizes that at the time of publishing this guide, there are many competing views regarding the potential benefits and risks of AI, the ethical principles applicable to the design and use of AI, and the methods for governance and regulation. Since this guide was published shortly after the first applications of multi-modal large models in health and before the more powerful models were released, the WHO will update this guide to accommodate the rapid development of technology, the societal handling of its use, and the impact of multi-modal large models used outside of healthcare and pharmaceuticals on health.

1.1 Importance Of General-purpose Foundation Models (Multi-modal Large Models)

Although multi-modal large models are relatively new and untested, they have had a significant impact on society across various fields, including healthcare and pharmaceuticals. ChatGPT is a large language model that has been released in multiple versions by a US tech company. It is estimated that by January 2023, just two months after its launch, the model had reached 100 million monthly active users, making it the fastest-growing consumer application in history.

Currently, many companies are developing multi-modal large models or integrating them into consumer applications, such as internet search engines. Large tech companies are also rapidly integrating multi-modal large models into most applications or developing new software. With the support of millions of dollars in private investment, startups are also racing to develop multi-modal large models. Due to the availability of open-source platforms, the multi-modal large models developed are faster and cheaper than those developed by giant companies.

The emergence of multi-modal large models has facilitated new investments and continuous product launches in the tech sector. However, some companies admit that they do not fully understand why multi-modal large models generate certain responses. Despite being reinforced by human feedback, the content generated by multi-modal large models is still not always predictable or controllable, and it may generate content that makes users uncomfortable or produces incorrect but highly convincing content during participation in “conversations.” Even so, the support for multi-modal large models is often not merely enthusiasm for their capabilities but also unconditional and uncritical claims about their performance in non-peer-reviewed publications.

The datasets used to train multi-modal large models have not been made public, but they have been rapidly adopted, making it difficult or impossible to know whether these data are biased, legally obtained, and compliant with data protection rules and principles, and whether the execution of tasks or queries reflects that it has been trained and gained problem-solving capabilities regarding the same or similar issues. Other concerns regarding the data used to train multi-modal large models, such as compliance with data protection laws, will be discussed below.

Individuals and governments are not prepared for the release of multi-modal large models. Individuals may not understand effective use of multi-modal large models without proper training, even though multi-modal large model chatbots may give the impression of being accurate and reliable; their responses are not always accurate or reliable. One study found that while the large language model GPT-3 “can generate more easily understandable accurate information compared to humans,” it can also generate “more persuasive false information,” and humans cannot distinguish between content generated by multi-modal large models and content generated by humans.

Governments are also generally unprepared. Laws and regulations established to govern the use of AI may not address the challenges or opportunities associated with multi-modal large models. The European Union has reached an agreement to enact an EU-wide “AI Act,” but had to modify its legislative framework in the final stages of drafting due to multi-modal large models. Other governments are also rapidly formulating new laws or regulations or enacting temporary bans (some of which have been lifted). In the coming months, various companies are expected to launch increasingly powerful multi-modal large models, which may bring new benefits but also new regulatory challenges. In this dynamic environment, this guide builds on previous guidelines, including ethical guidelines, to offer opinions and recommendations for using multi-modal large models in healthcare and pharmaceuticals.

1.2 WHO Guidelines On Ethics And Governance Of Artificial Intelligence In Health

The WHO’s first version of guidelines on the ethics and governance of artificial intelligence in health reviewed various methods of machine learning and various applications of AI in health but did not specifically review generative artificial intelligence or multi-modal large models. At the time of developing and publishing that guide in 2021, there was no evidence suggesting that generative artificial intelligence and multi-modal large models would soon be widely applied in clinical care, medical research, and public health.

However, the fundamental ethical challenges, core ethical principles, and recommendations proposed in that guide remain relevant for assessing and effectively and safely using multi-modal large models, despite the emergence and ongoing governance gaps and challenges in this new technology. These challenges, principles, and recommendations also form the basis for the approach taken by the expert group in this guide regarding multi-modal large models.

Ethics And Governance Of Artificial Intelligence In Health

[To Be Continued, Stay Tuned]

Ethics And Governance Of Artificial Intelligence In Health

Editor: Shang Dan

Reviewer: Ji Pengcheng

Leave a Comment Cancel reply