Revolutionizing gastrointestinal endoscopy: the emerging role of large language models
Article information
As healthcare continues to evolve with technological advancements, the integration of artificial intelligence (AI) into clinical practice has shown promising potential for enhancing patient care and operational efficiency.1-3 Large language models (LLMs), as a subset of AI technologies, are at the forefront of this revolution, offering capabilities that extend far beyond simple data processing. These models possess the unique ability to understand, generate, and interact with human language on an unprecedented scale, thereby opening new avenues for enhancing the clinical practices across various specialties, including gastroenterology.
Gastrointestinal (GI) endoscopy, a cornerstone procedure for the diagnosis and treatment of digestive tract disorders, is used to integrate advanced technologies. Endoscopic procedures rely on the expertise of specialists in interpreting complex visual data and performing precise interventions. This presents a unique set of challenges, such as the variability in diagnostic accuracy and a labor-intensive reporting and documentation process. LLMs, with their vast data processing capabilities, promise to address these challenges by enhancing the diagnostic accuracy, automating report generation, enabling clinical reasoning, and improving educational tools.4 This editorial aims to explore the emerging role of LLMs in the field of GI endoscopy and provide future directions of this symbiotic relationship between AI technology and GI endoscopy.
POTENTIAL ROLE OF LLMS IN GI ENDOSCOPY
The advent of LLMs heralds a new era in GI endoscopy marked by improved diagnostic accuracy, streamlined documentation, and enhanced educational and patient engagement strategies. By analyzing endoscopic images with unparalleled precision and automatically generating reports, LLMs introduce a level of analysis and efficiency that was previously unattainable, which could reduce diagnostic errors and administrative burdens. Their ability to quickly assess, interpret, and synthesize large volumes of medical data can transform the diagnostic process, providing support for complex medical queries, including rare or obscure conditions, and keeping medical professionals abreast of the latest research. LLMs can also be used for clinical reasoning.
In addition to the diagnostics and report generation, LLMs show significant potential in the education domain. They can create interactive training materials, personalize patient education, and provide emotional support.5,6 Emotional support is another unexpected benefit of LLMs, language vision models, and foundation models with multimodal functions. These capabilities can improve patient experience, enhance satisfaction, and foster a deeper understanding of medical conditions and treatments, and are expected to be the next-generation mainstream AI technology in clinical practice.
BENEFITS AND LIMITATIONS
The integration of LLMs into GI endoscopy promises to bring numerous benefits, including but not limited to, enhanced diagnostic accuracy, efficiency in clinical operations, and improved patient engagement (Fig. 1).4 The ability of the LLMs to serve as a dynamic source of knowledge for both medical staff and patients facilitates better communication, supports research, and contributes to quality improvement in medical practice (Figs. 2–4).7 For instance, the use of LLMs to analyze electronic medical records to identify patients for specific interventions or to understand the quality metrics, such as adenoma detection rates through pathology reports exemplifies their potential to revolutionize clinical practices.8
However, the implementation of LLMs is challenging. Data privacy, biases in training data, necessity for interdisciplinary collaboration, and the need for human oversight remain significant hurdles. The technical and cultural barriers of integrating these technologies into clinical practice should be addressed along with the ethical considerations of AI use in healthcare. Recent studies have highlighted the propensity of LLMs to amplify societal biases and overrepresent stereotypes, raising concerns about the equitable application of AI technologies.9 Rather than relying solely on the recommendations produced by the model, it is essential that the model be connected to an independent, verifiable source of bias-free knowledge via retrieval-augmented generation.9,10
Another major consideration was the likelihood of hallucinations. LLM is a general-purpose model. Prompt engineering, in which explicit instructions meant to exploit the optimal capabilities of LLMs are incorporated in addition to the question within the LLM input, can significantly improve the LLM performance for specific tasks. However, hallucinations may become common if humans simply ask questions without providing specific instructions. Fine-tuning or retrieval-augmented generation may improve the goal directedness of LLMs.4
FUTURE DIRECTIONS
The future of LLMs in GI endoscopy is poised to be at the intersection of AI technology, interdisciplinary collaboration, and ethical governance. As the field advances, it will be crucial to focus on patient-centric innovations and leverage LLMs to address global health disparities. The development of transformer-based language vision models and their potential to cover current convolutional neural network (CNN)-based approaches in GI endoscopy illustrates the ongoing evolution of AI technologies. Medical practice is a multimodal task that includes history taking, visual diagnosis, data interpretation, and clinical reasoning. Notably, LLM-based foundation models that have multimodal functions are the next-generation mainstream AI models in clinical practice.4
LLMs are one of the forms of generative models; however, generative models other than test-generation (generative adversarial network, diffusion, variational autoencoder, language vision models, etc.) are also evolving, and the creative features in these models are already being integrated into LLMs. The emergence of foundation models with multimodal functions underscores the shift towards more integrated and comprehensive AI tools in clinical practice, promising a future in which the capabilities of AI models are fully harnessed to enhance diagnostic and therapeutic outcomes in the field of GI endoscopy. We are currently developing new LLM models with significantly larger parameters and optimizations. Although they have the potential to enhance our practice, it is essential to handle them carefully to avoid potential harm.4
CONCLUSION
The integration of LLMs into GI endoscopy represents a frontier of healthcare innovation with the potential to significantly enhance the diagnostic accuracy, operational efficiency, and patient care. However, this journey is contingent on overcoming the challenges of data privacy, ensuring the quality of the data used for AI training, and fostering interdisciplinary collaboration.
Notes
Conflicts of Interest
Chang Seok Bang is currently serving as a KSGE Publication Committee member; however, he was not involved in peer reviewer selection, evaluation, or the decision process in this study. The other author has no potential conflicts of interest.
Funding
None.
Author Contributions
Conceptualization: CSB; Investigation: EJG; Resources: EJG; Writing–original draft: all authors; Writing–review & editing: all authors.