Volume 116, Issue 2 pp. 297-307
REVIEW ARTICLE
Open Access

Current status and future direction of cancer research using artificial intelligence for clinical application

Ryuji Hamamoto

Corresponding Author

Ryuji Hamamoto

Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan

Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

Correspondence

Ryuji Hamamoto, Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan.

Email: rhamamot@ncc.go.jp

Contribution: Conceptualization, Funding acquisition, ​Investigation, Project administration, Visualization, Writing - original draft, Writing - review & editing

Search for more papers by this author
Masaaki Komatsu

Masaaki Komatsu

Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan

Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Masayoshi Yamada

Masayoshi Yamada

Department of Endoscopy, National Cancer Center Hospital, Tokyo, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Kazuma Kobayashi

Kazuma Kobayashi

Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan

Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Masamichi Takahashi

Masamichi Takahashi

Department of Neurosurgery and Neuro-Oncology, National Cancer Center Hospital, Tokyo, Japan

Department of Neurosurgery, School of Medicine, Tokai University, Isehara, Kanagawa, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Mototaka Miyake

Mototaka Miyake

Department of Diagnostic Radiology, National Cancer Center Hospital, Tokyo, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Shunichi Jinnai

Shunichi Jinnai

Department of Dermatologic Oncology, National Cancer Center Hospital East, Kashiwa, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Takafumi Koyama

Takafumi Koyama

Department of Experimental Therapeutics, National Cancer Center Hospital, Tokyo, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Nobuji Kouno

Nobuji Kouno

Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan

Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

Department of Surgery, Graduate School of Medicine, Kyoto University, Kyoto, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Hidenori Machino

Hidenori Machino

Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan

Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Satoshi Takahashi

Satoshi Takahashi

Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan

Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Ken Asada

Ken Asada

Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan

Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
Naonori Ueda

Naonori Ueda

Disaster Resilience Science Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

Contribution: Supervision, Writing - review & editing

Search for more papers by this author
Syuzo Kaneko

Syuzo Kaneko

Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan

Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

Contribution: Conceptualization, ​Investigation, Writing - review & editing

Search for more papers by this author
First published: 18 November 2024

Abstract

The expectations for artificial intelligence (AI) technology have increased considerably in recent years, mainly due to the emergence of deep learning. At present, AI technology is being used for various purposes and has brought about change in society. In particular, the rapid development of generative AI technology, exemplified by ChatGPT, has amplified the societal impact of AI. The medical field is no exception, with a wide range of AI technologies being introduced for basic and applied research. Further, AI-equipped software as a medical device (AI-SaMD) is also being approved by regulatory bodies. Combined with the advent of big data, data-driven research utilizing AI is actively pursued. Nevertheless, while AI technology has great potential, it also presents many challenges that require careful consideration. In this review, we introduce the current status of AI-based cancer research, especially from the perspective of clinical application, and discuss the associated challenges and future directions, with the aim of helping to promote cancer research that utilizes effective AI technology.

Abbreviations

  • AI
  • artificial intelligence
  • CAD
  • computer-aided diagnosis
  • CNN
  • convolutional neural network
  • EAU
  • European association of urology
  • FDA
  • food and drug administration
  • GAN
  • generative adversarial networks
  • GPT
  • generative pre-trained transformer
  • GPU
  • graphics processing unit
  • HN
  • head and neck
  • IDATEN
  • improvement design within approval for timely evaluation and notice
  • ILSVRC
  • ImageNet large-scale visual recognition challenge
  • LLM
  • large language model
  • LMM
  • large multimodal model
  • MTB
  • Molecular Tumor Board
  • NLP
  • natural language processing
  • POC
  • proof of concept
  • SaMD
  • software as a medical device
  • SPECT
  • single-photon emission computed tomography
  • SVM
  • support vector machine
  • USMLE
  • US medical licensing examination
  • XAI
  • explainable AI
  • 1 INTRODUCTION

    The expectations for AI are rising due to rapid progress in machine learning brought about by the advent of deep learning, the development of high-performance GPUs and other computer analysis infrastructure, and the ability to utilize big data through the development and expansion of public databases.1-3 Currently, we are in what is being referred to as the third AI boom,4 defined by the broad social implementation of AI technology. AI technology is widely used in today's society, from the security field (e.g., facial recognition at airports) to automatic translation, automatic driving, and general consumer electronics.5-7 The medical field is no exception, with more than 800 AI-based medical devices approved by the US FDA.8 In Japan, a number of AI-equipped medical devices, including those developed by our group, have obtained regulatory approval and are already in clinical use. Since transformer was developed in 2017,9 generative AI research, as represented by ChatGPT, has undergone rapid development, with ever-increasing expectations.10, 11 Some experts have proposed that we are already in the fourth AI boom.12 Following the LLM, LMM research has gained momentum, with Med-PaLM as an example from the medical field. Generative AI-driven medical research has been continuously published.13, 14

    In cancer research, AI is actively explored for medical image analysis, such as an endoscope, radiation, and pathology image-based diagnostics, with advances already applied clinically.15-21 In addition to medical image analysis, AI is also being used in various cancer research applications, including omics analysis, medical information analysis, electronic medical records using NLP, and the analysis of pathology and radiology reports.22-27 Further, in 2003, the whole genome analysis project, which had been analyzed on the basis of an international consortium, was completed, and the world entered what is now known as the postgenomic era. The applications of genomic information in medicine have expanded, and the concept of personalized medicine emerged.28 In particular, the announcement of the Precision Medicine Initiative by US President Barack Obama in January 2015 had a major impact on global healthcare policy.29 This is especially true in the field of cancer research, where the concept of precision oncology is actively pursued. In this context, AI is increasingly used to optimize cancer treatment selection based on genomic and medical information.30, 31

    While the expectations for AI in cancer research become ever greater, challenges remain. Examples include overfitting, wherein high accuracy is observed in training data but declines in test data; the black box problem, where AI analysis is so complex that humans (medical professionals) cannot understand it; and the domain shift problem, which arises due to differences in medical equipment and protocols between institutions. In order to advance AI-based cancer research and achieve optimal results, these issues should be considered.

    In this review, the history of medical AI is first described, followed by an introduction to the current status of AI applications in cancer research, with a discussion on issues and future directions in AI-based cancer research.

    2 BRIEF HISTORY OF MEDICAL AI

    The concept of AI itself can be traced back to an anecdote in ancient mythology in which a divine master craftsman bestowed intelligence and consciousness onto an artifact.32 The coining of the term “artificial intelligence” by John McCarthy and colleagues at the Dartmouth Conference of 1956 is considered the birth of modern AI.33

    A key event in the foundation of medical AI is considered to be the initiation of research into CAD. These early CAD systems used flowcharts, statistical pattern matching, probability theory, and knowledge bases to drive the decision-making process.34 In the 1970s, expert systems, an AI technology, attracted significant attention worldwide, and the MYCIN expert system, a very early CAD system and recognized as the world's first successful medical AI system, was developed (Figure 1).35-37 MYCIN, a backward-linked expert system, identified bacteria that cause serious infections such as bacteremia and meningitis, recommending antibiotics with dosage adjusted to the patient's weight. It uses a simple inference engine and works with a knowledge base consisting of about 500 rules. Physicians were asked a series of simple yes/no or text questions and provided a list of causative organisms ranked from high to low based on the probability of each diagnosis, their level of confidence in the probability of each diagnosis, the reason behind each diagnosis, and the recommended course of drug treatment. A study conducted at Stanford University School of Medicine showed a 65% correct response rate for bacterial infection diagnosis by MYCIN, which was not comparable with that of specialist physicians, but was higher than that (42.5%–62.5%; average 55%) of five non-specialist Faculty member clinicians.37 However, MYCIN was never widely used in clinical practice, as there were concerns about liability issues upon misdiagnosis, and professionals were reluctant to accept the results produced via CAD. Furthermore, expert systems were often unable to formulate expert knowledge, especially with exceptions arising among the rules.38 The shortcomings of expert systems were subject to discussion, which brought about a downturn in AI research (the second AI winter period). Although MYCIN has never been used in clinical practice, it is known worldwide as a pioneering medical AI system. Despite the fact that it showed higher accuracy than general physicians, although it did not reach the level of specialist physicians, the results of MYCIN have influenced the development of diagnostic and decision support tools using current AI technology, including those used in the field of oncology.

    Details are in the caption following the image
    Schematic of MYCIN configuration for the human examination process (modified from Ref. 36). The basis for all decisions is domain-specific knowledge (“static knowledge”) obtained from experts. A group of computer programs (“rule interpreter”) uses this knowledge and data about a particular patient to draw conclusions and, in turn, generate therapeutic advice. At the same time, it records what happened, and this record can be used by the explanation routine if the physician requests justification or explanation of the conclusions reached by the program. The illustrations in this figure are from royalty-free Adobe Stock (https://stock.adobe.com/jp/).

    Since then, CAD has remained actively studied, with the first commercial CAD system for mammography, the ImageChecker M1000 system, approved by the US FDA in 1998.39 Subsequently, CAD for detecting lung cancer (nodule shadows) in simple chest radiographs and chest CT images and CAD for detecting polyps in colon CT examinations were successively approved by the US FDA.40 Insurance coverage for the use of mammography CAD became available in the United States in April 2001 (Medicare coverage), a major factor in spurring the spread of CAD.41

    In 2006, auto-encoder, a deep learning technology, was developed. This advent of deep learning was the main catalyst for the emergence of AI technology, which ushered in what is generally referred to as the third AI boom. In 2015, a group at Microsoft demonstrated recognition accuracy that exceeded the average human error rate in ILSVRC by using deep learning techniques.42 In 2017, AlphaGo, an AI-based Go program, defeated the world's top Go player.43 Thus, reports of AI surpassing human capabilities one after another invigorated AI research and development in various fields in society, including the medical field, where it has been centered on medical image analysis. Importantly, results are not only at the basic research level, but AI-powered software as a medical device (AI-SaMD) is also being applied clinically after receiving medical device approval. In response to this situation, an independent organization has published AI-based software (CE certified in the EU) that can be used in clinical radiology practice, with the aim of increasing the transparency of AI in radiology (Health AI Register: https://radiology.healthairegister.com).44 The publicly available information is summarized in Figure 2 based on the type of medical imaging exam and the body part. Most approvals for medical imaging exams are for X-ray, CT, and MRI image analysis, and there are many approvals for AI-SaMD that target the neuro region and chest.

    Details are in the caption following the image
    Status of AI-based diagnostic imaging program medical devices that have received the CE Mark in the EU (as of August 2024). Data were extracted from the Health AI Register (https://radiology.healthairegister.com), in which one program may overlap with both the type of examination and the body part (for example, being available for both CT and MRI). *Including mammography.

    3 AI TECHNOLOGIES USED IN CANCER RESEARCH

    A major factor that has led to a significant focus on AI in recent years has been the emergence of deep learning, and since deep learning is also a machine learning method, it can be concluded that machine learning is used as the underlying technology for current AI. In Figure 3, we present the properties of machine learning and examples of its use in cancer research.

    Details are in the caption following the image
    The use of machine learning in the field of cancer research. The main analysis methods are not limited to one application (e.g., SVM is used for regression tasks in addition to classification tasks).

    Machine learning can be divided into four types: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. Supervised learning corresponds to regression/classification tasks, unsupervised learning to clustering and dimensionality compression tasks, semi-supervised learning to tasks with generative or discriminative models, and reinforcement learning to optimization tasks. It should be emphasized that current AI is machine learning based and thus limited to the above tasks. There is a tendency to place excessive expectations on AI due to mass media and other factors; however, there are limits to the tasks that AI can actually handle, and it is thus necessary to understand its characteristics.

    The main methods used and examples in cancer research are shown in Figure 3. Note that these methods are not limited to a single task (e.g., SVM is used for regression tasks in addition to classification tasks). Extensive work has been done in cancer research, particularly in supervised learning classification tasks, yielding results that have been applied clinically. This is attributed to the proficiency of deep learning at image analysis, with many researchers working on classification (benign or malignant, etc.) tasks, and also to supervised learning based on annotation by medical specialists being important to apply for the approval of AI-equipped medical devices.45-49

    More recently, generative models have received considerable attention in cancer research. For example, Das et al. proposed the NAS-SGAN model for automated breast cancer classification using quantitative analysis of histopathological images to improve classification model performance with limited annotation data.50 The model is a semi-supervised learning framework combined with deep neural network-based generative adversarial learning. Even a limited amount of labeled samples could better discriminate between tumor grade, thereby improving system robustness and accuracy. Pang et al. developed a semi-supervised GAN-based radiomics model for augmenting breast ultrasound images.51 The model produced high-quality breast ultrasound images, validated by two experienced radiologists, with semi-supervised learning improving the quality of generated synthetic data compared with the baseline method. More accurate breast mass classification was thus achieved compared with other state-of-the-art methods.

    4 CURRENT STATUS OF AI-BASED CANCER RESEARCH

    In this section, we introduce the current state of cancer research using AI, focusing mainly on the condition in Japan. Table 1 shows a list of AI SaMDs related to oncology that have been approved for use in Japan; they are all products related to medical image analysis. The classification of classes here is based on the international system that has been adopted in Pharmaceutical Affairs Law in Japan, with Class II defined as “those that are considered to have a relatively low risk to the human body even if a problem occurs” and Class III defined as “those that are considered to have a relatively high risk to the human body if a problem occurs.” One of the characteristics of AI SaMDs approved in Japan is that many of them are AI systems that support endoscopic diagnosis. Of the 18 types of AI-SaMD approved for tumor-related applications, nine (50%) are for the purpose of supporting endoscopic diagnosis. At the research level, in addition to image analysis, there is also active research on omics analysis, NLP, and analysis using LLMs. Here, we focus on three areas: 1. medical image analysis, 2. omics analysis, and 3. NLP and LLMs.

    TABLE 1. AI-SaMD approved as a medical device in the field of oncology in Japan (as of May 2024).
    Research area Approval number Product Manufacturer Target inspection method Class Year of approval
    Gastrointestinal endoscope 23000BZX00372000 Endoscopy Diagnostic Imaging Support Software EndoBRAIN Cybernet Systems, Inc. Colonoscopy III 2018
    30200BZX00136000 Endoscopy Diagnostic Imaging Support Software EndoBRAIN-UC Cybernet Systems, Inc. Colonoscopy III 2020
    30200BZX00208000 Endoscopic Imaging Support Program EndoBRAIN-EYE Cybernet Systems, Inc. Colonoscopy II 2020
    30200BZX00288000 Endoscopy Support Program EW10-EC02 Fujifilm Corporation Colonoscopy III 2020
    30200BZX00235000 Endoscopy Diagnostic Imaging Support Software EndoBRAIN-Plus Cybernet Systems, Inc. Colonoscopy III 2020
    30200BZX00382000 WISE VISION Endoscope Image Analysis AI NEC Corporation Colonoscopy II 2020
    30400BZX00217000 Endoscopy Support Program EW10-EG01 Fujifilm Corporation Upper gastrointestinal endoscopy II 2022
    30400BZX00259000 Medical Image Analysis Software EIRL Colon Polyp LPIXEL Colonoscopy II 2022
    30500BZX00297000 Endoscopic Imaging Support Software gastroAI-model G AI Medical Service Co. Upper gastrointestinal endoscopy II 2023
    Radiation medicine 30100BZX00263000 Similar Image Case Retrieval Software Model FS-CM687 Fujifilm Corporation Diagnostic imaging (lung nodules, liver masses, diffuse lung disease) support II 2019
    30200BZX00150000 Pulmonary nodule detection program FS-AI688 Fujifilm Corporation CT scan of the chest II 2020
    30200BZX00202000 AI-Rad Companion Siemens Healthcare CT scan of the chest II 2020
    30200BZX00269000 Medical Image Analysis Software EIRL X-ray Lung nodule LPIXEL Chest X-ray II 2020
    30300BZX00188000 Chest X-ray image lesion detection (CAD) program LU-AI689 Fujifilm Corporation Chest X-ray II 2021
    30300BZX00271000 Diagnostic Imaging Support Software KDSS-CXR-AI-101 Konica Minolta, Inc. Chest X-ray II 2021
    30500BZX00161000 Radiation Therapy Planning Support Program Ai-Seg Excel Creates Inc. CT scan III 2023
    Medical ultrasound 30200BZX00379000 Breast Cancer Diagnosis Support Program RN-Decarte CES Descartes Co., Ltd. Breast ultrasound II 2023
    30600BZX00086000 Breast Cancer Ultrasound AI Diagnosis Support Software Smart Opinion METIS Eye Smart Opinion Inc. Breast ultrasound II 2024

    4.1 Medical image analysis

    In Japan, research on AI-based endoscopic image analysis is active, and its results are in the forefront worldwide. Of the 18 AI SaMDs related to oncology that have been approved for use in Japan, nine are related to endoscopes (Table 1).52-54 This can be attributed to the fact that Japanese vendors (Olympus, Fujifilm, and HOYA) dominate the endoscope market, and that the Japanese gastrointestinal endoscope field is a leading discipline worldwide.55 The target organ is often the colon, and working has been done on the development of real-time colonoscopy diagnosis support AI; in 2017, the National Cancer Center Japan announced the results of lesion detection in a world-leading manner.56 After filing regulatory applications and presenting a paper,17 the product received regulatory approval in Japan in 2020 and a CE mark in Europe, and is now being used clinically in Japan and Europe (where it is marketed by NEC).57 The system was then applied to Barrett's esophagus for tumor detection,58 which received the CE mark in 2021 and was clinically applied in Europe. As for colonoscopy diagnosis support, the system has been developed into an AI system that automatically and robustly predicts pathological diagnoses based on the revised Vienna Classification using standard colonoscopy images. The results showed that the developed AI system enables unskilled endoscopists to make the same differential diagnosis of colorectal neoplasms as skilled endoscopists.59 For organs other than the colon and esophagus, endoscopic diagnostic support AI for the stomach is also under development, and was approved as a medical device in 2023 for clinical application in Japan (gastroAI-model G, Table 1).60

    AI is also actively utilized for radiological image analysis, with several AI SaMDs approved as medical devices and clinically applied in Japan, mainly for chest X-ray and chest CT image diagnosis support (Table 1). Research has been published on various types of radiological image analysis, including X-ray, CT, MRI, and PET, for multiple cancer types, including lung, breast, brain, and others.61-63 We previously developed an AI technology to precisely extract suspected glioma regions from MR images using Synapse Creative Space platform,64 which was released in 2022, in collaboration with Fujifilm Corporation. Based on the POC for this AI technology, Fujifilm obtained certification under the Pharmaceutical Affairs Law for its “Region of Interest Segmentation Software for Head” and applied it clinically.65 By precisely extracting glioma-suspect areas from MR images and measuring the volume of extracted areas, it is possible to perform pre-treatment image evaluation of gliomas with higher accuracy. This system is expected to be useful for early detection, improving diagnostic accuracy, and optimization of treatment plans such as radiotherapy and surgery.

    Breast ultrasound diagnostic support AI is also being actively explored for breast cancer, with two systems approved by the pharmaceutical affairs bodies in Japan and now in clinical use (Table 1). Ultrasonography is performed in various medical fields owing to its simplicity, non-invasiveness, and real-time performance. However, as it involves manual scanning of probes to acquire images, there are large differences between technicians, with ultrasonic images being easily affected by acoustic shadows, which leads to deterioration in image quality and diagnostic accuracy. Due to these challenges unique to ultrasound examinations, AI technology is expected to provide support.66-68 As for “shadows” among the issues in ultrasound images, we have developed a new method to automatically detect shadows by learning with unlabeled data using deep learning, which enables us to detect shadows with higher accuracy than conventional methods.69 When analyzing using AI, it is important to be aware of the black box problem of not being able to obtain sufficient explanatory power regarding the basis for the AI's judgment. To address this problem, we developed a “barcode display of site detection results” function70 and a “graph chart diagram” that uses an auto-encoder to analyze the variance representation obtained from the barcode to add explanatory power to the results of ultrasound image analysis using AI.71 Furthermore, in collaboration with Fujitsu Japan, we applied these ultrasound-assisted AI technologies to fetal echocardiographic screening and obtained regulatory approval in Japan for clinical use in July 2024 as a world-leading achievement (Approval no. 30600BZX00155000).

    AI is also actively explored for pathological and skin image analysis in the context of oncology.15, 72-76 We developed a skin tumor prediction system by extracting brownish to dark skin lesions from 120,000 skin photograph data and utilizing deep learning techniques.21 A discrimination test using six types of images (malignant melanoma, basal cell carcinoma, nevus [mole], seborrheic keratosis, hematoma/hemangioma, and senile pigmentation) showed that the accuracy of AI (86.2%) was higher than that of non-specialists (74.8%) and specialists (79.3%).

    4.2 Omics analysis

    Precision oncology, which promotes optimal cancer treatment based on genomic information, is actively pursued, with research conducted to introduce AI into the MTB, which is important for genome analysis and subsequent proposal of optimal treatment.31 AI is also being used in epigenomic, transcriptomic, proteomic, and metabolomic analyses, as well as in multi-omics analysis research.15, 77, 78 Deep learning has developed around CNNs, and, in the case of images, as there is a relationship between neighboring pixels, the positional invariance and compositional properties of convolutional computation can be fully exploited. However, when we consider genetic analysis, it is generally reported that there are about 20,000 protein-coding genes in humans, whose functions are not necessarily related even if located close to each other on the genome. Biological phenomena are complex, and information about protein–protein interactions, upstream or downstream signal transduction, etc. is more important than genomic proximity. Without understanding these characteristics, deep learning would not produce good results in omics analysis. Furthermore, the number of parameters in omics analysis (e.g., about 20,000 for protein-coding genes) is very large in relation to the number of samples (n), which is called the “small n, large p problem.”79 This is known to be difficult to solve with conventional estimation methods, and caution should be exercised as the sample size is inevitably limited in the case of medical data.

    We previously constructed an analysis system using modified Diet Networks, which inverts the relationship between n and p by transposing the input information, and narrows down the number of parameters encoded using a different network in advance. We then successfully predicted the pathological classification of lung cancer (adenocarcinoma or squamous cell carcinoma) with high accuracy from genomic mutation information.23 In terms of multi-omics analysis, we have also built a platform for lung cancer prognosis prediction using auto-encoder, a form of deep learning.22, 24 Six types of omics data (somatic mutation data, copy number polymorphism data, mRNA data, microRNA data, DNA methylation data, and reverse phase protein array data) and clinical data were used as the input, and each was auto-encoded into a 100-dimensional bottleneck feature space. Univariate Cox regression analysis was then performed on the reduced data set to select features significantly associated with patient survival. The feature selection criteria were (1) log-rank test p < 0.01 or (2) 0.01 < log-rank test p < 0.05 and the top three p-values in each category. In total, 29 features were selected, including three somatic mutations, five copy number variations, 12 mRNAs, three microRNAs, three DNA methylation modifications, and three reversed-phase protein arrays. These data were combined into a single matrix called the omics matrix, which was used to determine the appropriate number of clusters as two using the Calinski–Harabasz criterion and the Silhouette index. Survival analysis using the labels estimated from k-means clustering showed a significant difference in survival using cluster IDs (log-rank test: p = 0.003), so the clustered subtype was referred to as the integrated survival subtype. The number of patients belonging to the integrated survival subtype 1 (long survival) was 270, and the number of patients belonging to the integrated survival subtype 0 (short survival) was 213. There was no significant relationship between tumor histopathologic subtype and integrated survival subtype, indicating that our model can predict patient survival independent of tumor subtype, including both lung adenocarcinoma and lung squamous cell carcinoma.24 Multimodal analysis of the integrated mRNA and microRNA dataset using autoencoders identified prognostic factors in lung adenocarcinoma, suggesting that they may be potential drug targets in novel lung adenocarcinomas.22

    4.3 Cancer research using NLP and LLMs

    NLP, one elemental technology of AI, has been reported to extract important clinical variables from free text in electronic medical records to identify cancer cases, quantify staging and treatment results, and automatically extract information on stage, histological type, tumor grade, treatment method, etc. with high accuracy from clinical notes, pathology reports, and surgical reports of patients with lung cancer.80, 81 Nara Institute of Science and Technology and Osaka International Cancer Center are conducting automatic analysis using NLP technology on spoken language to evaluate the effects of cancer treatment on the cognitive abilities of cancer patients.82 This study aims to assess cognitive function in cancer patients in a less burdensome manner.

    An important point in the NLP field is the rapid development of LLMs with the advent of transformers, and their use in various aspects of society, as exemplified by ChatGPT, has had a significant impact. The introduction of LLMs into the medical field is also underway, with GPT-4 reportedly achieving passing scores on Steps 1–3 of the USMLE by more than 20 points, making it the first computer-based system to qualify for standardized US physician face-private examination.83 Conversely, despite its potential to provide medical information, GPT-4's training data are not based on expert medical knowledge. Recent studies have reported that ChatGPT may obtain fundamentally flawed or inadequate information in the field of urologic oncology due to the extrapolation of data from relevant literature and abstracts without logic or accuracy.84 Therefore, current LLM-based chatbots may not be reliable enough for direct use. To address these challenges, it has been reported that the integration of the EAU oncology guidelines into ChatGPT-4 has the potential to address questions that were previously difficult to answer accurately.84 Fine tuning of LLMs is also underway to adapt these to specific medical reasoning tasks. Therefore, we would like to introduce the results of a study presented by a group from the Mayo Clinic, in which LLMs were retrained to be adapted to specific diseases, with favorable results. Zhu et al. at the Mayo Clinic developed an LLM retrained to meet the needs of patients with HN cancer (HN) undergoing radiation therapy, with an emphasis on symptom management and cybership care (Figure 4).83 Initially, they built a comprehensive external database to “educate” ChatGPT-4, integrating expert consensus guidelines on supportive care for HN patients and correspondence on 90 HN patients from physicians and nurses in the Mayo Clinic electronic health record. Performance was assessed using 20 patient post-treatment inquiries, and evaluated by three radiation oncologists (RadOncs). The custom-trained model was highly accurate in assisting HN patients with evidence-based information and guidance on symptom management and survivorship care. In contrast, the system introduced here uses a database from the United States, but when we consider factors such as differences in disease and treatment effects by race, medical systems, and language, the ideal method to build a more accurate system may be to build a similar system for each region and country (in the case of Japan, a Japanese system based on a Japanese database). In order to develop cancer research using NLP and LLMs, it is particularly important to construct high-quality large-scale medical databases. As medical information is sensitive personal information, care must be taken when handling it; however, by building a high-quality large-scale medical database, this field should develop constructively.

    Details are in the caption following the image
    Schematic of LLM retrained to meet the needs of head and neck cancer patients undergoing radiation therapy (modified from Ref. 83). The workflow and validation process with custom trained chatbots and external databases are shown. A specialized LLM-based model was proposed that was retrained using a high-quality external database running within the Gradio interface that facilitates ChatGPT-4 training with a custom external knowledge base. During the validation process, three certified radiation oncologists (RadOncs) specializing in the treatment of HN malignancies evaluated the answers to 20 questions generated by the LLM based on four criteria: accuracy, clarity, completeness, and relevance. An automated scoring method has also been developed to evaluate and score chatbot responses using GPT-4. The illustrations in this figure are from royalty-free Adobe Stock (https://stock.adobe.com/jp/).

    5 CHALLENGES IN CONDUCTING AI-BASED CANCER RESEARCH

    Here we presented the current status of AI-based cancer research for clinical application. While the main focus was on successful AI-based cancer research, there are recurring challenges specific to medical AI.15, 77 The first is “overfitting,” a condition in which the training error is small but the generalization error (the error when judging unknown data) is not. In the medical field, where the number of supervised data is limited, carefully judging the generalization performance of the constructed training unit is essential. The second is the “black box problem,” which refers to a situation in which the analysis process by AI is so complex that humans cannot understand it. This highlights the need to develop XAI/interpretable AI. The third issue is the “domain shift problem,” in which characteristics such as differences in the manufacturer, model number, and year of the medical equipment used, as well as differences in protocols at different facilities, can be stronger than the characteristics of the pathology, significantly affecting the robustness of the AI built as a result. The fourth issue is the “hallucination problem,” in which AI generates incorrect information, and should be noted in studies that utilize LLM.85 Although we omit detailed descriptions in this review for reasons of space limitation, various countermeasures are being taken by us and others against the “overfitting,” “black box,” “domain shift,” and “hallucination” problems.15, 71, 85, 86 In addition to technical issues, there are also related to legal systems and guidelines. Medical AI research and development involves the large-scale use of medical data, which is often sensitive personal information, resulting in many sensitive points in terms of privacy protection. In particular, as there are more opportunities to conduct cloud-based research when utilizing LLM, the utmost care must be taken to prevent the leakage of personal information. Furthermore, all medical devices using AI in clinical practice so far are “locked” AI medical devices with fixed algorithms that have been approved. To take full advantage of AI functions, “adaptive” AI medical devices that can learn and change through use must also be approved. The revised Pharmaceutical Affairs Law, which went into effect in Japan on September 1, 2020, introduced IDATEN (Improvement Design within Approval for Timely Evaluation and Notice) as an approval system for medical devices that appropriately considers the characteristics of medical devices that are continuously improved and enhanced as well as technological innovations such as AI.87 It is thus expected that “adaptive” AI medical devices will be approved in the near future.

    6 CONCLUSION

    Judging from its technological potential, there is no doubt that AI technology will be used in various aspects of cancer research in the future. Conversely, as mentioned above, there are multiple issues that need to be addressed. Thus, we believe that AI-based cancer research and its clinical application should not be rushed. Rather, considerable caution should be practiced. We would like to emphasize the need to rigorously verify the clinical utility of AI products by conducting appropriate clinical performance evaluation studies. Priority should be placed not on the interests of individual researchers, but on the benefits to patients and the public, and this cannot be pursued through short-sighted strategies. Thus, we believe that a common effort should be mobilized to advance the utilization of AI in cancer research, driven not only by experts in medicine and informatics, but also by experts in law, bioethics, sociology, and patient associations, aiming to benefit society through repeated discussion.

    AUTHOR CONTRIBUTIONS

    Ryuji Hamamoto: Conceptualization; funding acquisition; investigation; project administration; visualization; writing – original draft; writing – review and editing. Masaaki Komatsu: Conceptualization; investigation; writing – review and editing. Masayoshi Yamada: Conceptualization; investigation; writing – review and editing. Kazuma Kobayashi: Conceptualization; investigation; writing – review and editing. Masamichi Takahashi: Conceptualization; investigation; writing – review and editing. Mototaka Miyake: Conceptualization; investigation; writing – review and editing. Shunichi Jinnai: Conceptualization; investigation; writing – review and editing. Takafumi Koyama: Conceptualization; investigation; writing – review and editing. Nobuji Kouno: Conceptualization; investigation; writing – review and editing. Hidenori Machino: Conceptualization; investigation; writing – review and editing. Satoshi Takahashi: Conceptualization; investigation; writing – review and editing. Ken Asada: Conceptualization; investigation; writing – review and editing. Naonori Ueda: Supervision; writing – review and editing. Syuzo Kaneko: Conceptualization; investigation; writing – review and editing.

    ACKNOWLEDGMENTS

    The authors express their gratitude to the past and present members of the Hamamoto Laboratory.

      FUNDING INFORMATION

      This work was supported by the Cabinet Office BRIDGE (programs for bridging the gap between R and D and the ideal society (Society 5.0) and generating economic and social value) and the RIKEN Center for the Advanced Intelligence Project.

      CONFLICT OF INTEREST STATEMENT

      Dr. Ryuji Hamamoto is an editorial board member of Cancer Science. R.H. and M.K. have received research grants from Fujitsu Limited, and R.H. and K.K. have received research grants from the Fujifilm Corporation. The other authors declare no conflicts of interest.

      ETHICS STATEMENT

      Approval of the research protocol by an Institutional Reviewer Board: N/A.

      Informed Consent: N/A.

      Registry and the Registration No. of the study/trial: N/A.

      Animal Studies: N/A.

      Volume116, Issue2

      February 2025

      Pages 297-307

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.