ISS/SIC
Journal (WJS)
Congress
Create Account
Login
International Society of Surgery (ISS)
Société Internationale de Chirurgie (SIC)
Integrated Societies: IATSIC | IASMEN | BSI | ISDS
DIAGNOSTIC AND THERAPEUTIC ACCURACY OF LARGE LANGUAGE MODELS IN EMERGENCY DIGESTIVE SURGERY: A PRELIMINARY EVALUATION USING THE ARTIFICIAL INTELLIGENCE PERFORMANCE INSTRUMENT (AIPI) SCORING SYSTEM
marie.desonniaux@skynet.be
 
Back
Slot ID
366-07
Abstract Title
DIAGNOSTIC AND THERAPEUTIC ACCURACY OF LARGE LANGUAGE MODELS IN EMERGENCY DIGESTIVE SURGERY: A PRELIMINARY EVALUATION USING THE ARTIFICIAL INTELLIGENCE PERFORMANCE INSTRUMENT (AIPI) SCORING SYSTEM
Author Details
No. of Authors
4
Including the presenting author
Author 1
Marie Desonniaux marie.desonniaux@skynet.be University of Liège Faculty of Medicine Liège Belgium *
Author 2
Tommaso Federico Coppola coppolafederico@live.it Humanitas Gavazzeni University Hospital 2. Department of Minimally Invasive General and Oncologic Surgery Bergamo Italy
Author 3
Jerome R. Lechien Jerome.LECHIEN@umons.ac.be University of Mons Department of Surgery, Faculty of Medicine Mons Belgium
Author 4
Giovanni Dapri giovanni@dapri.net Humanitas Gavazzeni University Hospital 2. Department of Minimally Invasive General and Oncologic Surgery Bergamo Italy
Author 5
Author 6
Author 7
Author 8
Author 9
Author 10
Author 11
Author 12
Presenting Author Name
Marie Desonniaux
Presenting Author Email
marie.desonniaux@skynet.be
Presenting Author Country
Belgium
Abstract
Abstract type
Oral or Poster
Introduction *
Large Language Models (LLMs) including ChatGPT-4, Claude Sonnet 4, and DeepSeek have gained interest as decision-support tools in medical education. Their performance in high-acuity contexts like emergency digestive surgery, however, had remained under-evaluated. This study presented a preliminary evaluation of the diagnostic and therapeutic performance of three LLMs in emergency digestive surgery using the Artificial Intelligence Performance Instrument (AIPI) and aimed to determine their potential to assist junior surgical trainees in diagnostic reasoning and management planning.
Material & Method *
Data from twenty emergency digestive surgery cases were collected prospectively between May and July 2025. Each case was independently submitted to ChatGPT-4, Claude Sonnet 4, and DeepSeek using identical standardized prompts. Responses were scored independently using AIPI across four dimensions: primary diagnosis, differential diagnosis, investigations, and treatment. An independent blinded evaluation by expert surgeons is currently being conducted for comparative analysis.
Results *
All models correctly identified the primary diagnosis in 14/20 cases (70%). Claude Sonnet 4 and ChatGPT-4 achieved 70% accuracy in differential diagnoses, and DeepSeek slightly higher at 75%. Regarding complementary investigations, Claude 4 scored highest (90%), followed by ChatGPT-4 (85%) and DeepSeek (80%). Treatment recommendations were appropriate in 85% of cases for DeepSeek, and 80% for the other two models.
Conclusion *
This preliminary evaluation suggests that LLMs provide clinically relevant support in emergency surgical decision-making. Their consistent performance across diagnostic and therapeutic tasks indicates potential value in medical education. Ongoing expert validation will clarify their role in complex cases.
File Upload #1
Only accept images in .jpg or .png format. The image size must not exceed 1 MB.
File Upload #2
Only accept images in .jpg or .png format. The image size must not exceed 1 MB.
Category
Select Main Category
1 General Topics organized by ISS/SIC
Select Sub Category
1.12 AI surgery
Submission Status
Submitted
Word counter
235
Abstract Prizes
Eligible for the BSI Free Paper Prize
No
- Presenting author must register to the congress by 30 November 2025
- Author must submit a full-length manuscript conforming to the format of orignial articles in the World Journal of Surgery WJS by 30 November 2025
Eligible for the Grassi Prize
No
- Author must be age 40 or younger
- One of the authors must be a member of ISDS
- Presenting author must register to the congress by 30 November 2025
- Author must submit a full-length manuscript to the World Journal of Surgery WJS by 30 November 2025
Eligible for the Kitajima Prize
No
- Author must be age 40 or younger
- One of the authors must be a member of ISDS
- Presenting author must register to the congress by 30 November 2025
- Author must submit a full-length manuscript to the World Journal of Surgery WJS by 30 November 2025
Vimeo Link