KASMEJ

Kastamonu Medical Journal regularly publishes internationally qualified issues in the field of Medicine in the light of up-to-date information.

EndNote Style
Index
Original Article
Language differences in multiple choice question answers of Artificial Intelligence programs in retinal and vitreous diseases: comparison of ChatGPT-3.5, Gemini and Copilot
Aims: Our study aims to investigate the effect of language differences on the success of the freely available artificial intelligence programs ChatGPT-3.5, Gemini, and Copilot in answering multiple-choice questions about retina and vitreous diseases.
Methods: Forty-six questions related to retinal and vitreous diseases were included in the study. These questions were translated into Turkish by a certified native speaker. Artificial intelligence programs called ChatGPT-3.5, Gemini, and Copilot were applied one by one to questions in English and Turkish. The answer options given to the questions claimed to be correct were compared with the answer key and grouped as correct and incorrect. Their success in answering the questions correctly was compared statistically with each other.
Results: ChatGPT-3.5, Gemini, and Copilot correctly answered the questions in English at a rate of 54.3%, 69.6%, and 63%, respectively. ChatGPT-3.5, Gemini, and Copilot answered the Turkish questions correctly at a rate of 43.5%, 60.9%, and 52.2%, respectively. There was no statistically significant difference between chatbots in answering the Turkish versions of the questions, although there were fewer correct answers in all three applications (p>0.05).
Conclusion: Although no statistically significant difference was detected, the lower success rate of chatbots in answering Turkish questions revealed that these programs need to be developed in terms of understanding and applying language translations as well as the need to improve their knowledge level.


1. Evans RS. Electronic health records: then, now, and in the future. Yearb Med Inform. 2016;25(01):48-61. doi:10.15265/IYS-2016-s006
2. Rahimy E. Deep Learning applications in ophthalmology. Curr Opin Ophthalmol. 2018;29(3):254-260. doi:10.1097/ICU.0000000000000470
3. Patel VL, Shortliffe EH, Stefanelli M, et al. The coming of age of Artificial Intelligence in medicine. Artif Intell Med. 2009;46(1):5-17. doi:10.1016/j.artmed.2008.07.017
4. Mikolov T, Deoras A, Povey D, Burget L, Cernocký J. Strategies for training large scale neural network language models. 2011 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2011, Proceedings. Published online 2011:196-201. doi:10.1109/ASRU.2011.6163930
5. Kim SJ, Fawzi J, Kovach J, et al., eds. Retina and Vitreous. American Academy of Ophthalmology; 2023.
6. Wen J, Wang W. The future of ChatGPT in academic research and publishing: a commentary for clinical and translational medicine. Clin Transl Med. 2023;13(3):e1207. doi:10.1002/ctm2.1207
7. Tao BKL, Hua N, Milkovich J, Micieli JA. ChatGPT-3.5 and Bing Chat in ophthalmology: an updated evaluation of performance, readability, and informative sources. Eye (Lond). 2024;38(10):1897-1902. doi:10.1038/s41433-024-03037-w
8. Google AI updates: Bard and new AI features in Search. Accessed July 4, 2024. https://blog.google/technology/ai/bard-google-ai-search-updates/
9. Waisberg E, Ong J, Masalkhi M, et al. Google’s AI chatbot “Bard”: a side-by-side comparison with ChatGPT and its utilization in ophthalmology. Eye. 2023;38(4):642-645. doi:10.1038/s41433-023-02760-0
10. Bing Chat | Microsoft Edge. Accessed July 4, 2024. https://www.microsoft.com/en-us/edge/features/bing-chat?form=MT00D8
11. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digital Health. 2023;2(2):e0000198. doi:10.1371/journal.pdig.0000198
12. Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT-reshaping medical education and clinical management. Pak J Med Sci. 2023;39(2):605-607. doi:10.12669/pjms.39.2.7653
13. Jeblick K, Schachtner B, Dexl J, et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol. 2024;34(5):2817-2825. doi:10.1007/s00330-023-10213-1
14. Carlà MM, Gambini G, Baldascino A, et al. Exploring AI-chatbots’ capability to suggest surgical planning in ophthalmology: ChatGPT versus Google Gemini analysis of retinal detachment cases. Br J Ophthalmol. 2024;108(10):1457-1469. doi:10.1136/bjo-2023-325143
15. Mihalache A, Popovic MM, Muni RH. Performance of an Artificial Intelligence Chatbot in ophthalmic knowledge assessment. JAMA Ophthalmol. 2023;141(6):589-597. doi:10.1001/jamaophthalmol.2023. 1144
16. Momenaei B, Wakabayashi T, Shahlaee A, et al. Assessing ChatGPT-3.5 Versus ChatGPT-4 performance in surgical treatment of retinal diseases: a comparative study. Ophthalmic Surg Lasers Imaging Retina. 2024;55(x):1-2. doi:10.3928/23258160-20240227-02
17. Canleblebici M, Dal A, Erdağ M. Evaluation of the performance of large language models (ChatGPT-3.5, ChatGPT-4, Bing and Bard) in Turkish Ophthalmology Chief-Assistant Exams: a comparative study. Turk Clin J Ophthalmol. 2024;33(3):163-170. doi:10.5336/ophthal.2024-102632
18. Mihalache A, Grad J, Patil NS, et al. Google Gemini and Bard Artificial Intelligence chatbot performance in ophthalmology knowledge assessment. Eye (Lond). 2024;38(13):2530-2535. doi:10.1038/s41433-024-03067-4 </ol> <p>
Volume 5, Issue 4, 2025
Page : 227-230
_Footer