Highlights ■ ChatGPT and Gemini are showing increased ability to accurately answer multiple-choice questions on medical exams. ■ There was no statistical significance in the rate of correct answers by ChatGPT 3.5 and Gemini 1.5. However, we observed that ChatGPT 4.0 performed significantly better, and so did Gemini 2.5 Flash, when comparing to the literature. ■ The question taxonomy did not appear to be a relevant factor regarding the success rate of the models. ABSTRACT Objective: Given the rapid advancement […]