The best AI tutor for O-level subjects: ChatGPT, Gemini or The Wise Otter?
Sign up now: Get ST's newsletters delivered to your inbox
The Straits Times pitted three chatbots against one another to check which AI tutor is the best for Singapore students.
PHOTO ILLUSTRATION: UNSPLASH
SINGAPORE - Students are increasingly relying on artificial intelligence (AI) tutors for their daily work.
Besides mainstream tools such as Google’s Gemini and OpenAI’s ChatGPT, the latest contender in this space is The Wise Otter.
Created and launched in April by Singaporean developer Jotham Goh, 33, The Wise Otter incorporates grading criteria used in Singapore schools.
It can tackle primary school maths, as well as secondary school and junior college maths, English, physics, chemistry and biology questions with step-by-step explanations.
The Straits Times pitted the three chatbots against one another to check which AI tutor is the best for Singapore students.
For ChatGPT, ST included both GPT-4o and the latest GPT-5 models in this test.
GPT-5, launched on Aug 8, is said to hallucinate less and deliver more accurate answers compared with earlier editions.
The chatbots were asked to solve a random sample of questions found in past-year O-level papers and school examinations, including those that involve diagrams, across four subjects: maths, chemistry, physics and English.
Maths
PHOTO: SCREENSHOT FROM FREE TEST PAPERS
The three bots answered correctly a probability question about a cumulative frequency diagram showing the time that adults spent on exercise in one week. The bots also provided detailed explanations for their answers.
Asked to find the value of k (the minimum hours of weekly exercise for adults to stay fit), given that only 60 per cent of the adults in the diagram meet this minimum recommendation, GPT-4o and The Wise Otter failed to provide the right answer.
Only Gemini provided the correct value of k, which was 3.
GPT-4o and The Wise Otter wavered when told that 3 was the correct answer, quickly analysed the question again and gave 3 as their answers.
To test their confidence, both bots were told again that 3 was wrong. GPT-4o was more eager to please and wavered again, stating that k’s value was 2.67.
When the question was fed to GPT-5, it was able to provide the correct workings, but still misread the graph and gave the answer as 3.2.
PHOTO: SCREENSHOT FROM FREE TEST PAPERS
Although ChatGPT and The Wise Otter struggled with understanding the graph, all three bots were able to correctly answer other text-based questions.
English
The three bots were prompted to write an essay plan for the following question: “‘I realised that I was much stronger than I had previously thought.’ Write about a time when you felt like this.”
All three bots were able to produce a comprehensive structure for students to follow, and suggested key points to include in each paragraph.
They all prompted the writer to think of moments in life when there was a seemingly insurmountable challenge and how the writer’s perspective changed after the event, as well as to use vivid language as much as possible.
The Wise Otter went on to remind the writer that strength could be physical, emotional, mental or a combination of these.
It also gave possible classifications of the essay as “narrative”, “reflective” or “personal recount”. These are three of several essay types that O-level students in local schools are taught to identify, along with the appropriate approaches for the essay types.
The Wise Otter also provided advice on how to get the highest band for content, a scoring system used in GCE O levels – specifically, the essay needs to recount explicitly how the writer felt weak before, what happened to make the writer feel stronger, and how the writer realised he or she had strength all along.
GCE O-level examiners typically award marks for content and language, and score the papers between bands zero and five (five for the highest marks).
Essay-writing tips provided by The Wise Otter chatbot on Telegram.
PHOTO: SCREENSHOT FROM THE WISE OTTER
Chemistry
PHOTO: SCREENSHOT FROM FREE TEST PAPERS
A diagrammatic question on paper chromatography required the bots to identify which metals – lead, copper, iron, nickel and tin – could be found in mixture A, which contained three metals.
The correct answer was lead, iron and tin, as the chromatogram of mixture A showed spots at the same heights as these metals.
The bots were able to reason that matching the heights of the spots was the way to find the answer, but they were all able to correctly identify only two metals each.
Gemini, GPT-4o and GPT-5 mistakenly identified copper as a metal in mixture A. The Wise Otter erroneously said that nickel was found in mixture A.
Physics
PHOTO: SCREENSHOT FROM SG TEST PAPERS
When given a text-based multiple choice question that tested understanding of inertia, the bots gave an accurate definition of the concept, describing it as an object’s resistance to changing its state of motion.
The bots explained that inertia depends only on the mass of an object, and that other factors such as speed and velocity are irrelevant. They provided the correct answer that the car had the greatest inertia as it had the greatest mass.
PHOTO: SCREENSHOT FROM SG TEST PAPERS
Fed another physics question asking for the total resistance that flowed through an electrical circuit based on a diagram, the bots provided the correct answer, despite having struggled with diagrammatic maths and chemistry questions.
All the bots were able to break down the calculations into two parts – first by calculating the resistance of the parallel resistors, and then by adding it to the resistance of the fixed resistor connected in series. They provided the correct formula for calculating the equivalent resistance of the parallel resistors.
However, The Wise Otter’s Telegram interface does not allow it to show fractions. Fractions are denoted by the symbol “/” instead.
PHOTO: SCREENSHOT FROM THE WISE OTTER

