ChatGPT failed to answer half of the questions correctly at the programming exam
Generative artificial intelligence often makes mistakes, so even developers do not recommend using it to create program code. To test ChatGPT’s overall abilities and knowledge in this area, the system was asked more than 500 questions about software development. More than half of them were wrong, TechSpot reports.
Researchers from Purdue University in Indiana, USA, asked ChatGPT 517 questions from Stack Overflow, a popular resource for professional programmers and enthusiasts. They evaluated the answers not only for correctness but also for consistency, exhaustiveness, and brevity. We also analyzed the language style and sentiment of the answers.
According to the results of the experiment, ChatGPT gave only 48% of correct answers. At the same time, 77% of the answers were described by the researchers as wordy. The exhaustiveness and textbook style of the bot’s writing contributed to the fact that the wrong answer seemed correct to the volunteers.
The study says that even when ChatGPT’s answer was clearly wrong, two out of 12 participants still preferred it because of the AI’s pleasant, confident, and positive tone.