ChatGPT accurately fictional textual scenarios
ChatGPT is much better than humans at accurately identifying emotions in fictional textual scenarios
A new study found that ChatGPT, an increasingly popular AI chatbot capable of natural language processing, greatly outperformed humans in emotional awareness tasks in a set of fictional textual scenarios. It was much better than typical people at estimating the emotions characters would likely experience. The paper was published in Frontiers in Psychology.
ChatGPT is a sophisticated artificial intelligence language model developed by OpenAI. It’s designed to engage in natural language conversations with users and answer questions on a wide range of topics. One interacts with ChatGPT by typing queries or messages, and ChatGPT will generate responses based on the patterns it learned from vast amounts of text data on the internet. While often very informative and helpful, ChatGPT is limited by the fact that it operates solely based on patterns in the data it was trained on without real understanding of the contents it generates.
Casual observations of ChatGPT use indicate that people often try to use it to discuss their personal or mental health problems. Researchers have also long been evaluating the potentials of chatbots, particularly those that can process natural language such as ChatGPT, in the mental health field. Initial studies have often returned positive results. For example, chatbots have been tried in the role of providers of cognitive behavioral therapy and the initial results look promising.
One of the key cognitive capacities necessary for successfully providing mental health help is emotional awareness. Emotional awareness is the ability to recognize, understand, and acknowledge one’s own emotions and the emotions of others. It involves being attuned to the feelings and emotional states within oneself, as well as being empathetic towards the emotions expressed by others through their verbal and nonverbal cues.
Study author Zohar Elyoseph and his colleagues wanted to examine ChatGPT’s emotional awareness performance compared to the general population. The idea behind their new study is that, if chatbots based on ChatGPT or similar to it are to potentially be used for providing mental health services, their emotional awareness capacities must not be worse than those of real people.
To assess ChatGPT’s emotional awareness performance, the researchers used the Levels of Emotional Awareness Scale. It consists of 20 open-ended questions that describe emotionally charged scenarios intended to elicit emotions, such as anger, fear, happiness, and sadness. In the original version, human respondents were tasked to imagine themselves in the scenario and write down their emotions.
In the version used in this study, ChatGPT was prompted to describe how a “human” would feel in the described situation. ChatGPT’s answers were scored using the standard manual for this scale and compared with results of 750 French participants between 17 and 84 years (33 years on average) from another study. The researchers evaluated ChatGPT’s performance twice – once in January and again in February 2023.
The results showed that ChatGPT’s scores were much better than scores of both women and men in the first evaluation. Still quite a few real people had better scores than ChatGPT on this occasion. On the second try, ChatGPT’s performance had improved much. It achieved scores that only a few individuals from the general population surpassed. Finally, two licensed psychologists evaluated the accuracy of ChatGPT”s emotional awareness responses and found them to be very accurate.
“ChatGPT demonstrated significantly higher performance in all the test scales compared with the performance of the general population norms. In addition, one month after the first evaluation, ChatGPT’s emotional awareness performance significantly improved and almost reached the ceiling score of the LEAS. Accordingly, the fit-to-context (accuracy) of the emotions to the scenario evaluated by two independent licensed psychologists was also high,” the study authors concluded.
The study makes an important contribution to the scientific exploration of possibilities for the use of AI in the field of mental health. However, it should be noted that the study involved textual scenarios and recognition of the most probable emotions in fictional scenarios. By their nature, such scenarios are simplistic and their textual forms cover only the few aspects of the situation predefined as relevant by the author. This is profoundly unlike the real-world recognition of emotions and it is also the type of tasks in which AI models perform best. Results on more complex tasks or tasks where the relevant information is mixed with irrelevant might not be the same.
The study, “ChatGPT outperforms humans in emotional awareness evaluations”, was authored by Zohar Elyoseph, Dorit Hadar-Shoval, Kfir Asraf, and Maya Lvovsky.