The study originated from Jackson’s alma matter, Brigham Young University (BYU), and included 327 co-authors in 14 different countries from 186 educational institutions.

The research results showed that students outperform ChatGPT, a natural language processing tool driven by artificial intelligence (AI) technology, considerably by almost 30% if no partial credit was awarded and by about 20% if partial credit was awarded.

ChatGPT had the edge on 11.3% of the questions, performing particularly well on questions about accounting information systems (AIS) and auditing. Furthermore, ChatGPT also had a better overall performance in true/false and multiple-choice questions compared to its low performance in short-answer and complex mathematical assessment questions.

We asked Jackson about his role in the research as well as his takeaways on the implications of AI technologies, like ChatGPT, for students and professionals in academia. This is what he had to say.

What was your role in assisting with the research?

David Wood, an accounting professor at BYU and one of my former professors, reached out to me about joining him on a ChatGPT-focused project. I have always been interested in cutting-edge technological developments in accounting, and this project seemed to be a good fit.

The project required me to take every question from each of my fall 2022 classes (a total of 9 different exams) and run them through ChatGPT to see how the AI would fare. I then took these results and added them to the project database along with other relevant information about each problem.

Aside from data gathering, I was involved with the crowd-sourced effort of editing the manuscript. This process can be a lengthy one when you are doing it by yourself or with 1-2 other co-authors. This is the first time that I have been a part of such a huge effort, and I found it to be an innovative way to do education-focused research.

Based on your research findings, do you think educators will change the way they structure their exams? If so, how?

Educators can learn a lot from these results. For example, if an exam or assignment is purely multiple choice or true/false, AI has an easier time figuring out the answers. However, if the exam includes either longer or shorter "work it out" type problems, current generations of AI would have more difficulty getting those right. If the point of exams is to give students the opportunity to showcase their learning, an educator could avoid the issue altogether by tailoring their exams around these "work it out" type problems, or even considering an oral portion to examinations.

How do you think students can leverage ChatGPT in a positive way?

Moving beyond the classroom, I imagine companies will be using the technology, so it is important that students have at least some exposure. Many of the Big 4 public accounting firms have already stated that they will be using AI to assemble sales proposals and to draft first drafts of client letters. Obviously, these documents are nowhere near the quality the Big 4 requires, so it still requires a human touch to perfect.

This is where I envision student engagement with AI to be positive: to provide human touch/revision to an otherwise AI-generated document. There may be some polarizing thoughts on this, but I envision a written assignment in class to instruct students to use AI to draft the first version of a paper or cover letter but then go back through and edit as needed. Regarding AI as a study tool, I would say that given AI’s track record of providing conflicting information, I would refer students to use their tried-and-true materials, like textbooks and professors.

In your opinion, how can educators use the tool to help teach students?

Educators could use AI to design assignments, and to potentially test them to see if they are AI-proof. This could reduce educator’s reliance on test bank problems, and make it more difficult for students to cheat. But for any problem, educators should always verify that the answer makes sense. ChatGPT sometimes makes nonsensical errors, or even straight-up makes things up!

Is there anything else you would like to add on your research findings or your overall thoughts about AI and ChatGPT?

It’s important to point out the version of AI we tested was GPT-3. OpenAI recently launched its newest chatbot, GPT-4, and it remains to be seen how the newer generation fares against humans. The thing about AI is that it is constantly learning and evolving. But for now, humans win!

With the recent buzz surrounding the benefits and drawbacks of the evolution of AI as well as GPT-4 coming out, it is interesting to see what the repercussions will be from the advancement of technology. It seems there will be a lot more research coming out on both the ethics and performance side of this topic, and it will be interesting to cover it.

For the time being, feel free to access the abstract of the highlighted study online. The full PDF can be downloaded for a fee.

